# Data Description

In this page, we provide detailed analysis on the Mobike trajectory data that we use in the UrbanBike project. The content consists of four sections,

- Basic Statistics
- Mobike Mobility Characteristics
- Insight Discoveries
- NOTE: Data Used in the Demo System

## Basic Statistics

**Road Networks**. We use the road network of Shanghai, China from Bing Map, which contains 333,766 vertices and 440,922 edges.

**Mobike Trajectories**. Each Mobike trajectory contains a bike ID, an user ID, a temporal range of the trajectory, a pair of start/end locations, and a sequence of intermediate GPS points.

#### Figure 1: Moike Trajectory Data Distribution.

The Mobike dataset we collect contains one month’s data (i.e., 2016/09/01-2016/09/30) from the city of Shanghai (**Figure 1** gives an overview of the spatial distribution). The dataset contains 13,063 unique users, 3,971 bikes, and 230,303 trajectories (with a total of 18,039,283 unique GPS points).

## Mobike Mobility Characteristics

### Trip length distribution

#### Figure 2: Bike Trip Length Distribution.

**Figure 2** summarizes the distributions trip lengths of the Mobike users. From the figure, it is clear that the majority of the trajectories are relatively short, i.e., more than 70\% of the trips is shorter than 2 km, as people have the limited physical strength. The observation is consistent with the assumption that shared bike service is the solution for the “last mile problem” in public transportation system.

### Trip duration distribution

#### Figure 3: Bike Trip Duration Distribution.

**Figure 3** gives the trajectory duration distribution, where the majority of the trips are within 30 mins. It is because that 1)~most of the trips are less than 2 km, which should be completed within 15 mins, and 2)~the pricing plan of Mobike charges a user one RMB per 30 mins (we can also notice a sudden drop around 30 mins mark).

### Trip temporal distribution

#### Figure 4: Bike Trip Temporal Distribution.

**Figure 4** illustrates the distribution of the trip starting time. It obvious that there are two usage peaks, i.e., the morning/evening rush hours. It is interesting to see there are still significant usage very late at night, i.e., 10：00PM - 3：00AM, which may be generated by the overtime workers.

### Edge traversal distribution

#### Figure 5: Edge Traversal Distribution.

**Figure 5** depicts the edge distribution respect of the number of traversed trajectories, in log scale. It is obvious that the most of the edges are covered by less than 100 trajectories, which proofs that bike users have different destinations. On the the hand, there are over 2,000 edges, with more than 1,000 trajectories, which validates the necessity of planning effective bike paths.

## Insight Discoveries

#### Figure 6: Spatial Insights of Mobike Data.

### Spatial Hot Spots

**Figure 6a** gives the top-2 spots with the highest trip starting locations, where the upper side reflects a subway terminal station (i.e., Jinyun Road Station of Line 13th), and the lower side illustrates a very popular shopping mall (i.e., Bailian Zhonghuan Commerce Plaza). The intuition behind the observation is straightforward: although mall is very popular, it is not close to any subway stations, which makes cycling the best option; similarly for the terminal station, the fastest and most economic option to home is cycling.

### Star Pattern

We further investigate the travel directions around the spatial hot spots, and we discover that the bike trips go to different destinations from the same starting location, just like multiple edges with one shared end, namely, a star pattern, as demonstrated as the arrows in **Figure 6b**.

### Temporal Imbalance

#### Figure 7: Temporal Differences of Trips.

**Figure 7** gives the Mobike trip locations at different time periods. In the early morning, i.e., **Figure 7a**, more trips starts at the residential areas, while, around 08:00 a.m. to 10:00 a.m, more trips start at the subway station (as **Figure 7b**). After we analyze their destinations, it is clear that in the early morning, people live nearby ride the bike to the subway stations for work. Then, after one hour, different group of people arrive at the subway station and ride to the nearby malls and offices for work.

## NOTE: Data Used in the Demo System

The Mobike dataset we use in this demo system **only contains the trajectories of 8 days**, from 2016/09/01 to 2016/09/30.