Journal Pre-proof Demand-aware mobile bike-sharing service using collaborative computing and information fusion in 5G IoT environment Xiaoxian Yang, Yueshen Xu, Yishan Zhou, Shengli Song, Yinchen Wu PII: S2352-8648(22)00126-2 DOI: https://doi.org/10.1016/j.dcan.2022.06.004 Reference: DCAN 458 To appear in: Digital Communications and Networks Received Date: 11 July 2021 Revised Date: 4 June 2022 Accepted Date: 12 June 2022 Please cite this article as: X. Yang, Y. Xu, Y. Zhou, S. Song, Y. Wu, Demand-aware mobile bikesharing service using collaborative computing and information fusion in 5G IoT environment, Digital Communications and Networks (2022), doi: https://doi.org/10.1016/j.dcan.2022.06.004. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2022 Chongqing University of Posts and Telecommunications. Production and hosting by Elsevier B.V. on behalf of KeAi Communications Co. Ltd. Digital Communications and Networks(DCN) journal homepage: www.elsevier.com/locate/dcan Demand-aware mobile bike-sharing service using collaborative computing and information fusion in 5G IoT environment b School of Computer and Information Engineering, Shanghai Polytechnic University, Shanghai, 201209, China School of Computer Science and Technology, Xidian University, Xi’an, 710126, China -p a ro of Xiaoxian Yanga , Yueshen Xub,∗ , Yishan Zhoub , Shengli Songb , Yinchen Wub re Abstract Jo ur na lP Mobile bike-sharing services have been prevalently used in many cities as an important urban commuting service and a promising way to build smart cities, especially in the new era of 5G and Internet-of-Things (IoT) environments. A mobile bike-sharing service makes commuting convenient for people and imparts new vitality to urban transportation systems. In the real world, the problems of no docks or no bikes at bike-sharing stations often arise because of several inevitable reasons such as the uncertainty of bike usage. In addition to pure manual rebalancing, in several works, attempts were made to predict the demand for bikes. In this paper, we devised a bike-sharing service with highly accurate demand prediction using collaborative computing and information fusion. We combined the information of bike demands at different time periods and the locations between stations and proposed a dynamical clustering algorithm for station clustering. We carefully analyzed and discovered the group of features that impact the demand of bikes, from historical bike-sharing records and 5G IoT environment data. We combined the discovered information and proposed an XGBoost-based regression model to predict the rental and return demand. We performed sufficient experiments on two real-world datasets. The results confirm that compared to some existing methods, our method produces superior prediction results and performance and improves the availability of bike-sharing service in 5G IoT environments. c 2022 Published by Elsevier Ltd. KEYWORDS: Mobile bike-sharing service, Demand prediction, Collaborative computing, Information fusion, 5G IoT 1. Introduction As a green and new mode of transportation, mobile bike-sharing services improve the diversity of the city transportation and act as a convenient commuting service, helping solve the first-and-last mile problem in urban public transportation and effectively alleviating traffic jams. With mobile bike-sharing services, people do not need to buy or ride their own ∗ Corresponding author. The two authors Xiaoxian Yang and Yueshen Xu contribute equally to this paper, so Yueshen Xu is also the co-first author of this paper. 1 E-mail addresses: xxyang@sspu.edu.cn (X. Yang), ysxu@xidian.edu.cn (Y. Xu), yishanzhouh@hotmail.com (Y. Zhou), shlsong@xidian.edu.cn (S. Song), yinchenwu@stu.xidian.edu.cn (Y. Wu). bikes but they can enjoy the sharing service. There are two mainstream types of bike-sharing services. One is the docked bike-sharing service that is usually managed by metropolitan transportation service bureaus, and users follow a service procedure to pick up a bike at one station, ride, and return the bike to another station. An an important way to build smart cities, the docked mobile bike-sharing service has been deployed in more than 700 cities [1] worldwide, including many famous cities that serve tens of millions of people, such as New York (Citi Bike)2 , London (Beryl)3 , Paris 2 https://www.citibikenyc.com/ 3 https://beryl.cc/bikeshare/cities 2 Name of the first author, et al. -p ro of However, we find that in reality, a station is not isolated but is closely related to the nearby stations. The following observations motivated us to introduce clustering analysis and careful feature analysis in this study. For a station that is near an office building but has a limited number of available bikes, the bike demand cannot be satisfied until more bikes are transported to this station. In such a case, users are likely to pick up bikes from nearby stations. Hence, the demands of a cluster of stations that are near each other may better reflect the complete demand of users, and the demand prediction for an individual station cannot account for the help that is sought from nearby stations; hence, considering the bike demand for individual stations only results in low accuracy. The bike demand of a cluster is more stable than that of a single station. Moreover, we observe that the bike usage of a bike station over a long time usually shows similar patterns, which can be another useful feature. Besides, the weather condition also has a clear impact on bike demand [9]. Thus, taking full advantage of different features of bike usage is important. We expect that the study of different types of features can result in a new breakthrough to enhance bike demand prediction. Through the careful observation and investigation, we find that the bike demand is impacted by the environmental factors, including temporal factors and meteorological factors. We propose a demand prediction framework for mobile bike-sharing service in 5G IoT environments using information fusion and collaborative computing. Therefore, we propose a dynamic location-aware clustering algorithm and introduce an XGBoost regression model [10] as the prediction model. Our contributions can be summarized as follows. Jo ur na lP re (Vélib’ Métropole)4 , Tokyo (Docomo)5 , Beijing6 and Hangzhou7 . The other type is the non-docked bikesharing service that is usually managed by companies such as Meituan Bike 8 (former known as Mobike) and Hello Bike9 . In this second type of service, people do not need to return bikes to stations, but the companies and governments encourage them to return bikes in a given area, and the running companies face the problem of bike scheduling. In this paper, we consider the docked bike-sharing service as the main study case. Although mobile bike-sharing services bring much convenience, they also face many challenges. For many bike stations, the usage mode of bikes is random and excessive demand often occurs because of diverse factors, such as time, location and environment. There are two typical cases in which the demand for bikes is not satisfied and the quality-of-service (QoS) deteriorates, especially in 5G IoT environment. One is that no bikes are available for users to pick up, and the other is no docks are available to return bikes. Thus, it is crucial to solve the task of demand prediction. Some existing bike-sharing services try to monitor the status of bike stations, including the number of available bikes and docks [2]. In recent years, some researches attempted to design bike schedule strategies for the event that a demand failure occurs. For example, some researchers designed an optimized route for vehicles that transport bikes among stations [3][4]. Some researchers introduced the concept of inventory and modeled the schedule as a non-stationary Markov chain [5]. Others used programming tools (e.g., integer programming and non-linear programming) to design transportation route for bike scheduling [6]. Note that, it is always a late measure to rebalance bikes after a demand failure occurs, and it cannot guarantee bike availability. Thus, in this paper, we aimed to predict the demand for bikes at bike stations using information fusion and a collaborative approach in a 5G IoT environment. In the task of demand prediction for bike-sharing services, some researchers define the task as a time series problem and use historical records to predict future values. Researchers have proposed several prediction models. The autoregressive moving average model (ARMA) and autoregressive integrated moving average model (ARIMA) are some representative models [7]. Some researchers used classification or regression methods to better realize demand prediction. Some researchers try to use classification method to predict the bikes demand [8]. Furthermore, the regression model (e.g., k-nearest neighbor (KNN) regression) has been used to predict the hourly demand [6]. 4 https://www.velib-metropole.fr/ 5 https://docomo-cycle.jp/ 6 http://bjggzxc.jtw.beijing.gov.cn/ 7 https://www.hangzhou.com.cn/hzbike/ 8 https://mobike.com/global/ 9 https://www.helloglobal.com/ 1. We performed a comprehensive study on a series of potential features and fusion of information, such as date, time, seasons, and weather conditions. We found that these factors have a clear impact on bike demand in the mobile service. Such information fusion is especially important for machine learning-based prediction methods. 2. We proposed a collaborative framework for demand prediction in the bike-sharing service. We constructed a weighted adjacency graph where weighted edges represent similarities among stations. We partitioned the adjacency graph into small sub-graphs and dynamically identified clusters hierarchically. Such a dynamic clustering approach can guarantee high similarity of the demand in a cluster in 5G IoT environments. 3. We proposed a representation method for the discovered features. We employed XGBoost as the regression model in the bike demand prediction problem to predict hourly demand. 4. We evaluated the performance of our framework using two real-world datasets containing the real bike-sharing service data. The results show that Paper Title (The title should be descriptive, not full sentence) our method provides excellent results, and compared to the state-of-the-art methods, our model has clearly lowered prediction error. The rest of this paper is organized as follows. Section 2 presents the related works. Section 3 gives an overview of the developed mobile bike-sharing service. Section 4 elaborates the analysis of diverse features and the method of information fusion. Section 5 explains the proposed collaborative demand prediction approach in 5G IoT environments. Section 6 presents the experimental results and analysis. Section 7 concludes the paper and discusses future work. Jo ur na -p lP re The problems in bike-sharing services have received increasing attention among academia and industry. This study focuses on three aspects, namely, service planning, demand prediction, and bikes scheduling. Collaborative Computing. Many cities currently do not have bike-sharing services and aim to build their own service. To build a bike-sharing service, many factors, such as construction cost, geographical locations and QoS, should be considered. These factors will determine the service scale and quality parameters, such as the number of stations, station deployment, and total number of bikes. In [11, 12], the authors computed the total number of bikes necessary in a bike-sharing service based on many factors. In [13], the authors focused on the construction of bike paths based on large-scale bike trajectory data. The authors in [14] tried to compute the number and locations of bike stations, considering the interests of users and investors and QoS constraints. The authors in [15, 16] gave suggestions for the design of station capacity, in addition to the deployment of station locations. Information Fusion. Demand prediction is a crucial task that aims to predict the demand for bikes and docks to avoid unnecessary bike scheduling and improve bike availability. Existing studies on demand prediction can be summarized into two categories, i.e., studies based on cluster-level and station-level demand prediction. Bike usage is affected by many factors, such as weather, time, and external events, and the demand at one station has a strong connection with the neighboring stations. Therefore, stationlevel prediction involves many challenges [17]. Demand prediction was defined as a time series problem in [18, 19], and an ARIMA model was employed to predict the bike demand. Rixey et al. [20] built a linear regression model for predicting monthly rentals by studying the network effect of the scale and geographical distribution of bike-sharing stations. For cluster-level prediction, we assume that if people arrive at a bike station that does not have enough bikes or docks, the people will go to another nearby station. Researchers tried to group stations into clusters and predicted the demand of clusters. Li et al. [21] used K-means twice as the clustering method, and the first clustering was based on geographic distance between two stations and the second clustering was based on the clustering results of the first clustering and bike usage patterns. Chen et al. [22] constructed a weighted correlation network to model the relationship among stations and predicted the over-demand probability of each cluster. Zhou et al. [23] clustered stations by the fast-greedy community detection method, which identified the core and representative clusters at certain time windows. In [24], K-means clustering was applied to divide all stations into several clusters, and latent Dirichlet allocation (LDA) was used to explore latent bike riding patterns. Mobile Service Clustering. A bike-sharing service involves a large number of bikes and stations is large, and hence clustering is a feasible way to reduce the problem complexity [25, 26, 27]. In this paper, we applied clustering and developed prediction models in a low complexity scenario. We constructed a weighted adjacency graph, partitioned the graph into small sub-clusters, and merged highly correlated subclusters into bigger clusters. The results testify that the quality of clusters is clearly better than that achieved by previous clustering methods. Hence, our clustering results provide a better basis for the subsequent demand prediction. ro of 2. Related Work 3 3. Mobile Bike-sharing Service In this section, we provide the definitions used in this work, and present an overview of our framework. 3.1. Preliminary Information and Problem Definition Definition 1. Riding record. A riding record rt = (so , sd , τo , τd ) is a history record of bike usage from a starting station so . τo denotes the rental time at station so and τd denotes the return time at station sd . Definition 2. Cluster. We define a group of nearby stations with similar weights as cluster C. The weights are computed using the bike demand in a given time interval [t1 , t2 ] and geographic distance d. Definition 3. Bike usage. We define the sum of the absolute number of bikes rented Ui− (t) and bikes returned Ui+ (t) at station S i during a given time [t, t+∆] as the bike usage Ui (t) at station S i . Definition 4. Rental demand and return demand. We define the number of bikes rented Ui− (t) at station S i in a time period t as the rental demand at station S i . The number of bikes returned Ui+ (t) at station S i in a time period t is defined as the return demand at station S i. Problem Definition. Cluster-level bike demand prediction. Given a set of history riding records RT = (rt1 , rt2 , ..., rtH ), the problem of cluster-level demand 4 Name of the first author, et al. prediction is to predict the bike rental and return demand for each cluster during a future period. The period in our work is set to 1h. In the real world, the bike scheduling frequency is usually more than 1h [18], so the hourly prediction can reach the requirement of real-world applications and is more fine-grained. 4. Features Analysis and Information Fusion The raw data contain much redundant information, and hence simply using raw data usually leads to complex computation and noise. Hence, we comprehensively examined the potential factors for bike demand prediction. We conducted empirical feature analyses from various aspects using the bike trip records of all New York bike-sharing stations and meteorological data from 2014/04/01 to 2019/11/30. These two types of data can be accessed publicly, and we crawled from the Web pages of New York Citi Bike System [2] and New York Weather [28]. ro of (b) Rest days Fig. 1: Hourly bike demand analysis on working days and rest days Jo ur na lP re 4.1.1. Daily Temporal Analysis Fig. 1 shows the bike usage of different hours in one day and on different days in a week. The time period is from 2019/04/01 to 2019/11/30. It can be observed that the bike demands on working days and rest days have different peak and low-peak periods. Note that the rest days include weekends and holidays. The peak demand of working days occurs in two time periods, namely, the morning (7:00 to 9:00) and evening rush hours (17:00 to 19:00). In contrast, the peak demand of rest days is concentrated in the 10:00-20:00 period. The usage pattern represents the behaviors of people using the bike-sharing service. The above observation indicates that there are working days and rest days, which have different usage patterns. It also shows that the daily peak distribution is different between the two usage patterns, but is stable on working days and rest days. -p 4.1. Temporal Information (a) Working days We observe that the changes in the average bike demand per hour are almost the same as the working days in Section 4.1.1, which illustrates the necessity to learn the features on working days and rest days separately. Especially, the bike demand on rest days cannot be ignored. As shown in Fig. 2, the summer (green line) has the highest average bike demand, while the winter (peach line) has the lowest average demand, which is probably caused by the weather conditions. The same hours in different seasons can have different demands. For example, the daytime demand in winter is half less than that in spring. Based on the analysis, we consider the seasonal features as an influencing factor. 4.1.2. Seasons To study the impact of seasonal features on bike demand, we crawled the bike trip records of all Citi Bike stations from 2018/1/1 to 2019/11/30. We define the start and end dates of spring, summer, autumn, and winter seasons according to the equinox and solstice [29], as shown in Table 2. Table 1: Dates of four seasons season Spring Summer Fall Winter Start and end dates 03/22-06/20 06/21-09/22 09/23-12/20 12/21-03/21 Fig. 2: Hourly bike demand in four seasons (spring, summer, fall and winter) 4.2. Meteorological Information Several previous studies gave an analysis of the effect of weather on bike riding [6][9]. The results indicate that bike demand can vary under different mete- Paper Title (The title should be descriptive, not full sentence) 5 orological conditions. However, the existing study on the weather features is not enough. In this section, we will give a comprehensive analysis of meteorological features, including weather condition, humidity, precipitation intensity, air temperature, and wind speed. 4.2.1. Weather Condition We quantitatively analyzed the effects of weather conditions on bike demand. Bike usage is different at different hours on one day. Hence, we used the morning and evening rush hours to study the impact of different weather conditions on bike demand, and the results are shown in Figure 3. We classified weather conditions into four categories according to the impact on bike demand. U1 = clear, partly − cloudy, U2 = cloudy, U3 = f og, wind, rain and U4 = sleet, snow. We used the four weather categories as meteorological features to understand the bike demand. lP re -p ro of (a) Humidity ur na Fig. 3: The average hourly bike demand during rush hours in different weather conditions Jo 4.2.2. Humidity and Precipitation Intensity As the effects of humidity and precipitation intensity on bike demand are similar, so we discuss these factors together. We find that along with increasing humidity, the demand to rent bikes reduces. Fig. 4 (a) shows that a higher humidity is related to a lower hourly bike usage. The reason may be that under high humidity conditions, people are easily uncomfortable, thereby deteriorating the riding experience. Moreover, the low bike demand at extremely low humidity (less than 0.25%) is likely caused by cold temperature as extremely low humidity often occurs in a cold environment. Precipitation intensity represents the degree of precipitation process on rainy or snowy days. For example, precipitation intensity larger than 0.3in/h indicates heavy rain. It can be found from Fig. 4 (b) that as the precipitation intensity increases, the bike demand decreases. When precipitation intensity is greater than 0.3 in/h, the bike demand drops to almost 0. 4.2.3. Temperature and Wind Speed We examined the effect of temperature and wind speed on bike rental demand in rush hours. Fig. 5(a) shows that the bike rental demand per hour increases (b) Precipitation Intensity Fig. 4: The impact of humidity and precipitation intensity on bike demand in rush hours as the temperature increases. For wind speed, we observe that as the wind speed increases, the bike rental demand first increases and then decreases, as shown in Fig. 5(b). Hence, in this paper, temperature and wind speed are also taken as meteorological features that affect the bike demand. 5. Collaborative Demand Prediction in 5G IoT Environments 5.1. Similarity Computation The demand in a cluster is more regular than that in an individual station, and the prediction results of clusters will be more stable. To produce a more accurate bike demand for the stations in a cluster, our objective is to find the relationship between stations. The records of renting and returning bikes at each station at any time are contained in historical trip records. So we utilize the recorded bike demand to generate the similarity between stations. As shown in Fig.2, based on our observation, the bike demands of all stations decreased dramatically from 00:00 to 5:00 on working days and from 00:00 to 6:00 on rest days. In these two periods, the difference in bike demand between stations cannot be reflected, and so we remove the data of these two periods. We divided the remaining periods into four 6 Name of the first author, et al. which is defined as [Ui (TG1 ), Ui (TG2 ), Ui (TG3 ), Ui (TG4 )], tq = 1 ftq (si ) = [Ui (TG5 ), Ui (TG6 ), Ui (TG7 )], tq = 0 (1) where tq = 1 if the tq -th day is a working day and tq = 0 if the tq -th day is a rest day. Ui (TG1 ) to Ui (TG7 ) denote the number of bike riding records in different time groups (TG) of the day tq of station si . The complete bike-riding records vector of station si is (a) Temperature f(si ) = [ ft1 (si ), ft2 (si ), ..., ftK (si )] (2) ro of We compute the Pearson correlation coefficient [30] to determine the similarity between station si and station s j , and this coefficient is denoted as sim(si , s j ). -p sim(si , s j ) = cov(f(si ), f(s j )) ∂ f (si ) ∂ f (s j ) (3) (b) Wind Speed lP re where cov(f(si ), f(s j )) is the covariance of f(si ) and f(s j ) and ∂ f (si ) is the standard deviation of f(si ). Finally, we normalize sim(si , s j ) to [0,1] to measure the bike demand similarity between station si and station s j. ur na Fig. 5: The impact of temperature and wind speed on bike demand in rush hours Jo groups for working days and into three groups for rest days, as shown in Table 2. The groups are denoted as TGi (i = 1, . . . , 7). The usage pattern is stable within a time group. For example, in the morning rush hours, the usage pattern for bikes first increases and then decreases. The bike demand in a time group reflects the correlation and similarity among stations. Table 2: Dates of four seasons Day type Working days Rest days Time groups morning rush hours (TG1 ) daytime (TG2 ) evening rush hours (TG3 ) nighttime (TG4 ) early morning hours(TG5 ) daytime(TG6 ) nighttime(TG7 ) Duration 6:00-9:00 10:00-15:00 16:00-20:00 21:00-24:00 7:00-10:00 11:00-20:00 21:00-24:00 We represent station si by the bike demand of different time groups. Specifically, for station si , we construct the bike riding records vector of the tq -th day, 5.2. Location-aware Stations Clustering 5.2.1. Adjacency Graph Construction Our task was to dynamically cluster stations into groups, where each cluster contains stations that have similar bike demand and that are separated by small geographical distances. We constructed an adjacency graph G = (V, E) , where the set of nodes V = {s1 , s2 , ..., sn } represents all n stations, and E is the set of connected edges between two stations. The stations in a cluster should be similar in terms of both geographic location and bike demand. So, for constructing the graph, we only needed to investigate the neighboring nodes of the stations and did not need to consider sites located far away. We select k nodes with the top k edge weights for each node by the KNN approach to construct an adjacency graph G. We computed the edge weight using both bike usage similarity and geographic distance as follows. W(si , s j ) = sim(si , s j ) + log( τ ) dist(si , s j ) (4) where dist(si , s j ) is the geographic distance between station si and station s j and τ is the neighborhood distance threshold. The weight was computed as a combination of sim(si , s j ) and distance value. The distance value function rewards distances smaller than the threshold τ and penalizes distances larger than τ. The adjacency graph was divided into a group of sub-clusters through graph partitioning. The subcluster pairs that will merge into clusters should have Paper Title (The title should be descriptive, not full sentence) 7 high relative inter-connectivity and relative closeness. The relative inter-connectivity and relative closeness between two clusters are denoted as RI(Ci , C j ) and RC(Ci , C j ), respectively. Relative inter-connectivity represents the absolute inter-connectivity between Ci and C j that is normalized by the inter-connectivity between the two clusters. RI(Ci , C j ) = |EC(Ci , C j )| (5) |ECCi |+|ECC j | 2 where EC(Ci , C j ) is the absolute inter-connectivity between two clusters and represents the sum of the weights of the edges connecting the nodes in cluster Ci and the nodes in cluster C j . ECCi is the internal interconnectivity of a cluster Ci and is computed as the sum of the weights of the truncated edges that partition Ci roughly into halves. Relative closeness represents the absolute closeness between Ci and C j normalized by the internal closeness of the two clusters. + (6) |C j | |Ci |+|C j | S̄ ECC j na lP re where |Ci | is the number of nodes in cluster Ci . S̄ EC(Ci ,C j ) is the absolute closeness of the two clusters and represents the average weight of the edges connecting the nodes in cluster Ci and the nodes in cluster C j . S̄ ECCi is the internal closeness of cluster Ci and is computed as the average weight of the truncated edges that partition Ci into halves. ro of S̄ EC(Ci ,C j ) |Ci | |Ci |+|C j | S̄ ECCi 1. Construct a graph Gk . For a node si , if the weight value from s j to si is one of the top k maximum values among the weight values from all nodes to si , we add a weighted edge between s j and si (line 1). For each node, we use the KNN approach to find all top k weighted neighbor nodes that are connected with weighted edges to construct a graph Gk . As the number of neighbor nodes k is far smaller than the total number of nodes, Gk is a sparse graph. 2. Partition the graph into sub-clusters. We use pway partitioning algorithm [31] to partition the adjacency graph into M sub-clusters. M represents the number of sub-clusters (line 2). 3. Merge into clusters. The sub-clusters are composed into a set of sub-cluster pairs, and the subcluster pairs are iteratively selected from the set of sub-cluster pairs. If all stations in a sub-cluster pair satisfy the distance constraint, we calculate the value function between sub-cluster pairs and select the sub-cluster pair with the highest value function to merge (line 5 to line 11). The merging continues until only m sub-clusters are left. The clustering result is m clusters. -p RC(Ci , C j ) = Fig. 6: A toy example of LHC algorithm Jo ur 5.2.2. Dynamic Clustering Algorithm To produce clusters with similar bike demand and small geographic distance, the constructed graph is partitioned into sub-clusters. We dynamically select sub-cluster pairs and merge them into a new subcluster such that both the relative inter-connectivity and relative closeness between the sub-cluster pairs are high. The sub-clusters remaining after merging are the final clusters. We propose a location-aware hierarchal clustering (LHC) algorithm for stations, to find sub-cluster pairs with high relative inter-connectivity and relative closeness. The edge weights in subcluster pairs are computed by Eq.4, and the value for merging is computed as follows. value(Ci , C j ) = RI(Ci , C j ) × RC(Ci , C j )α (7) where α is a constant that weighs the importance of RI and RC. If α > 1, the relative closeness plays a more important role, and if α < 1, the relative interconnectivity is dominant. We conducted preliminary experiments, and the results show that the clustering result is not sensitive to α, and α is set to 2 in all the experiment cases. The proposed LHC method has three steps and a toy model is as shown in Fig. 6. The nodes represent stations, and the nodes of the same color belong to the same sub-cluster. Each cluster in the generated m clusters has two properties, namely high bike demand similarity and close location. These two properties are crucial for demand prediction. The pseudocode of LHC algorithm is shown in Algorithm 1. 5.3. Demand Prediction with XGBoost Regression The demand prediction process involves two phases, namely, feature representation and XGBoost regression. 5.3.1. Feature Representation The extracted features have two types of features, i.e., temporal and meteorological features. More specifically, temporal features contain isworkday, isrestday, hours, months and seasons. Meteorological features contain weather conditions, temperature, wind speed, humidity and precipitation intensity. All 8 Name of the first author, et al. our work, we consider a data set with n examples and m features. Algorithm 1: The LHC Algorithm Input: S = {S i }ni=0 , similarity set {simi(S i , S j )}i, j=1...n , the number of sub-clusters M, parameter m; Output: m clusters: C1 , C2 , ..., Cm ; n 1 Construct stations set S i i=0 to graph G k by KNN approach based on {simi(S i , S j )}i, j=1...n and locations; 2 Partition M sub-clusters C 1 , C 2 , ..., C m by k-way partitioning algorithm; 3 Initialize k=0; 4 k = M − n, ln = M; 5 for i = 1 : k do 6 if ln < m then 7 return m1 clusters C1 , C2 , ..., Cm ; 10 11 ln = ln − 1 fk (xi ), fk ∈ F (9) k=1 where F is the space of regression trees and F is defined as F = { f (x) = wq(x) }(q : Rm → T, w ∈ RT ) (10) where q represents the structure of each tree with T leaves in the tree. Each fk corresponds to an independent tree structure q and leaf weights w. For a given sample, XGBoost follows the decision rules in the trees to divide the features of the sample into leaf nodes, and sum up the score of the leaf nodes to compute the final prediction result. The following regularized objective is minimized to learn the set of functions used in the model. X X l(ŷi , yi ) + Ω( fk ) L(φ) = (11) re 12 ŷi = φ(xi ) = K X ro of 9 A tree ensemble model with K additive function to predict the bike demand and ŷi denotes the predicted result. C2n Generate sub-cluster pairs by merging; for j = 1 : C2n do if Sub-cluster pairs satisfy distance constraint then Find a sub-cluster pair with highest value by value function to merge; (8) -p 8 D = {(xi , yi )}(|D| = n, xi ∈ Rm , yi ∈ R) na lP the features are extracted from the data of weather reports and bike trip history records [2][27]. To deal with the problem that different features have different original representations, we propose two ways of representing all features. Jo ur 1. One-hot encoding. Use 1 and 0 to represent the presence and absence of one feature. We divide the weather conditions in the meteorological data into four weather categories, which are U1 ={clear, partly-cloudy}, U2 ={cloudy}, U3 ={fog, wind, rain} and U4 ={sleet, snow}. We set a new feature daytype to indicate the day of a week, representing Monday to Sunday. We apply one-hot encoding for isworkday, isrestday, day type, weather categories, hours, months, and seasons. 2. Numerical encoding. The values of the features are used directly. We applied numerical encoding to temperature, wind speed, humidity, and precipitation intensity. After feature extraction, we fully utilized all features by combining the temporal and meteorological features into a matrix, where the columns represent each feature and the rows represent combination vectors of all types of features. 5.3.2. XGBoost-based Regression Model We employed XGBoost as the regression model to realize demand prediction in the bike-sharing service. XGBoost is a tree boosting model [10] based on the ensemble of decision trees and widely used in regression tasks and provides competitive performance. In i k where l is a differentiable convex loss function that measures the difference between the prediction ŷi and the real value yi . To minimize (11), in each iteration, the fk that improves the model the most will be greedily added in (11). Formally, ŷ(t) i represents the prediction of the i-th instance at the t-th iteration, and the objective function to be minimized in the t-th iteration is L(t) = n X l(yi , ŷ(t−1) + ft (xi )) + Ω( fk ) i (12) i=1 To accelerate the optimization of (12), we introduce Taylor’s second-order approximation, and (12) can be transformed to a new objective function. L(t) = n X l(yi , ŷ(t−1) + ft (xi )) + Ω( fk ) i (13) i=1 where gi = ∂ŷ(t−i) l(yi , ŷ(t−1) ) and hi = ∂2ŷ(t−1) l(yi , ŷ(t−1) ) are the first and second order partial gradients of the loss function. Finally, the XGBoost model minimizes the objective function (13) and outputs the bike demand prediction results. 6. Experiments and Evaluations 6.1. Datasets We performed experiments on two real-world bikesharing service datasets obtained from the database Paper Title (The title should be descriptive, not full sentence) Jo ur na -p lP re 1. Citi Bike data. Bike-sharing trip records are publicly available from the platform of Citi Bike System of New York City [2]. We crawled 5.35 million records from 1st April to 30th September in 2014 to form the first dataset, and 23.75 million records from 1st November in 2018 to 30th November in 2019 to form the second dataset. The records format is (trip duration, start time, stop time, start station ID, end station ID). For the first dataset, we used all stations that appeared in the bike-sharing riding records during the corresponding period. For the second dataset, new stations that were built within the corresponding period were included along with the already existing stations. As training data, the data for the periods from 1st April to 10th September in 2014, and from 1st November in 2018 to 31th July in 2019 were used. The remaining data from 11th to 30th September in 2014, and from 1st August and 30th November in 2019 were used as test data. 2. Meteorological data. We collected the hourly meteorological data of New York City from April 2014 to November 2019. Hence, the time period of the meteorological data successfully covered both bike-sharing datasets. The format of the meteorological data was (timestamp, weather condition, temperature, wind speed, humidity, precipitation intensity). A small proportion of the meteorological data was missing, and these missing data were supplemented according to the hourly data in the previous hour. 1. HA. HA is short for Historical Average, which predicts the rental and return number by using the average number of historical rental and return demand in each time period [8][19]. For example, for 10:00 am to 11:00 am on work days, the time periods are all the historical times in the training set from 10:00 am to 11:00 am on all work days. 2. ARMA. ARMA considers the rental and return demands as time series and predicts the demand in time periods. The time periods are set identical to those in HA [8][32]. 3. ARIMA. The ARIMA model is designed to handle unstable data and considers the bike riding data as a time series [30]. The difference between ARIMA and ARMA is that the raw data in ARIMA are processed by the first-order difference and transformed into more stable data. 4. HP-KNN (BC). HP is short for hierarchical prediction and BC is short for bi-partite clustering. HP-KNN (BC) uses BC clustering to complete station clustering, predicts the entire bike-sharing demand of the city and allocates the entire trip records to each cluster based on the capacity proportion of each cluster [8]. The proportion is learned by KNN. 5. HP-MSI (BC). MSI is short for Multi-similaritybased Inference, which also uses a BC method to complete station clustering, and predicts the entire bike-sharing demand of the city and allocates the entire demand to stations based on the capacity proportion of each cluster. The proportion is learned by MSI [8]. 6. MFR-ARMA. MFR is short for Multiple Factor Regression, which uses a weighted k-means clustering method to complete station clustering and predicts the rental and return demands with a multi-factor regression model with ARMA [32]. 7. GBRT. GBRT is short for Gradient Boosted Regression Tree, which uses the negative gradient of the loss function in the current model value as an approximation of the residuals to fit a regression tree. The rental and return demands are predicted individually by GBRT [30]. 8. RF. RF is short for Random Forest, which is a powerful tool to solve the multivariate classification and regression problems. A RF model is composed of multiple random trees and the average value of the output of random trees is used as the predicted value [33]. ro of of New York City. We collected bike-sharing riding records and meteorological data from 1st April to 30th September in 2014, and from 1st November in 2018 to 30th November in 2019. The details of the two datasets are presented in Table 3. 9 Table 3: Details of the datasets Data source Time period Bike-sharing data Meteorological data #Stations #Bikes #Riding Records #Temporal frequency New York City 1st November in 2018 to 30th November in 2019 331 706 6,800 19,800 1st April to 30th September in 2014 5,359,995 23,752,004 hourly hourly 6.2. Compared Methods and Metrics Our proposed method of predicting the hourly rental and return demand is named as LHC-XGBoost, as it involves the LHC and XGBoost regression. To better evaluate our method, we compare our method with the following well-known baselines. Although neural network-based methods can also be used in prediction tasks, those methods usually requires a lot of computation resource [31]. Furthermore, neural network-based methods are not as representative as the typical prediction methods for time series data. In contrast, ARMA and ARIMA are widely-used prediction methods for time series data. Thus, neural network-based methods are not competitive enough for the studied task of this paper, and 10 Name of the first author, et al. hence they were not selected as methods for comparison. Metrics. To evaluate the performance of different methods, we employed Root Mean Squared Logarithmic Error (RMLSE) and Error Rate (ER) as the metrics, because they have been widely used in bikesharing service evaluation [8][33]. of clusters. As shown in Algorithm 1 in Section 5.2.2, if the LHC algorithm modifies the stop criteria when no sub-cluster pairs merge, we can automatically obtain a more reasonable number of clusters. By running the LHC algorithm, we obtained 16 clusters, which are shown in Fig. 7(c). With the generated 16 clusters, our method yields the best prediction performance. The RMLSE is 0.303 and ER is 0.234 for rental demand prediction, and RMLSE is 0.289 and ER is 0.227 for return demand prediction. RMLS E = r 1 Xn 1 XT (log(ȲCi ,t + 1) − log(YCi ,t + 1))2 t=1 i=1 T n (14) Pn 1 XT i=1 |ȲCi ,t − YCi ,t | Pn ER = t=1 T i=1 YCi ,t (15) ro of where YCi ,t denotes the ground truth of the rental and return demand of cluster Ci during t time. ȲCi ,t is the predicted value. (a) LHC algorithm with 23 clusters Jo ur na lP re Based on our observation, a larger number of clusters probably lead to a lower accuracy of predicting the demand. If the number of clusters is equal to the number of stations, each cluster is equal to an individual station. Then, it is difficult to generate accurate predictions because the bike demand fluctuates randomly. If there is only one cluster, all bike stations are contained in that cluster, and the demand prediction accuracy for the entire city can be high, but the demand prediction results of the entire city are not useful. Thus, the selection of an appropriate number of clusters that can capture the demand correlation between stations and meet the distance constraints requires knowledge and experience. A previous study [8] proposed a BC method to cluster stations. Clustering was performed on the first bike-sharing dataset (1st April to 30st September in 2014). As shown in Table 3, the first dataset contained 331 stations, 6800 bikes and 5,359,995 bike-sharing riding records. From the clustering results, the following observations can be made. -p 6.3. Evaluation of Clustering Analysis 1. To compare the clustering results obtained by our LHC algorithm and BC, following the setting in [8], we set the same number of clusters m as 23. The clustering results are shown in Fig. 7, where Fig. 7(a) shows the clustering results of the LHC algorithm and Fig. 7(b) show the clustering results of the BC method. It can be seen that our dynamic clustering algorithm LHC can generate more intra-connected clusters. LHC can generate clusters with more similar demand and tighter station locations. In some clusters that are generated by the BC method, the stations are more scattered, and a typical example is the stations colored sky-blue in Fig. 7(b). 2. As our LHC method can dynamically generate clusters, it is not necessary to set a fixed number (b) BC algorithm with 23 clusters (c) LHC algorithm with 16 clusters Fig. 7: Clustering comparison. The same colors denote the stations that are contained in the same cluster. We also investigated the clustering performance of Paper Title (The title should be descriptive, not full sentence) 11 In the GC algorithm, a city is divided into uniform grids and the stations that are in the same grid form a cluster. For the clustering results, the HA (historical average) is employed to yield the demand prediction results. It can be seen that HA, HA (GC) and HA (BC) perform worse than our methods, such as LHC-GBRT, LHC-RF and LHC-XGBoost. 2. We compared our methods with the following widely-used or state-of-the-art methods, namely ARMA, HP-KNN (BC), HP-MSI (BC), MFRARMA, GBRT and RF. For fairness, we used the same features for the machine learning-based methods, including GBRT and RF, and the parameters were all set according to the setting of original papers. We carefully fine-tuned each model. From Table 4, it can be seen that our methods (LHC-GBRT, LHC-RF and LHCXGBoost) yield the highest prediction accuracy on both evaluation measures. For example, compared to HP-MSI(BC) and MFR-ARMA, the rental demand prediction of LHC-XGBoost can reduce RMSLE by 0.028 and by 0.011, respectively, and can reduce ER by 0.024 and 0.019, respectively. Furthermore, compared to HPMSI(BC) and MFR-ARMA, for return demand prediction of LHC-XGBoost, RMSLE is reduced by 0.036 and by 0.02, respectively and ER is reduced by 0.036 and 0.026, respectively. na lP re -p ro of the LHC algorithm on the second dataset, for which the time period ranges from 1st November in 2018 to 30th November in 2019. The LHC algorithm also yields superior clustering results, which are shown in Fig. 9. Because of the newly constructed stations, more stations are included in the second dataset. So from Table 3, we can see that the numbers of stations (706), bikes (19,800) and riding records (23,752,004) in the second dataset are all clearly more than those in the first dataset. More stations result in more subcluster pairs, and the LHC algorithm can choose the most appropriate sub-cluster pairs to merge and produce better clustering results. We clustered 600 stations from 1st November in 2018 to 30th November in 2019 in New York City using the LHC algorithm. Thus, we obtained 31 clusters, as shown in Fig. 8. Jo ur (a) The geographic map of clustering Table 5 shows the average RMSLE and ER over all hours in the test set from the second dataset, corresponding to time period from 1st November, 2018 to 30th November, 2019. The experimental results demonstrate that the three methods (LHC-GBRT, LHC-RF, LHC-XGBoost) with our LHC clustering algorithm all produce superior prediction accuracy. The method LHC-XGBoost produces the best prediction accuracy. Table 4: Prediction errors comparison of rental and return demands for the first dataset (b) Clustering over New York City Map Model Fig. 8: Clustering results from 1st November in 2018 to 30th November in 2019. The same colors denote the stations that are contained in the same cluster. 6.4. Evaluation of Demand Prediction We compared the demand prediction accuracy of our methods with well-known or state-of-the-art methods for the two datasets. Table 4 shows the average RMSLE and ER over all hours in the test set of the first dataset. 1. We first compared our methods with HA plus geographical grid clustering (GC) and HA plus BC. HA(GC) HA(BC) HA ARMA ARIMA HP-KNN (BC) HP-MSI (BC) MFR-ARMA LHC-GBRT LHC-RF LHC-XGBoost Rental demand RMSLE ER 0.387 0.353 0.372 0.355 0.367 0.354 0.380 0.366 0.359 0.350 0.358 0.299 0.349 0.282 0.332 0.277 0.328 0.268 0.329 0.267 0.321 0.258 Return demand RMSLE ER 0.377 0.347 0.365 0.352 0.356 0.352 0.369 0.363 0.351 0.348 0.360 0.295 0.350 0.290 0.334 0.280 0.319 0.261 0.318 0.258 0.314 0.254 12 Name of the first author, et al. Table 5: Prediction errors comparison of rental and return demands for the second dataset Model HA ARMA ARIMA LHC-GBRT LHC-RF LHC-XGBoost Rental demand RMSLE ER 0.500 0.450 0.515 0.527 0.447 0.498 0.347 0.267 0.362 0.278 0.341 0.262 Return demand RMSLE ER 0.501 0.449 0.510 0.529 0.444 0.500 0.341 0.262 0.355 0.271 0.335 0.256 (a) Silhouette coefficient score 6.5. Sensitivity Analysis of Parameters Jo ur na -p lP re We examined the impact of parameter k on clustering results. We see k plays an important role in the construction of the adjacency graph and represents the number of neighboring stations of a target station. The value of k is studied from 6 to 14, and we employed the clustering evaluation indexes, namely, silhouette coefficient score and Davies-Bouldin index (DBI) to evaluate the clustering result. The silhouette coefficient score is a metric for clustering analysis that jointly evaluates the cohesion and separation of clusters, the value of which ranges from -1 to 1. A larger score denotes that clusters are separated better and have high cohesion, which indicates better clustering [34]. The DBI is defined as an index that is defined as the ratio of the intra-cluster scatter to the inter-cluster separation. A small DBI score means that the scatter within clusters is smaller than the separation among clusters, thereby indicating a better clustering result [34]. Fig.9 shows the values of the two indexes along with the change in k. It can be seen that a too small value of k (k=6) results in a lower silhouette coefficient score. The reason is that there are very few neighboring stations in the adjacency graph construction, and such a small number of neighboring stations is insufficient for successfully mining the bike demand correlations among stations. A large value of k, (e.g., k is equal to 12 or 14) also negatively lowers the silhouette coefficient score. The reason can be inferred that a large k leads to the increase of neighbor stations in clusters and further leads to the complexity of the adjacency graph, which harms the effectiveness of partitioning the graph into sub-clusters. As shown in Fig.9, a higher silhouette coefficient score is obtained when k is 8 or 10, which indicates a better clustering result. Moreover, the bike demand fluctuates frequently at an individual station, and the other stations in the same cluster can provide more valuable information for the demand prediction accuracy of the target station. In our experiment, k is set to 10 as the default value. ro of 6.5.1. Discussion of Parameter k (b) DBI Score Fig. 9: Cluster evaluation indexes with different k values 6.5.2. Discussion of Edge Weight Computation and Parameter τ In this section, we study the impact of the neighborhood distance threshold τ on cluster results. As τ is a parameter in edge weight computation (see (4) in Section 5.2), we also discuss the edge weight computation. The results are shown in Fig.10. Fig. 10(a) shows the number of station pairs with different Pearson Coefficient Correlation (PCC) values. PCC is employed to compute the similarity between two stations in bike riding records, and is one of two parts in edge weight computation. It can be found that the PCC values of the most station pairs are at 0.6 to 0.9. In edge weight computation, the other part is the distance between two stations, and the distance exceeding τ will be punished (Section 5.2). We study the clustering performance under different values of neighborhood distance threshold τ from 0.5km to 2km, which is evaluated by Davies-Bouldin index (DBI) and A small DBI value means a better clustering result. Fig. 10(b) shows DBI results and it can be found that when τ is 1.5km, DBI reaches the smallest value. So the τ is set to 1.5km as the default value. 7. Conclusion and Future Work In this paper, we propose a holistic framework for demand prediction for bike-sharing service. Our framework contains collaborative computing, information fusion and demand prediction, where we have 13 References @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ 7KHQXPEHURIVWDWLRQSDLUV Paper Title (The title should be descriptive, not full sentence) 3HDUVRQFRUUHODWLRQFRHIILFLHQW -p ro of (a) The number of station pairs with different PCC [1] The Meddin Bike-sharing World Map, Bike-sharing word map, https://bikesharingworldmap.com, 2021 (accessed 18 April 2021). [2] Citi Bike System, Citi bike system data, https://www. citibikenyc.com/system-data, 2021 (accessed 21 April 2021). [3] F. Chiariotti, C. Pielli, A. Zanella, M. Zorzi, A bike-sharing optimization framework combining dynamic rebalancing and user incentives, ACM Transactions on Autonomous and Adaptive Systems 14 (11) (2020) 1–30. [4] C. Contardo, C. Morency, L.-M. Rousseau, Balancing a dynamic public bike-sharing system, Technical report (2012) 1– 20. [5] J. Schuijbroek, R. C. Hampshire, W. van Hoeve, Inventory rebalancing and vehicle routing in bike sharing systems, European Journal of Operational Research 257 (2017) 992–1004. [6] J. Liu, L. Sun, W. Chen, H. Xiong, Rebalancing bike sharing systems: A multi-source data smart optimization, in: Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2016, pp. 1005–1014. [7] Y. Liang, S. Ke, J. Zhang, X. Yi, Y. Zheng, Geoman: Multilevel attention networks for geo-sensory time series prediction, in: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), 2018, pp. 3428–3434. [8] Y. Li, Y. Zheng, H. Zhang, L. Chen, Traffic prediction in a bike-sharing system, in: Proceedings of the 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2015, pp. 1–10. [9] K. Gebhart, R. B. Nolandand, The impact of weather conditions on bikeshare trips in washington, dc, Transportation 41 (6) (2014) 1205–1225. [10] T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 785–794. [11] I. Frade, A. Ribeiro, Bicycle sharing systems demand, Procedia of Social and Behavioral Sciences 111 (2014) 518–527. [12] A. Faghih-Imani, R. Hampshire, L. Marla, N. Eluru, An empirical analysis of bike sharing usage and rebalancing: Evidence from barcelona and seville, Transportation Research Part A: Policy and Practice 97 (2017) 177–191. [13] J. Bao, T. He, S. Ruan, Y. Li, Y. Zheng, Planning bike lanes based on sharing-bikes’ trajectories, in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 1377–1386. [14] J.-R. Lin, T.-H. Yang, Strategic design of public bicycle sharing systems with service level constraints, Transportation Research Part E: Logistics and Transportation Review 47 (2) (2011) 284–294. [15] J. Zhang, X. Pan, M. Li, P. S. Yu, Bicycle-sharing systems expansion: station re-deployment through crowd planning, in: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2016, pp. 1–10. [16] M. H. Beek, S. Gnesi, D. Latella, M. Massink, Towards automatic decision support for bike-sharing system design, in: Proceedings of International Workshop on Software Engineering and Formal Methods, 2015, pp. 266–280. [17] P. Hulot, D. Aloise, S. D. Jena, Towards station-level demand prediction for effective rebalancing in bike-sharing systems, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2018, pp. 378–386. [18] A. Kaltenbrunner, R. Meza, J. Grivolla, J. Codina, R. E. Banchs, Urban cycles and mobility patterns: Exploring and predicting trends in a bicycle-based public transport system, Pervasive Mobile Computing 6 (4) (2010) 455–466. [19] D. Tomaras, I. Boutsis, V. Kalogeraki, Modeling and predicting bike demand in large city situations, in: Proceedings of IEEE International Conference on Pervasive Computing and Communications (PerCom), 2018, pp. 1–10. [20] R. A. Rixey, Station-level forecasting of bikesharing ridership: re (b) DBI over different τ (km) lP Fig. 10: Sensitivity analysis of neighborhood distance threshold τ Jo ur na novel contributions. We successfully mine the relations between stations via comprehensively features analysis and the utilized information fusion include bike riding records, temporal features, meteorological features and geographical locations in 5G IoT. We propose a new similarity computation method that successfully supports the further proposed dynamic clustering algorithm. XGBoost is employed to work with our clustering algorithm as the regression model finishing the final demand prediction. The experiments are performed on two large real-world datasets and the results demonstrate that our clustering algorithm is effective, the studied features are indeed useful and the prediction model yields the superior demand prediction performance. In future, we intend to mine more features from other types of data, especially the social data, such as city events and activities. We plan to study whether such social data have impact on the QoS and demand of bike-sharing service. Acknowledgment This paper is supported by National Natural Science Foundation of China (No. 61902236) and Fundamental Research Funds for the Central Universities (No. JB210311). [27] [28] [29] [30] [31] [32] [33] [34] -p [26] re [25] lP [24] na [23] ur [22] Station network effects in three u.s. systems, Transportation Research Record 2387 (1) (2013) 46–55. Y. Li, Y. Zheng, Citywide bike usage prediction in a bikesharing system, IEEE Transactions on Knowledge and Data Engineering 32 (6) (2020) 1079–1091. L. Chen, D. Zhang, L. Wang, D. Yang, X. Ma, S. Li, Z. Wu, G. Pan, T. M. T. Nguyen, J. Jakubowicz, Dynamic clusterbased over-demand prediction in bike sharing systems, in: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp), 2016, pp. 841–852. X. Zhou, Understanding spatiotemporal patterns of biking behavior by analyzing massive bike sharing data in chicago, PLoS ONE 10 (10) (2015) 1731–1750. J. Bao, C. Xu, P. Liu, W. Wang, Exploring bikesharing travel patterns and trip purposes using smart card data and online point of interests, Networks and Spatial Economics 17 (2017) 1231–1253. Y. Li, Y. Zheng, Q. Yang, Dynamic bike reposition: A spatiotemporal reinforcement learning approach, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2018, pp. 1724– 1733. F. Chiariotti, C. Pielli, A. Zanella, M. Zorzi, A dynamic approach to rebalancing bike-sharing systems, Sensors 18 (2) (2018) 512–533. L. Lin, Z. He, S. Peeta, X. Wen, Predicting station-level hourly demands in a large-scale bike-sharing network: A graph convolutional neural network approach, Transportation Research Part C: Emerging Technologies 97 (2017) 258–276. Dark Sky by Apple, Dark sky api, https://darksky.net/ dev, 2021 (accessed 16 April 2021). Time and Date, Solstices & Equinoxes for New York, https: //www.timeanddate.com/calendar/seasons.html, 2021 (accessed 21 April 2021). D. Chai, L. Wang, Q. Yang, Bike flow prediction with multigraph convolutional networks, in: Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2018, pp. 397–400. Y. Akhremtsev, P. Sanders, C. Schulz, High-quality sharedmemory graph partitioning, IEEE Transactions on Parallel Distributed Systems 31 (11) (2020) 2710–2722. Z. Zheng, Y. Zhou, L. Sun, A multiple factor bike usage prediction model in bike-sharing system, in: Proceedings of International Conference on Green, Pervasive, and Cloud Computing, 2018, pp. 390–405. J. Liu, L. Sun, Q. Li, J. Ming, Y. Liu, H. Xiong, Functional zone based hierarchical demand prediction for bike system expansion, in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2017, pp. 957–966. D. Xu, Y. Tian, A comprehensive survey of clustering algorithms, Annals of Data Science 92 (2015) 165–193. Jo [21] Name of the first author, et al. ro of 14 Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☒The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Jo ur na lP re -p ro of None.