Determining Delivery Demand Area Distribution Using Effective Regions of Movement Clustering Elmer R. Magsino Gerald P. Arada Physics and Engineering Department University of the Fraser Valley British Columbia, Canada elmer.magsino@ufv.ca Corresponding Author Department of Electronics and Computer Engineering De La Salle University Manila, Philippines gerald.arada@dlsu.edu.ph Catherine Manuela L. Ramos Department of Manufacturing Engineering and Management De La Salle University Manila, Philippines Abstract—As more transport service providers traverse public roads to provide food and parcel delivery and ridesharing services, there is a need to analyze these delivery/service points to maximize provider profitability while minimizing harmful environmental effects. In this study, we utilize an urban empirical mobility dataset to extract important Global Positioning Systems (GPS) information where most transactions of delivery and services happened. In particular, we only utilized two-wheeled vehicular positions in the study since they offer more services as compared to four-wheeled vehicles. The urban map is uniformly partitioned into grids categorized by its vehicular capacity to locate highly demanded points. We then combine closely related grid positions into its corresponding effective regions of movement (ERMs) according to a preset vehicular capacity threshold. We also compute for the closeness centrality measure of these highly demanded locations and found that points within each ERM have short distances between them. Given these findings, ERMs are spatially separated thereby locating the demand area distribution easily. Index Terms—Mobility Dataset, Spatiotemporal Characteristics, Effective Regions of Movement, Closeness Centrality I. I NTRODUCTION With the proliferation of autonomous and intelligent vehicles equipped with location-based sensors, monitoring their movements on urban roads and highways has become accessible because of the availability of huge volume of vehicular mobility traces. These datasets contain at least the trajectories of vehicles like origin, destination, and route taken, timestamp, and speed. These vehicular records provide opportunities to understand the behaviors of any transportation mode [1]. Analyzing the dynamics of vehicles on urban roads translates to various applications such as smart mobility, information exchange, hotspot determination, and travel comfort and convenience provision. In [2], journey stops were classified from truck GPS mobility traces. A stop is identified as a stationary sequence of GPS truck coordinates or slow-moving trajectory. They categorized these stops to either task (which is a detour of their supposedly travel itinerary) or rest stops. In [3], GPS tracking data were employed to validate a traversed route based on the origin-destination pairs based on travel parameters like time reliability, route characteristics, and road density, while the work in [4] quantified the influence of main attraction distances and passenger interactions in deciding stop decisions among urban tourists. [5] implemented a two-layer clustering technique on Chile taxis to detect the city’s general and local travel patterns. In [6], bicycle GPS traces were investigated to identify gender-related behaviors in choosing a travel route. A clustering technique based on grid density and stay points was proposed in [7] to extract and estimate dynamic hotspot areas. The work focused on reducing the effects of noisy data while improving clustering accuracy. Instead of focusing on famous landmarks, the work in [8] detected small-scale local hotspots to be used for pickup and drop off points to further understand taxi demands and avoid traffic congestion in a local setup. The authors in [9] extracted hotspots in an urban setup based on a combination of vehicles and bicycles coordinates. Their detected hotspots are generally the destination of one transport mode and the origin of the other transport mode. In this work, we utilized two-wheeled vehicle dataset to determine their most frequented establishments at a given time of a day. The dataset only contains the origin and destination positions and its corresponding speed. In order to represent the spatiotemporal-varying urban map from GPS locations and time stamps, we consolidated the spatiotemporal maps into one map that will generally characterize the urban location. From this spatiotemporal stable map, we, then, identify delivery locations based on a vehicular capacity threshold and its corresponding demand area distribution by using the effective regions of movement (ERMs) clustering technique. We also evaluated the distance proximity between these delivery positions. The major contributions of this work are summarized below. 1) We present a single-day mobility trace dataset comprised of GPS location and speed of two-wheeled delivery and ridesharing vehicles. Given these dataset, we identify highly demanded service points of a dynamic urban map by calculating its spatiotemporal stable network characteristics. 2) We employ the Effective Regions of Movement (ERM) as a new means to group nearby traces and discriminate them from other mobility groups. We then calculate the closeness centrality measure to determine if the network is distributed or not. The paper outline is described as follows. We describe the experimental setup and the concepts to determine the highly demanded points and areas of distributions in Section II. We then present our extensive experimental results in Section III and discuss our findings. Finally, we conclude our study in Section IV. locations happen most often, or where food and service orders take place. Fig. 1 shows the area of collection which is approximately equal to 3400 km2 . It is comprised of the National Capital Region (where the capital city of the Philippines is located) and the nearby provinces surrounding the capity region. II. E XPERIMENTAL S ETUP We discuss the mobility dataset and its statistics in this section. We also tackle how we estimate the vehicular density in the origin and destination locations. A. Grab Mobility Dataset The Grab mobility traces, related to the works in [10], [11], is a dataset that contains the speed, origin, and destination locations of motorcycles and cars servicing the Philippine urban places. Motorcycles are two-wheeled vehicles employed in the food and service businesses, while, cars are fourwheeled transports utilized in transporting passengers and ridesharing. The origin and destination pairs are given in both OpenStreet Map node labels and latitude and longitude coordinates. There are no vehicular identification to preserve privacy and the timestamp is presented by combining all mobility traces happening in every minute for each hour of the day starting from 00:00 (12 MN ) to 23:59 (11:59 PM). In summary, the mobility trace dataset is characterized as follows. 1) The dataset is a single-day collection of motorcycles and cars GPS location taken on December 16, 2019, which was a pre-pandemic period. 2) Each row presents the origin and destination pair of either a motorcycle or car. The number of rows determine the quantity of sampled vehicular locations having at least five vehicles. 3) The speed, in meter/sec, is the average speed of the corresponding vehicles in the area at the said time. 4) The relationship of origin-destination (OD) pairs between sampling times are independent from each other since there is no identification number to relate which belongs to a trajectory. Thus, we can consider each OD pair as an incomplete trip. In this study, having no explicit relationships between succeeding OD pairs do not affect the evaluation of delivery demand since locations with high distribution can be checked by visually looking at the urban map. This process is tedious but is not used as a limitation of determining where delivery Fig. 1. The geographical area where the spatiotemporal parameters of motorcycles and cars are sampled. The blue line from lower left to the upper right highlights the GPS boundaries of the mobility traces. B. Dataset Statistics From the single-day actual mobility dataset, we only extract the average hourly speed and vehicular volume of two-wheeled vehicles, as shown in Fig. 2. The average hourly speed is determined by getting the harmonic mean of vehicular speeds to include the effect of slow-moving vehicles. There is a noticeable average slow speed even during non-peak hours. This is attributed to the fact that the dataset was taken nine days before the 2019 Christmas day. In the Philippines, roads are very busy because this is the time where people are out on reunions, party, and gift buying. However, there is a direct relationship between speed and the motorcycle volume. As more motorcycles are present, as well as other types of vehicles, the slower the vehicular movement. C. Locating Delivery Area Distribution from OD Vehicular Capacity Fig. 1 is initially partitioned into 1,960,000 grids, where N = 1400. Each grid covers an area equal to 0.0017 km2 . The value of N = 1400 can be increased to allow more discrimination between spatial locations, particularly, if minor roads are to be studied. However, this comes at more computational expenses and use of memory storage. We model the urban grid’s hourly vehicular capacity, m=59 X CHt (t = 0, 1, ..., 23) = Cm , where Cm is given in (1). m=0 V Cp,q is the number of OD pairs in grid gp,q . ζp,q,ST S = 23 X α(iTS )ω(iTS ), (3) i=0 where ζ(iTS ) − min ζ(iTS ) α(iTS ) = max ζ(iTS ) − min ζ(iTS ) ζ(iTS ) . ω(iTS ) = max ζ(i = 0, . . . , ITS ) Fig. 2. Grab motorcycle dataset statistics on December 16, 2019. This is a pre-pandemic period and nine days before Christmas Day, a very important occasion in the Philippines. V C1,1 .. . Cm = V Cp,1 . .. V CN,1 ... .. . ... .. . V C1,q .. . V Cp,q .. . ... V CN,q ... ... ... ... ... V C1,N .. . V Cp,N .. . V CN,N (1) where APfix denotes the averaged RSSI reading taken at indoor location i from an access point APfx , where x denotes how many APs are used to measure the RSSI of position i. We normalize each CHt (t = 0, 1, ..., 23) based on the maximum vehicular capacity of determined from all CHt (t = 0, 1, ..., 23), as shown in (2). To simplify CˆHt (t = 0, 1, ..., 23), a vehicular capacity threshold value, 0 < τV C ≤ 1, is employed to remove grids of low capacity and focus only on high-capacity grids. Thus, this procedure emphasizes those grids that contain only more OD pairs. Fig. 3. An illustrative example to compute the spatiotemporal stable vehicular capacity of the map under study [12]. To determine the delivery area distribution, we then form the Effective Regions of Movement (ERMs) [12]. This is done by grouping adjacent grids that satisfy the vehicular capacity threshold. The clustering [13] of spatiotemporal map grids form an ERM, ERMe , is governed by (4a) below. In this work, we only combine edge-adjacent grid maps. ERMe ≡ gp,q ∪ gp+∆p,q+∆q CˆHt (t = 0, 1, ..., 23) = CHt (t = 0, 1, ..., 23) max CHt (t = 0, 1, 2, ..., 23) subject to (2) From these CˆHt (t = 0, 1, ..., 23), we calculate the spatiotemporal stable vehicular capacity characteristic that will provide a single snapshot to represent the dynamic urban map. The spatiotemporal stable network characteristic, ζp,q,ST S = CˆHt (t = 0, 1, ..., 23), is calculated by following (3). At sampling time t = iTS , the α(iTS ) is the feature scaled value, while ω(iTS ) is the weight derived from the CˆHt (t = 0, 1, ..., 23)’s, respectively. This is visually interpreted in Fig. 3. (4a) |{cgp,q } ∪ {cgp+∆p,q+∆q }| ≤ τc (4b) min(ρgp,q , ρgp+∆p,q+∆q ) ≥ ρ0 , (4c) where ∆p, ∆q ∈ {−1, 0, 1}. cgp,q is the average vehicular capacity in gp,q , while τc is the vehicular capacity threshold of each formed ERMe . Constraint (4b) allows the merging of grids gp,q and gp+∆p,q+∆q when the merged grids have vehicular capacity less than the vehicular capacity threshold, τc , of each formed ERMe . ρgp,q and ρgp+∆p,q+∆q in Constraint (4c) are the outbound and inbound vehicular flows coming to and leaving from gp,q , respectively. If this condition is met, then, the two map grids are merged, otherwise, they are dropped. III. R ESULTS AND D ISCUSSION In this section, we present the results of our extensive simulations. We determine the delivery demand area distribution and its epicenter based on a varying threshold vehicular density level, τc . These values are: τc = 0, 0.5, 0.75 and 0.9. Fig. 4 shows GPS coordinates of two-wheeled delivery or service points for various vehicular density levels. It is noticeable that as τc is increased, the delivery/service points are found on main/major roads (depicted by the thick orange lines). From τc = 0.5 to τc = 0.75, it is evident that there is a 80% decrease in delivery/service points. This is attributed to the fact that two-wheeled riders tend to come together at establishments (food or business) that are high on foot traffic. Table I shows the exact number of delivery/service points extracted from the mobility dataset for each given vehicular capacity threshold value. TABLE I N UMBER OF D ELIVERY /S ERVICE POINTS AS THE THRESHOLD VALUE IS CHANGED . Vehicular Capacity Threshold, τc 0.0 0.5 0.75 0.9 Delivery/Service Points 136,964 946 168 67 Fig. 5 illustrates the ERMs formed from these delivery/service points. These ERMs are superimposed in the figures found in Fig. 4. We exclude τc = 0.0 because each coordinate will be an ERM. Here are some observations on Fig. 5. 1) The large ERMs (represented by the yellow and green areas) are treated as basically non-usable ERMs because there are very few coordinates found in these regions. These formed ERMs have the lowest priority levels. 2) As the vehicular capacity is increased, the number of ERMs decreased drastically. This is due to the fact that less regions contain delivery/service points, thus, the forming of ERMs extend to the nearest urban area with delivery/service points. 3) ERMs tend to concentrate on the most populous locations as depicted by more GPS coordinates. These ERMs can be used by the two-wheeled service provider to concentrate their service crew in these locations. We note that each point in Fig. 4 can be connected to create a complex network. In reality, a two-wheeled service provider tends to travel to nearby places in search for additional orders, i.e., to increase their take-home pays. To better understand if the next points within an ERM or other ERMs are viable options to travel to obtain additional delivery/service, we calculate its average closeness centrality [14]. The closeness centrality simply states how close is one point to the rest of the points found in the network. If a network is close, then, a service provide can easily go to the other locations to pick up an order or provide a service. The higher the value of the closeness centrality is, the more central or nearer a delivery/service point to the other delivery/service point, else, a low value dictates otherwise. This is calculated by summing all the shortest distance from a delivery/service node to every other delivery/service node in the network. We assume that all points are connected to each other to form our network to be analyzed. Realistically, this connection allows travel from one point to another, regardless of what road and traffic conditions to take. In the worst case scenario, a two-wheeled service provider can go from one point to the farthest point if its current position is not suitable to make profit. Of course, the service provider needs to optimize if this is possible. The closeness centrality of a delivery/service point to the other nodes of the network, C(Ej ), is given in 5, where is the separation between two EV charging locations and . We note that we use the shortest distance between these two stations, i.e., not necessarily the road network distance and no existing EV station in between. C(Ej ) = P N −1 d(Ej , Ei ) (5) where d(•) refers to the euclidean distance of •. Table II shows the average, minimum, and maximum closeness centrality values when the vehicular capacity threshold is varied. As τc is increased, the network is said to be tight allowing two-wheeled providers to go to other points, however, these values are not promising since these values are less than 0.5. Practically, the distances between two delivery/service points are still far on the average. We also observe that the max value illustrates that no one delivery point is too central and in close proximity to the majority of delivery locations in the network. Effectively, the formed network is distributed. TABLE II C LOSENESS CENTRALITY VALUES BASED FROM THE GPS COORDINATES GIVEN IN F IG . 4. Vehicular Capacity Threshold, τc 0.5 0.75 0.90 Average Value 0.1181 0.1871 0.1692 Minimum Value 0.0362 0.0620 0.0560 Maximum Value 0.1572 0.2499 0.2197 IV. C ONCLUSION In this work, we have utilized empirical mobility traces dataset to analyze most frequently visited GPS locations and then determine demand area distribution in an urban setup. Given the dynamic spatiotemporal characteristics of the dataset, we presented the average vehicular capacity behavior by computing for its equivalent spatiotemporal stable network characteristics, effectively, pinpointing the most explored establishment GPS locations. From these places, the demand area distribution was obtained by employing the effective regions of movement clustering. Our findings have shown that as the constraint of vehicular capacity was increased, the ERMs increase in size because less locations are frequently visited. Fig. 4. The locations where most two-wheeled delivery or service are happening given the vehicular threshold capacities to be (a) τc = 0, (b) τc = 0.5, (c) τc = 0.75, and (d) τc = 0.9. Fig. 5. The superimposed ERMs and delivery/service point locations for two-wheeled delivery or service given the vehicular threshold capacities to be (a) τc = 0.5, (b) τc = 0.9, and (d) τc = 0.9. This distributed behavior is supported by the calculation of the closeness centrality measure for all points made from the interconnected network. In future research, we can explore employing the findings here in deploying electric vehicle charging stations [15], trajectory anomaly detection [16], application of dynamic parking pricing [17], and resources offloading and utilization in vehicular networks [18]. V. ACKNOWLEDGMENT This work was supported by Mitsubishi Motors, Philippines. R EFERENCES [1] C. Celes, A. Boukerche, and A. A. Loureiro, “Mobility trace analysis for intelligent vehicular networks: Methods, models, and applications,” ACM Computing Surveys (CSUR), vol. 54, no. 3, pp. 1–38, 2021. [2] B. Xu, R. Gupta, B. Hashisho, R. Köhn, and S. van de Hoef, “Extracting journeys from truck gps traces,” in Proceedings of the 15th ACM SIGSPATIAL International Workshop on Computational Transportation Science, 2022, pp. 1–10. [3] L. Montero and X. Ros-Roca, “Using gps tracking data to validate route choice in od trips within dense urban networks,” Transportation Research Procedia, vol. 47, pp. 593–600, 2020. [4] N. D’Angelo, A. Abbruzzo, M. Ferrante, G. Adelfio, and M. Chiodi, “Gps data on tourists: a spatial analysis on road networks,” AStA Advances in Statistical Analysis, pp. 1–23, 2023. [5] C. Heredia, S. Moreno, and W. F. Yushimito, “Characterization of mobility patterns with a hierarchical clustering of origin-destination gps taxi data,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 12 700–12 710, 2021. [6] F. Rupi, M. Freo, C. Poliziani, M. N. Postorino, and J. Schweizer, “Analysis of gender-specific bicycle route choices using revealed preference surveys based on gps traces,” Transport policy, vol. 133, pp. 1–14, 2023. [7] X. Wang, Z. Zhang, and Y. Luo, “Clustering methods based on stay points and grid density for hotspot detection,” ISPRS International Journal of Geo-Information, vol. 11, no. 3, p. 190, 2022. [8] X.-J. Chen, Y. Wang, J. Xie, X. Zhu, and J. Shan, “Urban hotspots detection of taxi stops with local maximum density,” Computers, Environment and Urban Systems, vol. 89, p. 101661, 2021. [9] A. Keler, J. M. Krisp, and L. Ding, “Extracting commuter-specific destination hotspots from trip destination data–comparing the boro taxi service with citi bike in nyc,” Geo-spatial Information Science, vol. 23, no. 2, pp. 141–152, 2020. [10] E. R. Magsino, G. P. Arada, and C. M. L. Ramos, “Investigating data dissemination in urban cities by employing empirical mobility traces,” in 2020 IEEE 12th international conference on humanoid, nanotechnology, information technology, communication and control, environment, and management (HNICEM). IEEE, 2020, pp. 1–5. [11] X. Huang, Y. Yin, S. Lim, G. Wang, B. Hu, J. Varadarajan, S. Zheng, A. Bulusu, and R. Zimmermann, “Grab-posisi: An extensive real-life gps trajectory dataset in southeast asia,” in Proceedings of the 3rd ACM SIGSPATIAL international workshop on prediction of human mobility, 2019, pp. 1–10. [12] E. R. Magsino and I. W.-H. Ho, “An enhanced information sharing roadside unit allocation scheme for vehicular networks,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 9, pp. 15 462–15 475, 2022. [13] Q. Zhao, Y. Shi, Q. Liu, and P. Fränti, “A grid-growing clustering algorithm for geo-spatial data,” Pattern Recognition Letters, vol. 53, pp. 77–84, 2015. [14] L. Geng and K. Zhang, “Correlation of road network structure and urban mobility intensity: An exploratory study using geo-tagged tweets,” ISPRS International Journal of Geo-Information, vol. 12, no. 1, p. 7, 2022. [15] C. M. S. Tan and E. R. Magsino, “Demand-based deployment of electric vehicle charging stations employing empirical mobility dataset,” in Pervasive Computing and Social Networking: Proceedings of ICPCSN 2022. Springer, 2022, pp. 285–294. [16] D. Smolyak, K. Gray, S. Badirli, and G. Mohler, “Coupled igmm-gans with applications to anomaly detection in human mobility data,” ACM Transactions on Spatial Algorithms and Systems (TSAS), vol. 6, no. 4, pp. 1–14, 2020. [17] E. R. Magsino, G. P. Arada, and C. M. L. Ramos, “An evaluation of temporal-and spatial-based dynamic parking pricing for commercial establishments,” IEEE Access, vol. 10, pp. 102 724–102 736, 2022. [18] O. Akyıldız, F. Y. Okay, İ. Kök, and S. Özdemir, “Road to efficiency: Mobility-driven joint task offloading and resource utilization protocol for connected vehicle networks,” Future Generation Computer Systems, vol. 156, pp. 157–167, 2024.