Delivery Demand Area Distribution Using Movement Clustering

Determining Delivery Demand Area Distribution
Using Effective Regions of Movement Clustering
Elmer R. Magsino
Gerald P. Arada
Physics and Engineering Department
University of the Fraser Valley
British Columbia, Canada
Corresponding Author
Department of Electronics and Computer Engineering
De La Salle University
Manila, Philippines
Catherine Manuela L. Ramos
Department of Manufacturing Engineering and Management
De La Salle University
Manila, Philippines
Abstract—As more transport service providers traverse public
roads to provide food and parcel delivery and ridesharing
services, there is a need to analyze these delivery/service points
to maximize provider profitability while minimizing harmful
environmental effects. In this study, we utilize an urban empirical
mobility dataset to extract important Global Positioning Systems
(GPS) information where most transactions of delivery and
services happened. In particular, we only utilized two-wheeled
vehicular positions in the study since they offer more services as
compared to four-wheeled vehicles. The urban map is uniformly
partitioned into grids categorized by its vehicular capacity to
locate highly demanded points. We then combine closely related
grid positions into its corresponding effective regions of movement (ERMs) according to a preset vehicular capacity threshold.
We also compute for the closeness centrality measure of these
highly demanded locations and found that points within each
ERM have short distances between them. Given these findings,
ERMs are spatially separated thereby locating the demand area
distribution easily.
Index Terms—Mobility Dataset, Spatiotemporal Characteristics, Effective Regions of Movement, Closeness Centrality
With the proliferation of autonomous and intelligent vehicles equipped with location-based sensors, monitoring their
movements on urban roads and highways has become accessible because of the availability of huge volume of vehicular
mobility traces. These datasets contain at least the trajectories
of vehicles like origin, destination, and route taken, timestamp,
and speed. These vehicular records provide opportunities to
understand the behaviors of any transportation mode [1].
Analyzing the dynamics of vehicles on urban roads translates
to various applications such as smart mobility, information
exchange, hotspot determination, and travel comfort and convenience provision.
In [2], journey stops were classified from truck GPS mobility traces. A stop is identified as a stationary sequence
of GPS truck coordinates or slow-moving trajectory. They
categorized these stops to either task (which is a detour of their
supposedly travel itinerary) or rest stops. In [3], GPS tracking
data were employed to validate a traversed route based on
the origin-destination pairs based on travel parameters like
time reliability, route characteristics, and road density, while
the work in [4] quantified the influence of main attraction
distances and passenger interactions in deciding stop decisions
among urban tourists. [5] implemented a two-layer clustering
technique on Chile taxis to detect the city’s general and local
travel patterns. In [6], bicycle GPS traces were investigated to
identify gender-related behaviors in choosing a travel route.
A clustering technique based on grid density and stay points
was proposed in [7] to extract and estimate dynamic hotspot
areas. The work focused on reducing the effects of noisy data
while improving clustering accuracy. Instead of focusing on
famous landmarks, the work in [8] detected small-scale local
hotspots to be used for pickup and drop off points to further
understand taxi demands and avoid traffic congestion in a local
setup. The authors in [9] extracted hotspots in an urban setup
based on a combination of vehicles and bicycles coordinates.
Their detected hotspots are generally the destination of one
transport mode and the origin of the other transport mode.
In this work, we utilized two-wheeled vehicle dataset to
determine their most frequented establishments at a given time
of a day. The dataset only contains the origin and destination
positions and its corresponding speed. In order to represent
the spatiotemporal-varying urban map from GPS locations and
time stamps, we consolidated the spatiotemporal maps into one
map that will generally characterize the urban location. From
this spatiotemporal stable map, we, then, identify delivery
locations based on a vehicular capacity threshold and its
corresponding demand area distribution by using the effective
regions of movement (ERMs) clustering technique. We also
evaluated the distance proximity between these delivery positions. The major contributions of this work are summarized
1) We present a single-day mobility trace dataset comprised
of GPS location and speed of two-wheeled delivery and
ridesharing vehicles. Given these dataset, we identify
highly demanded service points of a dynamic urban
map by calculating its spatiotemporal stable network
2) We employ the Effective Regions of Movement (ERM)
as a new means to group nearby traces and discriminate
them from other mobility groups. We then calculate the
closeness centrality measure to determine if the network
is distributed or not.
The paper outline is described as follows. We describe the
experimental setup and the concepts to determine the highly
demanded points and areas of distributions in Section II. We
then present our extensive experimental results in Section III
and discuss our findings. Finally, we conclude our study in
Section IV.
locations happen most often, or where food and service orders
take place.
Fig. 1 shows the area of collection which is approximately
equal to 3400 km2 . It is comprised of the National Capital
Region (where the capital city of the Philippines is located)
and the nearby provinces surrounding the capity region.
We discuss the mobility dataset and its statistics in this
section. We also tackle how we estimate the vehicular density
in the origin and destination locations.
A. Grab Mobility Dataset
The Grab mobility traces, related to the works in [10], [11],
is a dataset that contains the speed, origin, and destination
locations of motorcycles and cars servicing the Philippine
urban places. Motorcycles are two-wheeled vehicles employed
in the food and service businesses, while, cars are fourwheeled transports utilized in transporting passengers and
ridesharing. The origin and destination pairs are given in
both OpenStreet Map node labels and latitude and longitude
coordinates. There are no vehicular identification to preserve
privacy and the timestamp is presented by combining all
mobility traces happening in every minute for each hour of
the day starting from 00:00 (12 MN ) to 23:59 (11:59 PM).
In summary, the mobility trace dataset is characterized as
1) The dataset is a single-day collection of motorcycles and
cars GPS location taken on December 16, 2019, which
was a pre-pandemic period.
2) Each row presents the origin and destination pair of either a motorcycle or car. The number of rows determine
the quantity of sampled vehicular locations having at
least five vehicles.
3) The speed, in meter/sec, is the average speed of the
corresponding vehicles in the area at the said time.
4) The relationship of origin-destination (OD) pairs between sampling times are independent from each other
since there is no identification number to relate which
belongs to a trajectory. Thus, we can consider each OD
pair as an incomplete trip.
In this study, having no explicit relationships between
succeeding OD pairs do not affect the evaluation of delivery
demand since locations with high distribution can be checked
by visually looking at the urban map. This process is tedious
but is not used as a limitation of determining where delivery
Fig. 1. The geographical area where the spatiotemporal parameters of
motorcycles and cars are sampled. The blue line from lower left to the upper
right highlights the GPS boundaries of the mobility traces.
B. Dataset Statistics
From the single-day actual mobility dataset, we only extract
the average hourly speed and vehicular volume of two-wheeled
vehicles, as shown in Fig. 2. The average hourly speed is
determined by getting the harmonic mean of vehicular speeds
to include the effect of slow-moving vehicles. There is a
noticeable average slow speed even during non-peak hours.
This is attributed to the fact that the dataset was taken nine
days before the 2019 Christmas day. In the Philippines, roads
are very busy because this is the time where people are
out on reunions, party, and gift buying. However, there is a
direct relationship between speed and the motorcycle volume.
As more motorcycles are present, as well as other types of
vehicles, the slower the vehicular movement.
C. Locating Delivery Area Distribution from OD Vehicular
Fig. 1 is initially partitioned into 1,960,000 grids, where
N = 1400. Each grid covers an area equal to 0.0017 km2 . The
value of N = 1400 can be increased to allow more discrimination between spatial locations, particularly, if minor roads
are to be studied. However, this comes at more computational
expenses and use of memory storage.
We model the urban grid’s hourly vehicular capacity,
CHt (t = 0, 1, ..., 23) =
Cm , where Cm is given in (1).
V Cp,q is the number of OD pairs in grid gp,q .
ζp,q,ST S =
α(iTS )ω(iTS ),
ζ(iTS ) − min ζ(iTS )
α(iTS ) =
max ζ(iTS ) − min ζ(iTS )
ζ(iTS )
ω(iTS ) =
max ζ(i = 0, . . . , ITS )
Fig. 2. Grab motorcycle dataset statistics on December 16, 2019. This is a
pre-pandemic period and nine days before Christmas Day, a very important
occasion in the Philippines.
V C1,1
 ..
 .
Cm = 
 V Cp,1
 .
 ..
V CN,1
V C1,q
V Cp,q
V CN,q
V C1,N
.. 
. 
V Cp,N 
.. 
. 
where APfix denotes the averaged RSSI reading taken at
indoor location i from an access point APfx , where x denotes
how many APs are used to measure the RSSI of position i.
We normalize each CHt (t = 0, 1, ..., 23) based on the
maximum vehicular capacity of determined from all CHt (t =
0, 1, ..., 23), as shown in (2). To simplify CˆHt (t = 0, 1, ..., 23),
a vehicular capacity threshold value, 0 < τV C ≤ 1, is
employed to remove grids of low capacity and focus only
on high-capacity grids. Thus, this procedure emphasizes those
grids that contain only more OD pairs.
Fig. 3. An illustrative example to compute the spatiotemporal stable vehicular
capacity of the map under study [12].
To determine the delivery area distribution, we then form
the Effective Regions of Movement (ERMs) [12]. This is done
by grouping adjacent grids that satisfy the vehicular capacity
threshold. The clustering [13] of spatiotemporal map grids
form an ERM, ERMe , is governed by (4a) below. In this
work, we only combine edge-adjacent grid maps.
ERMe ≡ gp,q ∪ gp+∆p,q+∆q
CˆHt (t = 0, 1, ..., 23) =
CHt (t = 0, 1, ..., 23)
max CHt (t = 0, 1, 2, ..., 23)
subject to
From these CˆHt (t = 0, 1, ..., 23), we calculate the spatiotemporal stable vehicular capacity characteristic that will
provide a single snapshot to represent the dynamic urban map.
The spatiotemporal stable network characteristic, ζp,q,ST S =
CˆHt (t = 0, 1, ..., 23), is calculated by following (3). At
sampling time t = iTS , the α(iTS ) is the feature scaled
value, while ω(iTS ) is the weight derived from the CˆHt (t =
0, 1, ..., 23)’s, respectively. This is visually interpreted in Fig.
|{cgp,q } ∪ {cgp+∆p,q+∆q }| ≤ τc
min(ρgp,q , ρgp+∆p,q+∆q ) ≥ ρ0 ,
where ∆p, ∆q ∈ {−1, 0, 1}. cgp,q is the average vehicular
capacity in gp,q , while τc is the vehicular capacity threshold
of each formed ERMe .
Constraint (4b) allows the merging of grids gp,q and
gp+∆p,q+∆q when the merged grids have vehicular capacity
less than the vehicular capacity threshold, τc , of each formed
ERMe . ρgp,q and ρgp+∆p,q+∆q in Constraint (4c) are the
outbound and inbound vehicular flows coming to and leaving
from gp,q , respectively. If this condition is met, then, the two
map grids are merged, otherwise, they are dropped.
In this section, we present the results of our extensive simulations. We determine the delivery demand area distribution
and its epicenter based on a varying threshold vehicular density
level, τc . These values are: τc = 0, 0.5, 0.75 and 0.9.
Fig. 4 shows GPS coordinates of two-wheeled delivery
or service points for various vehicular density levels. It is
noticeable that as τc is increased, the delivery/service points
are found on main/major roads (depicted by the thick orange
lines). From τc = 0.5 to τc = 0.75, it is evident that there is
a 80% decrease in delivery/service points. This is attributed
to the fact that two-wheeled riders tend to come together at
establishments (food or business) that are high on foot traffic.
Table I shows the exact number of delivery/service points
extracted from the mobility dataset for each given vehicular
capacity threshold value.
Vehicular Capacity
Threshold, τc
Fig. 5 illustrates the ERMs formed from these delivery/service points. These ERMs are superimposed in the
figures found in Fig. 4. We exclude τc = 0.0 because each
coordinate will be an ERM. Here are some observations on
Fig. 5.
1) The large ERMs (represented by the yellow and green
areas) are treated as basically non-usable ERMs because
there are very few coordinates found in these regions.
These formed ERMs have the lowest priority levels.
2) As the vehicular capacity is increased, the number of
ERMs decreased drastically. This is due to the fact that
less regions contain delivery/service points, thus, the
forming of ERMs extend to the nearest urban area with
delivery/service points.
3) ERMs tend to concentrate on the most populous locations as depicted by more GPS coordinates. These
ERMs can be used by the two-wheeled service provider
to concentrate their service crew in these locations.
We note that each point in Fig. 4 can be connected to create
a complex network. In reality, a two-wheeled service provider
tends to travel to nearby places in search for additional orders,
i.e., to increase their take-home pays. To better understand if
the next points within an ERM or other ERMs are viable
options to travel to obtain additional delivery/service, we
calculate its average closeness centrality [14].
The closeness centrality simply states how close is one point
to the rest of the points found in the network. If a network
is close, then, a service provide can easily go to the other
locations to pick up an order or provide a service. The higher
the value of the closeness centrality is, the more central or
nearer a delivery/service point to the other delivery/service
point, else, a low value dictates otherwise. This is calculated
by summing all the shortest distance from a delivery/service
node to every other delivery/service node in the network.
We assume that all points are connected to each other to
form our network to be analyzed. Realistically, this connection
allows travel from one point to another, regardless of what
road and traffic conditions to take. In the worst case scenario,
a two-wheeled service provider can go from one point to the
farthest point if its current position is not suitable to make
profit. Of course, the service provider needs to optimize if
this is possible.
The closeness centrality of a delivery/service point to the
other nodes of the network, C(Ej ), is given in 5, where is the
separation between two EV charging locations and . We note
that we use the shortest distance between these two stations,
i.e., not necessarily the road network distance and no existing
EV station in between.
C(Ej ) = P
N −1
d(Ej , Ei )
where d(•) refers to the euclidean distance of •.
Table II shows the average, minimum, and maximum closeness centrality values when the vehicular capacity threshold
is varied. As τc is increased, the network is said to be tight
allowing two-wheeled providers to go to other points, however,
these values are not promising since these values are less than
0.5. Practically, the distances between two delivery/service
points are still far on the average. We also observe that the
max value illustrates that no one delivery point is too central
and in close proximity to the majority of delivery locations in
the network. Effectively, the formed network is distributed.
Vehicular Capacity
Threshold, τc
In this work, we have utilized empirical mobility traces
dataset to analyze most frequently visited GPS locations
and then determine demand area distribution in an urban
setup. Given the dynamic spatiotemporal characteristics of the
dataset, we presented the average vehicular capacity behavior
by computing for its equivalent spatiotemporal stable network
characteristics, effectively, pinpointing the most explored establishment GPS locations. From these places, the demand
area distribution was obtained by employing the effective regions of movement clustering. Our findings have shown that as
the constraint of vehicular capacity was increased, the ERMs
increase in size because less locations are frequently visited.
Fig. 4. The locations where most two-wheeled delivery or service are happening given the vehicular threshold capacities to be (a) τc = 0, (b) τc = 0.5, (c)
τc = 0.75, and (d) τc = 0.9.
Fig. 5. The superimposed ERMs and delivery/service point locations for two-wheeled delivery or service given the vehicular threshold capacities to be (a)
τc = 0.5, (b) τc = 0.9, and (d) τc = 0.9.
This distributed behavior is supported by the calculation of
the closeness centrality measure for all points made from the
interconnected network.
In future research, we can explore employing the findings
here in deploying electric vehicle charging stations [15],
trajectory anomaly detection [16], application of dynamic
parking pricing [17], and resources offloading and utilization
in vehicular networks [18].
This work was supported by Mitsubishi Motors, Philippines.
