Uploaded by yuxinzhang

2023-IEEE-T-IV-The AD4CHE Dataset and its Application in Typical Congestion Scenarios of Traffic Jam Pilot Systems

advertisement
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
1
The AD4CHE Dataset and its Application in
Typical Congestion Scenarios of Traffic Jam
Pilot Systems
Yuxin Zhang, Cheng Wang, Ruilin Yu, Luyao Wang, Wei Quan, Yang Gao, Pengfei Li
Abstract— Autonomous driving has attracted considerable
attention from research and industry communities. Although
prototypes of automated vehicles (AVs) are developed, remaining
safety issues and functional insufficiencies hinder their market
introduction. To obtain reasonably foreseeable scenarios and
study human driving policies, many naturalistic driving datasets
are proposed. However, no open-source dataset filled with
congestion scenarios is publicly available. The paper presents the
Aerial Dataset for China’s Congested Highways & Expressways
(AD4CHE). It contains 5.12 hours of aerial survey data from four
different cities in China, with a total driving distance of 6540.7 km.
Moreover, overlap and non-overlap cut-in scenarios are
distinguished to better describe driver behavior in congestion
scenarios. Both types of cut-in scenarios are extracted and
parameterized. The Kernel Density Estimator (KDE) is utilized to
generate parameter distributions for the scenario-based testing
method. Furthermore, the driving behavior in overlap cut-in
scenarios is intensively analyzed. The results reveal that the
drivers have an evasive maneuver during overlap cut-in of
challenging vehicles, and the preferred following distance varies
with the relative longitudinal velocity. Both scenario
parameterization and driving behavior analysis can contribute to
developing and verifying Traffic Jam Pilot (TJP) systems deployed
in Chinese traffic situations. The dataset is available at
https://auto.dji.com/cn.
Index Terms— Aerial dataset, autonomous driving, Chinese
highways and expressways, congestion scenarios, driver behavior,
scenario-based testing.
I. INTRODUCTION
S
AFETY assurance of autonomous vehicles (AVs) is currently
a challenge. Although it is nowadays not a rare case to see
AVs being tested on public roads, it is considered inefficient
due to rare occurring critical situations. To accelerate the
testing, various simulation tools [1] and environmental sensor
models [2] have received much attention. Due to the model
validity in simulations, simulation-based testing is only
partially functional. Consequently, the scenario-based approach
This research was supported by JLU-DJI Collaborate Research Project
(HQ-RD-201117-02), National Natural Science Foundation of China (No.
52075213), and Industrial Technology Basic Public Service Platform Project
2020 from Ministry of Industry and Information Technology of China (No.
2020-0100-2-1). (Corresponding author: Cheng Wang).
Y.X. Zhang, R.L. Yu and L.Y Wang are with the State Key Laboratory of
Automotive Simulation and Control, Jilin University, Changchun, 130025
China (e-mail: yuxinzhang@jlu.edu.cn,yurl21@mails.jlu.edu.cn). C. Wang
is with the Autonomous Agents Research Group, University of Edinburgh,
Edinburgh, EH8 9AB United Kingdom (e-mail: cheng.wang@ed.ac.uk)
W. Quan, Y. Gao and P.F. Li are with the DJI Automotive, Shenzhen,
518063 China (e-mail: moritz.quan@dji.com, tena.gao@dji.com,
xiaofei.li@dji.com)
is motivated and studied in German Pegasus [3] and Japanese
SAKURA [4] projects. The scenario-based approach aims at
abandoning irrelevant scenarios to reduce the test scenario
space resulting from the open world. A similar concept known
as scenario engineering [5] [6] is used to achieve trustworthy
AI by keeping system parameters at reasonable levels.
Obviously, data sources are essential for scenario-based testing.
One way to obtain data for the method comes from naturalistic
driving. To determine the contribution of naturalistic driving
data, the following provides a detailed comparison of traffic
accident data, field operation data and data from expert
knowledge.
Using traffic accident data to study traffic behavior has been
applied for decades. With the advent of AVs, testing AVs in
those accident scenarios has broadened the application scope of
traffic accident data, as it is valuable to determine whether an
AV could prevent a collision in scenarios that results from
human drivers. For instance, criticality phenomena [7]
associated with increased criticality are extracted from the
GIDAS dataset to identify which factors in scenarios influence
traffic accidents.
Despite the numerous applications for traffic accident data,
the amount of data available is limited. Additionally, some
essential attributes, such as lighting conditions to describe a
scenario, are missing because the data was not originally
specialized for AVs. Currently, a popular way to collect testing
data for AVs is field operation data. In this method, an AV
operates in the real world with a safety driver onboard. The
data, such as perception data, is either continuously recorded or
manually triggered to save by a safety driver, or automatically
saved by a technique like “silent testing” [8]. The data collected
is critical in improving the performance of AVs. However, the
behavior of surrounding traffic participants may be affected by
an AV in mixed traffic because of unusual equipment installed
on the AV.
Moreover, inefficiency and high costs limit its widespread
application because of the safety drivers. In contrast, data from
expert knowledge [9] can be generated in a less time-consuming
manner. A developer with expertise in some domains of AVs is
likely to be aware of the limitations. This valuable experience
can be used to create corresponding scenarios. Nevertheless, the
validity of the generated data is a concern. In particular,
unknown scenarios can never be covered since experience is
posterior.
Unlike traffic accident data, much naturalistic driving data is
available and simple to obtain. By recording the driving data
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
2
during a daily drive, naturalistic driving data is then generated.
The China-FOT dataset [10] belongs exactly to this category.
They recruited volunteers to drive their vehicles for one and a
half years. The collected data was then utilized to investigate
drivers’ behavior. Additionally, such data can be collected by
mounting a sensor, such as a camera, in a fixed position. For
instance, a camera mounted on a building records vehicle
trajectory data [11]. A relatively new method to generate
naturalistic driving data is using a drone. Such a concept is
successfully applied in the PEGASUS project, in which the
traffic data on a section of a highway is recorded by a drone.
The HighD dataset [12] is then born.
Nevertheless, there are few congestion scenarios in the
HighD dataset, and the average speed is 100.67 km/h. As a
result, it is inapplicable when the focus is to study driver
behavior in traffic jam scenarios in order to develop traffic jam
pilot (TJP) systems. A TJP as a level 3 [13] automated driving
system, which can be activated in a traffic jam or slow-moving
highway traffic up to 60 km/h. When the traffic jam pilot is
activated, the system will take over the control, and drivers no
longer need to continuously monitor the vehicle. They only
have to remain alert and be ready for taking over the driving
task again when the system prompts them at any time.
Several original equipment manufacturers (OEMs) have
introduced their TJP systems into the market. However, there is
currently no TJP available in the Chinese automotive market.
On the one hand, corresponding regulations in China are not yet
in place. On the other hand, the traffic in China is more
complicated than in Europa. Considerable effort must be
expended to adapt the TJP to Chinese traffic situations.
To support the adaption and investigate how Chinese drivers
behave in traffic jams, this paper proposes a dataset filled with
congestion scenarios on Chinese highways and expressways.
The naturalistic driving data is recorded by a drone. After data
processing, we provide detailed information for each recorded
vehicle and road information, such as lane markings. Moreover,
we distinguish between overlap and non-overlap cut-in
scenarios and discovered that overlap cut-in scenarios are
relatively common in Chinese traffic jam scenarios. To
compare the two types of cut-in scenarios, scenarios are
extracted and parametrized. Using the kernel density estimator
(KDE) [14], the foreseeable parameter distributions are
obtained, which facilitates the scenario-based testing method
for a TJP system. Meanwhile, the Chinese driver behavior has
been identified and can be used to validate and improve driver
performance models such as the fuzzy safety model (FSM)
[15]. These driver performance models are regarded as a
reference for AVs because an AV is expected to outperform a
careful and competitive driver [16]. The contributions to the
paper are:
 an open-source dataset focusing on Chinese highway
and expressway congestion scenarios is proposed.
Unlike other existing naturalistic driving datasets, ours
is the first open-source dataset that focuses specifically
on congestion scenarios. As a result, it can aid in the
development of TJP systems;
 Because of congestion, particular cut-in maneuvers
emerge. We define overlap cut-in to distinguish it from
non-overlap cut-in scenarios (common cut-in) and
extract both types of scenarios in order to determine
reasonably foreseeable scenario parameter distribution
in congestion scenarios;
 the driving behavior of Chinese drivers in overlap cut-in
scenarios is intensively analyzed, and new findings are
discovered by answering three research questions in
order to contribute to the development of driver
performance models in congestion scenarios.
The structure of this paper is as follows: Section II introduces
the related datasets and works; Section III focuses on the data
collection process in four Chinese cities and associated postprocessing. Section IV presents the results of extracted
scenarios and the driving behavior analysis in congestion
scenarios; finally, the discussion is carried out in section V and
conclusions are given in section VI.
II. RELATED WORK
This section first investigates currently available naturalistic
driving datasets and compares them to our dataset. Following
that, works on scenario parametrization are introduced. Finally,
the role of driver models in AVs is presented.
A. Related datasets
Various types of datasets [17] are published with the
emergency of AVs. Traffic accident datasets were originally
collected to analyze traffic safety issues aiming to derive
measures or regulations for vehicles, infrastructures, and
medical aspects. This type of data has recently been used to
construct critical scenarios for AVs [18] [19]. Field operational
datasets, on the other hand, are generated by AVs. Sensor data
such as images and point clouds are recorded in this type of
dataset. As a result, they are rather suitable for offline testing
perception algorithms. The KITTI dataset [20], the Apollo
dataset [21] and the Waymo open dataset [22] are examples of
this type of dataset. Vehicles equipped with sensors are driven
on roads in these datasets, and the data is known as field
operation data. To limit the scope of the literature review, we
concentrate on naturalistic driving (ND) datasets.
A popular dataset recorded by a drone is the HighD dataset
[12], which records the trajectories of approximately 110,000
vehicles on a 420-meter-long highway. Since then, a series of
similar datasets, such as the inD [23], roundD [24] and exitD
[25] datasets, have been published. The inD records the vehicle
trajectories in an intersection, whereas a roundabout is a
location in the roundD dataset. The exitD dataset is motivated
for including merging scenarios. One common aspect of these
datasets is that only the German traffic is recorded.
For ND datasets recorded in America, the NGSIM dataset
[11] is the most popular one to record trajectory data using a
fixed camera. Besides, the INTERACTION dataset [26]
includes trajectory data for intersections, roundabouts and
merging scenarios. The Stanford drone [27] and Interstate-24
MOTION [28] datasets use a drone to generate trajectory data.
The former is primarily concerned with campus scenarios,
whereas the latter is captured on highways at high speeds.
The China-FOT dataset [10] recorded 32 drivers’ daily
driving in Shanghai, China. Vehicle CAN data, pedal data and
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
3
video images were recorded. Since the cameras were installed
inside the vehicles, a minor modification is required. Since
then, the SIND dataset [29], which was recorded at a signaled
intersection, has emerged. The last dataset is ours, which is
known as the aerial dataset for China’s highways and
expressways (AD4CHE). Compared to other existing
naturalistic driving datasets, as shown in Table I, our dataset
has the following three special properties:
 Open source: both the dataset and the code for this
paper are open-source;

Congestion: the dataset is filled with congestion
scenarios by recording during rush hours. Congestion
stimulates overlap cut-in, This uncommon type of cut-in
is valuable for developing TJP systems;

Chinese highways and expressways: the traffic data
focuses on traffic situations on Chinese highways and
expressways, whereas few existing ND datasets take this
into account.
TJP and automated valet parking (AVP) systems are
considered the next generation of automated driving functions
for intelligent vehicles. Studying how Chinese drivers behave
in congestion scenarios to support TJP systems’ design
motivates us to generate the valuable AD4CHE dataset. To the
best of our knowledge, this is the first aerial dataset with these
three special properties.
TABLE I
COMPARISON OF DIFFERENT NATURALISTIC DRIVING DATASETS
Dataset
COUNTR
Y
RELEASED
Y
Y
Y
2018
2019
2020
Y
2021
Partial Y
2021
Y
2020
trajectory data,
digital maps
Y
2019
TYPICAL SCENARIOS
DATA CONTENT
trajectory data
trajectory data
trajectory data
trajectory data,
digital maps
trajectory data,
digital maps
trajectory data
HighD [12]
InD [23]
RoundD [24]
Germany
Germany
Germany
drone
drone
drone
fast straight driving
intersections
roundabouts
ExitD [25]
Germany
drone
entries and exits
Automatum [30]
Germany
drone
pNEUMA [31]
drone
NGSIM [11]
Greece
Germany
, China,
USA
USA
static camera
Stanford drone [27]
USA
drone
INTERACTION
[26]
OPENSOURCE
RECORDING TYPE
drone
straight driving,
entries and congestion
urban congestion
intersections,
roundabouts and
merging
entries and exits
campus and straight
driving
YEAR
trajectory data
Y
2006
Trajectory data
Y
2016
Interstate-24
MOTION [28]
China-FOT [10]
USA
drone
highway fast driving
Trajectory data
N
2020
China
vehicle sensors
urban driving
N
2014
SIND [29]
China
drone
intersections
Y
2022
drone
highway and
expressway congestion
vehicle CAN data
trajectory data,
digital maps
trajectory data,
digital maps
Y
2022
AD4CHE (Ours)
China
B. Scenario generation
As an aerial dataset, it can be applied for different purposes.
Its contributions to AV verification and validation are
particularly noteworthy. One great benefit of aerial datasets is
their application in scenario-based testing for AVs. Because of
the scenarios’ validity, extracting scenarios from those datasets
and using them to verify AVs is credible. In general, there are
three steps to generate concrete scenarios [32] from a dataset.
They are scenario definition [33] [34], scenario
parameterization [35] [36] and parameter space estimation [37]
[38], respectively. Ontology-based approaches are commonly
used for scenario definition. Due to the necessity of prior
knowledge in ontology, searching methods such as stress
testing [39] [40] attempt to discover critical scenarios directly.
In scenario parameterization, common trajectory parameters
such as velocity and position are intuitive but may be
inappropriate for characterizing scenarios; for example, it is
unknown if the ego’s initial velocity or maximum velocity is
better to describe its behavior in a cut-in scenario. According to
E.d. Gelder et al. [35], defining too few parameters leads to
oversimplification, while defining too many parameters has a
dimensional problem. Therefore, they used the singular value
decomposition (SVD) to choose the optimal parameters.
Nevertheless, the method is only demonstrated by two simple
scenarios, and assumptions are made to prove its effectiveness.
In contrast, Karunakaran et al. [36] derived the parameters
to describe a cut-in scenario by comparing simulated
trajectories govern by control points to real-world trajectories.
Aside from scenario parameterization, the next step in
generating logical scenarios [32] is parameter space estimation.
Many works [41] [42] have been done using publicly available
datasets. For example, Zlocki et al. [38] analyzed the parameter
distributions of five vehicle kinematic variables in the HighD
and the SAKURA datasets to determine their correlations. To
generate final concrete scenarios, sampling techniques like
important sampling [43] [44] and risk-index based sampling [45]
are applied to reduce the parameter space. Although the
generated logical scenarios provide valuable guidance for the
scenario-based testing method, their application for safety
verification and validation of AVs is limited due to the dataset
diversity requirement. To complement this, the parameter
analysis in the AD4CHE dataset is meaningful, particularly for
testing TJP systems.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
4
C. Driver performance models
In addition to scenario generation, analyzing driver behavior
in congestion scenarios is also crucial. On the one hand, it can
guide the parameter determination and optimization for AV
decision-making [46] [47]. For instance, driving data is used in
[48] to optimize vehicle motion state parameters for a generated
trajectory. On the other hand, driver behavior can be used to
calibrate safety metrics such as responsible sensitivity safety
(RSS) [49] to balance safety and aggression. These safety
metrics are commonly used as a safety checker [50] to ensure
the safety of a planning module. However, safety metrics that
do not use human driver behavior as a calibration reference
typically result in conservative decisions that have lower
acceptance and are intolerable in congestion scenarios due to
frequent cut-in. Consequently, we are motivated to perform the
driver behavior analysis in congestion scenarios.
Additionally, driver performance models are inevitable for
simulation-based testing [51], as they are required for valid
simulation results. Because real-world testing brings huge test
effort, verifying and validating AVs in simulations is a viable
alternative. Lastly, driver performance models can be used as a
guide when introducing AVs to the public. Since the UNECE
Regulation No.157 [52] introduced a careful and competent
driver to assess the safety of a level 3 automated lane keeping
system (ALKS), various driver models, including the fuzzy
safety model (FSM) [15] and the stochastic cognitive model
(SCM) [53], have been studied to provide a more realistic
reference for releasing AVs. However, the driving behavior in
China differs significantly from that of European countries due
to driving culture, infrastructure, etc. The differences are
thoroughly discussed in [54]. They concluded that adaption is
necessary when applying automated driving systems in Chinese
traffic. As a result, the driver behavior analysis in the AD4CHE
dataset could help with this adaptation.
Fig. 1. The road network of the recorded data. The number in each lane represents the lane number assigned to each vehicle. Vehicles on exit or entrance ramps
have a lane number greater than 100. The blue lines are road markings. The positions of these lane marking points are available. The coordinate system is located
in the upper left corner, with the x-axis to the right and the y-axis to the bottom.
III. AERIAL DATA PROCESSING
In this section, we introduce the basic steps to process the
recorded data. First, a brief introduction to the recorded sites is
given. The techniques used to handle the data are described.
Lastly, we present the available information in the dataset.
A. Recording sites
Because driving behavior is influenced by a variety of
factors, including road structures, lane marking quality and
driving culture, etc., it is necessary to go across several different
areas in China to increase the dataset coverage. As a result, we
chose two cities in the north of china and two in the south of
China due to their different geographies and climate conditions.
The two northern cities are Xi’an and Changchun, whereas the
two southern cities are Hefei and Shenzhen.
Because we aim to support the development and verification
of TJP systems, whose use cases are highway and expressway
scenarios with traffic flow speed from 0 to 80 km/h, four
highway and expressway sections in those four cities are
selected. To capture congestion, we wait for the appropriate
time, such as rush hour, to fly the drone. Flight approvals are
obtained prior to data collection to avoid no-fly zones and data
security risks.
B. Recording and pre-processing
According to the design requirements of an L3 TJP system,
the data must be accurate enough to reasonably model the
driving behavior of traffic participants in traffic jams. Unlike
the HighD dataset’s settings, which have a hover height of over
300 m, we define the hover height as 100 m and focus on
collecting congested road traffic during peak commuting hours.
To improve the accuracy and consistency across cities, we
developed a unified aerial data collection standard to guide the
data collection process, which includes a pre-flight check of the
drone status, battery and camera parameter settings, etc.
Even though the drone has an excellent anti-shake platform,
there are still minor jitters that cause road sections to be
inconsistent between images. To address this issue, we perform
video alignment pre-processing on the captured video data. By
matching the same road features, such as lane markings in the
images of different frames, images in the entire video are
aligned to the beginning frame. Consequently, the road position
will not change while playing a video, and all recorded vehicles
have a unified reference. In addition, the pixel size error
analysis is performed by dimensional calibration in a controlled
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
5
environment with ideal ground conditions. The overall position
accuracy is about 5 cm, and the maximum error is below 10 cm.
Fig. 1 illustrates an example of lane segmentation and the
lane number assigned to each lane. Since enter and exit ramps
are not the focus, we assign them a large number to distinguish
them from the main roads. In particular, as the blue lines show,
we also provide the detected lane markings, which can be used
for lane-changing analysis and lane width determination. The
coordinate system is in the image’s upper left corner.
Fig. 2. One example to illustrate our detection results. . Object ID and velocity are visualized above each tracked object. Cars are visualized by a red bounding
box, while non-cars are visualized by a green bounding box. Because of the accurate yaw angle, objects on the on/off ramps are correctly captured. Moreover,
severe congestion scenarios are observed, distinguishing our dataset from other existing naturalistic driving datasets.
C. Objects detection and tracking
After data recording and pre-processing, object detection and
tracking is the remaining task. Due to the superior performance
of the convolutional neural network (CNN) in object detection
[55] [56], a CNN model is utilized to detect objects and segment
lanes. The training data consists of 10k images for lane
markings and 50k images for vehicles, which are obtained
through semi-automatic annotation. Since different object
classes are distinguished during annotation, our model provides
the object class together with the object state. To validate the
model’s performance, test data that differs from the training
data is determined, and two metrics, precision and recall, are
chosen to evaluate the model. By computing the intersection
over union (IoU) [57] for each labeled object and detected
object in the test data, the precision and recall of the detection
results are 0.97 and 0.93. The high values of these two metrics
indicate the superior performance of our model.
To increase the accuracy of the results further, a forwardbackward extended Kalman filter [58] is applied because no
time and computation limit exists offline compared to online
tracking. In addition, for the size of a vehicle that should remain
constant during the movements of a vehicle, the detected size
when the vehicle is strictly under the drone is chosen to avoid
image projection errors. The position error in the X and Y
directions is within 7 cm for object detection at a length of 5 m.
Fig. 2 shows the detection results. Each vehicle is represented
by a bounding box with its ID and speed above it.
D. Data format
Based on the processed results, the state (position and
velocity, etc.) and vehicle class at each frame are provided.
Besides, essential information such as time-to- collision (TTC)
and time-headway (THW) to describe the traffic flow is also
given. Further, we give not only the information that the HighD
dataset provides, but also four additional parameters. They are
the lane angle, vehicle orientation, yaw rate and ego offset. The
ego offset represents the deviation between a vehicle’s and
lane’s center. This information can, for example, be used to
analyze the offset distribution when a vehicle keeps in a lane.
Such distribution plays a vital role in determining when a risk
perception point begins, as described in UNECE Regulation
No.157 [52], which is the first L3 AV regulation in the world.
Besides the ego offset, the orientation could be helpful in
analyzing the subtle driving behavior of human drivers. A lane
segmentation image is also provided along with the data, which
benefits the extraction of lane markings. Generally, the
provided information is sufficient to accurately extract various
functional scenarios and convert them to popular simulation
data formats like OpenSCENARIO [59].
IV. DATASET ANALYSIS
In this section, we first give some statistical information
about the dataset. Then, the overlap cut-in scenario is defined
and extracted. Based on the extracted scenarios, scenario
parameterization and parameter distribution are presented.
Lastly, the driver behavior in those overlap cut-in scenarios is
analyzed to provide guidance in designing human-like TJP
systems.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
6
A. AD4CHE at a Glance
To present a general overview of our dataset and highlight its
differences to the HighD dataset, some crucial statistical
information is given below in Table II. Due to a longer
recording duration in the HighD dataset, the total number of
vehicles is higher. However, when we observe the average
speed attribute, the value in the AD4CHE dataset is 29.25 km/h,
which is much lower than that in the HighD dataset. It implies
that our dataset contains a large number of congestion scenarios
on the recorded highways and expressways. This conclusion
can also be drawn by looking at the “vehicle number per
kilometer” attribute. It indicates that the vehicle density in our
dataset is nearly four times that of the HighD dataset.
TABLE II
STATISTICAL INFORMATION OF THE AD4CHE DATASET AND ITS DIFFERENCES
TO THE HIGHD DATASET
Attributes
HighD
AD4CHE
Recording Duration [hours]
Lanes (per direction)
Road Length [m]
Number of Vehicles
Number of Cars
16.5
2-3
400-420
110 000
90 000
5.12
5-6
≈130
53761
42516
Number of Trucks
20 000
10306
Number of Buses
0
939
Driven distance [km]
45 000
6540.7
Vehicle Number per km
2.44
8.22
Average Speed [km/h]
100.67
29.25
Driven time [h]
447
223.65
overlap cut-in. As summarized in [60], lane-changing is divided
into four phases: keeping, changing, arrival and adjustment.
Cut-in, as opposed to lane-changing, emphasizes the interaction
of two vehicles. Fig. 4 illustrates a cut-in maneuver. When
driving in a lane, a vehicle usually has a wandering zone. The
wandering zone has been statistically studied by experts in
Japan and is defined as 0.375 m [52]. When the challenging
vehicle leaves this zone, the timestamp is considered the
beginning of a cut-in maneuver and is described as 𝑇1 . When
the challenging vehicle is within the wandering zone again in
an adjacent lane, we use 𝑇5 to represent this timestamp. During
this period, the moment of crossing the lane marking is defined
as 𝑇3 . Based on these time variables, as depicted in Fig. 4, we
can distinguish overlap cut-in and non-overlap cut-in.
Fig. 4. The illustration of a cut-in maneuver. 𝑇1 is the timestamp when a cut-in
begins. 𝑇3 is the timestamp when a lane marking crossing is happening. 𝑇5 is
the end timestamp of a cut-in maneuver. We use the variables to define overlap
cut-in scenarios, which are critical in congestion due to limited free space.
TABLE III
PARAMETERS THAT ARE USED TO DESCRIBE CUT-IN SCENARIOS
Parameters
Variables
Units
Initial ego velocity
Initial challenging vehicle velocity
Initial relative longitudinal distance
Initial relative lateral distance
Lateral challenging vehicle velocity
𝑣ego,0
𝑣cha,0
𝑑rel,𝑥0
𝑑rel,𝑦0
𝑣cha,𝑦
m/s
m/s
m
m
m/s
Start time of cut-in
𝑇1
𝑇3
𝑇5
s
Time to cross a lane marking
End time of cut-in
Fig. 3. The speed distributions in the AD4CHE and the HighD datasets. The
speed in the HighD dataset is generally higher than that in the AD4CHE dataset.
It indicates that the AD4CHE dataset is more suitable to analyze congestions.
For a TJP system, our dataset provides the data required to
analyze driver behavior in congestion scenarios, whereas the
role of the highD dataset is limited. In addition, the AD4CHE
dataset is recorded on highways and expressways within the
operational design domain (ODD) [13] of a TJP system.
Meanwhile, rich interactive driving behavior is included due to
the congestion, which is valuable for TJP development. Fig. 3
compares the speed distributions of the two datasets.
B. Overlap cut-in
Compared to a car-following scenario, a cut-in scenario is
more challenging for a TJP system. In particular, the cut-in
maneuvers in congestion scenarios are quite different due to
less available space. To clearly express an overlap cut-in
scenario, we distinguish lane-changing, non-overlap cut-in and
s
s
The non-overlap cut-in maneuver occurs frequently on
highways where a challenging vehicle cuts in with its rear ahead
of the ego’s front. However, this type of cut-in occurs rarely in
congestion scenarios where a driver is prone to cutting in even
when falling behind. This type of cut-in is more critical in
congestion scenarios due to the severely limited free space and
is more challenging compared to the non-overlap cut-in. As a
result, we concentrate on this type of cut-in scenario in our
scenario analysis and define it as:
Overlap cut-in: when a cut-in maneuver begins, the rear of
the challenging vehicle is still behind the ego’s front. This can
be mathematically expressed by
𝑙ego + 𝑙cha
(1)
|𝑑ego,𝑥 − 𝑑cha,𝑥 | ≤
2
where 𝑑ego,𝑥 and 𝑑cha,𝑥 are the longitudinal position of the ego
and the challenging vehicle, respectively. 𝑙ego and 𝑙cha
represent their length.
The overlap cut-in scenarios are those that satisfy equation
(1), while the rest are non-overlap cut-in scenarios. One of our
aims is to compare the parameter distributions of these two
types of cut-in scenarios in order to provide design guidance for
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
7
TJP systems. To this end, the scenario parameters defined in
Table III are used. Since the variable 𝑣cha,𝑦 changes
continuously during a cut-in process and its maximum value
represents how urgent a lateral movement is, 𝑣cha,𝑦,max is used
to perform the parameter analysis.
C. Scenario extraction
Fig. 5 illustrates the scenario extraction process. First, we
check if a vehicle has 𝑇3 to find lane-changing maneuvers.
Then, we use the wandering zone to determine the start
timestamp of a likely cut-in maneuver. Similarly, the end
timestamp of a potential cut-in maneuver is defined if the
challenging vehicle enters the wandering zone again in an
adjacent lane. Then, we observe which vehicle is behind the
challenging vehicle after cut-in and has no lane-changing
during the cut-in process to identify the ego vehicle.
90 overlap and 286 non-overlap cut-in scenarios are found.
TABLE IV
PARAMETERS OF THE RSS MODEL
Parameters
Variables
Units
Reaction time 𝜌
Maximum acceleration 𝑎max,accel
Minimum deceleration 𝑎min,brake
Maximum deceleration 𝑎max,brake
0.75
3
6
m/s2
m/s2
6
m/s2
s
According to UNECE Regulation No.157 [52], one of the
critical scenarios that should be used to test TJP systems before
approval is the cut-in scenario. When compared to the paper's
defined overlap cut-in scenario, the overlap cut-in scenario is
more difficult because lateral perception, rather than
longitudinal perception, is required to accurately track and
predict the behavior of the cut-in vehicle, and lateral perception
is typically built with fewer and less powerful sensors. To
emphasize this challenge, we utilize one exemplary overlap cutin scenario to illustrate the utility of this type of scenario in
potentially improving existing TJP systems. As illustrated in
Fig. 6(b), when the challenging vehicle starts a cut-in maneuver
(as 𝑑rel,𝑦 decreases), it is still partially behind the ego vehicle
(as 𝑑rel,𝑥 is negative). Fig. 6(a) shows the actual cut-in process.
The challenge of detecting adjacent rear vehicles and
identifying their intentions shall be overcome when deploying
TJP systems in the regions where drivers tend to cut in even
when they fall behind.
Fig. 5. The overlap and non-overlap cut-in scenario extraction process.
To determine if a lane-changing scenario is a cut-in scenario,
the time-to-collision (TTC) metric is usually used to filter out
the scenarios where the challenging vehicle is so far ahead that
the ego vehicle is unaffected. A TTC threshold of 5 s [15] is
used to find hazardous cut-in scenarios in the highD dataset.
However, due to the limited longitudinal space in congestion,
the ego vehicle reacts to a challenging vehicle despite large
TTC or even negative TTC values. Consequently, the TTC
metric is not appropriate in our case. Instead, we use a distancebased metric. A distance of 75 m is utilized in [61] based on
200 observations to delete the scenarios where a challenging
vehicle is irrelevant to the ego vehicle. To avoid assigning such
a deterministic value and thereby losing generality, we use the
RSS model to calculate the minimum safe distance at which a
reaction of the ego vehicle is required. With a set of
conservative parameters suggested by [15] as described in
Table IV, we will not miss scenarios requiring ego reaction.
If a cut-in scenario is valid, we then investigate if non-cars,
such as trucks or buses, are involved in the cut-in. This paper
only considers car-involved cut-in scenarios; non-car-involved
cut-in scenarios are ignored. Based on the definition of overlap
cut-in, we further divide the cut-in scenarios into overlap and
non-overlap cut-in scenarios. Following the steps outlined
above, we finally obtained 376 cut-in scenarios. Specifically,
Fig. 6. One example of an overlap cut-in scenario to demonstrate the dataset
values to TJP system development.
D. Parameter distribution
The parameters listed in Table III are extracted in both
overlap and non-overlap scenarios. Because the distributions of
these parameters are unknown in advance, it is difficult to
predefine the function form of the signal and then fit parameters
to this function form. Instead, we employ kernel density
estimation (KDE) to calculate each parameter’s probability
density function (pdf). Due to the non-parametric characteristic
of KDE, the shape of the pdf is automatically adapted under the
given data, and the KDE is highly flexible in terms of the actual
shape of a pdf.
In our case, the Gaussian kernel function is applied to obtain
the pdf of four parameters, as illustrated in Fig. 7. For the
relative longitudinal distance 𝑑rel,𝑥0 at timestamp 𝑇1 , most
challenging vehicles perform cut-in maneuvers with a small
𝑑rel,𝑥0 , which is consistent with the phenomenon in congestions
due to small longitudinal space available. In overlap cut-in
scenarios, the value of 𝑑rel,𝑥0 is less than zero. Apparent
differences between overlap and non-overlap scenarios can be
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
8
found in the relative longitudinal velocity 𝑣rel,𝑥0 at timestamp
𝑇1 . The 𝑣rel,𝑥0 are mostly greater than zero in overlap cut-in
scenarios, whereas the 𝑣rel,𝑥0 in non-overlap cut-in scenarios
are normally distributed with a mean value of about -0.5 m/s.
This indicates that the challenging vehicle is faster than the ego
vehicle during the overlap cut-in maneuvers. In contrast, no
preferred 𝑣rel,𝑥0 is identified during non-overlap cut-in
maneuvers because the challenging vehicle is entirely ahead of
the ego.
Regarding the ego velocity 𝑣ego,0 at timestamp 𝑇1 , most cutin scenarios occur when the ego velocity is very low, with only
a few cases occurring when 𝑣ego,0 is greater than 15 m/s. This
is because the traffic in several lanes is less congested. In
addition, the velocities of most ego vehicles in overlap cut-in
scenarios are lower than in non-overlap cut-in scenarios. This
is most likely one of the motivations for the challenging vehicle
to perform an overlap cut-in. The maximum lateral velocity of
the challenging vehicle during a cut-in maneuver 𝑣cha,𝑦,max
also shows some distinctions between these two types of cut-in.
The 𝑣cha,𝑦,max in most overlap cases is around 0.3 m/s, which
is roughly half of the majority of 𝑣cha,𝑦,max in non-overlap cutin scenarios.
Using the pdf generated by KDE, we can define logical
scenarios for the scenario-based testing method aimed at testing
a TJP system. Because no collision occurs in these reasonably
foreseeable parameter distributions in the dataset, a TJP shall
also have no crash in the generated scenarios derived from the
pdf to achieve a comparable level of safety as human drivers.
To effectively assess a TJP, the important sampling [43] [44]
technique can be applied to emphasize the scenarios with higher
criticality based on our findings.
Fig. 7. The parameter distributions extracted from overlap and non-overlap cut-in scenarios. First row: the relative longitudinal distance is negative, and the
challenging vehicle is faster than the ego vehicle in overlap cut-in scenarios. Second row: overlap cut-in occurs when the ego velocity is relatively low; the lateral
velocity of the challenging vehicle is small during the overlap cut-in.
E. Driver behavior analysis
It is essential to understand how the driver behaves when
confronted with an overlap cut-in. Although low longitudinal
velocity is unlikely to cause fatal accidents, as observed in our
dataset, it does cause issuance issues when a TJP cannot deal
with this type of cut-in. Therefore, we analyze the following
three research questions:
RQ1: Does the ego driver evade laterally when confronted
with an overlap cut-in?
RQ2: Does the ego driver brake despite positive relative
longitudinal velocity in the event of an overlap cut-in?
RQ3: What is the ego driver’s preferred distance when a
challenging vehicle enters the ego’s lane in overlap cut-in
scenarios?
To answer these three research questions and facilitate the
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
9
design of a human-like TJP, we analyze further the 90 overlap
cut-in scenarios extracted from the dataset. Regarding RQ1, we
analyze the lateral ego offset during the period [ 𝑇1 , 𝑇3 ] to
identify if the ego will evade laterally to allow the challenging
vehicle to cut in. We use linear regression to fit the time
sequence lateral offsets, and the slope is utilized to determine if
the lateral offsets shift or remain relatively constant. Finally, 64
ego vehicles show an increasing or decreasing lateral offset
depending on the cut-in direction among 90 overlap cut-in
scenarios, whereas 19 vehicles have a constant lateral offset.
Because 𝑇3 equals 𝑇1 in 7 cases, no general trend can be drawn
where the lateral offsets of the challenging vehicles already
reach the position of 𝑇3 when entering the detection zone of a
drone.
To determine if the lateral inverse TTC (reciprocal of TTC)
and the longitudinal ego velocity influence the lateral offset, we
illustrate their relationships at timestamp 𝑇1 , because the
challenging vehicle is adjacent to the ego at this timestamp, i.e.,
they are so close to each other in the longitudinal direction that
the ego would evade laterally if the challenging vehicle cuts in.
The results are shown in Fig. 8. The figure shows that shifting
offset occurs more frequently when the ego velocity is within
[5, 20] m/s. 20 m/s is the upper limit of velocity in the
discovered overlap cut-in scenarios due to the congestion. In the
low-speed area, no strong correlation is found when the lateral
inverse TTC is also very low. Due to the low ego velocity in
uncritical overlap cut-in scenarios, some drivers tend to leave
space when a challenging vehicle cuts in, while others attempt
to stop the cut-in of the challenging vehicle. Therefore, drivers
tend to evade laterally to an overlap cut-in vehicle, especially
when the ego velocity is slightly higher.
distance headway (DHW) when reaching the maximum
decelerations. The results are shown in the lower subplot in Fig.
9. Even though the deceleration values are negligible due to
small velocities, the braking maneuver is preferred to be applied
under small DHW values.
Fig. 9. The upper subplot shows the average acceleration distribution when the
challenging vehicle enters the ego’s lane. Braking maneuvers are identified
despite positive relative longitudinal velocity. The lower subplot illustrates the
DHW values when reaching the maximum deceleration. Braking is more
frequent when the corresponding DHW is small.
Fig. 10. The relationship between relative longitudinal velocity and the DHW.
A linear relation is identified by analyzing the data points representing the
timestamp at which the challenging vehicle completes its cut-in.
Fig. 8. The influence of lateral inverse TTC and ego velocity on the lateral ego
offset. The ego has a lateral offset when its velocity is larger than 5 m/s. No
strong correlation between lateral inverse TTC and lateral ego offset is found.
For RQ2, due to the small relative longitudinal distance,
deceleration is applied in some cases, despite the challenging
vehicle being mostly faster than the ego, as illustrated in the
upper subplot in Fig. 9. During the period [𝑇3 , 𝑇5 ], the ego
driver shall brake when necessary because the changeling
vehicle enters the ego’s lane. It turns out that about half of the
cases studied exhibit a deceleration process. To determine
under what conditions braking is applied, we analyze the
To facilitate the calibration of a suitable following distance
for a TJP to avoid frequently cutting in by other challenging
vehicles, we study the following distance when the challenging
vehicle finishes cutting in, i.e., 𝑇5 timestamp is reached, to
answer the third question RQ3. By drawing the corresponding
relative longitudinal velocity along with the DHW, as shown in
Fig. 10, we found that DHW tends to be small if the relative
longitudinal velocity is small. Conversely, large DHW values
are shown with large relative longitudinal velocities. As a result,
the following distance shows a linear relationship with the
relative longitudinal velocity at the timestamp of 𝑇5 with a
correlation coefficient of 0.72. Many works [62] [63] have
pointed out that safety metrics must be calibrated to achieve a
tradeoff between safety and efficiency; our findings provide the
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
10
opportunity to calibrate those metrics to make motion planning
more human-like.
safety of AVs by building an automatic closed loop “recordprocess-testing”.
V. DISCUSSION
VII. REFERENCES
We generated the first open-source dataset for supporting the
design and verification of TJP systems. Unlike the HighD
dataset, we recorded the trajectory data in four Chinese cities
during peak hours. Most of the relative longitudinal distances
are smaller than 10 m in congestion scenarios when a cut-in
maneuver begins, which contrasts sharply with the results
obtained at high driving speeds [64]. By distinguishing overlap
from non-overlap cut-in scenarios, we demonstrated their
parameter differences. In overlap cu-in scenarios, the
challenging vehicle drives faster than the ego to successfully
cut in. An overlap cut-in maneuver usually occurs when the ego
is very slow. This is probably because the challenging driver
feels safe cutting in under this condition. Meanwhile, the
challenging vehicles also show smaller lateral velocities in
overlap cut-in scenarios. Due to less available longitudinal
space, they gamble with the ego drivers to understand if the ego
driver would allow them to cut in.
To interpret how the ego driver behaves in those overlap cutin scenarios, we analyzed the driver behavior by answering
three questions. We discovered that most drivers evade laterally
when the challenging vehicles cut in. It means that most drivers
are willing to give way. Moreover, despite negative TTC in
overlap cut-in scenarios, most drivers apply braking when the
DHW is too small, which matches well with the margin distance
described in [65] to maintain safety after braking. This
following distance also depends on the relative longitudinal
velocity. Their incremental relationship reflects the driving
policy of drivers. The code for the analysis is available on
GitHub (https://github.com/ADSafetyJointLab/AD4CHE).
Although the dataset provides rich information about driving
behavior in congestion scenarios, few critical scenarios are
found. A future research direction could be to superimpose the
driving environment or behavior in traffic accidents on the
overlap cut-in scenarios to generate critical overlap cut-in
scenarios. A driver model can also be studied to simulate human
driver behavior in congestion scenarios.
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
VI. CONCLUSION
This paper proposed the first open-source Chinese largescale naturalistic traffic flow dataset focusing on congestion
scenarios. During the processing and analysis of the dataset, we
have the following findings: 1) the dataset has very high
accuracy, and the data information such as yaw and lane
marking positions, which are usually not provided in other open
source datasets, are also available in our dataset; 2) drivers
behave differently in overlap and non-overlap scenarios, and a
gaming process exists in overlap cut-in scenarios; 3) ego drivers
brake when the relative space is small, and they prefer to keep
a large following distance when the relative velocity is high.
Our work can benefit not only the development of TJP
systems, especially those targeting the Chinese market but also
can contribute to modeling driver behavior in congestion
scenarios. Future work includes expanding the dataset and
generating critical scenarios based on reasonably foreseeable
parameters. Mining unknown unsafe scenarios to improve the
[16]
[17]
[18]
[19]
[20]
[21]
G. Rong et al., “Lgsvl simulator: A high fidelity simulator for
autonomous driving,” in 2020 IEEE 23rd International Conference on
Intelligent Transportation Systems (ITSC), 2020, pp. 1–6.
A. Elmquist and D. Negrut, “Methods and models for simulating
autonomous vehicle sensors,” IEEE Transactions on intelligent
vehicles, vol. 5, no. 4, pp. 684–692, 2020.
PEGASUS Project Office, PEGASUS METHOD. [Online]. Available:
https://www.pegasusprojekt.de/en/home (accessed: Jun. 20 2021).
SAKURA, SAKURA research project. [Online]. Available: https://
www.sakura-prj.go.jp/project_info/ (accessed: Nov. 28 2022).
X. Li, Y. Tian, P. Ye, H. Duan, and F.-Y. Wang, “A Novel Scenarios
Engineering Methodology for Foundation Models in Metaverse,” IEEE
Transactions on Systems, Man, and Cybernetics: Systems, pp. 1–12,
2022, doi: 10.1109/TSMC.2022.3228594.
X. Li, P. Ye, J. Li, Z. Liu, L. Cao, and F.-Y. Wang, “From Features
Engineering to Scenarios Engineering for Trustworthy AI: I&I, C&C,
and V&V,” IEEE Intelligent Systems, vol. 37, no. 4, pp. 18–26, 2022,
doi: 10.1109/MIS.2022.3197950.
S. Schoenawa and C. Neurohr, Identification and Analysis of
Criticality Phenomena within the GIDAS database. [Online].
Available: https://elib.dlr.de/188717/ (accessed: Aug. 10 2022).
C. Wang, K. Storms, and H. Winner, “Online Safety Assessment of
Automated Vehicles Using Silent Testing,” IEEE Trans. Intell.
Transport. Syst., 2021.
G. Bagschik, T. Menzel, and M. Maurer, “Ontology based scene
creation for the development of automated vehicles,” in 2018 IEEE
Intelligent Vehicles Symposium (IV), 2018, pp. 1813–1820.
R. Liu and X. Zhu, “Driving data distribution of human drivers in
urban driving condition,” in 2017 IEEE 20th International Conference
on Intelligent Transportation Systems (ITSC), 2017, pp. 1–6.
Traffic Analysis Tools: Next Generation Simulation - FHWA
Operations. [Online]. Available: https://ops.fhwa.dot.gov/
trafficanalysistools/ngsim.htm (accessed: Nov. 7 2022).
R. Krajewski, J. Bock, L. Kloeker, and L. Eckstein, “The highd
dataset: A drone dataset of naturalistic vehicle trajectories on german
highways for validation of highly automated driving systems,” in 2018
21st International Conference on Intelligent Transportation Systems
(ITSC), 2018, pp. 2118–2125.
SAE J3016, Taxonomy and Definitions for Terms Related to Driving
Automation Systems for On-Road Motor Vehicles. [Online]. Available:
https://doi.org/10.4271/J3016_202104
M. Rosenblatt, “Remarks on some nonparametric estimates of a
density function,” The annals of mathematical statistics, pp. 832–837,
1956.
K. Mattas et al., “Driver models for the definition of safety
requirements of automated vehicles in international regulations.
Application to motorway driving conditions,” Accident Analysis &
Prevention, vol. 174, p. 106743, 2022, doi:
10.1016/j.aap.2022.106743.
M. A. Nees, “Safer than the average human driver (who is less safe
than me)? Examining a popular safety benchmark for self-driving
cars,” Journal of safety research, vol. 69, pp. 61–68, 2019.
Y. Kang, H. Yin, and C. Berger, “Test your self-driving algorithm: An
overview of publicly available driving datasets and virtual testing
environments,” IEEE Transactions on intelligent vehicles, vol. 4, no. 2,
pp. 171–185, 2019.
E. Esenturk, S. Khastgir, A. Wallace, and P. Jennings, “Analyzing
Real-world Accidents for Test Scenario Generation for Automated
Vehicles,” in 2021 IEEE Intelligent Vehicles Symposium (IV), 2021,
pp. 288–295.
X. Zhang, F. Li, and X. Wu, “CSG: Critical Scenario Generation from
Real Traffic Accidents,” in 2020 IEEE Intelligent Vehicles Symposium
(IV), 2020, pp. 1330–1336.
A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics:
The KITTI dataset,” The International Journal of Robotics Research,
vol. 32, no. 11, pp. 1231–1237, 2013.
X. Huang, P. Wang, X. Cheng, D. Zhou, Q. Geng, and R. Yang, “The
apolloscape open dataset for autonomous driving and its application,”
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
11
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
IEEE transactions on pattern analysis and machine intelligence, vol.
42, no. 10, pp. 2702–2719, 2019.
P. Sun et al., “Scalability in perception for autonomous driving:
Waymo open dataset,” in Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, 2020, pp. 2446–2454.
J. Bock, R. Krajewski, T. Moers, S. Runde, L. Vater, and L. Eckstein,
“The ind dataset: A drone dataset of naturalistic road user trajectories
at german intersections,” in 2020 IEEE Intelligent Vehicles Symposium
(IV), 2019, pp. 1929–1934.
R. Krajewski, T. Moers, J. Bock, L. Vater, and L. Eckstein, “The round
dataset: A drone dataset of road user trajectories at roundabouts in
germany,” in 2020 IEEE 23rd International Conference on Intelligent
Transportation Systems (ITSC), 2020, pp. 1–6.
T. Moers, L. Vater, R. Krajewski, J. Bock, A. Zlocki, and L. Eckstein,
“The exiD Dataset: A Real-World Trajectory Dataset of Highly
Interactive Highway Scenarios in Germany,” in 2022 IEEE Intelligent
Vehicles Symposium (IV), 2022, pp. 958–964.
W. Zhan et al., “Interaction dataset: An international, adversarial and
cooperative motion dataset in interactive driving scenarios with
semantic maps,” arXiv preprint arXiv:1910.03088, 2019.
A. Robicquet, A. Sadeghian, A. Alahi, and S. Savarese, “Learning
social etiquette: Human trajectory understanding in crowded scenes,”
in European conference on computer vision, 2016, pp. 549–565.
D. Gloudemans et al., “Interstate-24 motion: Closing the loop on smart
mobility,” in 2020 IEEE Workshop on Design Automation for CPS and
IoT (DESTION), 2020, pp. 49–55.
Y. Xu et al., “SIND: A Drone Dataset at Signalized Intersection in
China,” in 2022 IEEE 25th International Conference on Intelligent
Transportation Systems (ITSC), 2022, pp. 2471–2478.
P. Spannaus, P. Zechel, and K. Lenz, “AUTOMATUM DATA: Dronebased highway dataset for the development and validation of
automated driving software for research and commercial applications,”
in 2021 IEEE Intelligent Vehicles Symposium (IV), 2021, pp. 1372–
1377.
E. Barmpounakis and N. Geroliminis, “On the new era of urban traffic
monitoring with massive drone data: The pNEUMA large-scale field
experiment,” Transportation research part C: emerging technologies,
vol. 111, pp. 50–71, 2020.
T. Menzel, G. Bagschik, and M. Maurer, “Scenarios for development,
test and validation of automated vehicles,” in 2018 IEEE Intelligent
Vehicles Symposium (IV), 2018, pp. 1821–1827.
E. de Gelder et al., “Towards an ontology for scenario definition for
the assessment of automated vehicles: An object-oriented framework,”
IEEE Transactions on intelligent vehicles, vol. 7, no. 2, pp. 300–314,
2022.
G. Bagschik, T. Menzel, and M. Maurer, “Ontology based scene
creation for the development of automated vehicles,” in 2018 IEEE
Intelligent Vehicles Symposium (IV), 2018, pp. 1813–1820.
E. de Gelder et al., “Scenario Parameter Generation Method and
Scenario Representativeness Metric for Scenario-Based Assessment of
Automated Vehicles,” IEEE Trans. Intell. Transport. Syst., vol. 23, no.
10, pp. 18794–18807, 2022, doi: 10.1109/TITS.2022.3154774.
D. Karunakaran, J. S. Berrio, S. Worrall, and E. Nebot,
“Parameterisation of lane-change scenarios from real-world data,” in
2022 IEEE 25th International Conference on Intelligent
Transportation Systems (ITSC), 2022, pp. 2607–2613.
B. Zhu, P. Zhang, J. Zhao, and W. Deng, “Hazardous Scenario
Enhanced Generation for Automated Vehicle Testing Based on
Optimization Searching Method,” IEEE Trans. Intell. Transport. Syst.,
vol. 23, no. 7, pp. 7321–7331, 2022, doi: 10.1109/TITS.2021.3068784.
A. Zlocki et al., “Logical Scenarios Parameterization for Automated
Vehicle Safety Assessment: Comparison of Deceleration and Cut-in
Scenarios from Japanese and German Highways,” IEEE Access, 2022.
A. Corso, P. Du, K. Driggs-Campbell, and M. J. Kochenderfer,
“Adaptive stress testing with reward augmentation for autonomous
vehicle validation,” in 2019 IEEE Intelligent Transportation Systems
Conference (ITSC), 2019, pp. 163–168.
D. Nalic, H. Li, A. Eichberger, C. Wellershaus, A. Pandurevic, and B.
Rogic, “Stress Testing Method for Scenario-Based Testing of
Automated Driving Systems,” IEEE Access, 2020.
E. de Gelder, E. Cator, J.-P. Paardekooper, O. O. den Camp, and B. de
Schutter, “Constrained sampling from a kernel density estimator to
generate scenarios for the assessment of automated vehicles,” in 2021
IEEE Intelligent Vehicles Symposium Workshops (IV Workshops),
2021, pp. 203–208.
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
Y. Xu, Y. Zou, and J. Sun, “Accelerated testing for automated vehicles
safety evaluation in cut-in scenarios based on importance sampling,
genetic algorithm and simulation applications,” Journal of intelligent
and connected vehicles, vol. 1, no. 1, pp. 28–38, 2018.
Z. Huang, H. Lam, D. J. LeBlanc, and D. Zhao, “Accelerated
Evaluation of Automated Vehicles Using Piecewise Mixture Models,”
IEEE Trans. Intell. Transport. Syst., vol. 19, no. 9, pp. 2845–2855,
2018, doi: 10.1109/TITS.2017.2766172.
S. Jesenski, N. Tiemann, J. E. Stellet, and J. M. Zöllner, “Scalable
generation of statistical evidence for the safety of automated vehicles
by the use of importance sampling,” in 2020 IEEE 23rd International
Conference on Intelligent Transportation Systems (ITSC), 2020, pp. 1–
8.
Y. Akagi, R. Kato, S. Kitajima, J. Antona-Makoshi, and N. Uchida, “A
risk-index based sampling method to generate scenarios for the
evaluation of automated driving vehicle safety,” in 2019 IEEE
Intelligent Transportation Systems Conference (ITSC), 2019, pp. 667–
672.
C. Xu, W. Zhao, C. Wang, T. Cui, and C. Lv, “Driving Behavior
Modeling and Characteristic Learning for Human-like Decisionmaking in Highway,” IEEE Transactions on intelligent vehicles, pp. 1–
12, 2022, doi: 10.1109/TIV.2022.3224912.
L. Wang, C. Fernandez, and C. Stiller, “High-Level Decision Making
for Automated Highway Driving via Behavior Cloning,” IEEE
Transactions on intelligent vehicles, 2022.
B. Yang, X. Song, Z. Gao, and N. Zhu, “Trajectory planning for
vehicle collision avoidance imitating driver behavior,” Proceedings of
the Institution of Mechanical Engineers, Part D: Journal of
Automobile Engineering, vol. 236, no. 5, pp. 907–926, 2022.
Mobileye, Responsibility-Sensitive Safety (RSS) A Model for Safe
Autonomous Driving. [Online]. Available: https://www.mobileye.com/
responsibility-sensitive-safety/ (accessed: Jun. 19 2021).
B. Gassmann, F. Oboril, I. Alvarez, and K.-U. Scholl, “An Online
Safety Guard For Intelligent Transportation Systems,” in 2021 IEEE
International Intelligent Transportation Systems Conference (ITSC),
2021, pp. 2575–2581.
N. Weber, D. Frerichs, and U. Eberle, “A simulation-based, statistical
approach for the derivation of concrete scenarios for the release of
highly automated driving functions,” in AmE 2020-Automotive meets
Electronics; 11th GMM-Symposium, 2020, pp. 1–6.
UNECE, UN Regulation No. 157 - Automated Lane Keeping Systems
(ALKS). [Online]. Available: https://undocs.org/ECE/TRANS/WP.29/
2020/81 (accessed: Nov. 29 2022).
A. Fries, F. Fahrenkrog, K. Donauer, M. Mai, and F. Raisch, “Driver
Behavior Model for the Safety Assessment of Automated Driving,” in
2022 IEEE Intelligent Vehicles Symposium (IV), 2022, pp. 1669–1674.
L. Duan and F. Chen, “The future of advanced driving assistance
system development in China,” in Proceedings of 2011 IEEE
International Conference on Vehicular Electronics and Safety, 2011,
pp. 238–243.
M. Masmoudi, H. Ghazzai, M. Frikha, and Y. Massoud, “Object
Detection Learning Techniques for Autonomous Vehicle
Applications,” in 2019 IEEE International Conference on Vehicular
Electronics and Safety (ICVES), 2019, pp. 1–5.
A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom,
“Pointpillars: Fast encoders for object detection from point clouds,” in
Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, 2019, pp. 12697–12705.
H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S.
Savarese, “Generalized intersection over union: A metric and a loss for
bounding box regression,” in Proceedings of the IEEE/CVF conference
on computer vision and pattern recognition, 2019, pp. 658–666.
Y. Huang, F. Zhu, G. Jia, and Y. Zhang, “A Slide Window Variational
Adaptive Kalman Filter,” IEEE Transactions on Circuits and Systems
II: Express Briefs, vol. 67, no. 12, pp. 3552–3556, 2020, doi:
10.1109/TCSII.2020.2995714.
ASAM, OpenSCENARIO. [Online]. Available: https://www.asam.net/
standards/detail/openscenario/ (accessed: May 10 2021).
R. Song and B. Li, “Surrounding vehicles’ lane change maneuver
prediction and detection for intelligent vehicles: A comprehensive
review,” IEEE Trans. Intell. Transport. Syst., 2021.
S. Liu et al., “Calibration and evaluation of responsibility-sensitive
safety (RSS) in automated vehicle performance during cut-in
scenarios,” Transportation research part C: emerging technologies,
vol. 125, p. 103037, 2021.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in IEEE Transactions on Intelligent Vehicles. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TIV.2023.3260902
12
[62]
[63]
[64]
[65]
N. Nadimi, H. Behbahani, and H. Shahbazi, “Calibration and
validation of a new time-based surrogate safety measure using fuzzy
inference system,” Journal of Traffic and Transportation Engineering
(English Edition), vol. 3, no. 1, pp. 51–58, 2016.
A. Rodionova, I. Alvarez, M. S. Elli, F. Oboril, J. Quast, and R.
Mangharam, “How safe is safe enough? Automatic Safety Constraints
Boundary Estimation for Decision-Making in Automated Vehicles,” in
2020 IEEE Intelligent Vehicles Symposium (IV), 2020, pp. 1457–1464.
H. Nakamura et al., “Defining reasonably foreseeable parameter ranges
using real-world traffic data for scenario-based safety assessment of
automated vehicles,” IEEE Access, vol. 10, pp. 37743–37760, 2022.
C. Wang, C. Popp, and H. Winner, “Acceleration-Based Collision
Criticality Metric for Holistic Online Safety Assessment in Automated
Driving,” IEEE Access, vol. 10, pp. 70662–70674, 2022.
Yuxin Zhang (Member IEEE) received
the Joint Ph.D. degree in Vehicle
Engineering from Jilin University, China
and UC Berkeley, USA, in 2016. From
2016 to 2019, he worked as a postdoc
researcher in Systems Engineering in Jilin
University, China, and also a Safety
Researcher in UISEE, Beijing, China.
Since 2019, he worked as an Associate
Professor at State Key Laboratory of Automotive Simulation
and Control, Jilin University, China, and also as a Safety
Researcher in DJI Automotive. His main research interests
include automated driving systems safety engineering,
functional safety, and safety of the intended functionality.
Prof. Zhang serves as Chair of SAE International Automated
Driving FuSa and SOTIF Seminar from 2021, Expert Member
of SAE International Automated Driving Safety Technical
Committee from 2020, Technical Representative of Scenariobased Automated Driving Safety Standard ISO 34502,
Functional Safety and SOTIF Standard ISO 26262 and ISO
21448, Safety for the Evaluation of Autonomous Products
Standard UL 4600, and STPA Recommended Practices
Standard SAE J3187 from 2019. He is also a Functional Safety
Professional certified by TÜV SÜD, Germany from 2018.
Cheng Wang received the B.Sc. degree
from the school of automotive engineering
at Wuhan University of Technology,
Wuhan, China, in 2014 and the M.Sc.
degree from the school of automotive
studies at Tongji University, Shanghai,
China, in 2017 and the Ph.D. degree from
the institute of automotive engineering at
the Technical University of Darmstadt,
Darmstadt, Germany, in 2021. He currently works as a research
associate at the University of Edinburgh. His research interest
includes explainable AI and safety verification and validation
of autonomous vehicles.
Ruilin Yu received B.S. degree in vehicle
engineering from Hefei University of
Technology, Hefei, China, in 2021, and is
currently pursuing a master's degree in Jilin
University. His research interests include
vehicle trajectory data analysis, SOTIF and
perception system of automated vehicles.
Luyao Wang received B.S. degree in
vehicle engineering from Xi'an University
of Technology, Xi’an, China, in 2021.
Currently, he is a graduate student majoring
in vehicle engineering in State Key
Laboratory of Automotive Simulation and
Control, Jilin University. His research
interests include automated driving safety
and planning.
Wei Quan received the B.Sc in automobile
engineering from Tongji University and
the University of Applied Sciences for
Engineering and Economics Berlin,
German with double degree. After that, he
received M.Sc in Automotive and Engine
Engineering from the University of
Stuttgart, Stuttgart, German. During his
studies, he worked on the project of
detection and tracking of ARS441 (Advanced Radar Sensor) in
Continental AG, the research project of machine learning
optimization processes in the University of Stuttgart and the
project of ultrasonic sensor modeling in Robert Bosch GmbH.
After his studies, he joined DJI Automobile and took part in the
project of AD4CHE (Aerial Dataset for Congested Highway &
Expressway).
Yang Gao received the M.Sc. degree from
Huazhong University of Science and
Technology (HUST), Wuhan, China, in
2019. He is now a member of DJI
Automotive. His research interests include
deep learning and monocular SLAM.
Xiaofei Li received the M.Sc. degree in
automotive engineering from Chang’an
University, Xi’an, China, in 2019. In the
same year, he joined DJI Automotive and
engaged in Safety of the Intended
Functionality (SOTIF) related work. His
mainly
research
interests
include
application of SOTIF methodology based
on V model in the development process of
Autonomous Driving Systems (ADS), safety analysis of
planning and control algorithms and improvement of algorithm
performance, and the scenario-based testing methodology used
to complete the verification and validation of ADS.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on March 27,2023 at 08:25:58 UTC from IEEE Xplore. Restrictions apply.
Download