Content uploaded by Xiaowei Shi
Author content
All content in this area was uploaded by Xiaowei Shi on Jul 29, 2021
Content may be subject to copyright.
1
A Car-Following-Based Method for Vehicle Trajectory Connection 1
2
Xiaowei Shi, Dongfang Zhao, Xiaopeng Li* 3
4
Department of Civil and Environmental Engineering, University of South Florida, Tampa, FL, 33620, USA 5
6
Abstract 7
High-accuracy long-coverage vehicle trajectory data can benefit the investigations of various traffic 8
phenomena. However, most of the existing vehicle trajectory datasets may miss parts of the trajectories due 9
to sensing limitation and thus contain substantial broken vehicle trajectories. This may restrict the analyses 10
of traffic dynamics and the validations of corresponding findings. To address this issue, this paper proposed 11
a car-following-based (CF-based) vehicle trajectory connection method that can connect broken vehicle 12
trajectories. The proposed method can not only fill missing data points caused by detection errors but also 13
can help connect vehicle trajectory datasets from different sensors. To illustrate the performance of the 14
proposed CF-based method, the proposed method was employed to process a series of vehicle trajectory 15
datasets. The datasets were extracted from aerial videos recorded at several successive spaces of Interstate-16
75, United States. Compared with several benchmark trajectory connection methods, the results showed 17
that the proposed method has advantages in both trajectory connection accuracy and trajectory consistency 18
aspects. To the best of the authors’ knowledge, the dataset processed by the proposed CF-based method, 19
named by HIGH-SIM, is the longest vehicle trajectory dataset in the literature. The dataset has been 20
published online for public use. 21
22
Keywords: Car-Following Model, Vehicle Kinematics, Vehicle Trajectory Connection, Vehicle Trajectory 23
Dataset 24
1. Introduction 25
Vehicle trajectories, as the positions of a stream of vehicles over time along a guideway (Daganzo, 26
1997), can provide informative insights into various traffic-related studies, such as traffic flow theory, 27
traffic simulation modeling, traffic safety measures, and traffic management. Kim and Cao (2010) classified 28
existing vehicle trajectory data collection methods into two categories, including vehicle-based methods 29
(Anuar and Cetin, 2017; Coifman et al., 2016; Victor, 2014; Zhao et al., 2017) and video-based methods 30
(Babinec et al., 2014; Kim et al., 2019; Xu et al., 2017; Zhao and Li, 2019). The vehicle-based methods 31
suggest collecting vehicle trajectory data by probe vehicles. The probe vehicles equipped with 32
position/distance measurement sensors (e.g., Lidar, Radar, GPS) travel along the testing road segment and 33
thus the trajectories of the probe vehicles, as well as the surrounding vehicles, can be obtained. The 34
drawback of the vehicle-based methods is apparent. Since only the trajectories of the probe vehicle and its 35
surrounding vehicles will be collected, the data penetration rate with respect to the entire traffic is often 36
very low. On the other hand, the video-based methods extract vehicle trajectories from traffic videos, which 37
are recorded by roadside or aerial cameras over the investigated road segments. By tracking each vehicle’s 38
motions in the videos, the trajectories of all vehicles traveling along the road segments can be obtained. 39
Due to the rapid developments of aerial video recording technologies, e.g., unmanned aerial 40
vehicles with high-definition cameras, video-based trajectory data are becoming more appealing with 41
advantages such as scalability, flexibility, economy, and unbiasedness (Kim and Cao, 2010). Thus, in 42
recent years, the collection of video-based trajectory data has attracted wide attention of researchers both 43
from industry and academia (NGSIM, 2006; Apeltauer et al., 2015; Azevedo et al., 2014; Barmpounakis 44
and Geroliminis, 2020; Chen et al., 2020; Krajewski et al., 2018; Punzo et al., 2011; Shi et al., 2021). 45
Despite these merits of the video-based data collection method with the advanced technologies, we would 46
2
like to point out two types of fundamental issues in the datasets, including detection errors and limited
1
ranges as specified below. 2
Detection errors can be further divided into two categories in terms of the origin of the errors, such 3
as the source errors and extraction errors. The source errors are caused due to the partially losing video feed. 4
Since the videos are recorded in the air, it is possible that the target sites are partially blocked by facilities 5
(e.g., bridges, signals, billboards, buildings) around the road, and thus the vehicle motions in the blocked 6
areas are lost. As illustrated in Figure 1, a bridge across the recorded road segment leads to the loss of the 7
trajectories of all vehicles underneath. 8
Time
0
Space
Bridge
Trajectory
9
Figure 1. Source errors illustration. 10
The extraction errors are caused by trajectory extraction methods. To obtain trajectory data from 11
video sources, various trajectory extraction methods have been proposed in the literature to track each 12
vehicle’s motions in videos. Reliable vehicle tracking is an extremely challenging problem, which falls into 13
the computer vision field and has attracted intensive studies in the past few years (Jazayeri et al., 2011; 14
Wang et al., 2008; Zhang et al., 2007). Although plenty of high-performance methods have been proposed 15
to tackle this problem, as far as we know, none of them can always guarantee 100% detection rates. 16
Particularly, some exogenous factors, such as the weather, light, wind, camera angle, etc., may impact the 17
quality of the recorded video as well as the detection rates. Thus, missing detections in the trajectory 18
extraction process seem inevitable. Once a missing detection happens, the original long trajectory will be 19
broken into shorter trajectories by the missing points, which degrades the quality of the obtained dataset. 20
As illustrated in Figure 2, due to the angles of the cameras, the vehicle sizes in the video are gradually 21
reduced as approaching the end of the road segment. This leads to the decrease of the detection rates and 22
thus the extracted trajectories are broken into small pieces in the downstream. 23
3
Time
0
Space
Trajectory
1
Figure 2. Extraction errors illustration. 2
To overcome the detection errors issue, data post-processing studies on trajectory smoothing were 3
conducted (Lee and Krumm, 2011; Punzo, 2009; Siddique, 2019; Wu, 2018; Xin, 2008). Interested readers 4
can refer to Lee and Krumm (2011) for a detailed review of vehicle trajectory post-processing. The studies 5
investigating the vehicle trajectory connection are scarce in the literature. Kim et al. (2019) connected 6
broken trajectories by extending the trajectory of each vehicle in three seconds with a constant speed. Tong 7
et al. (2017) connected broken trajectories by extending the linear interpolation method considering the 8
historical data and contextual arrival information. Zhang and Jin (2019) performed vehicle trajectory data 9
cleaning and connection (or stitching) while extracting the data, but they did not specify the adopted 10
trajectory connection method. One can see that although this topic started to draw attention in recent years 11
as the emergence of aerial video recording technologies, the existing methods in the literature still are fairly 12
simple, which may only yield results with limited quality without capturing driving behavior and physics 13
(i.e., car-following behavior). To the best of the author’s knowledge, Sazara et al. (2017) is the only study 14
that connects broken vehicle trajectories considering car-following characteristics. They collected raw 15
vehicle trajectory data by the vehicle-based method using Lidar sensors. To connect the broken trajectories, 16
they first extended the broken trajectories with Gipp’s car-following model, and then a simple reshaped 17
operation was proposed to connect the two pieces of the broken trajectories. Despite the success of this 18
study, we would like to point out that there is a strong underlying assumption. That is, the vehicle IDs of 19
the broken trajectories are given and thus they can be easily matched by each individual vehicle. This 20
assumption can be satisfied easily if the datasets are collected by the vehicle-based method (e.g., Lidar, 21
Radar, etc.). However, how to efficiently match two broken trajectories without explicit vehicle ID tags 22
when the size of the extracted dataset is large (collected by the video-based methods) is an intriguing 23
problem. This problem even becomes harder if the recorded videos include multiple lanes, in which broken 24
vehicle trajectories across different lanes may belong to the same vehicle due to possible lane change 25
maneuvers. Figure 3 illustrates the described lane change maneuvers on a simple two-lane road. Figure 3 26
(a) and (b) are the trajectories from lane 1 and lane 2, respectively. If a vehicle makes a lane change from 27
lane 1 to lane 2, as highlighted in both Figure 3 (a) and (b), the trajectory of this vehicle for each lane is 28
“incomplete”. A wrong connection may happen if a broken trajectory is coincidently near to the 29
“incomplete” trajectory, as illustrated in the Figure 3 (b). In addition, without further investigations of 30
feasible ranges for vehicle trajectories in the space-time diagram, the reshaped operation may easily 31
generate trajectories that violate vehicle kinematic constraints. 32
4
Time
0
Space
Time
0
Space
Detec tion er ror
Lane c hange
Lane c hange
Lane 1 Lane 2
1
(a) (b) 2
Figure 3. Lane change scenario illustration. 3
The second fundamental issue that may exist in the data is limited ranges. Compared to the vehicle-4
based collection methods, the video-based collection methods have advantages on the data collection scale 5
(e.g., the number of collected trajectories). However, the detection ranges (e.g., the length of vehicle 6
trajectories) are greatly constrained by both the detection accuracy and the altitude of the cameras. That is, 7
the cameras with a higher altitude can record the traffic video of a greater range, yet the requirement on the 8
detection accuracy of the cameras will be higher. Table 1 lists the length of several video-based trajectory 9
datasets reported in the literature. We see that the longest trajectory dataset is the NGSIM dataset, the length 10
of which is 640 meters. However, it is still insufficient to observe a full life cycle of traffic phenomena (e.g., 11
traffic bottleneck development and dissipation). To break the constraints on the detection ranges, a direct 12
solution is to either improve the performance of cameras (detection accuracy) or fly the cameras to a higher 13
altitude. However, these solutions belong to the optics or mechanical field and are out of the knowledge 14
scope of a transportation engineer. To resolve this issue from the transportation engineering perspective, 15
Raju et al. (2021) proposed a concept of stitching trajectory data from different cameras. However, methods 16
of stitching videos from different cameras reliably have not been proposed following this seminal concept. 17
Table 1. Trajectory length comparisons among video-based trajectory datasets. 18
Publication Location
Trajectory
length
Availa
bility
Link
NGSIM
Highways, United
States
640 meters Yes
https://ops.fhwa.dot.gov/trafficanal
ysistools/ngsim.htm
Krajewski et al., (2018)
Highways, Geman
420 meters
Yes
https://www.highd-dataset.com/#
Azevedo et al. (2014) Motorway, Portugal 500 meters No /
Kim et al. (2019)
Expressways, Korea
188 meters
No
/
Babinec et al. (2014)
Ring road, Czech
Republic
300 meters No /
Xu et al. (2017)
Freeway and urban
roads, China
160 meters No /
Barmpounakis and
Geroliminis (2020)
Urban roads, City of
Athens, Greece
350 meters Yes https://open-traffic.epfl.ch/
Our dataset: HIGH-SIM
Highways, United
States
2,438
meters
Yes https://github.com/CATS-Lab-USF
5
Overall, one can see that the existing video-based datasets are constrained by the aforementioned
1
two issues, which may restrict the analyses of traffic dynamics and the validations of corresponding findings. 2
To help circumvent these issues, this paper proposed a car following-based (CF-based) method for vehicle 3
trajectory connection, in which the broken vehicle trajectories were connected based on the car-following 4
theory. The proposed method can not only fill missing data points caused by detection errors (solving the 5
detection errors issue) but also can help connect trajectory data from different sensors (solving the limited 6
ranges issue). To illustrate the performance of the proposed CF-based method, we processed a series of 7
vehicle trajectory datasets that were extracted from aerial videos recorded at several successive segments 8
of Interstate-75 (28°08'37.2"N 82°22'58.8"W to 28°10'16.2"N 82°23'38.0"W), United States (See Shi et al. 9
(2021) for the detailed trajectory extraction method). The results showed that the proposed method 10
outperformed several benchmark methods in both trajectory connection accuracy and trajectory consistency 11
aspects. Moreover, the dataset processed by the proposed CF-based method, named by HIGH-SIM, has 12
been published in the data shared link of both Federal Highway Administration, U.S. Department of 13
Transportation (https://highways.dot.gov/) and Connected and Autonomous Transportation Systems Lab, 14
University of South Florida (https://github.com/CATS-Lab-USF) for public use. Based on the best of the 15
authors’ knowledge, the HIGH-SIM dataset is the longest high-resolution vehicle trajectory dataset 16
capturing all vehicles in the traffic stream among the publicly available ones, which includes a full life 17
cycle of the bottleneck. This paper focuses on the trajectory connection method we adopted to generate the 18
dataset. Interested readers can refer to Shi et al. (2021) for a detailed introduction about the HIGH-SIM 19
dataset. 20
The disposition of this paper is as follows. Section 2 describes the trajectory connection problem 21
investigated in this paper. Section 3 exhibits the proposed CF-based vehicle trajectory connection method. 22
A series of numerical experiments are conducted in Section 4 to demonstrate the performance of the 23
proposed method. Section 5 concludes the paper and discusses the limitations of the proposed method and 24
possible solutions. 25
2. Problem Statement 26
The investigated trajectory connection problem is stated as follows. Let denote the set of 27
vehicle trajectories in the investigated dataset, and each trajectory is labeled as {1,2, ,}. The 28
trajectory data are captured in a spatial range [0, ] within a continuous time period [0, ]. In practice, the 29
data are usually only available at discrete time points, and thus the time period is discretized into time points 30 : = {0, 1,2, ,} with time interval : = /. Each trajectory is defined as the composition of a 31
pair of arrays, (,) with time point array = [,,,] denoting the consecutive time points and 32
location array = [,,,] denoting the corresponding location coordinates (e.g., mileposts) of 33
the trajectory at these time points, where is the total number of the data points in trajectory . In addition, 34 : = [,,,], = [,,,], and = [,,,] denote the velocities, 35
accelerations, and preceding trajectory’s labels of trajectory at the corresponding time points, respectively. 36
If there is no preceding trajectory preceding to trajectory at time , then is set to 0. The length of the 37
vehicle associated with trajectory is denoted by . 38
It is expected that a great number of trajectories in dataset only capture a portion of the subject 39
vehicles’ motions due to the aforementioned issues. These trajectories, referred as broken trajectories, can 40
be identified if they satisfy either of the following conditions: (1) if > 0 and > 0, ; or (2) if 41 < and <,. We collect all broken trajectories from and denote the broken trajectory 42
dataset as . To understand the missing segment of each broken trajectory, according to the 43
conditions, we further classified the broken trajectory dataset into three subsets, such as ,,
. 44
6
For those broken trajectories that satisfy condition (1), we store them into , which means that these 1
trajectories are broken at the origin side (i.e., a segment before (,) is missed). For those broken 2
trajectories that satisfy condition (2), we store them into , which means that these trajectories are broken 3
at the end side (i.e., a segment after (,) is missed). For those broken trajectories that satisfy both 4
conditions (1) and (2), we store them into
, which means that these trajectories are broken at both sides. 5
Thus, we have {}=, and = {
}. For the specific example shown in Figure 4, 6 = {1,2, ,8}, = {2,3,5,6,7}, = {3,6,7}, = {2,5,6},
= {6}. 7
8
Figure 4. Problem statement. 9
This paper aims to connect these broken trajectories considering vehicle driving characteristics (i.e., 10
car-following behavior) and thus enhances the quality of the dataset. To this end, this paper proposes a CF-11
based method for vehicle trajectory connection, in which the broken vehicle trajectories can be connected 12
based on car-following theory. 13
3. Methodology 14
The proposed CF-based vehicle trajectory connection method includes two steps: (1) car-following 15
model calibration; and (2) vehicle trajectory connection. 16
3.1 Car-following model calibration 17
To connect the broken vehicle trajectories considering the car-following behavior, we first calibrate 18
a car-following model with the dataset. The car-following model adopted in this paper is the Pitt car-19
following model (Drew, 1968), which is shown in Equation (1). 20
=+ 3.04878 ++(),
(1)
where is the spacing headway between the leading vehicle and following vehicle, is the length of the 21
leading vehicle, is a sensitivity factor, is the speed of the following vehicle, is the speed of the 22
leading vehicle, and is a calibration constant ( >,= 0.1; otherwise, = 0). Note that this 23
paper focuses on connecting the broken trajectories considering the car-following behavior. We select the 24
7
Pitt car-following model without further comparing it to other car-following models. The comparison
1
among different car-following models may be out of the scope of this paper, and interested readers can refer 2
to (van Hinsbergen et al., 2015, Rahman, 2013) for detailed comparisons. 3
With the Pitt car-following model, a series of car-following trajectory pairs, including the time (), 4
location (), and speed () of both leading and following vehicles, is extracted from the dataset to calibrate 5
the model. By referring to the preceding trajectory label array (i.e., ), the car-following trajectory pairs 6
can be obtained easily, i.e., 0, extracting ,, and ,,. With the car-7
following trajectory pairs, we calibrate the Pitt car-following model with a greedy algorithm. The error 8
measurement function is used to evaluate the fitness of the Pitt car-following model, as shown in Equation 9
(2): 10
= ()
/, (2)
where is the standard error of the estimated location () and the real location (), is the total number 11
of the car-following trajectory pairs, is the number of the data captured for trajectory pair . 12
The detailed calibration procedures are shown in Figure 5. The calibration starts by initializing the 13
model parameters and with random values [0,0.1], [0,2], and inputting the convergence 14
criteria and step size ,. In each iteration , the variable , will be updated as =15 (0,1)+, =(0,1)+, where (0,1) is a random number in the 16
range of (0,1). In the next step, the error of the fitness is calculated according to Equation (2). These 17
procedures will be repeated until the variation of the estimation error ||, which means that
18
the optimal parameters , are obtained. 19
20
Figure 5. Pitt car-following model calibration process 21
3.2 Vehicle trajectory connection 22
8
With the calibrated Pitt car-following model, the proposed CF-based vehicle trajectory connection
1
method connects the broken trajectories with the following two sub-steps. Sub-step 1: Extending the broken 2
vehicle trajectories with the calibrated car-following model. Sub-step 2: Connecting the broken trajectories 3
considering vehicle kinematics constraints. 4
Sub-step 1: 5
For those broken trajectories in (or ), the trajectories are extended backward (or forward) 6 time intervals with the calibrated Pitt car-following model. We name the extended trajectories as the 7
transition trajectories in this paper. The time length of the transition trajectories is denoted by , which is 8
a given value dependent on the quality of dataset . That is, a large value is set if the time length of the 9
missing trajectories is long and vice versa. The effects of different values on the trajectory connection 10
will be analyzed in Section 4. Assume that there are two broken trajectories labeled by and 11 , and the preceding vehicle trajectory are labeled by and . The arrays of the transition trajectory 12
for trajectory (i.e.,
,,,
,
) are generated according to Equations (3)-(8), where 13
superscript indicates that the trajectory extends backward. 14
=
,=
()
()
,{1,2, ,1}, (3)
=(),{1,2, ,}, (4)
=
,=
arg
(
)
=,{1,2, ,1}, (5)
=+ 3.04878 ++
(
)
, (6)
=
,=
(
()
)/,{1,2, ,1}, (7)
: =
=,{1,2, ,}. (8)
Similarly, the arrays of the transition trajectory for trajectory (i.e.,
,,,
,
) 15
can be calculated according to Equations (9)-(14), where superscript indicates that the trajectory extends 16
forward. 17
=+
,= 1
()
+
()
,{2,3, ,}, (9)
=+,{1,2, ,}, (10)
=arg(
)
=,{1,2, ,}, (11)
=+ 3.04878 ++
(
)
, (12)
9
=(
)/,= 1
(
()
)/,{2,3, ,}, (13)
: =
=,{1,2, ,}. (14)
We illustrate the obtained transition trajectories with trajectory 6 described previously, which
1
belongs to
including both backward and forward extensions. As shown in Figure 6, by referring to the 2
preceding vehicle trajectory, the transition trajectories are generated regarding the Pitt car-following model 3
colored with green. Here we denote the set of all transition trajectories as . For the specific example that 4
we described in Section 2,
,
,
,
,
,
. 5
6
Figure 6. Transition trajectory illustration. 7
Sub-step 2: 8
In step 2, we propose the criterion for connecting two broken trajectories through the transition 9
trajectories. In addition, vehicle kinematics constraints are considered while connecting the trajectories. 10
The criterion for trajectory connection is defined as follows. Assume that there is an arbitrary 11
transition trajectory ( can be either or ). For each trajectory ,, the location 12
difference between the transition trajectory and the broken trajectory, which is denoted as , can be 13
calculated by 14
: =
{,,,}(
)
{,,,}(
)
(
)
()
,
=max(
,),=min
,
.
(15)
10
If we have < and =min ,, we consider the two trajectories and are the 1
trajectories of the same vehicle and thus we can connect them. is a given error term for evaluating the 2
location difference between the transition trajectory and the broken trajectory. Note that a large value 3
may cause a wrong connection, while a small value may reject a correct connection due to the estimation 4
errors. In the practice implementation of the proposed algorithm, different values shall be given 5
regarding the quality of the raw datasets. We will further analyze the performance of different values in 6
Section 4. 7
Note that the transition trajectory may not perfectly connect to the origin/end point of other broken 8
trajectories due to the errors of the trajectory estimation (i.e., ) as illustrated in Figure 7. It can be seen that 9
there is a gap between the transition trajectory and the broken trajectory. To connect the trajectories 10
(trajectories and ) considering vehicle kinematics constraints, we propose a time-space-cone-based 11
trajectory connection method as follows. 12
13
Figure 7. Time-space cone illustration. 14
Assume that the kinematics constraints for each vehicle are given, including the maximum speed, 15
maximum and minimum accelerations, which are denoted as ,, in this paper. Then for each 16
pair of ready-to-connect trajectories ,, we can uniquely generate two boundary trajectories starting 17
from the end (or the origin) of the trajectories, named as the slowest and fastest trajectories as illustrated in 18
Figure 7. Each pair of the proposed boundary trajectories (a slowest trajectory and a fastest trajectory), 19
which starts at the same point (e.g., either (,) or (,)), covers all feasible trajectories passing 20
the point and forms a cone-shaped area in the time-space graph. Starting from the end (or the origin) of the 21
trajectory, the slowest trajectory is generated by operating the vehicle forward (or backward) with the 22
11
minimum acceleration (i.e., min{, 0}) until time (or time 0) in the time-space graph. The physical 1
meaning of the slowest trajectory is that the slowest trajectory is a lower bound to all feasible trajectories, 2
which suggests that all trajectories starting from the end (or the origin) of the trajectory operate faster than 3
the slowest trajectory. Similarly, the fastest trajectory is generated by operating the vehicle forward (or 4
backward) with the maximum acceleration (i.e., ) until the speed of the vehicle getting to the 5
maximum speed (i.e., ). Also, the fastest trajectory is generated until the trajectory reaching time 6
(or time 0) in the time-space graph. The physical meaning of the fastest trajectory is that the fastest 7
trajectory is an upper bound to all feasible trajectories, which suggests that all trajectories starting from the 8
end (or the origin) of the trajectory operate slower than the fastest trajectory. 9
The equations for generating the slowest and fastest trajectories for trajectory are shown in 10
Equations (16)-(23). To avoid repetition, we only show the equations for trajectory , and those of trajectory 11 can be obtained easily by considering that the vehicle operates reversely in the time-space graph. 12
=+0.5 ,= 1
()
+
()
0.5
()
,{2,3, , (
)/}, (16)
=+,{1,2, , ()/},
(17)
=max0, ,= 1
max(0,
()
) , {2,3, , (
)/}, (18)
=, /( )
0,
, (19)
+
+ 0.5
,= 1
()
+
()
+ 0.5
()
,{2,3, , (
)/}, (20)
=
+,{1,2, , (
)/}, (21)
=min(,+ ) , ()/( )
,
, (22)
=
, (
)/(
)
0,
. (23)
With these boundary trajectories, all feasible trajectories that not only satisfy the vehicle kinematics
13
constraints but also connect the two broken trajectories are restricted into the shadow area as illustrated in 14
Figure 7. By considering the transition trajectory that we obtained in Step 1, which we denote as the old 15
transition trajectory (i.e.,
), the new transition trajectory that can connect the two broken trajectories 16
(trajectories and ) is obtained by the trajectory that has the minimum location difference with the old 17
transition trajectory in the shadow area. We denote the new transition trajectory as
. The equation to 18
obtain the new transition trajectory is shown in Equations (24) and (25). The and sets of the new 19
transition trajectory can be obtained according to Equations (24) and (25) with a given location value. 20
=arg
min(
) , 1,2, , ()/, (24)
=
+,1,2, , (
)/ (25)
12
Figure 8 illustrates the new transition trajectory. With this, the broken trajectories and are 1
connected. The new trajectory is formed by combing the arrays of these three trajectories, such as 2
trajectories , new transition trajectory, and trajectory . By repeating these two steps, the issues we 3
revealed previously, i.e., detection errors and limited ranges, can be successfully fixed. Nonetheless, we 4
would like to point out one obvious limitation of the proposed algorithm. That is, the new transition 5
trajectory is obtained without considering the acceleration variations, which suggests that the acceleration 6
of the transition trajectory may dramatically change at the intersection point of two trajectories, e.g., the 7
intersection point of the old transition trajectory and fastest trajectory as shown in Figure 8. This limitation 8
can be circumvented by some trajectory smoothing techniques (Lee and Krumm, 2011), e.g., the merging 9
operation proposed by (Li and Li, 2019) can connect two quadratic trajectories with smooth acceleration 10
variations. However, the investigation of this technique may be out of the scope of this paper, and interested 11
readers can refer to (Li and Li, 2019) for more details. 12
13
Figure 8. New transition trajectory illustration. 14
4. Numerical Experiment 15
4.1 Dataset 16
In the numerical experiment, we demonstrate the proposed CF-based vehicle trajectory connection 17
method with a set of raw vehicle trajectory datasets extracted from aerial videos (Shi et al., 2021). As shown 18
in Figure 9 (a), the aerial videos were collected by three 8K cameras on a helicopter from 4:15 – 6:15 pm 19
on Tuesday (May 14, 2019) at 8,000 feet (2,438 meters) long segment of Interstate-75 in Florida, United 20
States. The segment includes bi-directional traffic flow, and we only use vehicle trajectories operating from 21
south to north (down to up in Figure 9 (a)) in this paper. The extracted trajectory datasets contain vehicle 22
trajectories of three regular lanes for vehicle proceeding. From left to right in Figure 9 (a), the three lanes 23
are named Lane 2, Lane 1, and Lane 0, respectively. A sample of the detected vehicles in the trajectory 24
extraction process is shown in Figure 9 (b), in which the detected vehicles are marked with red boxes. The 25
frequency of the extracted datasets is 30 Hz, and the format of the datasets is consistent with the NGSIM 26
dataset for the convenience of further trajectory analysis and public use. 27
13
1
(a) (b) 2
Figure 9. Study area for collecting the aerial videos (Shi et al. (2021)). 3
4
4.2 Trajectory Connection Result 5
Before the trajectory connection, the raw datasets include 283,501 broken vehicle trajectories, 6
which were caused due to the issues we mentioned previously, e.g., missing detections, wrong detections, 7
etc. The speed and acceleration ranges of the trajectories in the raw datasets are [0, 150] ft/s ([0, 45.72] m/s 8
or [0, 165] km/h) and [-20, 20] ft/s2 ([-6.10, 6.10] m/s2). 9
By using the proposed CF-based trajectory connection method to process the extracted raw datasets, 10
283,501 broken vehicle trajectories are eventually connected to 2,184 vehicle trajectories. The processed 11
dataset is named as the HIGH-SIM dataset that has been published online for public use. We plot the vehicle 12
trajectories both before and after the connection in Figure 10 to help readers understand the performance 13
of the proposed method. It can be seen in Figure 10 (a)-(c) that the raw vehicle trajectories have some 14
common brokens, suggesting that the trajectories are extracted from the aerial videos shot by different 15
cameras. We can also observe that the raw vehicle trajectories are broken into a bunch of small pieces while 16
approaching the end of the segment. This is because of the camera angle issue as we described in Figure 2. 17
That is, as vehicles are away from the camera, the vehicle sizes on the video are gradually reduced and thus 18
the detection rates decrease. However, after processing the raw datasets with the proposed method, most of 19
the vehicle trajectories are successfully connected, as shown in Figure 10 (d)-(f). The results show that the 20
HIGH-SIM dataset has more reasonable speed and acceleration distributions than a well-known trajectory 21
dataset, the NGSIM US-101 dataset. We explicitly study the quality of the HIGH-SIM dataset in Shi et al. 22
(2021), and interested readers can refer to it for more details. 23
24
14
Raw datasets
HIGH-SIM dataset
(a) Lane 2
(d) Lane 2
(b) Lane 1
(e) Lane 1
(c) Lane 0
(f) Lane 0
Figure 10. Comparison between the raw datasets and the processed datasets. 1
4.3 Different Methods Comparison 2
We compare the performance of the proposed CF-based vehicle trajectory connection method with 3
several benchmark methods, including Kim et al. (2019)’s method (denoted as the linear-based method), 4
nonlinear-based method (extending the broken trajectory by a quadratic function), and Kalman filter. The 5
testing dataset is generated by broking a set of complete vehicle trajectories of the extracted raw dataset. 6
The raw dataset is used to prevent the external factors that potentially influence the results. For example, if 7
15
we use the HIGH-SIM dataset, the vehicle trajectory connection rate by the proposed method definitely
1
will be higher than other methods. The connection rate is denoted by : = /, where is the number 2
of connected vehicle trajectories and is the number of broken vehicle trajectories. We compare the 3
connection rate of different methods by varying the mean of the broken time length : = () and the 4
variance of the broken time length : = () of the trajectories, where is the time length of the 5
trajectory segment that we deleted from each trajectory. The varying range of the mean broken time length 6
is [0.1, 3] second and that of the variance of the broken time length is [0, 64]. 7
Figure 11 plots the connection rate for each method with varying the mean of the broken time 8
length. We can observe that the connection rate ranges of the studied methods are [0.9, 1], [0, 0.2], [0.2,0.5], 9
[0.5,0.9] for the proposed method, the linear-based method, the nonlinear-based method, and Kalman filter, 10
respectively. The proposed method outperforms all benchmark methods regarding the connection rate. One 11
reason for this superior performance is that the proposed method incorporates a car-following model into 12
the trajectory connection and thus well captures the driving characteristics, which provides benefits on the 13
trajectory connection. It also can be observed that as the mean of the broken time length increases, the 14
connection rate appears decreasing trend for each method. This indicates that the longer the time length of 15
the missing trajectory is, the harder the trajectory connection will be. However, the connection rate of the 16
proposed method decreases slower than those of the benchmark methods, which further supports the 17
superior performance of the proposed method. 18
19
Figure 11. Connection rate comparison with varying mean broken time length. 20
Further, we change the variance of broken time length (i.e., ) to study the robustness of the 21
proposed method. The variance of the broken time length is varied from 0 to 64, and the connection rates 22
of the methods are shown in Figure 12. We can observe that as the variance of the broken time length 23
increases, the connection rate also appears decreasing trend for each method. This is because that a higher 24
variance of the broken time length gives a higher probability of long broken trajectories and thus leads to a 25
lower trajectory connection rate. Note that although the connection rate of the proposed method is degraded 26
with a high variance of the broken time length, the overall connection rate of the proposed method still is 27
the highest among the four methods, which varies within [0.6, 1]. This result indicates that the proposed 28
method has relatively good robustness when dealing with different quality datasets and thus strengthens the 29
transferability of the proposed method. 30
16
1
Figure 12. Connection rate comparison with varying the variance of broken time length. 2
Moreover, we are also interested in the effects of the two critical parameters on the connection rate 3
of the proposed method, such as the time length of the transition trajectory (i.e., ) and the error term for 4
evaluating the location difference between the transition trajectory and the broken trajectory (i.e., ). Note 5
that the default values of and are set to 1.67 seconds and 5, respectively. When we vary the values of 6
one parameter, we keep the value of the other parameter the same as the default value. Then, as shown in 7
Figure 13 (a), by increasing the values of from 0.2 to 3.2 seconds, the connection rates increase from 0.4 8
to 0.9. There is a significant increase in the connection rate when equals to around 1.5 s, which indicates 9
that most of the broken time length of the trajectories (around 80%) is less than 1.5 s. As can be seen in 10
Figure 13 (b), as the values of increase, the connection rates increase fast at the beginning (from 2 to 6). 11
However, when the values of are greater than 6, the keep increasing of the values of does not much 12
impact the connection rates. This is because that for a correct trajectory connection, the calculated distance 13
error between the two trajectories is dependent on the accuracy of the adopted car-following model. The 14
maximum distance error for a given car-following model is a bounded value and is regardless of the dataset. 15
Once equals to the maximum distance error for a correct trajectory connection, the keep increasing of 16
the values of will increase the probability of a wrong connection and will not impact the connection rate 17
that reflects the percentage of correct connections. 18
19
(a) (b) 20
Figure 13. Sensitivity analysis for two critical parameters ( and ). 21
5. Conclusion 22
This paper proposed a car-following-based (CF-based) vehicle trajectory connection method that 23
can connect broken vehicle trajectories based on car-following theory. The proposed method can not only 24
fill missing data points caused by detection errors but also can help connect trajectory data from different 25
17
sensors. To illustrate the performance of the proposed CF-based method, the proposed method was
1
employed to process a series of vehicle trajectory datasets that were extracted from aerial videos recorded 2
at several successive spaces of Interstate-75, United States. Comparing with several benchmark methods, 3
the results showed that the proposed method has advantages in both trajectory connection accuracy and 4
trajectory consistency aspects. The dataset processed by the proposed method, named by the HIGH-SIM 5
dataset, was the longest vehicle trajectory dataset in the literature based on the authors’ knowledge, which 6
contains a full life cycle of the bottleneck. The dataset has been published online for public use. 7
Future research can be conducted in a few directions. It will be interesting to adopt some learning-8
based methods (e.g., machine learning, deep learning, etc.) to extend the broken trajectory for trajectory 9
connection. Moreover, it will be interesting to validate the existing macro- and microscopic traffic 10
characteristics with the extracted datasets. 11
12
Acknowledgment 13
This research is supported by the US National Science Foundation through Grants Crisp 1 and 14
Crisp 2. 15
16
References 17
Anuar, K., Cetin, M., 2017. Estimating Freeway Traffic Volume Using Shockwaves and Probe Vehicle 18
Trajectory Data. Transp. Res. Procedia 22, 183–192. https://doi.org/10.1016/j.trpro.2017.03.025 19
Apeltauer, J., Babinec, A., Herman, D., Apeltauer, T., 2015. Automatic vehicle trajectory extraction for 20
traffic analysis from aerial video data. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. - ISPRS 21
Arch. 40, 9–15. https://doi.org/10.5194/isprsarchives-XL-3-W2-9-2015 22
Azevedo, C.L., Cardoso, J.L., Ben-Akiva, M., Costeira, J.P., Marques, M., 2014. Automatic Vehicle 23
Trajectory Extraction by Aerial Remote Sensing. Procedia - Soc. Behav. Sci. 111, 849–858. 24
https://doi.org/10.1016/j.sbspro.2014.01.119 25
Babinec, A., Herman, D., Cecha, S., 2014. AUTOMATIC VEHICLE TRAJECTORY EXTRACTION 26
FOR TRAFFIC ANALYSIS FROM AERIAL VIDEO DATA. 27
Barmpounakis, E., Geroliminis, N., 2020. On the new era of urban traffic monitoring with massive drone 28
data : The pNEUMA large-scale field experiment. Transp. Res. Part C 111, 50–71. 29
https://doi.org/10.1016/j.trc.2019.11.023 30
Chen, X., Li, Z., Yang, Y., Qi, L., Ke, R., 2020. High-Resolution Vehicle Trajectory Extraction and 31
Denoising From Aerial Videos. IEEE Trans. Intell. Transp. Syst. 1–13. 32
https://doi.org/10.1109/tits.2020.3003782 33
Coifman, B., Wu, M., Redmill, K., Thornton, D.A., 2016. Collecting ambient vehicle trajectories from an 34
instrumented probe vehicle: High quality data for microscopic traffic flow studies. Transp. Res. Part 35
C Emerg. Technol. 72, 254–271. https://doi.org/10.1016/j.trc.2016.09.001 36
Daganzo, C.F., 1997. Fundamentals of Transportation and Traffic Operations. Fundam. Transp. Traffic 37
Oper. https://doi.org/10.1108/9780585475301 38
Drew, D.R., 1968. Traffic Flow Theory and Simulation. New York, McGraw-Hill. 39
18
Jazayeri, A., Cai, H., Zheng, J.Y., Tuceryan, M., 2011. Vehicle detection and tracking in car video based
1
on motion model, in: IEEE Transactions on Intelligent Transportation Systems. pp. 583–595. 2
https://doi.org/10.1109/TITS.2011.2113340 3
Kim, E.J., Park, H.C., Ham, S.W., Kho, S.Y., Kim, D.K., Hassan, Y., 2019. Extracting Vehicle 4
Trajectories Using Unmanned Aerial Vehicles in Congested Traffic Conditions. J. Adv. Transp. 5
2019. https://doi.org/10.1155/2019/9060797 6
Kim, Z.W., Cao, M., 2010. Evaluation of feature-based vehicle trajectory extraction algorithms. IEEE 7
Conf. Intell. Transp. Syst. Proceedings, ITSC 99–104. https://doi.org/10.1109/ITSC.2010.5625278 8
Krajewski, R., Bock, J., Kloeker, L., Eckstein, L., 2018. The highD Dataset: A Drone Dataset of 9
Naturalistic Vehicle Trajectories on German Highways for Validation of Highly Automated Driving 10
Systems, in: 2018 21st International Conference on Intelligent Transportation Systems (ITSC). 11
IEEE, pp. 2118–2125. https://doi.org/10.1109/ITSC.2018.8569552 12
Li, L., Li, X., 2019. Parsimonious trajectory design of connected automated traffic. Transp. Res. Part B 13
Methodol. 119, 1–21. https://doi.org/10.1016/j.trb.2018.11.006 14
Punzo, V., 2009. Estimation of vehicle trajectories from observed discrete positions and Next‐Generation 15
Simulation Program (NGSIM) data. 16
Punzo, V., Borzacchiello, M.T., Ciuffo, B., 2011. On the assessment of vehicle trajectory data accuracy 17
and application to the Next Generation SIMulation (NGSIM) program data. Transp. Res. Part C 18
Emerg. Technol. 19, 1243–1262. https://doi.org/10.1016/j.trc.2010.12.007 19
Raju, N., Arkatkar, S., Easa, S., Joshi, G., 2021. Developing extended trajectory database for 20
heterogeneous traffic like NGSIM database. Transp. Lett. 00, 1–10. 21
https://doi.org/10.1080/19427867.2021.1908490 22
Sazara, C., Nezafat, R.V., Cetin, M., 2017. Offline reconstruction of missing vehicle trajectory data from 23
3D LIDAR. IEEE Intell. Veh. Symp. Proc. 792–797. https://doi.org/10.1109/IVS.2017.7995813 24
Tong, C., Chen, H., Xuan, Q., Yang, X., 2017. A framework for bus trajectory extraction and missing 25
data recovery for data sampled from the internet. Sensors (Switzerland) 17. 26
https://doi.org/10.3390/s17020342 27
van Hinsbergen, C.P.I.J., Schakel, W.J., Knoop, V.L., van Lint, J.W.C., Hoogendoorn, S.P., 2015. A 28
general framework for calibrating and comparing car-following models. Transp. A Transp. Sci. 29
https://doi.org/10.1080/23249935.2015.1006157 30
Victor, T., 2014. Analysis of Naturalistic Driving Study Data: Safer Glances, Driver Inattention, and 31
Crash Risk, Analysis of Naturalistic Driving Study Data: Safer Glances, Driver Inattention, and 32
Crash Risk. Transportation Research Board, Washington, D.C. https://doi.org/10.17226/22297 33
Wang, G., Xiao, D., Gu, J., 2008. Review on vehicle detection based on video for traffic surveillance, in: 34
2008 IEEE International Conference on Automation and Logistics. IEEE, pp. 2961–2966. 35
https://doi.org/10.1109/ICAL.2008.4636684 36
Xu, Y., Yu, G., Wu, X., Wang, Y., Ma, Y., 2017. An Enhanced Viola-Jones Vehicle Detection Method 37
from Unmanned Aerial Vehicles Imagery. IEEE Trans. Intell. Transp. Syst. 18, 1845–1856. 38
https://doi.org/10.1109/TITS.2016.2617202 39
Zhang, G., Avery, R.P., Wang, Y., 2007. Video-based vehicle detection and classification system for real-40
time traffic data collection using uncalibrated video cameras. Transp. Res. Rec. 138–147. 41
https://doi.org/10.3141/1993-19 42
19
Zhang, T., Jin, P.J., 2019. A longitudinal scanline based vehicle trajectory reconstruction method for
1
high-angle traffic video. Transp. Res. Part C Emerg. Technol. 103, 104–128. 2
https://doi.org/10.1016/j.trc.2019.03.015 3
Zhao, D., Li, X., 2019. Real-World Trajectory Extraction from Aerial Videos - A Comprehensive and 4
Effective Solution. 2019 IEEE Intell. Transp. Syst. Conf. ITSC 2019 2854–2859. 5
https://doi.org/10.1109/ITSC.2019.8917175 6
Zhao, H., Wang, C., Lin, Y., Guillemard, F., Geronimi, S., Aioun, F., 2017. On-Road Vehicle Trajectory 7
Collection and Scene-Based Lane Change Analysis: Part I. IEEE Trans. Intell. Transp. Syst. 18, 8
192–205. https://doi.org/10.1109/TITS.2016.2571726 9
Next Generation Simulation, 2006. Source: https://ops.fhwa.dot.gov/trafficanalysistools/ngsim.htm. 10
Shi, X., Zhao, D., Yao, H., Li, X., James, R., Hale, D., and Ghiasi, A., 2021. An Open Database 11
Generation with Monte Carlo Based Lane Marker Detection and Critical Analysis of Vehicle 12
Trajectory - High-Granularity Highway Simulation (HIGH-SIM). Preprint: 13
https://doi.org/10.13140/RG.2.2.30725.06887 14
Wu, M., 2018. Collecting Ambient Vehicle Trajectories from an Instrumented Probe Vehicle and Fusing 15
with Loop Detector Actuations. Dissertation. The Ohio State University. 16
Lee WC., Krumm J., 2011. Trajectory Preprocessing. In: Zheng Y., Zhou X. (eds) Computing with 17
Spatial Trajectories. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1629-6_1 18
Rahman, M., 2013. Application of Parameter Estimation and Calibration Method for Car-Following 19
Models. Clemson University. 20