Content uploaded by Yang Ma
Author content
All content in this area was uploaded by Yang Ma on Nov 18, 2021
Content may be subject to copyright.
1
A Virtual Procedure for Real-time Monitoring of Intervisibility between Conflicting
Agents at Intersections Using Point Cloud and Trajectory Data
Yang Ma a, b *, Yubing Zheng c, Yiik Diew Wong b, Said Easa d, Jianchuan Cheng a
* Corresponding Author, yangma_93@seu.edu.cn
a School of Transportation, Southeast University, P.R. China, 211189
b School of Civil and Environmental Engineering, Nanyang Technological University, Singapore,
639798
c School of Automobile and Transportation Engineering, Hefei University of Technology, P.R. China,
230009
d Department of Civil Engineering, Ryerson University, Toronto, ON, Canada M5B 2K3.
Y.Z. ybzheng@hfut.edu.cn; Y.D.W. cydwong@ntu.edu.sg; S.E. seasa@ryerson.ca ;
J.C. jccheng@seu.edu.cn.
This is a manuscript accepted for publication in Transportation Research Part C
2
Abstract
A new procedure is developed for effectively monitoring intervisibility between conflicting agents
in real-time at intersections. Dense light detection and ranging (Lidar) point clouds and time-
stamped trajectory data are used to model static intersection environment and emulate traffic
participants’ dynamic motion states, respectively. The proposed procedure reads trajectory data in
sequence according to their timestamps. An agent-based approach that enables the application of
multi-core parallel computing is applied to estimate conflict points and identify conflicting agents
in pairs. Then, a linear elevation array is created from the Lidar point cloud data, based on which
the elevation of each trajectory point is obtained in a real-time manner. Meanwhile, three-
dimensional bounding cuboids are generated at each path point to represent digital twins of agents.
Once a pair of conflicting agents are identified, a hybrid approach is triggered to examine whether
the agents’ line-of-sights are occluded by either static (e.g., tree trunk) or dynamic obstacles (e.g.,
cars). Accordingly, virtual warning signals can be generated. The effectiveness of the procedure is
demonstrated through controlled experiments. The procedure was also tested in two virtual
scenarios. The mean processing time at each frame is less than 0.1s as achieved with limited
computational power. With implementation of parallel computing technique, processing time is not
sensitive to number of agents within the intersection. In addition, by enabling the outputs of virtual
warning signals, spatial distribution of conflict points, and individual conflict-related time series
data, the procedure shall help provide substantial insights into intersection safety.
Keywords: conflict points, blind spots, real-time computation, intersection safety, digital twins.
3
1 Introduction
1.1 Background
Compared with conventional ways of collecting historical collision data to perform crash analyses
at intersections, the use of traffic conflicts as a measure of intersection safety is gaining increasing
acceptance among traffic safety researchers (El-Basyouny and Sayed, 2013; Guo et al., 2020a; Ma
and Zhu 2021). Traffic conflicts enable a timely depiction of traffic surveillance amidst operational
features at the intersections, based on which proactive measures can be implemented to prevent or
mitigate potential crashes.
According to the manner that conflict data are used, existing studies can generally be
categorized into two types: real-time analysis and post-analysis. Post-analysis of historical data
serves to provide insights into how traffic participants interact in conflict areas, thereby gaining
deeper understandings of their behaviors (Chen et al., 2017, 2019; Kumar et al., 2019; Liu et al.,
2017; Zeng et al., 2014, 2017) and assess infrastructure safety improvements (Guo et al., 2020b).
However, like crash data analyses, post-analysis approaches cannot indicate real-time crash risks.
Regarding real-time analysis, much effort as reported in the literature has been devoted to
establishing real-time conflicts-incorporated crash prediction models that examined crash likelihood
and severity using real-time traffic and environmental data (Essa and Sayed, 2019; Guo et al. 2020a;
Machiani and Abbas, 2016; Zheng and Sayed, 2020). In these studies, the intersection safety
performance was measured by several conflict indicators extracted from real-time trajectory data
generated by roadside monitoring units (RMUs), usually surveillance cameras. Using some
environmental variables (e.g., weather) and traffic variables (e.g., traffic volume and queue length)
as inputs, crash prediction models can quantify the safety level of an intersection (Essa and Sayed,
2018).
1.2 Intervisibility between Conflicting Agents
However, although existing real-time conflict-related studies can estimate the safety
performance of an intersection in real-time, they cannot detect specific conflict events and make
some instant responses. Fig. 1 depicts two typical conflict events that require instant warnings. In
Scenario 1, a fast cyclist is crossing the street while another fast car is also trying to go straight
through the intersection. Scenario 2 shows a right-turn-on-red vehicle conflicting with a running
pedestrian. Both scenarios are common in urban street networks in China.
Figure 1. Conflict events that require instant warnings
A significant issue that adds to the severity of conflict events shown in Fig. 1 is the blind spot
created by the dynamic obstacles (i.e., trucks). Due to the presence of a truck, the car and vulnerable
4
road users are not intervisible. In this case, the car may not yield because the driver fails to visually
detect the cyclist or pedestrian, and a collision may thus occur. Such traffic crashes were frequently
reported in China over the last decade (BaiduNews, 2021).
Theoretically, when all road users are interconnected in the future, intervisibility between
conflicting agents (IvCA) will no longer be an issue because all traffic participants can share motion
data with each other via wireless communication. However, during the long transition period from
human-driven vehicles to fully connected and autonomous vehicles (CAVs), the constrains in
intervisibility is an essential factor that should be seriously considered in the monitoring of conflict
points at intersections. Also in this transition phase, RMUs will play a crucial role in monitoring
traffic states and emit warning signals to traffic participants within the intersection (Tsukada et al.,
2019).
The IvCA issue is closely associated with operational safety at intersections. More specifically,
if the car in Fig. 1 can detect or be warned of the presence of the conflicting pedestrian or cyclist,
the driver can decelerate expediently to avoid a collision. Nonetheless, the IvCA issue remains
overlooked in current studies related to conflict analyses because of a lack of effective procedures
to examine whether conflicting agents are intervisible in real-time. In addition to adjacent vehicles
as sight obstructions, stationary roadside objects such as bushes, information boards, and building
façades may also adversely affect the drivers’ visual detection of the conflicting agents.
1.3 Point cloud data
In recent years, dense point cloud data have been recognized as a promising data source for
creating a precise digital model of real-world road environments (Jung et al., 2018; Soilán et al.,
2018). Point cloud data may either be collected by light detection and ranging (Lidar) devices,
binocular cameras, or derived from multi-view high-resolution drone images. Regarding digital
modeling of the intersections, Lidar point clouds collected at street levels are more desirable. Not
only because of their centimeter-level representation of the real-world environment, but also, they
are insensitive to objects overhanging the road surface (e.g., tree crown and overpass) that
substantially limit the accuracy of photogrammetric point clouds. Despite the great potential of point
cloud data in mapping static real-world road scenes, they have not been applied in the dynamic
monitoring of IvCA.
1.4 Objectives and Contributions
To address the existing gaps, the main objective of this study is to develop a virtual procedure
that uses point cloud and trajectory data as inputs, for real-time monitoring of the intervisibility
between pair-wise conflicting agents at intersections.
The contributions of this study are as follows. First, an agent-based approach is developed for
estimating the conflict points that enables the use of multi-thread parallel computing technique.
Second, three-dimensional (3D) bounding cuboids are created and aligned with their forward
directions to represent various road users in real-time. Third, a hybrid method is proposed to
estimate IvCA considering both stationary and dynamic obstacles.
2 Related Work
2.1 Conflict Points Detection
Since traffic conflict techniques were first introduced by Perkins and Harris (1967), many studies
have focused on using conflict points as a safety performance measure, especially at intersections.
5
Nevertheless, there are only limited efforts on monitoring IvCA from a technical perspective. In the
connected (yet-to-realized) environment, real-time motion data of all traffic participants would be
available. They are not restricted by the sight-occluding physical obstacles, in which case collision
avoidance algorithms (Fu et al., 2018) and vehicle control strategies (Zhu and Ukkusuri, 2015) have
been developed to boost traffic efficiency rather than conflict event identification. Herein, the
literature related to using vehicle-to-everything (V2X) technologies in monitoring traffic conflicts
(e.g., Ma and Zhu, 2021) is not reviewed in this study.
Table 1 Conflict points estimations
Traffic
Participants
References Description Issues
Vehicles only Essa and
Sayed, 2018
Focusing on real-end conflict events, vehicle trajectories
were extracted from fixed camera-captured videos at a
signalized intersection. The conflict indicator time-to-
collision (TTC) was calculated and lateral deviations
between trajectories were not considered.
Considers only rear-end
conflict events
Jang et al.,
2012
Multiple roadside units were used to monitor vehicle
trajectories within the unsignalized intersection.
Intervisibility between conflicting vehicles in pair were
examined with a sight triangle method
Does not consider
dynamic obstacles
Difficult to address
complex obstacles
Oh et al.,
2010
From videos captured by surveillance cameras at a
signalized intersection, conflict points were indirectly
estimated by examining whether the stopping points of two
conflicting vehicles were across each other
Does not consider IvCA
Indirectly measures
conflict points
Salim et al.,
2007
Learning collision patterns from historical collision data to
detect collision-prone areas for different intersections. Then,
based on real-time motion data, conflict points were
identified with a trigonometric method.
Requires historical
collision data
Does not consider IvCA
Vehicles and
vulnerable
road users
Chen et al.,
2020
Traffic agents were identified and tracked from drone image
sequences. A safe space was defined for each traffic agent. If
the boundary of a traffic agent’s safe space was encroached
by other agents, a conflict would occur
Post analysis
Does not consider IvCA
Chen et al.,
2017, 2019
Based on trajectories extracted from drone videos, TTC and
post-encroachment (PET) time were estimated to assess
vehicle-pedestrian conflicts at signalized interactions
Not in real-time
Does not consider IvCA
Wang et al.,
2009
A conflict detection model was developed to simulate
continuous mixed traffic flow
Does not map the real-
world situation
Does not consider IvCA
The related studies are divided into two groups based on the types of traffic participants
involved, as shown in Table 1. Regarding rear-end conflict events, it is unnecessary to consider the
IvCA issue. Among the studies that considered angle conflict events, only Jang et al. (2012)
examined the intervisibility between conflicting vehicles. However, due to the use of sight-triangle
method, the static obstacles in their study were represented by a two-dimensional (2D) rectangle,
which did not provide a close fit to real-world situations. Besides, the authors did not include non-
6
motorized traffic. Also, Jang et al. (2012) did not consider dynamic obstacles. Vulnerable road users
were considered in Chen et al. (2017, 2019), Chen et al. (2020), and Wang et al. (2009). However,
in their studies, most attention was paid to extracting conflict indicators, including time-to-collision
(TTC) and post-encroachment time (PET), to quantify conflict severity between vehicles and
pedestrians or cyclists. None of the studies had considered whether sight obstacles occluded a
conflicting agent.
2.2 Visibility Analysis at Intersections Using Point Cloud Data
It has been well demonstrated that dense point cloud data provide a reliable virtual environment
for performing visibility-related analyses (Gargoum et al., 2018a; Jung et al., 2018; Ma et al., 2019a;
Soilán et al., 2018). Over the past decades, Lidar points have been widely used to profile longitudinal
available sight distance on highways; see for example Castro et al. (2014), Gargoum et al. (2018b),
González-Gómez et al. (2019), and Ma et al. (2019b). In comparison, very few studies have focused
on visibility analyses at the intersections using dense point cloud data.
Three main components are involved in visibility analyses: sight point, obstacles, and target
point. A virtual line-of-sight (LOS) is commonly created from the sight point to the target point. If
the LOS does not intersect with obstacles, the target point is visible to the sight point; otherwise, it
is invisible. In cases when no target points are defined, the virtual LOS terminates automatically at
the obstacles (Jung et al., 2018). A summary of existing studies that used point cloud data to perform
visibility-related analyses at the intersections is presented in Table 2.
Table 2 Visibility analysis using point cloud data
References Sight and Target Points Obstacles Issues
Tsai et al., 2011 Manually selected sight point;
no target point
Obstacles were modeled by a
digital surface model (DSM)
created from point cloud data
Static obstacles only
Inaccuracy of DSM
Post-analysis
Jung et al., 2018 Manually selected sight point;
no target point
Modeled by grid cells Inefficient
Static obstacles only
Post-analysis
González-Gómez
and Castro, 2019
Predefined virtual vehicle and
pedestrian trajectory points
Road surface and off-ground
objects are modeled by digital
terrain model and multi-patch file
in ArcGIS
Static obstacles only
Post-analysis
González-Gómez et
al., 2021
Predefined virtual rider and
pedestrian trajectory points
Static obstacles only
(intervening vehicles were
treated as static obstacles)
Post-analysis
The main goal of Tsai et al. (2011) and Jung et al. (2018) was to inspect whether sight distance
triangles were clear at intersections. Therefore, their work did not involve the estimation of conflict
points. González-Gómez et al. (2019, 2021) did consider the conflict points between different road
users. However, the trajectory data of road users were not collected from the real world, and the
analyses were not performed in a real-time manner either. As noted in Table 2, no existing studies
has incorporated dynamic obstacles. Although vehicles were included as sight obstructions by
González-Gómez et al. (2021), they were treated as stationary obstacles because of a lack of
dynamic behavior model.
In essence, the studies related to conflict-points estimation did not involve careful visibility
7
analysis, while point cloud-based visibility analyses did not consider the dynamic interactions
among road users. Thus, it is identified that a procedure for real-time monitoring of IvCA at
intersections is currently lacking.
3 Methodology
3.1 Overview
The proposed procedure uses trajectory data from RMUs as input and automatically identifies non-
intervisible (i.e., occluded) conflicting agents in pairs in a real-time manner, as illustrated in Fig. 2.
First, the motion data are geocoded based on the reference three-dimensional (3D) point cloud data
of intersections. Second, an agent-based conflict points computation approach is developed in which
multi-thread parallel computing is applicable to substantially reduce the runtime. Third, in addition
to static obstacles, moving agents represented by 3D bounding cuboids of pre-defined sizes are
modeled and considered as obstacles as well. Then, virtual LOS connecting conflicting agents in
pairs are created and a hybrid assessment approach is applied to examine whether each LOS is
obstructed. Finally, virtual warning signals can be disseminated accordingly.
Figure 2. Framework for real-time conflict-points detection
The procedure comprises three main computational layers, as depicted in Fig.3. Using real-
time 2D motion data, the first layer aims to identify all conflict points and the corresponding
conflicting agents. Then, in Layer 2 a digital ground model is created from point cloud data, based
on which the elevation of each agent is obtained dynamically. In this layer, 3D agents are modeled
using different bounding cuboids. Finally, considering both static and moving obstacles, the virtual
LOS are created and a dynamic LOS assessment procedure is implemented in Layer 3 to check the
intervisibility between the conflicting agents.
8
Figure 3. Computational layers for identifying non-intervisible conflicting agents
3.1 Data Alignment and Fusion
Two types of data are involved in the procedure: point cloud data and trajectory data (see Fig.
4). The dense point cloud data as measured in meters provide an accurate and precise digital twin
of real-world intersection infrastructure. The trajectory data map the agents’ positions and directions
at sequential time steps. The trajectory data may be collected using algorithm-embedded cameras
or static roadside Lidars (both known as edge computing devices). The paths generated from the
video flow and Lidar scan sequences are measured in pixels and meters, respectively.
Figure 4. Feature points matching
Commonly, trajectory or motion data collected by RMUs are not in the same coordinate system
as static 3D infrastructure data. Therefore, the data alignment is a fundamental step that brings
different data streams into the same coordinate frame. In this study, a manual geo-registration
approach is applied to map trajectory data into the coordinate system of the more accurate static
point cloud data layer. Specifically, more than 4 feature points are manually selected in the local
space of trajectory data, as illustrated in Fig.4. Then, their corresponding positions in the frame
of point cloud data are used to derive the transformation matrix. Given a set of matching points,
the algorithms for estimating 2D and 3D geometric transformation have been well established
(Torr and Zisserman, 2000). The transformation process is given by Eq. (1).
=∙∙ (1)
where = scale factor, = geometric transformation matrix.
Regarding 2D video sequences, the feature points should be scattered over a horizontal plane
(e.g., pavement surface). In that case, only 2D coordinates are required for deriving . Differently,
9
users shall choose 3D feature points in Lidar scan sequences for the data alignment.
3.3 Conflicting Agents Identification
3.3.1 Agent definition
A struct-type array is constructed for each tracked agent. Different fields and their corresponding
functions are listed in Table 3. During the application of the procedure, agent data are updated in
real-time. For instance, if a conflict point is detected, the coordinates of conflict point, the ID of
conflicting agents, TTC, etc. will be written into ‘ConflictData’. As such, some post analyses can
also be performed in addition to estimating IvCA.
Table 3 Main fields of the struct-type agent data
Field Function
ID Store the id of a tracked agent
TimeStep Count the number of time steps for a tracked agent; default size:
1000
×
1
array
Type Store the semantic class of a tracked agent
Shape Store dimensions of the predefined bounding cuboid
TimeStamp Store the time stamp; default size:
1000
×
1
array
Trajectory Store the coordinates of a tracked agent; default size:
1000
×
3
matrix
Direction Store the direction of a tracked agent; default size:
1000
×
3
matrix
RotateAngles Store the rotating angles of a tracked agent for 3D modeling estimated by
‘Direction’ data.
IsTunring Indicate whether the agent is making a turn: 1-yes, 0-no
TrackPrediction Store the predicted (matched) path for the turning vehicle
IsInROI Indicate whether the agent is within the region of interest: 1-yes, 0-no
Speed Store the velocity of a tracked agent; default size:
1000
×
1
array
ConflictData Store the conflict-related information. ‘Trajectory’, ‘Direction’, ‘Speed’, and
‘ConflictData’ are linked with ‘TimeStamp’
SenseParameters
Store the visual field related information: horizontal viewing angle and visual
range
3.3.2 Conflict points estimation
An agent-based approach is proposed to estimate conflict points at each time step (time step size is
). Let - be the global coordinate system (i.e., horizontal projection of ). A local 2D
coordinate frame is established for each agent. Take Agent for illustration. Suppose (,) is
Agent ’s position at the time step . Originating at (,), let its forward direction be the -
axis, and the -axis is along the rightward vector orthogonal to the -axis. The visual field of
Agent is modeled by a sector-shaped region whose horizontal viewing angle and radius are and
, respectively. Agents outside the visual field are not considered in the computation of conflict
points because of their relatively lower impacts.
The transformation from - to - is achieved as follows:
=cos(
−) −sin(
−)
sin(
−) cos(
−)−
−=sin−cos
cossin−
− (2)
where = angle between Agent ’s direction and -axis, and it is positive in a clockwise direction,
10
(,)= points in - frame.
Regarding the estimation of conflict points, different approaches are separately used for the
turning vehicles and the other agents. Vehicles do not move randomly when they are making turns.
Specifically, they will usually follow a motion pattern which can be learned from a set of observed
trajectories (Saunier et al., 2010). The general process is graphically presented in Fig.5. A region of
interest (ROI) is pre-demarcated where turning behaviors may occur to help differentiate turning
vehicles from the other types of agents. As shown in Fig.5, if an agent is determined as a turning
vehicle, its path points inside the ROI are used to match the most similar path from a trajectory set.
Inspired by Sayed et al. (2013), a path matching procedure involving four main steps shown in
Fig.6 is proposed. First, a set of trajectories are collected on site during large-traffic-volume hours.
Second, curved paths are segmented from the observed trajectories. At this stage, natural cubic
splines are applied to partition each curved path into uniformly spaced points (Ma et al., 2019c).
Third, a path map is generated from the path points. Finally, a real-time path matching is performed
during the application of the proposed procedure based on the path map. The detailed description of
the matching procedure is presented in Appendix A.
The matched trajectory is then used to detect conflict events for turning vehicles. Otherwise,
the traditional extrapolation method (Svensson and Hydén, 2006) that extrapolates the agents’
movements with constant velocity is applied to estimate conflict points. The algorithms for
computing conflict points between different types of agents are graphically shown in Fig.7.
Figure 5. General process of estimating conflict points
Figure 6. Matching paths for turning vehicles
In this study, the detection of conflict points is conducted in the local space of each agent.
Fig.7a illustrates the transformation from the global to local coordinates frame. Thera are four cases
in regards to the computation of conflict points, as follows:
11
Case A: Neither Agent nor the conflicting agent is a turning vehicle (e.g., Agents and
in Fig. 7a). In this case, the conflict points involving Agent are all located in the -axis.
Therefore, solving the 2D coordinates of conflict points using the extrapolation method is simplified
as:
=
⎩
⎨
⎧
−
∙sin
+
,
≤0
0,
>0 (3)
where
= angle between direction of Agent and -axis at the time step , and = -
values of conflict points. Note that only the conflict points whose 0<≤ are effective. For
instance, as shown in Fig.7b, the conflict point between Agents and does not fall inside the
visual field.
Then, the TTC indicators can be obtained as:
⎩
⎨
⎧
=,
=,
∙,
(4)
where
and
are speed of Agent and at , respectively. Note that if an agent’s
speed is lower than 0.5 m/s, its TTC will not be computed. Other notations remain unchanged.
Case B: Agent is not a turning vehicle while the conflicting agent is a turning vehicle (e.g.,
Agents and in Fig. 7b). The predicted path of Agent are essentially discrete points and the
gap between two consecutive points is (its value is 0.2 m in this case). Let
be the vector
from the origin (local space) to a certain path point of Agents , and
be the vector along the -
axis. Estimating conflict point is equivalent to detecting the critical point (see Fig. 7b) where
sign(
×
) changes. If the critical point does not exist, if means Agent does not conflict with
Agent . In this case, the TTC indicators are calculated by:
=
=
(5)
where = vertical distance between the origin and the critical point, = distance from Agent
’s current position to the critical point measured along Agent ’s path, and other notations remain
unchanged.
12
(a)
(b)
(c)
Figure 7. Agent-based conflict points estimation: (a) Straight-through vehicles, cyclists and
pedestrians (b) Turning vehicles
Case C: Agent is a turning vehicle while the conflicting agent is not (e.g., Agents and
in Fig.7c). Let
be the vector from Agent ’s position in the - space to a certain path point
of Agent , and
be the local direction of Agent . Then, the estimation of conflict points is
similar to that in Fig.7b. Differently, in this case is measured along Agent ’s path and
equals the Euclidean distance from Agent ’s position to the critical point.
Case D: Both Agent and the conflicting agent are turning vehicles (e.g., Agents and in
Fig. 7c). In this case, pairwise distance between Agent and ’s path points are calculated. If two
agents’ predicted trajectories do not overlap, let be the distance between the closest point-pair.
Two agents are considered as conflicting if ≤ . and are then measured along
respective paths.
While multiple conflict points may be detected ahead of Agent , only the closest one (i.e.,
minimum ) is considered in subsequent steps. Hereby, Fig. 7 only shows the case of Agent .
During application of the procedure, each agent within the intersection detected by RMU should be
estimated, which may pose a significant challenge to the computational efficiency during high traffic
volume hours. In a loop structure, the conflict points estimation module is executed agent by agent
in sequence, as depicted in Fig. 8. The processing time may increase linearly with the number of
agents.
Thanks to the agent-based approach for estimating conflict points, a parallel technique is
applicable to accelerate the computations. More Specifically, a virtual state detector is created to
enable the application of multi-core parallel computing. The virtual state detector stores the motion
states {
,
,
} and corresponding ID’s {
} of all agents within the intersection at time step
13
. denote the predicted trajectories for turning vehicles. Note that
,
,
and
are all
stored in the format of cell arrays. Then, copies of {
,
,
,
} are created where
equals the number of agents within the intersection. Because the data volume of {
,
,
,
}
is small, creating copies of them will not cause a large consumption of the computer memory. In
this way, the computational task can be divided into same sub-tasks, which can be performed
simultaneously. Due to the duplication of {
,
,
,
}, the access to motion states in each sub-
task is independent, which enables the use of parallel computing. Regarding the multi-core or multi-
thread parallel computing, the computations are carried out by several separate processing units
simultaneously and thus are completed efficiently.
Once an effective conflict point is detected in a sub-task, the ID of the conflicting agent is
retrieved from {
}. In this way, the procedure can identify conflicting agents in pairs. For each pair
of conflicting agents, their data are also available via Eqs. (4) or (5). As illustrated in Fig. 7,
the conflicting angles are also calculable in the - space.
Figure 8. Use of multi-thread parallel computing
3.4 3D Modeling of Obstacles
The static obstacles are modeled using dense point cloud data. Concerning dynamic obstacles,
neither video flows nor Lidar scan sequences provide a good depiction of the 3D agents. Agents in
a video sequence are merely 2D. Although the agents in Lidar scan sequence are 3D in nature, the
number of laser points on agents varies greatly. More specifically, the density of laser points on an
agent decreases substantially with its distance from the Lidar sensor. Besides, due to the occlusion
issue, points on some agents are incomplete. Note that several Lidar devices can be integrated to
achieve a fully 3D sensing of intersections, yet the cost is currently too high to be deployed at
ordinary intersections.
Therefore, in this study all agents are represented by 3D bounding cuboids of predefined sizes.
As displayed in Fig. 9, let ,, and be the length, width, and height of each agent, respectively.
Regarding the video-derived data, these variables are pre-defined according to agent classes (see
14
Fig. 9). The dimensions of some common agents are listed in Table 4. 3D bounding cuboids are
usually created along with trajectory data in the Lidar scan sequence using deep learning techniques
(e.g., Lang et al., 2019). However, in some cases, when the laser points on an agent are incomplete
due to the occlusion issue, the bounding cuboid may not correspond with the agent’s true size. Some
studies have considered combining Lidar scans with video sequences to generate more accurate
bounding cuboids for detected agents (Pang et al., 2020). However, this is not the focus of this study.
Therefore, pre-defined bounding cuboids are an option for both video and Lidar-derived motion
data. These bounding cuboids serve as digital twins of agents in a virtual environment.
Table 4 Sizes of different agents
Classes Pedestrian
Cyclist Car Medium vehicle Truck Bus
,
,
(m)
0.5,0.5,1.7
1.5,0.5,1.4 5.0,1.8,1.4 6.0,2.0,1.8 7.2,2.3,2.7
12.0,2.55,3.25
Figure 9. Bounding cuboids of agents
Two steps are required to make the 3D bounding cuboids representing agents to continuously
align with their forward directions: (1) creating a digital ground model, and (2) rigid rotation. The
digital ground model is created from dense point cloud data using the approach proposed by Ma et
al. (2021) (see Fig. 10). Specifically, the point cloud data are first partitioned into numerous pillars
(grid size = ). Then, a pillar-wise filtering procedure is carried out whereby the points that are ℎ
higher than the lowest point are eliminated in each pillar. The remaining points in the pillars are
projected onto grid cells on the horizontal plane. Finally, the connected non-void cells are identified
as ground points. Note that two grid cells are connected if their corner or edge touches.
Figure 10. Segmenting ground points
Figure 11. Aligning bounding box with
forward directions
Then, as depicted in Fig. 11, elevation of each path point is obtained based on the ground points.
However, these preliminarily identified ground points are inappropriate for a real-time modeling of
3D agents considering their dense and disorganized nature. Therefore, to fill potential data gaps in
the identified ground points and speed up the subsequent interpolations of trajectory points, grid
15
points with a uniform gap of m are created to cover the ground area. The elevations of the grid
points are linearly interpolated using the identified ground data points (see Fig. 12). At this step, the
grid points outside the convex hull of the horizontal ground points are void due to a lack of valid
elevation information. The valid grid points are essentially a ×3 array where denotes the
total number of the valid grid points.
(a)
(b)
Figure 12. Creation of a linear elevation array: (a) General process (b) Example
Given a horizontal path point, the general process of elevation interpolation is finding its
neighboring grid points first and then computing the elevation. However, neighbors searching at
each time step is a computationally intensive task, even with pre-constructed data structure such as
Kd-tree, which is difficult to support a real-time estimation. Therefore, a linear elevation array is
created from the ×3 array, as illustrated in Fig. 12. Suppose (,) is the lower-left grid
point. The horizontal and vertical distances from the upper-right grid point to (,) are and
, respectively. In this phase, the unit length is . A null 1×(+1)∙(+1) array is created.
Grid points in the ×3 array are encoded as follows:
=(+1)∙++1,
(+1)∙++1 (6)
where , = horizontal and vertical distances from each valid grid point to (,) ,
respectively, 0≤≤, 0≤≤, (, , , )∈, = the order of grid points in the
linear array.
Using , the ×3 array is mapped onto a 1×(+1)∙(+1) array. Each element of
the linear array stores the -value of a corresponding grid point and the - coordinates are no
longer required. Fig. 12b shows a sample that four grid points are converted to a 1×4 array. In
16
this case, the first encoding function is used. The correspondence between grid points and linear
array elements are visualized as well. Given a new path point (,), its can be obtained by Eq.
(7), and the elevation value can be retrieved efficiently from the linear array accordingly. The
accuracy of the proposed interpolation method is closely associated with . Therefore, some
empirical tests were conducted to investigate the influence of on interpolation accuracy and
efficiency. The results are presented in Appendix B.
=(+1)∙
+
+1,
(+1)∙
+
+1 (7)
where [.] = a function rounding a number to its nearest integer, and (,)= points in - frame.
The local righthand ′-′-′ coordinate system of each bounding box is shown in Fig. 9. The
origin is centered on the bottom center of the bounding box. The ′ and ′ axes are parallel to the
long and short edges, respectively. Let {,……} be the track points of Agent . Let the
origin of the ′ -′ -′ frame overlap with , then the global coordinates of eight vertices that
compose a bounding box are calculated as:
,
,
,
={}∙,
,
,
+
(8)
where
=sin−cos0
cossin0
0 0 1 (9)
=1 0 0
0 cos−sin
0 sincos (10)
where , = rotation angle around and z axes, respectively, they are positive in a clockwise
direction, , = rotation matrices around and axes, respectively,
(,
,,
,,
),(,
,,
,,
) = local coordinates of vertices in ′ - ′ -′ and global spaces,
respectively, at time step , and (,,) = position of Agent at time step .
In most cases, and can be derived using agent’ direction vectors. However, at
signalized intersections, agents may remain stationary for a while to wait for their green phase. In
that case, it is difficult to estimate the forward direction vector using adjacent trajectory points (i.e.
adjacent elements of ‘Trajectory’ data). To address this issue, if an agent’s position does not change,
the rotation angles corresponding to its latest movement can be retrieved from ‘RotationAngles’. As
such, agents can also be modeled even they are stationary.
3.5 Dynamic Visibility Assessment
After the conflicting agents in pairs are identified, a set of virtual LOS are created accordingly. For
pedestrians and cyclists, it is assumed that the sight point overlaps with the trajectory point on the
horizontal plane but with adjustable height. Eight vertices of the conflicting agent are viewed as
target points. In this case, the offsets of a sight point i from the bottom center are 0.8 , as
illustrated in Fig.13a. For cars and medium vehicles, the horizontal offsets of the sight point from
the front side and the left side of the bounding cuboid are empirically set as 1.5 m and 0.5 m,
respectively. For trucks and buses, the distance from the eye to the front side is 1.0 m, other settings
remain unchanged as cars. Note that a more specific eye position in a bounding cuboid can be set
17
by defining its offset from the cuboid center. Besides, in countries where drivers need to align to
left, the eye is on the right side (dashed lines in Fig.13b).
(a)
(b)
Figure 13. Creation of the virtual LOS: (a) Basic model (b) Setting of eye points
The goal of the visibility assessment is to examine whether static or dynamic obstacles obstruct
the LOS. Due to the difference between the two types of obstacles, two approaches are separately
applied to inspect their relationship with the LOS. The general process of the hybrid visibility
assessment method considering both static and dynamic obstacles is graphically presented in Fig.
14. In Fig. 14, is the number of pairwise conflicting agents, is a ×2 matrix storing id of
conflicting agents, and () means to access the -th row of .
is used to store temporary
ID of agents within the intersection after excluding (), and is the size of
. Similar to
(),
() means to access the -th element of
. The algorithm is described in detail
next.
Figure 14. Process of hybrid visibility assessment method at each time step
18
3.5.1 For static obstacles
Regarding static point clouds as obstacles, inspired by the voxel-based methods (e.g.,
Shalkamy et al., 2020), an occupancy-based analytic involving the following five steps is proposed:
Step 1: Let (,,) and (,,) be the coordinates of the low-left and top-right
corners of static point cloud data. They are separately estimated as follows:
=min()
min()
max(),
=max()
max()
min() (11)
An ×× binary matrix ×× representing the occupancy map is then constructed as
follows:
⎩
⎪
⎨
⎪
⎧
=
=
=
(12)
where [.] = a function rounding a number to an integer, = grid size, (,,) = coordinates of
static point clouds. The initial value of each element of ×× is zero.
Step 2: Calculate the locations (ℓ,,) of static point cloud data in ×× which can be
estimated as:
⎩
⎪
⎨
⎪
⎧
ℓ=
=
=
(13)
where the notations remain unchanged as in Eq. (12).
Step 3: Set elements at (ℓ,,) to 1: ××(ℓ,,)=1 , and the occupancy map is
constructed as shown in Fig. 15.
(a)
19
(b)
Figure 15. Generation of occupancy matrix: (a) Points to occupancy map, and (b) An example of
occupancy matrix
Step 4: Suppose (,,) and (,,) denote the eye and target positions,
respectively. The line segment that connects (,,) and (,,) is discretized into
points using Eq. (14), as follows:
=
+∙
‖‖∙−
−
− (14)
where (,,) = discretized points in the LOS, and
‖‖=(−)+(−)+(−) (15)
={0,1,2….|0≤≤‖‖
,∈} (16)
where both (,,) and (,,) are included in (,,).
Step 5: (,,) are converted to locations (ℓ,,)using Eq. (13). Access
elements of ×× using (ℓ,,) . If there are non-zero elements, the LOS is
obstructed by static obstacles; otherwise, it is unobstructed. Note that each vertex of the conflicting
agent corresponds to a set of LOS points. To avoid reducing efficiency by estimating
(,,) one by one, discretized points in eight LOS are combined and converted to
locations in the occupancy matrix. The process is described in detail in Appendix C.
3.5.2 For dynamic obstacles
If the LOS is not affected by stationary obstacles, the module for assessing dynamic obstacles
is triggered. The process of detecting whether a bounding cuboid obstructs an LOS is inverse to Eq.
(2). Using the bounding cuboid of Agent as illustration, all (,,) are converted to the
local ′-′-′ coordinate frame as follows:
,
,
,
=∙−
−
− (17)
where (,
,,
,,
) = coordinates of (,,) in the local frame (see Fig. 9) of
Agent , other notations remain the same as Eq. (8).
In the local ′-′-′ space, if any point of (,,) is detected within the bounding
20
box with inequation (18), the target point is invisible to the sight point, else it is visible. If an
observer does not see a moving target conflicting with him/her, he/she may not detect the presence
of conflict point, which is a hazardous situation in which a collision may occur.
⎩
⎨
⎧
−
≤,
≤
−
≤,
≤
0≤,
≤
(18)
where ,, = length, width, and height of Agent , respectively.
As noted in Fig. 14, during the execution of visibility assessment modules, the static obstacles
are firstly checked at each time step. If LOS is obstructed, subsequent analysis of dynamic obstacles
step is skipped to avoid a waste of computational power. When dynamic obstacles are considered,
LOS connecting two agents are estimated sequentially. Besides, 3D bounding cuboids will not be
created for the conflicting agents in pair whose intervisibility is being estimated (other agents within
the intersection will be modeled).
4 Validation
Two controlled experiments shown in Fig. 16a were conducted to validate the proposed procedure.
Specifically, three volunteers were recruited to walk along designated routes on a site with restricted
visibility. During the experiment, each participant used a camera (wide view mode) to record their
views. A Go-pro Hero 7 camera (GoPro 2021) was mounted on the top of an erected pole to capture
video footages of three participants’ movements. Fig. 16b displays video sequences collected from
different perspectives (POV denotes Point of View) at the same time. It is noteworthy that different
video data were synchronized for subsequent analyses.
The trajectory data of each participant were extracted from the video data using a commercial
software named DataFromSky (2021). The reliability of DataFromSky has been acknowledged in
the literature (Adamec et al., 2017). Fed with video data, DataFromSky can output each participant’s
path and its corresponding semantic class.
(a)
21
(b)
Figure 16. Controlled experiments: (a) walking routes, (b) video data from different
perspectives
Dense point cloud data of the site were collected by Leica P40 Scan Station (HEXAGON,
2021), and are visualized in Fig. 17. Leica P40 Scan Station can capture ultra-dense point clouds at
a millimeter level and the point cloud data were then used to manually geo-register path data. Next,
the digital ground model in grids shown in Fig. 17 were generated using the procedure described in
Section 3.4.
Figure 17. Static point cloud data
Table 5 Variable values
Variable
/m
/°
/m
ℎ
/m
/m
/s
Value 17.0 180 0.2 0.15 0.2 0.03
Figure 18. From a 2D video sequence to a 3D scene
22
(a)
(b)
(c)
Figure 19. Comparison with ground truth: (a) Experiment 1 (b) Experiment 2 (c) video
sequences
Using point cloud data and geocoded trajectory data as inputs, the proposed procedure can
estimate IvCA in real-time. Variable values are empirically specified in Table 5. As illustrated in Fig.
18, the procedure can effectively reconstruct the dynamic 3D scene corresponding to a video
sequence. During the execution, the procedure automatically records time stamped ID of conflicting
agents and IvCA information (i.e. vis indicator in Fig. 19, 0-invisible; 1-visible). In this regard, it is
viable to compare the estimated results with ground truth. Specifically, when a pair of conflicting
agents is detected, its corresponding time stamp data can be used to manually retrieve video data
shown in Fig. 16b. Then, it can be manually examined that whether the participant can see its
conflicting agent. Comparisons between the estimated visibility results and the ground truth in both
23
experiments are displayed in Fig. 19.
The dashed line in Fig. 19 refer to ground truth. The results in Fig. 19 demonstrate that the
proposed procedure can effectively assess IvCA because the estimated vis indicator data overlap
substantially with ground truth. Considering the tracking error and sight point offset, it is
understandable that the estimated time when visibility changes slightly deviate from the real
situation. There is a small gap (refer to the arrow mark in Fig. 19a) between the beginning time of
identified and real conflict events in each case. The results from the definition of visual range
depicted in Fig. 5. At the beginning, Agents did not fall within each other’s visual fields and thus
conflict events were not detected. Note that Agents’ visual ranges and viewing angles are adjustable
to accommodate different scenarios.
During the experiments, participants did not exactly trace the routes shown in Fig. 16a, this
account for why Agent 3 may conflict with Agent 2 in some cases (see the area marked by dashed
ellipse). Regarding the I case in Fig. 19a, it is estimated that Agent 2 conflicted with Agent 3, which
corresponds to the real-life situation (refer to Fig. 19c). In the II case, Agent 3 occasionally
conflicted with Agent 2 because the variations in Agent 2’s forward direction. The time curves are
plotted in Fig. 20. All computations were executed on a 16 GB-RAM computer (a CPU of Intel ®
8-core i7-10700CPU @2.9GHz, the same hereinafter). As noted, the efficiency is satisfactory.
(a)
(b)
Figure 20. Time curves: (a) Experiment 1 (b) Experiment 2
5 Case Study
5.1 Data Descriptions
The procedure proposed in this study is also tested in two virtual scenarios. The data were collected
at an unsignalized intersection in Southeast University (SEU), Nanjing, P.R. China and a signalized
intersection in Singapore, as shown in Fig.21. The video data on Site 1 (553 seconds) and Site 2
(802 seconds) were captured with Phantom 4 Pro® (DJI, 2021) and a Go-pro Hero 7 camera (GoPro
2021), respectively. The technical information regarding two video cameras is presented in Table
6a. Similar to the validation part, the trajectory data were extracted from video footages using
DataFromSky. Regarding path data, each path point has a timestamp in addition to its coordinates.
In both cases, a combination of video data and DataFromSky served as the virtual RMU.
(a)
24
(b)
(c)
Figure 21. Study sites: (a) Locations, (b) Signal phases on Site 2, (c) Histograms of agent types
Fig.21b shows a three-phase traffic signal cycle on Site 2. As noted, turning vehicles may
conflict with pedestrians in Phases 1 and 3. On Site 1, a significant proportion of agents are non-
motorized road users (see Fig. 21c). Cyclists and pedestrians may move randomly at the
unsignalized intersection, which may give rise to more conflict events. In comparison, most agents
on Site 2 are motorized vehicles.
Table 6 Technical parameters of video cameras and laser scanners
(a) Video cameras
Technical parameters Values
Phantom 4 Pro® Go Pro Hero 7®
Field of view 84° 87.6º (linear mode)
Video resolution 1920
×
1080 pixels 1920×1080 pixels
Frame rate 24 FPS 30 FPS
Height 20 m 45 m
Hover accuracy
∓
0.3 m horizontally
∓
0.1 m vertically
/
(b) Laser scanners
Technical parameters Values
Livox-40 Leica P40
Range <260 m 0.4~270 m
25
Ranging accuracy 2 cm @ 20m 1.2mm + 10ppm
Field of view 38.4°
×
38.4° 360°
×
290°
Angular resolution <0.1° 8" horizontal;
8" vertical
Data rate 100,000 points/s 1,000,000 points/s
The static 3D infrastructure data on Site 1 and Site 2 were collected by Livox Mid-40 (LIVOX,
2021) and Leica P40 Scan Station (HEXAGON, 2021), respectively. The related technical
parameters are presented in Table 6b. Due to a limited field of view of Livox Mid-40, the Lidar
device was rotated to capture a 360° view of the intersection on Site 1. As shown in Fig. 22, the
point cloud data enable a fine depiction of the intersection on each study site. Note that all non-
stationary noises (e.g., cyclists, pedestrians, and cars) were manually removed from point cloud data
using CloudCompare (2021).
5.2 Application
Variable values are specified in Table 5, except that -values for Site 1 and Site 2 are 0.04 s
and 0.067 s, respectively. The execution time of several pre-processing steps before the application
of the procedure is presented in Table 7. Note that the time of manual operations (e.g., registration)
is not included. Because of the geocoded nature of point cloud data, several ground feature points
were used to manually geo-register video data, as illustrated in Figs. 23a and 23b. A sample of the
original and geo-registered video sequence is also shown in Figs. 23a and 23b. It is noteworthy that
only horizontal coordinates were used in this phase. The mean alignment error is within 10
centimeters, which is acceptable for subsequent conflict points computations.
Table 7 Time of pre-processing steps in seconds
Steps Segment road points Generate DEM
Generate path map
Site 1 2.77 3.12 1.76
Site 2 2.96 3.31 2.92
Fig. 23c visualizes all tracks after the manual geo-registration. Because the intersection on Site
1 is not controlled by signals, agents’ movements are quite complicated. In comparison, vehicles’
movements on Site 2 show a regular pattern. In future real-world cases, the transformation matrix
∙ that aligns video data with point cloud data can be estimated first. Then ∙ can be
incorporated into the tracking algorithms embedded in RMUs. In this way, the monitoring devices
can output geocoded motion data directly.
26
(a)
(b)
Figure 22. Static point cloud data: (a) Site 1, and (b) Site 2
(a)
27
(b)
(c)
Figure 23. Manual alignment of video sequence and point cloud data: (a) Feature points, (b)
Visualization of tracks measure in meters, (c) Trajectory data on both sites
Similar to Fig.18, the approach by Ma et al. (2021) was applied to identify the ground points
from the Lidar data. In addition, to ensure a real-time interpolation, the digital ground model in grids
were created accordingly for both intersections. To imitate the dynamic process of receiving data
from RMU, the virtual procedure imported path points in sequence according to their timestamps.
A 3D bounding cuboid was created at each trajectory point in each time step, as illustrated in Fig.24a.
The creation of twin bounding cuboids is closely related to the detection accuracy of tracking
algorithms. If an agent was failed to be identified by RMU (see the missing detection in Fig. 24a),
no bounding cuboid would be created for it. However, the tracking techniques are not the focus of
this study, hence their adverse impacts on the conflict points detection are not considered in this
study.
(a)
28
(b)
Figure 24. 3D modeling of obstacles: (a) Site 1, and (b) Site 2
(a)
(b)
Figure 25. Time tests: (a) Three computing structures, and (b) Influence of core numbers
As previously mentioned in Section 3.3.2, the parallel computing technique was applied to
speed up the estimations of conflict points. Several time tests were conducted to examine efficiency
of different computing structures. Note that the tests were performed on data of Site 1 considering
the more complex interactions among agents. There are three available types of structures: for-loop,
vectorization, and parallel computing. The structures of for-loop and parallel computing are
illustrated in Fig. 8. Regarding vectorization programming, it refers to that operations are performed
on multiple components of a vector at the same time (MATLAB, 2021). In that case, the agents are
stored in the form of a cell array where each element corresponds to an agent. The results are plotted
in Fig. 25.
Figure 26. Dynamic creation of the LOS
The horizontal and vertical axes refer to the number of agents to be estimated and the time of
29
conflicting agents’ identification, respectively. Samples of 10, 20, 30, 40, 50, 100, 200,400, and 600
agents were separately estimated using different computing structures. In this case, all cores were
used to execute the computations. From Fig.25, it is noted that the parallel computing structure
outperforms substantially better than the other two on efficiency when the number of agents exceeds
50. The differences between for-loop and vectorization structures are negligible when there are
fewer than 400 agents. The differences between the parallel computing structure and the other two
are positively correlated with the number of agents. In the tests, the vectorization structure did not
show better performance on efficiency as usual because of the complexity of computational tasks.
Fig. 25b shows the impacts of the number of cores on the time performance of parallel
computing. Generally, when there are over 100 agents within the intersection, the more cores are
used, the less time the computations take. However, when number of agents is less than 100, more
cores do not result in faster computations.
Once a pair of conflicting agents is identified in Layer 1, as illustrated in Fig. 26, a set of virtual
LOS is created in Layer 3. The visibility assessment procedure is then applied to check whether the
conflicting agents are intervisible. A sample of two LOS separately obstructed by a bounding cuboid
and a tree trunk is shown in Fig. 27a. Figs. 27b and 27c envision the different ways to detect
obstructions in an LOS for dynamic and static obstacles, respectively. With reference to the close-
up view of LOS 1 in Fig. 27b, the procedure can detect the LOS points inside a bounding cuboid.
In Fig. 27c, the discretized LOS points which overlap with tree trunk points are marked in red.
(a)
30
(b) (c)
Figure 27. Visualization of visibility assessment: (a) Two LOS, (b) Close-up view of LOS 1, and
(c) Close-up view pf LOS 2
5.3 Outputs
5.3.1 Processing time
The stepwise processing time and the number of agents are both plotted in Fig. 28. As noted, the
procedure takes about 0.05 s and 0.08 s to complete computations at each step (8 threads were used)
on Site 1 and Site 2, respectively. Therefore, the procedure can support real-time monitoring of
conflict points with limited computational power. Besides, thanks to parallel computing, the
processing time is not very sensitive to the number of agents, as indicated by the rectangles in Fig.
28a. Because the intersection on Site 1 is unsignalized, it is understandable that the number of
conflicting agents is relatively large at each time step. In comparison, despite more agents on Site
2, not many conflict events are detected because traffic flow is controlled by traffic signals.
(a)
31
(b)
Figure 28. Processing time and number of agents: (a) Site 1, (b) Site 2
5.3.2 Index
An index that measures severity of conflict points can be output in real-time using the
developed procedure. The index equals the ratio of number of ‘hidden’ conflict points to
total number of conflict points at each time step, as shown in Fig.29. A hidden conflict point
corresponds to a pair of non-intervisible conflicting agents. A higher may imply a more
dangerous situation where a larger proportion of conflicting agents are not intervisible to each other.
In future studies, the index could contribute to some real-time models (e.g., Zheng and
Sayed, 2020) for predicting crash occurrence at the intersections.
Although there are many conflict events on Site 1, most conflicting agents are intervisible, as
noted in Fig.29. A large proportion of traffic participants on Site 1 are pedestrians and cyclists (see
Fig. 21c), whose dimensions will not cause severe occlusion issues. Differently, motorized vehicles
account for a significant part of agents on Site 2. In that case, it is more likely that agents’ view may
be substantially blocked by a vehicle of large size in a conflict event. Therefore, in spite of less
conflict events, -values are at a higher level on Site 2 than that on Site 1.
Figure 29. Index at different time steps
32
5.3.3 Virtual warning signals
The procedure records all conflicting agents at each time step, as shown in Fig.30a. The former
two columns are IDs obtained in Layer 1, while the third column stores the indicators denoting any
obstruction in the LOS connecting two conflicting agents. More specifically, an indicator of 1 or 0
means the agent in Column 1 can or cannot see the corresponding agent in Column 2, respectively.
If an indicator of 0 is detected, a warning signal can be emitted from RMU to corresponding agents
to avoid a potential collision. It is noteworthy that in real-world applications, whether it is necessary
to emit a warning signal should also consider the TTC information of conflicting agents.
Fig. 30b visualizes the results in Layers 1 and 2 at the 4740th time step on Site 2. In this case,
two turning vehicles (i.e. Agents 252 and 254) are conflicting with Pedestrians 367 and 423. From
Fig. 30b, note that no obstructions lie between each pair of conflicting agents, which are in
accordance with the indicators shown in Fig. 30a. In addition to real-time monitoring of IvCA, the
results shown in Fig. 30 may help reconstruct and understand how an accident occurs at the
intersections. Specifically, using trajectory data of road users involved in a crash as inputs, the
proposed procedure can estimate their frame-wise intervisibility before a collision.
(a)
(b)
Figure 30. Virtual warning signals: (a) Conflicting agents and indicators, and (b) Reconstructed
scene
5.3.4 Spatial distribution of conflict points
The procedure also enables a post-analysis of conflict events within the intersection. Fig. 31 shows
the spatial distribution of conflict points spanning the entire video sequences. The color bars in Figs.
31a and 31b map the conflicting angle
and the scale of , respectively. equals
33
the difference between of two conflicting agents. Each whose absolute value
exceeds 5 s is set to 5 s. A smaller indicates a higher risk level.
On Site 1, there are many conflict points scattered over the intersection. The result is reasonable,
considering that the intersection is not controlled using signals. However, corresponding to results
in Fig.29, the ‘hidden’ conflict points only account for a small part of all conflict points. Besides,
there is no notable pattern in regards to the distribution of ‘hidden’ conflict points. Only the conflict
points whose ≤4 s are visualized in Fig. 31b. On Site 2, a majority of ‘hidden’ conflict
points occurred at pedestrian crossings (demarcated by dashed lines). In Phase 2, the inner turning
vehicles from Pioneer Rd. North to Jurong West St. may conflict with approaching vehicles on their
left side. Their intervisibility may be adversely affected by the outer turning vehicles. This accounts
for the conflict points inside the ellipse marked in Fig.31b. However, because the approaching
vehicles are not allowed to go straight through the intersection during Phase 2, the risk level of these
conflict points is relatively low.
From Fig.31, the locations with more hidden conflicts can be directly indicated, based on
which specific countermeasures can be implemented accordingly. For instance, more hidden conflict
points in the pedestrian crossing areas are observed on Site 2. Therefore, some traffic control devices
or traffic signs should be installed here to mitigate the potential IvCA issue.
(a)
34
(b)
Figure 31. Visualization of conflict points: (a) Site 1, and (b) Site 2
5.3.5 Individual time-series data
For behavioral analyses, the procedure can also output individual conflict and visibility-related data.
Fig.32 shows several time-series data of Agent 13 (a pedestrian) on Site 1: speed profile, ,
, visibility indicator (see Fig. 30), and IDs of conflicting agents. As shown in Fig. 32a,
Agent 13 conflicts with Agent 12 first, and then it conflicts with Agent 14. As indicated in the
visibility indicator profile, Agents 12 and 13 are not intervisible at T1 and T2. The traffic scenes at
T1 and T2 are separately reconstructed as Fig. 32b using the proposed procedure. In this case, Agents
12 and 13’s intervisibility are reduced by static roadside obstacles (i.e. trunk and poles). However,
because roadside obstacles are not continuous, Agents 12 and 13 can visually detect each other
during the interactions. Therefore, Agent 13 can expediently decelerate to avoid colliding with
Agent 12.
(a)
(b)
35
Figure 32. Time series data of Agent 13 in a conflict event: (a) Time series data, and (b)
Reconstructed scene
(a)
(b)
Figure 33. Time series data of Agent 924 in a conflict event: (a) Time series data, and (b)
Reconstructed scene
Fig. 33 shows a relatively more complex case on Site 2. Agent 924 is a motorcyclist who was
making a turn. During Agent 924’s movements, it mainly conflicted with Agent 861 (a pedestrian).
The visibility indicator profile implies that IvCA was poor in this conflict event. The traffic scenes
at , , and are separately reconstructed to acquire insights into the IvCA issue. In Fig. 33b,
it is noted that at and , the presence of Agent 675 (a bus) substantially blocks Agent 924’s
view. Because Agent 924 decelerated continuously, the poor IvCA issue did not result in a collision
in this case. However, if Agent 924 was a reckless or aggressive driver, the poor IvCA may
significantly increase the probability of occurrence of a right-angle collision. The time-series data
and reconstructed scenes in Figs. 32 and 33 thus provide a better understanding of the behavioral
interactions among conflicting agents.
36
6 Concluding Remarks
To the authors’ best knowledge, this study is the first attempt to achieve real-time monitoring of
intervisibility between conflicting agents in pairs at intersections. Based on the study, the following
comments are offered:
1. When properly developed, the proposed procedure can be deployed in real-world RMUs and
help improve intersection safety. For instance, the severity level of non-intervisibility can be
quantified based on its duration. If two conflicting agents cannot see each other for a specific
time, a warning signal must be emitted to avoid a potential collision. On the contrary, if the
duration of non-intervisibility is relatively short, it is undesirable to even warn the conflicting
agents. In the future, developing effective strategies for emitting warning signals considering
agents’ reactive behaviors and interactions is crucial to the signals’ performance.
2. The intervisibility between conflicting agents may substantially affect the safety performance
of intersections. However, it has not been considered in previous studies as related to conflict
analyses. In this regard, incorporating intervisibility constraints into current crash prediction
models shall improve those models’ accuracy.
3. Considering that the new procedure can output individual time series data in a conflict event, it
can assist in studying road users’ interactions, especially in cases where visibility is limited. For
instance, the conflict events with the non-intervisibility issue can be identified using the
visibility indicators. Then, different road user’s reactions to a hidden conflicting agent can be
understood by analyzing variations in their motion states.
4. Some limitations remain to be addressed, which can be taken up in future work. Trajectory data
acquired by roadside Lidar have not been tested in this study. Besides, the sizes of bounding
cuboids representing agents were pre-defined, and it is desirable to use more accurate digital
twins to map the real-world situation better. For instance, video data can be combined with
Lidar scans to achieve a more accurate estimation of agent sizes (e.g., Xu et al., 2018). In
Section 3.3.2, the path matching procedure was mainly designed for motorized vehicles
considering their regular turning patterns, a path prediction model for more random cyclists is
desired in the future study. The experiments for validation in this study are relatively simple,
more experiments covering various traffic scenarios are expected to measure the quality of
warning signals. Also, some variables such as were empirically determined in the case study.
Herein, more test sites and scenarios to investigate their influences on accuracy and efficiency
shall be undertaken in a following study.
Acknowledgements
This work is jointly supported by the National Natural Science Foundation of China [51878163, and
51768063]; the Natural Sciences and Engineering Research Council of Canada (Ryerson-2020-
04667); the Chinese Scholarship Council [202006090200], and the Scientific Research Foundation
of Graduate School of Southeast University [YBPY2038].
Appendix
A. Generation of Path Map
As described in Section 3.3.2, a set of agents’ trajectories is collected on site during busy traffic
hours. The linear fitting method is applied to segment curved paths from the trajectory set.
Specifically, a linear regression model is fitted to each set of discrete path points. If the average
37
Euclidean distance from path points to the fitted line exceeds 0.5 m, the path is considered as a
curved one. Natural cubic splines are used to fit the segmented curved paths (Ma et al., 2019). As
such, each curved path can be re-partitioned into uniformly spaced points (see Fig.6).
Supposing that there are curved paths. As illustrated in Fig. A1a, each observed path is
labeled with a unique path id (1≤id≤). Then, all paths are stacked and rasterized into a path
map. Similar to the process of segmenting ground points (see Fig.10), the labeled path points are
partitioned into a number of pillars (grid size = ) in the --path id space. Note that each pillar
corresponds to a pixel in the path map. The pixel value equals the path id with the highest frequency
in the corresponding pillar. If the pillar is void, the pixel value is zero.
During the application of the proposed procedure, if an agent is detected as turning, its path
points inside the ROI are converted to locations in the path map to retrieve pixel values. The pixel
value with the largest count indicates the id of the most similar path amongst the trajectory set. Then,
the path id can help retrieve the matched path from the path set. Fig. A1b shows a histogram of pixel
values and their corresponding candidate paths. As noted, the matched track corresponds to the
highest bin. Using a path segment as input, the matching procedure can help to predict the agent’s
future trajectory in real-time.
(a)
(b)
Figure A1. Generation of path map for real-time matching: (a) Convert path points to path map,
(b) Matching procedure
B. Influence of on Elevation Interpolation
38
Data in the case study are used to test the influence of on elevation interpolation. Based on the
identified ground points, the ground truth elevation data of all path points were obtained using a
conventional 2D linear interpolation. In comparison, the linear elevation array-based approach
described in Section 3.4 was also applied to obtain elevation data for each path point. For each site,
several -values were used to generate different estimations and results were then compared with
the ground truth data. To quantify the accuracy and time performance of the linear elevation array
method, and were calculated as follows, respectively. The test results are presented in
Table B1.
Linear elevation array
=∑∑,
,
(B1)
=∑∑,
(B2)
where = the number of tracks, = the number of path points of the -th track, , = time
required for interpolating the -th path point of the -th track, ,
and ,
= the ground truth and
estimated -values of the -th path point of the -th track, respectively.
Table B1 Test results (a) SEU case (b) Singapore case
(m) 0.1 0.2 0.3 0.4 0.5
(a)
MAE 0.012 0.0154 0.0247 0.0305 0.0364
T (10-8 s) 14.10 13.40 12.03 8.94 8.61
(b)
MAE 0.0069 0.0071 0.0122 0.0135 0.0164
T (10-8 s) 18.23 16.20 9.70 9.09 8.77
(a)
(b)
Figure B1. -MAE curves (a) SEU case (b) Singapore case
As indicated in Table B1, and are positively and negatively associated with ,
respectively. A small means longer processing time, but the interpolation accuracy is higher.
Because the efficiency is satisfactory as the average time per path point is at a level of 10-8 s. In this
regard, a smaller is more desirable. Fig. B1 shows the -MAE curves in two cases. It is noted
that when -value decreases from 0.2 to 0.1, the variation in is not substantial. Therefore, a
-value ranging from 0.01 to 0.02 m shall be accurate enough for the elevation interpolation.
C. Visibility Assessment of Eight Vertices
39
As mentioned in Section 3.4, each vertex of the conflicting agent corresponds to a set of LOS points.
To avoid relatively time-consuming for-loop computations, LOS points of eight vertices are stacked
into a ×4 matrix, where the fourth-row stores vertex ID (see Fig. C1). Fed with the first three
rows of the matrix, the LOS assessment module described in Section 3.4 can output the indices of
LOS points overlapping with static obstacles and those inside cuboids. Then, the indices can be used
to retrieve vertex ID. If vertex ID of obstructed LOS points include all integers from 1 to 8, the
conflicting agent is considered as invisible; otherwise, it is visible.
Figure C1. Visibility assessment of eight vertices
References
Adamec, V., Schullerova, B., Babinec, A., Herman, D., Pospisil, J. 2017. Using the DataFromSky
system to monitor emissions from traffic. In Transport Infrastructure and Systems (pp. 913-918).
CRC Press.
BaiduNews. 2021. Vehicles cause inadequate view.
https://www.baidu.com/s?rtt=1&bsst=1&cl=2&tn=news&rsv_dl=ns_pc&word=%E8%BD%A6
%E8%BE%86%E9%81%AE%E6%8C%A1%E8%A7%86%E7%BA%BF%E4%BA%8B%E6
%95%85 (in Chinese, accessed on Apr. 7th, 2021).
Castro, M., Anta, J.A., Iglesias, L., Sanchez, J.A., 2014. GIS-based system for sight distance
analysis of highways. J. Comput. Civil Eng. 28 (3), 04014005.
Chen, A. Y., Chiu, Y. L., Hsieh, M. H., Lin, P. W., Angah, O., 2020. Conflict analytics through the
vehicle safety space in mixed traffic flows using UAV image sequences. Transp. Res. Pt. C-Emerg.
Technol. 119, 102744.
Chen, P., Zeng, W., Yu, G., Wang, Y. 2017. Surrogate safety analysis of pedestrian-vehicle conflict
at intersections using unmanned aerial vehicle videos. J. Adv. Transp. 2017, 5202150.
Chen, P., Zeng, W., Yu, G. 2019. Assessing right-turning vehicle-pedestrian conflicts at intersections
using an integrated microscopic simulation model. Accid. Anal. Prev. 129, 211-224.
CloudCompare. 2021. 3D point cloud and mesh processing software Open-Source Project
http://www.cloudcompare.org/ (accessed on Jan. 22nd, 2021).
DataFromSky. 2021. Deep traffic video analysis. https://datafromsky.com/ (accessed on Jan. 13th,
2021).
DJI. 2021. PHANTOM 4 PRO. https://www.dji.com/sg/phantom-4-pro (accessed on Feb. 17th,
2021).
El-Basyouny, K., Sayed, T. 2013. Safety performance functions using traffic conflicts. Saf.
Sci. 51(1), 160-164.
Essa, M., Sayed, T. 2018. Traffic conflict models to evaluate the safety of signalized intersections
at the cycle level. Transp. Res. Pt. C-Emerg. Technol. 89, 289-302.
Essa, M., Sayed, T. 2019. Full Bayesian conflict-based models for real-time safety evaluation of
signalized intersections. Accid. Anal. Prev. 129, 367-381.
Fu, Y., Li, C., Luan, T. H., Zhang, Y., Mao, G. 2018. Infrastructure-cooperative algorithm for
40
effective intersection collision avoidance. Transp. Res. Pt. C-Emerg. Technol. 89, 188-204.
Gargoum, S. A., Tawfeek, M. H., El-Basyouny, K., Koch, J. C. 2018a. Available sight distance on
existing highways: Meeting stopping sight distance requirements of an aging population. Accid.
Anal. Prev. 112, 56-68.
Gargoum, S. A., El-Basyouny, K., Sabbagh, J. 2018b. Assessing stopping and passing sight distance
on highways using mobile Lidar data. J. Comput. Civil Eng. 32(4), 04018025.
González-Gómez, K., Iglesias, L., Rodríguez-Solano, R., Castro, M. 2019. Framework for 3D point
cloud modelling aimed at road sight distance estimations. Remot. Sens. 11(23), 2730.
González-Gómez, K., Castro, M. 2019. Evaluating pedestrians’ safety on urban intersections: A
visibility analysis. Sustainability, 11(23), 6630.
González-Gómez, K., López-Cuervo Medina, S., Castro, M. 2021. Assessment of intersection
conflicts between riders and pedestrians using a GIS-based framework and portable Lidar. GISci.
Remot. Sens. 1-16.
GoPro. 2021. GoPro Hero 7. https://gopro.com/en/us/shop/cameras (accessed on June. 18th, 2021).
Guo, Y., Sayed, T., Essa, M. 2020a. Real-time conflict-based Bayesian Tobit models for safety
evaluation of signalized intersections. Accid. Anal. Prev. 144, 105660.
Guo, Y., Liu, P., Wu, Y., Chen, J. 2020b. Evaluating how right-turn treatments affect right-turn-on-
red conflicts at signalized intersections. J. Transp. Saf. Secur. 12(3), 419-440.
Jang, J. A., Choi, K., Cho, H. 2012. A fixed sensor-based intersection collision warning system in
vulnerable line-of-sight and/or traffic-violation-prone environment. IEEE Trans. Intell. Transp.
Syst. 13(4), 1880-1890.
Jung, J., Olsen, M.J., Hurwitz, D.S., Kashani, A.G., Buker, K., 2018. 3D virtual intersection sight
distance analysis using Lidar data. Transp. Res. Pt. C-Emerg. Technol. 86, 563–579.
Kumar, A., Paul, M., Ghosh, I. 2019. Analysis of pedestrian conflict with right-turning vehicles at
signalized intersections in India. J. Transp. Eng. Pt A: Syst. 145(6), 04019018.
Lang, A. H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O. 2019. Pointpillars: Fast encoders
for object detection from point clouds. In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (pp. 12697-12705).
Liu, M., Zeng, W., Chen, P., Wu, X. 2017. A microscopic simulation model for pedestrian-pedestrian
and pedestrian-vehicle interactions at crosswalks. PLoS one, 12(7), e0180992.
HEXAGON. 2021, Leica ScanStation P40-High-Definition 3D Laser Scanning Solution.
https://leica-geosystems.com/products/laser-scanners/scanners/leica-scanstation-p40--p30
(accessed on July 3th, 2021).
Ma, Y., Zheng, Y., Cheng, J., Zhang, Y., Han, W. 2019a. A convolutional neural network method to
improve efficiency and visualization in modeling driver’s visual field on roads using MLS data.
Transp. Res. Pt. C-Emerg. Technol. 106, 317-344.
Ma, Y., Zheng, Y., Cheng, J., Easa, S. 2019b. Real-time visualization method for estimating 3D
highway sight distance using Lidar data. J. Transp. Eng. Pt A: Syst.145(4), 04019006.
Ma, Y., Zheng, Y., Hou, M., Easa, S., Cheng, J., 2019c. Automated method for detection of missing
road point regions in mobile laser scanning data. ISPRS Int
.
J. Geo-Inf., 8(12), 525.
Ma, Y., Easa, S., Cheng, J., Yu, B. 2021. Automatic Framework for Detecting Obstacles Restricting
3D Highway Sight Distance Using Mobile Laser Scanning Data. J. Comput. Civil Eng. 35(4),
04021008.
Ma, Y., Zhu, J. 2021. Left-turn conflict identification at signal intersections based on vehicle
41
trajectory reconstruction under real-time communication conditions. Accid. Anal. Prev. 150,
105933.
Machiani, S. G., Abbas, M. 2016. Safety surrogate histograms (SSH): A novel real-time safety
assessment of dilemma zone-related conflicts at signalized intersections. Accid. Anal. Prev. 96,
361-370.
MATLAB. 2021. Cellfun: Apply Function to Each Cell in Cell Array.
http://www.mathworks.com/help/matlab/ref/cellfun.html (accessed on Feb. 10th, 2021)
Oh, J., Kim, E., Kim, M., Choo, S. 2010. Development of conflict techniques for left-turn and cross-
traffic at protected left-turn signalized intersections. Saf. Sci. 48(4), 460-468.
Pang, S., Morris, D., Radha, H. 2020. CLOCs: Camera-LiDAR object candidates fusion for 3D
object detection. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS) (pp. 10386-10393). IEEE.
Perkins, S.R., Harris, J.I., 1967. Traffic conflict characteristics: Accident potential at intersections.
Technical Report. General Motors Research Publication GMR-718.
Salim, F. D., Loke, S. W., Rakotonirainy, A., Srinivasan, B., Krishnaswamy, S. 2007. Collision
pattern modeling and real-time collision detection at road intersections. In 2007 IEEE Intelli.
Transp. Syst. Conf. IEEE, pp. 161-166.
Saunier, N., Sayed, T., Ismail, K. 2010. Large-scale automated analysis of vehicle interactions and
collisions. Transp. Res. Rec. 2147(1), 42-50.
Svensson, Å., Hydén, C. 2006. Estimating the severity of safety related behaviour. Accid. Anal. Prev.
38(2), 379-385.
Sayed, T., Zaki, M. H., Autey, J. 2013. Automated safety diagnosis of vehicle–bicycle interactions
using computer vision analysis. Saf. Sci. 59, 163-172.
Shalkamy, A., El-Basyouny, K., Xu, H. Y. 2020. Voxel-based methodology for automated 3D sight
distance assessment on highways using mobile light detection and ranging data. Transp. Res.
Rec. 2674(5), 587-599.
Soilán, M., Riveiro, B., Sánchez-Rodríguez, A., Arias, P. 2018. Safety assessment on pedestrian
crossing environments using MLS data. Accid. Anal. Prev. 111, 328-337.
Torr, P. H., Zisserman, A. 2000. MLESAC: A new robust estimator with application to estimating
image geometry. Comput. Vision Image Understanding, 78(1), 138-156.
Tsai, Y., Yang, Q., Wu, Y., 2011. Use of light detection and ranging data to identify and quantify
intersection obstruction and its severity. Transp. Res. Rec. 2241, 99–108.
Tsukada, M., Kitazawa, M., Oi, T., Ochiai, H., Esaki, H. 2019. Cooperative awareness using
roadside unit networks in mixed traffic. In 2019 IEEE Vehicular Networking Conference (VNC),
IEEE, pp. 1-8.
Wang, L., Mao, B., Chen, S., Zhang, K. 2009. Mixed flow simulation at urban intersections:
Computational comparisons between conflict-point detection and cellular automata models. In
2009 International Joint Conference on Computational Sciences and Optimization, IEEE, Vol. 2,
pp. 100-104.
Xu, D., Anguelov, D., Jain, A. 2018. Pointfusion: Deep sensor fusion for 3d bounding box estimation.
In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 244-253).
Zeng, W., Chen, P., Nakamura, H., Iryo-Asano, M. 2014. Application of social force model to
pedestrian behavior analysis at signalized crosswalk. Transp. Res. Pt. C-Emerg. Technol. 40, 143-
159.
42
Zeng, W., Chen, P., Yu, G., Wang, Y. 2017. Specification and calibration of a microscopic model for
pedestrian dynamic simulation at signalized intersections: A hybrid approach. Transp. Res. Pt. C-
Emerg. Technol. 80, 37-70.
Zheng, L., Sayed, T. 2020. A novel approach for real-time crash prediction at signalized intersections.
Transp. Res. Pt. C-Emerg. Technol. 117, 102683.
Zhu, F., Ukkusuri, S. V. 2015. A linear programming formulation for autonomous intersection
control within a dynamic traffic assignment and connected vehicle environment. Transp. Res. Pt.
C-Emerg. Technol. 55, 363-378.