ChapterPDF Available

A Data-Driven Approach for Taxi-Time Prediction: A Case Study of Singapore Changi Airport

Authors:

Abstract and Figures

The ground movement is one of the most critical airside operations. It includes two sub-problems: routing and scheduling and serves the purpose of guiding aircraft on the surface of an airport to meet the departure schedule while minimizing overall travel time. To achieve that purpose, ground movement controllers manage the taxi-route assignments and taxi-time estimation for each aircraft in arrival or departure queue. A high-accuracy taxi-time calculation is required to increase the efficiency of airport operations. In this study, we propose a data-driven approach to construct features set and build predictive models for taxi-time prediction for departure flights. The proposed approach can suggest the taxi-route and predict the corresponding taxi-time by analyzing ground movement data. The controller’s operational preferences are extracted and learned by machine learning algorithms for predicting taxi-route and taxi-time of given aircraft. In this approach, we take advantage of taxiing trajectories to learn the controller’s decision, which reflects how the controller had decided the routing for a given situation. Two machine learning models, random forest regression, and linear regression are implemented and show similar performances in estimating the taxi-time. However, since the random forest is an ensemble method that has advantages in handling outliers, performing feature selection, and assessing feature importance, it can provide more stable results and interpretability, for real operations. The predictive model for taxi-time can predict the taxi-out time with high accuracy with given assigned taxi-route. The model can cover the controller’s decision up to 70% in the top-1 and 89% in top-2 recommends. The mean absolute error is less than 2.07 min for all departure flights, and root mean square error is approximately 2.5 min. Moreover, the ± 3-minute error window can cover around 76% of departures, while more than 95% of departures are within the ± 5-minute error window.
Content may be subject to copyright.
ENRI Int. Workshop on ATM/CNS. Tokyo, Japan. (EIWAC 2019)
1
[EN-A-9](A(Data-driven(approach(for(taxi-time(prediction:(a(case(
study(of(Singapore(Changi(airport(
(EIWAC'2019)'
+DT. Pham*, M. Ngo**, N. Tran*, S. Alam*, V. Duong*
Air Traffic Management Research Institute (ATMRI)
School of Mechanical and Aerospace Engineering (MAE)
Nanyang Technological University (NTU)
Singapore, Singapore
*[ dtpham | thanhnam.tran | sameeralam | vu.duong ]@ntu.edu.sg
**man.ngo@jvn.edu.vn
Abstract:
In daily operations at an airport, the ground movement of an aircraft is one of the most critical airside operations. The ground
movement problem includes two sub-problems: routing and scheduling, which serve the purpose of guiding aircraft on the
surface of an airport to meet the departure schedule while minimizing overall travel time. Ground-movement controllers
manage the taxi-route assignments and taxi-time estimation for each aircraft in arrival or departure queue. A high-accuracy
taxi-time calculation is required to increase the efficiency of airport operations. In this study, we propose a data-driven
approach to construct features set and build predictive models for taxi-time prediction for departure flights. The proposed
approach can suggest, both, taxi-route and predict the corresponding taxi-time: by analyzing ground movement data. The
controller's operational preferences are extracted and learned by machine learning algorithms for predicting taxi-route and
taxi-time of given aircraft. In this approach, we take advantage of taxiing trajectories to learn the controller’s decision, which
reflects how the controller had decided the routing for a given situation. Two machine learning models, random forest
regression and linear regression, are implemented and show similar performances in estimating the taxi-time, however, from
our observations, the random forest model can provide a more stable result and interpretability which is suitable for the real
operations. The predictive model for taxi-time can predict the taxi-out time with high accuracy with given assigned taxi-
route. The model can cover the controller’s decision up to 70% in the top-1 and 89% in top-2 recommends. The Mean
Absolute Error is less than 2.07 minutes for all departure flights and Root Mean Square Error is approximately 2.5 minutes.
Moreover, the ±3-minute error window can cover around 76% of departures while more than 95% of departures are within
the ±5-minute error window.
Keywords: routing, taxi-time prediction, surface movement, machine learning, random forest, linear regression
1. INTRODUCTION
In daily operations at an airport, the ground movement of
aircraft is one of the most critical airside operations. The
ground movement problem includes two problems: routing
and scheduling, which serve the purpose of guiding aircraft
on the surface of an airport to meet the schedule while
minimizing overall travel time. In which, the primary tasks
of the ground-movement controller are taxi-routes
assignment and taxi-time estimation for each aircraft in
arrival or departure queue [1]. A controller may select or
modify taxi-routes based on his operational preferences or
current runway-taxiway constraints that will lead to
difficulty in taxi-time estimation.
Moreover, in Airport Collaborative Decision Making (A-
CDM) [2], a high-accuracy taxi-time calculation is required
to avoid generating and propagating delays in the air traffic
management system because of the gap in time between
estimated and actual taxi-time.
Several studies have focused on tackling taxi-time
prediction [1][3][4] or taxi-route routing [5] problems.
Previously, the limited availability of ground movement
data such as aircraft surface movements, flight information,
and airside operations information is the challenge for all
studies. The research focuses on some aspects of the
DT. Pham, M. Ngo, N. Tran, S. Alam, V. Duong
2
problem such as taxi-out or taxi-in time prediction,
considering traffic and data from a small set of stands, taxi-
way, airlines, and aircraft types. Recently, surface
movement data such as Advanced-Surface Movement
Guidance and Control System (A-SMGCS) [6] provides
more opportunities for analyzing and the system level for
better understanding, insides, and prediction for routing
problems. However, extracting features from the surface
movement data requires an innovative data structure to
capture the space and time dependency between airport
airside traffic and airport airside infrastructure.
Furthermore, the recommended taxi-routes are usually the
output of a mathematical algorithm that does not consider
controller preferences or operational strategy. Finally, even
though those works have provided sets of useful
explanatory variables for taxi-time prediction, data-driven
features have not been well studied in the literature.
In this study, we focus on data-driven approaches to
construct features set and build predictive models for taxi-
route taxi-time prediction. By analyzing ground movement
data, the controller's operational preferences can be
extracted and learned by machine learning algorithms for
predicting taxi-route and taxi-time of given aircraft. A set
of features is also obtained from the airport traffic network,
weather information, and flight information.
The proposed algorithms can capture the existing pattern in
movement data as controller preferences in handling taxiing
and predict the taxi-route and taxi-time for each given
departure aircraft. It is applied for Singapore Changi airport
and evaluated with one-month Advanced-Surface
Movement Guidance and Control System (A-SMGCS)
data.
2. OVERVIEW
Our proposed approach is presented in Figure 1. The list of
potential decisions is learned from data. The departure
flights must be assigned route in sequence. This assumption
is made to ensure that at the decision time of a given flight,
all previous departures are assigned taxi-route thus, traffic
scores can be computed. Based on the current information
and some predictions about future traffic, the list of options
will be ranked and suggested. The taxi-time for each option
or decision can also be computed from given features.
When one option is chosen (it could be 1st ranked option in
autonomous mode), the traffic will be updated for the next
flights.
There are four main steps in this study: (1) standardizing
trajectories data; (2) constructing a list of decision
candidates (options); (3) extracting features from airport
data including traffic conditions, weather, etc.; (4)
developing predictive models for taxi-route and taxi-time
for a given departure. Firstly, a map-matching technique is
applied to standardizing actual trajectories using the airside
graph. After this step, each trajectory is represented by a
sequence of nodes with the corresponding timestamp.
Secondly, we use Density-based Spatial Clustering of
Applications with Noise (DBSCAN) [7] technique to
cluster trajectories to form the list of common taxi-routes,
called options. These options are what the controller will
consider when making a taxi-route assignment for aircraft.
Thirdly, features are extracted from the airport traffic
network, weather data, and flight information data. The
spatio-temporal airport network (airside) traffic is
computed for each flight given its departure time. The
traffic score will be extracted for each option of that
departure which reflects how the decision relates to current
traffic. Extracted features from different sources will be
combined to form a set of input features for the machine
learning model. These features are essential in explaining
the controller's decision. Finally, the Random Forest
method is selected as the predictive model because of its
interpretability. Two predictive models are trained from
preprocessed historical data to predict taxi-route and taxi-
time.
3. OPTION EXTRACTION
3.1 Data Preprocessing
3.1.1 Parsing data
Advanced Surface Movement Guidance and Control
System (A-SMGCS) is a system at airports having a
surveillance infrastructure consisting of Cooperative
Surveillance (e.g. Multilateration systems) and a Non-
Cooperative Surveillance (e.g. SMR, Microwave Sensors,
Optical Sensors, etc.). A-SMGCS data contains information
about movements of aircraft including their trajectories.
Table 1: List of extracted features from raw data
9 features
Gate, Longitude, Latitude, Velocity in Longitude,
Velocity in Latitude, Time of Track, Measured flight
level, Type of Aircraft, Wake Turbulence
Categorization
Figure 1: A schematic illustration of the proposed approach
ENRI Int. Workshop on ATM/CNS. Tokyo, Japan. (EIWAC 2019)
3
The provided raw data is in EDR extension, compressed
messages, which are not suitable for performing analysis or
developing learning model. Thus, the first processing step
is extracting and storing data into an analytical version. One
month of data (October 2017) is extracted and stored in
CSV format. The final data contains >$30,000 trajectories.
The fields are kept in analytical data is listed in Table 1.
3.1.2 Pre-processing data
The pre-processing process includes three steps:
Detect different flights with the same ID: It is
possible to have different flights with the same ID
on one specific day, we should detect and separate
them for further analysis.
Detect whether the flight is arrival or departure.
Determine the runway configuration.
Graph of Singapore Changi Airport Network: we construct
and simplify Changi airport graph from taxiway and runway
coordinates from the NARSIM simulator.
3.2 Standardizing Trajectory
The original data based on the coordinates of airplanes
through time is not well-structured to be inputted to a
machine-learning model. The coordinate data contains
many noises. The training model on noisy data can affect
the accuracy of the model later. By matching the aircraft
coordinates to the route on an airport graph (using map-
matching algorithm) we can represent the trajectory of the
airplane by the list of edge (taxiways), it helps to reduce the
noise in training data before putting to the model. Besides,
standardizing raw data can also reduce the complexity of
our following clustering and learning problems since we
only need to focus on a shared and well-defined graph with
a finite and limited number of edges and nodes. Finally, the
output of the program will be delivered to air traffic
controllers who are familiar with the list of taxiways that
the airplane will follow, rather than the coordinates. So, the
map matching is the step to build the bridge from raw data
to a more understandable data format for controllers.
3.2.1 Map-matching algorithm
There are a lot of studies on map matching. The most
commonly used is point-to-point matching [8] [9] and
point-to-curve matching [10] that maps each point in the
flight to the “closest” node or closest curve on the graph.
Another approach is curve-to-curve matching [9] [10] that
choose the closest curve from the list of candidates (often
was generated from a point-to-point matching) to the
original curve.
We observed that the flight data are dense so we
implemented the point-to-point matching with rule-based to
guarantee the result is the valid trajectory. The matching
algorithm includes two steps:
In the first step, we assigned each point of the
flight to the node in the airport graph if the distance
from this point to the nearest node is less than a
predefined threshold. Result of map matching
algorithm is a sequence of node id from the gate to
first exit-gate on the runway for departure flights
and from the last exit-gate to the gate for arrival
flights.
This approach may lead to some logical mistakes.
for example, after step 1 the chosen trajectory can
be: O n1 n2 D but on the airport graph
node, n1 and n2 do not have any connected edge.
Thus, in the second step, we avoid these errors by
connecting two unconnected nodes by the
Dijkstra's shortest path between them. Let say
shortest path from n1 to n2 is n1 n3 n2, the
final trajectory is the sequence of connected nodes
O n1 n3 n2 D. The matching result is
illustrated in Figure 2.
Figure 2 : An illustration of matching the trajectory
Figure 3 : Total distance different ratio distribution
DT. Pham, M. Ngo, N. Tran, S. Alam, V. Duong
4
3.2.2 Evaluate the map matching process.
Noise trajectories can significantly affect the performance
of the map-matching algorithm. The performance metric for
this step is the distance percent error between the matched
trajectory curves and original flight curves. The score is
non-negative and the smaller it is the better the matching
result is. By investigating the matching results, we can
detect and remove the abnormal trajectories (Figure 3). To
maintain the quality of the dataset, we set a threshold = 0.4
(40%) to remove those trajectories which have high errors.
3.3 Clustering Using DBSCAN
We observe from historical data that the controllers have a
pattern in assigning the taxiway for each departure for
similar situations. Even though, those decisions can be
affected by uncertainty such as weather, current airport
traffic, etc. In general, they are limited and form a set of
potential taxiways for each departure. Therefore, we can
extract the pattern in the departure taxi-way as controller’s
options from historical data by the clustering method.
We choose the DBSCAN algorithm as our clustering
algorithm. One of the notable advantages of DBSCAN is
that we are not required for a predefined number of clusters.
It is an important feature because the number of options is
different for each group of flight. Another advantage of
DBSCAN is that it is able to identify outlier trajectories as
noises so we can isolate those from processed data.
After the map matching step, we can represent a trajectory
as a list of nodes on the airport graph. Then those
trajectories are vectorized to input into the DBSCAN
algorithm. We use Euclidean distance to define the
difference between the two trajectories. The neighborhood
threshold is 2 that means we consider trajectories in one
group are not different for more than two nodes. We only
keep groups that contain more than three trajectories. An
example of a clustering result can be observed in Figure 4.
Figure 4: E xample of the clustering result
Figure 5: Percentage of remaining clustered flight
Figure 5 shows the percentage of the clustered departures
(after excluding the flights identified as noise) over total
departures in each cluster. The average percentage of
remaining flights is about 75% that means the extracted
options can cover 75% of the controller’s decision.
4. FEATURE ENGINEERING
The features which are considered in this study belong to
four categories which are summarized in Table 2. The detail
description of those categories is mentioned as below.
4.1 Flight basic features
Each flight is specified by the gate, runway and aircraft type
features. Because those features are categorical we encode
them to the one-hot vector. Every value in an old column is
split into a new column with two value one (exist) and zero
(non-exist).
Table 2: Summary of extracted features in 4 categories
FLIGHT BASIC IN FORMATION
5 features
gate, runway, dayofweek, hour, aircraft_type
SELECTED OPTION FEATURES
2 features
estimated_travel_time (s), travel_distance (m)
TRAFFIC FEATURES
N features
traffic_score0, …, traffic_scoreN-1
WEATHER FEATURES
12 features
visibility (km), pressure (mbar),
temperature (C), dewpoint (C), humidity (%),
wind_speed (km/h), wind_dir_degrees(o),
fog ([0,1]), tornado ([0,1]), thunder ([0,1]),
hail ([0,1]), rain ([0,1]),
ENRI Int. Workshop on ATM/CNS. Tokyo, Japan. (EIWAC 2019)
5
4.2 Selected option features
In our approach, to predict the taxi-time for departure flight,
we first assign taxi-route for each flight and then estimate
the travel time of each flight for given taxi-route. In this
work, we only consider two features which are the expected
travel time and travel distance.
4.3 Traffic features
Traffic features reflect the density of traffic of the airport at
the time when the controller makes the decision of taxi route
assignment considering future aircraft movements. Given
the flight plan of a departure aircraft which we want to
assign taxi-route, we assume that all taxi-route of other
aircraft in that period of time is fixed. Thus, we compute the
traffic score for each option by computing the traffic score.
Firstly, we introduce a trajectory prediction model for each
flight on a taxi-way based on combining the average travel
time of road segments. Secondly, given a flight departure
time, the traffic features for each option of that flight will
be computed. In the given time window, we group the
trajectory positions into multiple snapshots based on their
timestamps with time step is 10 seconds. For each snapshot,
we will compute a traffic density map by mapping aircraft
positions into a grid layer on the airport map with grid size
is 100m. The score of each cell is estimated by the number
of aircraft in its area and the impact of neighboring traffic
by spreading function.
4.4 Weather features
The weather is also the factor affect the decision of the
controller. We collect weather information at the airport
updating every thirty minutes.
5. PREDICTIVE MODELS
5.1 Random Forest
Random Forest (RF) is an ensemble learning method for
both classification and regression. It constructs multiple
decision trees that are trained with different subsets of
features and samples. The trees learn different knowledge
from data and then vote for final prediction. It is highly
robust with outliers/noises without skewing the prediction
results and avoids overfitting due to the diversity of trees.
One of the key advantages of Random Forest which suits
our problem is its capability to handle unbalanced data-sets
and able to work with different types of features and range
of feature values. Moreover, the interpretability of the
model is also considered for understanding the impact of
different features in prediction.
5.2 Predicting the controller’s decision
A predictive model is built to predict controller decision in
assigning a taxi-route for each departure flight. The features
in Flight Basic Information, Traffic and Weather groups are
used to predict the selected option of controllers in
historical data. Noting that the possible decisions of
controllers for each flight are the list of N extracted options
in Section 4. The traffic scores are computed for all N
options which can provide the relationship between each
option and surrounding traffic. With this formulation, the
Random Forest Classifier is chosen as the predictive model.
All the categorical features are encoded using a one-hot
vector encoder that makes the total number of features for
this model is 543 features.
5.3 Predicting taxi-time for departure flights
A predictive model is built to predict the travel time for each
departure flight with assigned taxi-route. The features in all
groups (Flight Basic Information, Selected Option, Traffic,
and Weather) are selected for the training of a predictive
model. However, since the option is decided for a given
flight, only the traffic score for that selected option is
considered. With this formulation, the Random Forest
Regressor is chosen as the predictive model. Similarly, a
one-hot vector encoder is used for encoding the categorical
features and the total number of features is 153 features.
6. EXPERIMENTS AND RESULTS
6.1 Dataset for developing predictive models
From the original dataset, multiple processing steps have
been performed on this data for cleaning and standardizing.
Table 3 shows the summary for 4 versions of datasets that
we have produced. The total of departures in 1-month data
is 11891 movements. After pre-processing step, only 8128
samples are kept. However, 2252 movements will be
removed after option extraction step since they are
considered as noises by clustering algorithm (DBSCAN).
Finally, for each cluster, departures with abnormal taxi-time
are removed. Thus, the final dataset for training predictive
models only contains 4363 samples (» 36.7% raw data).
One of the future works is investigating new preprocessing
and clustering algorithms to increase this percentage.
Table 3: Datasets and their size after each processing step
Version of Dataset
Number of samples
Full 1-month departure dataset
11891
Pre-processing departure dataset
8128
Extracted-opt ion departure d ataset
5875
Filtered departure dataset
4363
DT. Pham, M. Ngo, N. Tran, S. Alam, V. Duong
6
6.2 A predictive model for controllers’ decision
To ensure the existence of departure flight from all pairs of
gate and runway in both training and testing data, the data
is grouped with each pair and then partitioned. We tuning
the two mains hyper-parameters of the Random Forest
model that max depth (the depth of decision tree) and the
number of estimators by train and test (on validation) on a
different set of parameters. There is a total of 25 sets of
parameters that are evaluated for model tuning which is
combined from Max_Depth in (50, 100, 150, 200, 250) and
Number_Of_Estimater in (50, 100, 150, 200, 250). The 5-
fold cross-validation method is used to assess the model
performance for each set of parameters. Figure 6 shows the
result of the tuning process. When the number of estimators
is greater than 100 and the max depth is greater than 100,
model performances are converged. The best accuracy is
70.4% when max depth is 100 and the number of estimators
is 200. This set of parameters is chosen for training our
predictive model. Figure 7 shows the coverage of the
controller decision for departure flight in top recommends.
More than 70% of controller decision can be found at the
first suggestion while approximately 90% of the decision
will be covered in the first two suggestions. This high
coverage, which is similar to other recommendation
systems, will increase the acceptance of users since they are
found their required items quickly and easily.
Figure 6 : Model performance for a different set of parameters
Figure 7 : The coverage of controller decision in top K rec ommends
Table 4: List of most important features for the taxi-route model
Index
Features
Gini-importance
1
hour
0.032892
2
pressure (mbar)
0.029549
3
day of week
0.029420
4
wind_dir_degrees ()
0.028553
5
temperature (C)
0.028047
6
humidity (%)
0.027537
7
wind_speed (km/h)
0.027189
8
dewpoint (C)
0.019692
9
visibility (km)
0.017607
10
aircraft_type_H
0.007380
11
aircraft_type_M
0.007299
12
rain ([0,1])
0.004665
13
thunder ([0,1])
0.004202
The list of the most important features is shown in Table 4.
The features are ordered by Gini-importance which can be
considered as the percentages of the contribution of each
feature. Since we have several features and none of them
dominates in the contribution, the Gini-importance for all
features is small (maximum 3.3%). However, the top 11
features have bigger contributions compared to the others.
In which, the hour-of-day and day-of-week are two of the
most important features which affect the controller’s
decision. The next ones are weather features such as
pressure, wind direction, temperature, etc. Besides, aircraft
type is also in the top features for predicting the controller’s
decision. Even though rain and thunder features are very
important, their contributions are less than the others in our
model. It happens because of the nature of our data in which
only less than 150 cases are recorded with rain or thunder
condition. With more dataset, which covers other seasons
and weather conditions, we believe their importance can
increase significantly.
6.3 A predictive model for taxi-time
We get the Dead Reckoning (DR) method as a baseline
model. The DR method uses the 10th percentile value of
taxi time distributions for departures in the same group (in
our case is the same option) as the predicted taxi time [1].
A predictive model for taxi-time, we also use the 5-fold
cross-validation method to assess the model's performance.
ENRI Int. Workshop on ATM/CNS. Tokyo, Japan. (EIWAC 2019)
7
We implement two algorithms: random forest and linear
regression. Table 5 shows the results of models with 4
metrics: Mean Taxi Time Difference (MD), Mean Absolute
Error (MAE), Root Mean Square Error (RMSE), Mean
Absolute Percent Error (MAPE). Different metrics are used
to provide better observation and assessment of model
performance for prediction.
The mean absolute percentage error (MAPE) of random
forest (RF) and linear regression (LR) models are 22.06%
and 23.46%, respectively, while the DR model error is
higher with 27.55%. The distribution of MAPE of RF and
LR models (shown in Figure 9) is similar to the power-law
shape that is skewed to zero. LR performance is better than
RF on those flights that have an error of less than 10% but
in general, the difference between the two models is not
significant, only 1.4%. The distribution of MAPE in the DR
model is flatter and has a high variance compared to the RF
and LR models.
For more details, Table 6 compares the model performance
by ±k-minute error metric. There are 76.71% of the flight's
taxi-time predicted by the LR model and 75.65% of the RF
model has an error within 3 minutes, significantly better
than DR with 58.9%. Increasing the error range within 5
minutes the difference is bigger, it covers almost taxi-time
predicted by LR and RF (95.36% and 95.29%) while DR is
only 78.5%. In conclusion, both models using random
forest (RF) and linear regression algorithms (LR) have
similar performances and are much better than the baseline
model (DR).
Table 5: Comparison of performance metrics
Performance metrics
LR
RF
DR
Mean Taxi Time D ifference (min)
0.15
0.22
-2.8
Mean Absolute Error (min)
2.01
2.07
2.96
Root Mean Square Error (min)
2.52
2.56
3.91
Mean Absolute Percent Error (%)
22.66
23.46
27.55
Figure 9 : Distributions of absolute percent error of taxi-time prediction
Table 6: Departures within ±k-minute error window
Error window
LR
RF
DR
± 2-minute
56.30%
56.02%
45.11%
± 3-minute
76.71%
75.65%
58.90%
± 5-minute
95.36%
95.29%
78.75%
Table 7: List of most important features for f taxi-time prediction models
Rank
Random Forest
Linear Regression
1
DR
Fog
2
Traffic Score
Tornado
3
Hour
Aircraft Type
4
Pressure
Gate - 303
5
Wind Direction
Hail
6
Estimated Option Time
Gate - 202L
7
Day of Week
Runway
8
Wind Speed
Gate - D40R
9
Visibility
Gate - 462R
10
Temperature
Gate - 462L
The list of the most important features is shown in Table 7.
The features are ordered by Gini-importance for the random
forest model and by weight for the linear regression model.
We noticed that although LR is performing slightly better
than RF, the list of feature importance of RF is more
explainable and more generally compared to LR. This is
reasonable because LR tends to stress the specific features
related to rare events. RF assembles the predictions from
decision trees that can produce more generalized results.
That characteristic is reflected in its list of important
features. Features affecting most predicted taxi-time
include estimated time, traffic scores of options, hours, day
of the week and the weather features.
7. CONCLUSION AND FUTURE WORK
7.1 Conclusion
In this work, we have proposed an approach which can both
suggest taxi-route and predict the corresponding taxi-time.
The taxi-route model is developed considering controller
preferences which are learned from historical data. In this
approach, we also take advantage of taxiing trajectories to
DT. Pham, M. Ngo, N. Tran, S. Alam, V. Duong
8
form the controller’s decision that not only limits the
potential options but also is more practical. As a result, the
model can cover the controller’s decision up to 70% in the
top-1 and 89% in top-2 recommends. The second predictive
model for taxi-time can predict the taxi-out time with high
accuracy with given assigned taxi-route. The MAE is less
than 2.07 minutes for all departure flights and RMSE is
approximately 2.5 minutes. Moreover, the ±3-minute error
window cover around 76% of departures while more than
95% of departures are within the ±5-minute error window.
Two machine learning models, RF and LR, show similar
performances in estimating the taxi-time however from our
observations, RF can provide a more stable result and
interpretability due to its characteristics.
7.2 Future Work
To increase the performance of both models, the
preprocessing step will be investigated with a better map
matching algorithm for standardizing data. More surface
movement data will be collected and analyzed to propose
new features for predictive models.
8. ACKNOWLEDGMENTS
This research is partially supported by the Air Traffic
Management Research Institute (NTU-CAAS) Grant
No. M4062429.059 (Program 4) and Grant
No.M4062429.052 (Program 1)
9. REFERENCES
[1] H. Lee, W. Malik, and Y. C. Jung, “Taxi-out time
prediction for departures at charlotte airport using
machine learning techniques,” in16th AIAA Aviation
Technology, Integration, and Operations Conference,
2016, p. 3910.
[2] S. Corrigan, L. Martensson, A. Kay, S. Okwir, P.
Ulfvengren, and N. McDonald, “Preparing for airport
collaborative decision making (a-cdm)
implementation: an evaluation and recommendations,”
Cognition, Technology & Work, vol. 17, no. 2, pp.
207–218, 2015.
[3] S. Ravizza, J. Chen, J. A. Atkin, P. Stewart, and E. K.
Burke, “Aircraft taxi time prediction: comparisons and
insights,” Applied Soft Computing, vol. 14, pp. 397–
406, 2014.
[4] S. Ravizza, J. A. Atkin, M. H. Maathuis, and E. K.
Burke, “A combined statistical approach and ground
movement model for improving taxi time estimations
at airports, ”Journal of the Operational Research
Society, vol. 64, no. 9, pp. 1347–1360, 2013.
[5] I. Gerdes and A. Temme, “Taxi routing for aircraft:
Creation and controlling,” The Second SESAR
Innovation Days, 2012.
[6] Organisation de l’aviation civile internationale,
Advanced Surface Movement Guidance and Control
Systems (A-SMGCS) Manual. ICAO, 2004.
[7] M. Ester, H.-P. Kriegel, J. Sander, X. Xuet al., “A
density-based algorithm for discovering clusters in
large spatial databases with noise.” In Kdd, vol. 96, no.
34, 1996, pp. 226–231.
[8] W. Y. O chieng, M. A. Quddus, and R. B. Noland,
“Map-matching in complex urban road networks,”
2003.
[9] D. Bernstein, A. Kornhauseret al., “An introduction to
map matching for personal navigation assistants,”
1996.
[10] C. E. White, D. Bernstein, and A. L. Kornhauser,
“Some map matching algorithms for personal
navigation assistants,” Transportation research part c:
emerging technologies, vol. 8, no. 1-6, pp. 91–108,
2000.
10. COPYRIGHT
The authors confirm that they, and/or their company or
institution, hold the copyright of all original material
included in their paper. They also confirm they have
obtained permission, from the copyright holder of any third-
party material included in their paper, to publish it as part
of their paper. The authors grant full permission for the
publication and distribution of their paper as part of the
EIWAC2019 proceedings or as individual off-prints from
the proceedings.
... 径进行优化研究,姚梦飞[13] 将长短期记忆模型(Long Short Term Memory)和循环神经网络(RNN)算法对离场航空器滑行路径进行预测,通过研究航空器的历史滑行轨迹坐标与速度从而预测未来的航空器滑行位置。Pham[14] 分析了滑行路线与滑行时间的关系, 首先利用随机森林算法,根据管制员的操作偏好和机场的运行情况所得出的航空器滑行 数据来预测滑行路线,然后根据所预测的滑行路径预测滑行时间,但是此方法忽略了风 向对滑行路径预测的影响。Schultz[15] 介绍了 A-CDM 系统数据来预测场面运行效率的方 法。使用 A-CDM 系统中的 ADS-B 数据计算滑行时间、降落和起飞架次以及对跑道占 用时间的统计,直观地显示了机场的运行情况。此类方法仅对场面运行静态历史数据进 行研究预测,并未考虑场面航空器数量、天气等其他动态因素变化的情况下滑行时间的 差异,导致模型实际运用价值降低。 另一类是基于运筹优化算法下的滑行时间及滑行路径优化研究[16] , 姜雨[17] 引入双层 ...
Thesis
Full-text available
大型繁忙机场场面布局复杂,场面进离港航空器数量叠加造成机场长期处于高负荷 运转状态, 使得航空器由于滑行时间过长而导致的延误时有发生。同时, 集成了空管、 机场、航空公司三方数据的机场协同管理系统(A-CDM)的广泛使用为研究机场场面运 行效率及预测航空器关键时间节点提供了数据基础。 作为评价场面运行效率的关键指标 之一, 滑行时间预测的准确性不仅为优化航班推出时刻,提高离场时隙的使用效率具有 重要作用,还可以为航空公司准确计算油量、航空器减少地面排放提供理论参考。 基于 以上原因,本文建立深度学习模型预测不确定性条件下的离场航空器滑行时间,具体过 程如下: 首先, 对研究所需的数据进行预处理及分析影响滑行时间的因素。 根据本研究所使 用的多源异构数据进行融合处理,利用数据标准化算法消除不同数据量纲的差异来提高 数据质量。为解决数据不平衡导致的模型鲁棒性降低的问题,使用数据重采样技术平衡 各类别数据的数量。同时,对影响航空器滑行时间的因素及影响程度进行分析,根据因 素的数据特征进一步将其分为静态确定因素(机型,跑道运行模式,滑行距离等) 及动 态不确定因素(机场场面流量、 天气)。 其次,对动态不确定性因素(机场场面流量) 进行预测。 根据数据的时-空属性,利 用滑动时间窗法将数据进行平滑处理, 保证了后续预测的连续性及稳定性。建立长短期 记忆网络-深度神经网络(D-LSTM) 联合模型,预测机场场面实际流量。 使用 D-LSTM 模型对香港机场场面流量进行预测验证。结果显示,与其他机器学习模型相比, D-LSTM 模型预测的准确率为 88.0%,可有效地捕捉场面流量的趋势性及周期性特征。 最后, 建立深度学习模型预测离场航空器滑行时间。根据现阶段航空器滑行时间的 定义及统计方法分别建立两个模型以预测离场航空器滑行时间,分别为: 基于历史统计 数据(未对动态不确定因素进行预测)的航空器滑行时间预测 Wide-Deep 模型;以及 将机场场面流量变为不确定因素下的动态 Wide-Deep 模型。 结果显示, 以上两种深度学 习模型预测精度均明显优于其他机器学习算法,可用于大型机场多种运行条件下离场航 空器滑行时间预测。
Article
Airside taxi delays have adverse consequences for airports and airlines globally, leading to airside congestion, increased Air Traffic Controller/Pilot workloads, missed passenger connections, and adverse environmental impact due to excessive fuel consumption. Effectively addressing taxi delays necessitates the synchronization of stochastic and uncertain airside operations, encompassing aircraft pushbacks, taxiway movements, and runway take-offs. With the implementation of mixed-mode runway operations (arrivals-departures on the same runway) to accommodate projected traffic growth, complexity of airside operations is expected to increase significantly. To manage airside congestion under increased traffic demand, development of efficient pushback control, also known as Departure Metering (DM), policies is a challenging problem. DM is an airside congestion management procedure that controls departure pushback timings, aiming to reduce taxi delays by transferring taxiway waiting times to gates. Under mixed-mode runway operations, however, DM must additionally maintain sufficient runway pressure—departure queues near runway for take-offs—to utilize available departure slots within incoming arrival aircraft steams. While a high pushback rate may result in extended departure queues, leading to increased taxi-out delays, a low pushback rate can result in empty slots between incoming arrival streams, leading to reduced runway throughput. This study introduces a Deep Reinforcement Learning (DRL) based DM approach for mixed-mode runway operations. We cast the DM problem in a markov decision process framework and use Singapore Changi Airport surface movement data to simulate airside operations and evaluate different DM policies. Predictive airside hotspots are identified using a spatial-temporal event graph, serving as the observation to the DRL agent. Our DRL-based DM approach utilizes pushback rate as agent’s action and reward shaping to dynamically regulate pushback rates for improved runway utilization and taxi delay management under uncertainties. Benchmarking the learnt DRL-based DM policy against other baselines demonstrates the superior performance of our method, especially in high traffic density scenarios. During a typical day at Singapore Changi Airport, DRL-based DM reduces peak taxi times by 1-3 minutes on average, saves 26.6% in fuel consumption, and contributes to more environmentally friendly and sustainable airside operations.
Article
Full-text available
Variable taxi time prediction is the core of the Airport Collaborative Decision Making (A-CDM) system. An accurate taxi time prediction contributes to enhancing airport operational efficiency, safety and predictability. The deep dynamic spatio-temporal correlation inherent in airport traffic data is critical for taxi time prediction. However, existing machine learning (deep learning) methods have been unable to thoroughly exploit these correlations. To address this issue, we propose a deep learning-based model called the multi-task dynamic spatio-temporal graph attention network (MT-DSTGAN). Our model also predicts future entire airport traffic flow and taxiing segment traffic flow as auxiliary tasks, with the goal of enhancing the accuracy of aircrafts’ taxi time prediction. The proposed MT-DSTGAN model is implemented and assessed through a case study of Beijing Capital International Airport with a real-world dataset. The advantage of the proposed model, which shows better performance in various evaluation metrics, is demonstrated in a comparative study with other baseline works. In summary, the proposed MT-DSTGAN exhibits promising capabilities in perceiving the dynamic changes in the taxiing process of aircraft and demonstrates the ability to capture complex spatio-temporal correlations in airport traffic data.
Article
Full-text available
This paper presents a machine learning-based approach for predicting the taxi-out time, with the departure process decomposed into two components: the time taken to travel from the gate to the departure queue, and the time spent in the departure queue. Gradient-Boosted Decision Tree (GBDT) models are trained to predict the two components using different feature sets, and a comparison of both model shows that they can provide better prediction accuracy compared with conventional methods, with a Root Mean Squared Error (RMSE) of 1.79 minutes and 0.92 minutes when predicting the taxiing and queuing times respectively, and 78% and 96% of predictions falling within a ±2 minute error margin. Predictions from the GBDT model are analysed and interpreted using SHAP (SHapley Additive exPlanations) values, a well-recognised technique for providing interpretability to many different black-box models, and allowing feature importance to be evaluated at global (model) and local (individual prediction) levels. In particular, the most important feature groups for the taxiing and queuing models are respectively the route features and runway queuing features. The model explainability provides a pathway towards the certification of machine learning techniques in Air Traffic Controller (ATCO) decision support tools.
Article
Airport taxi delays adversely affect airports and airlines around the world leading to airside congestion, increased Air Traffic Controllers/Pilot workload, and adverse environmental impact due to excessive fuel burn. Airport Departure Metering (DM) is an effective approach to contain taxi delays by controlling departure pushback timings. The key idea behind DM is to transfer aircraft waiting time from taxiways to gates. State-of-the-art DM methods use model-based control policies that rely on airside departure modeling to obtain simplified analytical equations. Consequently, these models fail to capture non-stationarity in the airside operations leading to poor performance of control policies under uncertainties. This work proposes model-free and learning-based DM using Deep Reinforcement Learning (DRL) approach to reduce taxi delays while meeting flight schedule constraints. This paper casts the DM problem in a markov decision process framework and develops a representative airport-airside simulator to simulate airside operations and evaluate the learnt DM policy. For effective state representation, this work introduces taxiway hotspot features to account for the spatial-temporal evolution of airside congestion levels. This significantly improves the DM policy convergence rate during training. The performance of the learnt policy is evaluated under different traffic densities with a reduction of approximately 44% in taxi out delays, in medium-density traffic scenarios, which corresponds to 2-minute savings in taxi-out time per aircraft. Furthermore, benchmarking DRL against an evolutionary method and another state-of-the-art simulation-based heuristic demonstrates the superior performance of our method, especially in high traffic density scenarios. With increased traffic density, taxi-time savings achieved by the learnt DM policy increase without a significant decrease in runway throughput. Results, on a typical day of simulated operations at Singapore Changi Airport, demonstrate that DRL can learn an effective DM policy to contain congestion on the taxiways, reduce total fuel consumption by approximately 22% and better manage the airside traffic.
Article
The prediction of taxi time plays a primary role. To improve the accuracy and portability of prediction, the Informer-RFR (Informer-Random Forest Regression) model was proposed. First, each gate's average taxi time series is clustered, and then an Informer model is trained for each cluster. Second, the RFR model was used to predict the taxi time of flight with the result of Informer and real-time factors. The proposed model can be directly trained by the historical database of a new airport, which can reduce the data processing time. Finally, the model was trained and verified using the data for Beijing Capital International Airport, China (PEK). The results show that the gates of PEK can be divided into four clusters, and the time series of each cluster differ in peak value and changing trend. Our model’s accuracy of predicting within ±5 min is 96.62%, and the mean absolute error is 147.59 s.
Article
Full-text available
The key objective of this paper was to report on one of the industrial-based change case studies of the MASCA project (MAnaging System Change in Aviation—EU FP7, 2010–2013 ). This case study provides a systematic insight into one airport’s approach to their preparation for full implementation of Airport Collaborative Decision Making (A-CDM). An action-based methodological approach was applied over a 3-year period, and a particular focus of this paper is on the application of the MASCA system change and operational evaluation tool (SCOPE/Structured Enquiry). Key recommendations resulted in research-led interventions, such as the development of a Serious Game to facilitate co-ordination and communications. The paper also reports on future recommendations for the implementation of A-CDM, such as prioritising social relations and trust building amongst airport stakeholders as opposed to viewing A-CDM solely as an IT-led project. Recommendations and learning from this case study can also be disseminated to other airports who are about to embark on the preparation for full A-CDM implementation and compliance.
Article
Full-text available
Global Navigation Satellite Systems (GNSS) such as GPS and digital road maps can be used for land vehicle navigation systems. However, GPS requires a level of augmentation with other navigation sensors and systems such as Dead Reckoning (DR) devices, in order to achieve the required navigation performance (RNP) in some areas such as urban canyons, streets with dense tree cover, and tunnels. One of the common solutions is to integrate GPS with DR by employing a Kalman Filter (Zhao et al., 2003). The integrated navigation systems usually rely on various types of sensors. Even with very good sensor calibration and sensor fusion technologies, inaccuracies in the positioning sensors are often inevitable. There are also errors associated with spatial road network data. This paper develops an improved probabilistic Map Matching (MM) algorithm to reconcile inaccurate locational data with inaccurate digital road network data. The basic characteristics of the algorithm take into account the error sources associated with the positioning sensors, the historical trajectory of the vehicle, topological information on the road network (e.g., connectivity and orientation of links), and the heading and speed information of the vehicle. This then enables a precise identification of the correct link on which the vehicle is travelling. An optimal estimation technique to determine the vehicle position on the link has also been developed and is described. Positioning data was obtained from a comprehensive field test carried out in Central London. The algorithm was tested on a complex urban road network with a high resolution digital road map. The performance of the algorithm was found to be very good for different traffic maneuvers and a significant improvement over using just an integrated GPS/DR solution.
Article
With the expected continued increases in air transportation, the mitigation of the consequent delays and environmental effects is becoming more and more important, requiring increasingly sophisticated approaches for airside airport operations. Improved on-stand time predictions (for improved resource allocation at the stands) and take-off time predictions (for improved airport-airspace coordination) both require more accurate taxi time predictions, as do the increasingly sophisticated ground movement models which are being developed. Calibrating such models requires historic data showing how long aircraft will actually take to move around the airport, but recorded data usually includes significant delays due to contention between aircraft. This research was motivated by the need to both predict taxi times and to quantify and eliminate the effects of airport load from historic taxi time data, since delays and re-routing are usually explicitly considered in ground movement models. A prediction model is presented here that combines both airport layout and historic taxi time information within a multiple linear regression analysis, identifying the most relevant factors affecting the variability of taxi times for both arrivals and departures. The promising results for two different European hub airports are compared against previous results for US airports.
Article
In this paper, the design philosophy as well as the potential for Air Traffic Management research of the NLR Air Traffic Control Research Simulator (Narsim) are outlined. After an introduction on the background of this research, basically being the inadequate capacity of the present-day air traffic management infrastructure, the political as well as the research initiatives which should be taken to solve this problem, are noted. A description of the hardware as well as the software architecture of Narsim is presented together with an object oriented strategy for the analysis and development of air traffic management systems. Finally, the present research programme of Narsim is outlined.
Article
Third-generation personal navigation assistants (PNAs) (i.e., those that provide a map, the user's current location, and directions) must be able to reconcile the user's location with the underlying map. This process is known as map matching. Most existing research has focused on map matching when both the user's location and the map are known with a high degree of accuracy. However, there are many situations in which this is unlikely to be the case. Hence, this paper considers map matching algorithms that can be used to reconcile inaccurate locational data with an inaccurate map/network.
Taxi routing for aircraft: creation and controlling
  • I Gerdes
  • A Temme
I. Gerdes and A. Temme, "Taxi routing for aircraft: Creation and controlling," The Second SESAR Innovation Days, 2012.