Conference PaperPDF Available

A new approach for mobile positioning using the CDR data of cellular networks

Authors:

Abstract and Figures

Nowadays, mobile devices are equipped with a number of radio transceivers which are active every day and everywhere. As a result, vast amounts of data and technical logs are collected by mobile operators. For this reason, mobile phones have a great potential for sensing urban and rural mobility and population displacement. Therefore, in this article, we are proposing a new approach for estimating the location of mobile subscribers within the coverage area of a mobile network. The method created is based on enhanced Kalman filter featured with integrated mobility models. The algorithm allows estimating location of mobile subscribers by knowing only the network coverage cell to which they are connected. The results are very encouraging and they can be very beneficial for applications in intelligent transportation systems and location based services based on the use of Call Detail Records (CDRs) data.
Content may be subject to copyright.
A new approach for mobile positioning using the
CDR data of cellular networks
(Preprint Version Submitted to MT-ITS17’ Conference)
Artjom Lind
Institute of Computer Science,
University of Tartu
Ulikooli 17
Tartu, Estonia
Email: artjom.lind@ut.ee
Amnir Hadachi
Institute of Computer Science,
University of Tartu
Ulikooli 17
Tartu, Estonia
Email: amnir.hadachi@ut.ee
Oleg Batrashev
Institute of Computer Science,
University of Tartu
Ulikooli 17
Tartu, Estonia
Email: olegus@ut.ee
Abstract—Nowadays, mobile devices are equipped with a
number of radio transceivers which are active every day and
everywhere. As a result, vast amounts of data and technical logs
are collected by mobile operators. For this reason, mobile phones
have a great potential for sensing urban and rural mobility
and population displacement. Therefore, in this article, we are
proposing a new approach for estimating the location of mobile
subscribers within the coverage area of a mobile network. The
method created is based on enhanced Kalman filter featured
with integrated mobility models. The algorithm allows estimating
location of mobile subscribers by knowing only the network
coverage cell to which they are connected. The results are very
encouraging and they can be very beneficial for applications
in intelligent transportation systems and location based services
based on the use of Call Detail Records (CDRs) data.
I. INTRODUCTION
Over the last years, the availability of mobile data has
increased [19] and new mobile patterns that showed noticeable
impact on other fields have been revealed. CDRs are a set of
information about telecoms transactions that operator uses to
generate billings. A single call detail record (CDR) contains a
number of records’ attributes [24], such as information about
the phone itself, the subscriber, the timestamps, the coverage
area ID, etc. Apparently nowadays, 5% of mobile data traffic
is created by Machine-to-Machine (M2M) connections due to
fast growing Internet of Things (IoT) and Intelligent Trans-
portation Systems (ITS) applications [20].
The ITS field demonstrates good potential for research and
applications. It was estimated [20] that the amount of M2M
traffic will grow up to 29% by 2021. However, this type of
data raises a lot of challenges and one of them is the accuracy
of localizing mobile users in mobile networks. There were a
number of previous studies that concentrated on CDR to show
clear success in mining CDRs for human mobility aspects [21]
or base stations’ characteristics [23].
Moreover, the precision of localization is rarely set as a
primary goal. Instead, the focus is set on other mobility aspects
clearly derivable from CDR. There was an attempt to roughly
estimate the travel route [22], assuming that a significant
amount of historical data in CDR is collected.
Besides, the use of cellular data, more precisely calls’ detail
records (CDRs), can be very beneficial in saving energy when
using location based routing protocol. In addition, this type of
data can be the key to urban mobility sensing, since almost all
of the people own mobile phones and are using them everyday
and everywhere. Therefore, CDR data can be very useful for
understating the dynamics of human mobility patterns [10] and
their means of transportation and displacement.
It is clear, based on the literature [11], that mobile cellular
networks and their data can be used as ubiquitous sensors
for real mobility in space and time. In [11] the author
demonstrated the use of CDR data by creating an algorithm to
infer vehicle travel time on highways and to detect traffic jams
in real time. Their approach was focused on the macroscopic
level of traffic.
Another use of CDR data for macroscopic analysis of mobility
is illustrated in [25], where the authors present an interesting
software that is capable of estimating multiple aspects of
travel demands based on CDR data in a flexible and efficient
manner. The software system includes old and new algorithms
for generating origin-destination matrices and route trips. In
addition, the system contains an interactive graphical web
interface for visualizing the results of the algorithms analysis
and estimations.
Following the same theme related to human mobility analysis,
it is well-defined that estimating and measuring the population
displacement can help a lot in understanding many urban
phenomena. However, the use of CDR data rises a lot of issues
and challenges which are clearly stated in [26]. The authors
pointed to the existing problems with mobile phone based
measures of mobility patterns and localization. In addition,
they described new alternative approaches that can help in
fixing these issues. Furthermore, they provided a variety of
useful cases that can help in understanding the human mo-
bility patterns by using mobile phone data at different levels:
microscopic and macroscopic.
At this moment, it is important to point to the fact that
positioning mobile users at a microscopic level still constitutes
a big challenge. The localization in mobile networks has been
investigated for a while now [15], [16], [18]. One of the
techniques that has been used is the centralized localization
algorithm that runs on a base station, where all the partici-
pating cells must forward their measurement data to the base
station [1]. The accuracy of this method is acceptable when
the users are not moving and it needs deployment at the base
station. Other derivatives of this technique exist [2], but they
are not fast enough regarding the positioning processes.
Some other methods use the signal strength, triangulation
and belief propagation algorithm for self-positioning [3]. In
addition, there is also the use of various distributed recursive
estimation approaches, such as particle filters [5], Kalman filter
[4] or sequential Monte Carlo [6], since they are suitable for
non-linear cases. In general, all the existing techniques use the
signal information from neighboring cells for self-positioning.
From the same family of approaches, we can bind the use of
Bayesian inference approach for locating mobiles in cellular
networks [9]. The approach presented in the article is using the
network information layout and shows how to integrate supple-
mentary knowledge, such as interference ratio measurements
and round-trip-time, to localize mobiles. The authors presented
results that prove that their method reduces the localization
error by 20%.
Moreover, in [12] the authors gave a performance comparison
of the self-positioning techniques using fixing methods, such
as Received Signal Strength (RSS) Statistics, Least Squares
(LS) [14], Weighted Least Squares (WLS) and Constrained
Weighted Least Squares (CWLS). Their results show that
CWLS is the most adequate for mobile positioning in urban
areas. The CWLS method performs well in dense urban
locations, where there are multiple multi-storied structures.
In addition, there is also the use of neural networks [13],
[17]. For example in [13], the article presents an application
of an artificial neural network to increase the accuracy of
cellular mobile subscribers’ positioning. Their technique relies
on neural networks in fusing radio location measurements and
confidence of measurements to refine the positioning accuracy.
However, the approach was oriented on locating the mobile
station, not the users. Moreover, their experiment was carried
out based on simulated data and no real field experiment was
done.
From this perspective, in our study we focus on localization
within the coverage area, relying on historical CDR data or
dynamic Visitor Location Registry (VLR) feed. The latter al-
lows us to use our algorithm for real-time dynamic positioning
of mobile subscribers on operator’s side. Location awareness
is important for the cellular network, since many applications
depend on its data for developing environment monitoring,
road traffic management, control and tracking in emergency
situations, etc. Hence, in this article we are proposing an
algorithm for estimating the exact location of mobile devices
in the cellular network by using enhanced Kalman Filter.
The proposed method also includes mobility models that the
Kalman Filter uses for estimating the exact location of the
mobile users, plus a coverage optimization technique.
II. PRO BL EM STATE ME NT
The CDR record is generated at the moment of an explicit
phone usage (calling, sending SMS or packet data). This
means that the moment you are not using your phone, no in-
formation is collected. Hence, Call Detail Records are sparsely
sampled spatio-temporal data [7]. The collected records con-
tain the following information (Table 1):
TABLE I
TYPE OF INFORMATION CONTAINED IN CDR DATA
Inf ormationT y pe Description
IMSI The International Mobile Subscriber Identity
IMEI The International Mobile Equipment Identity
Cell ID Mobile coverage cell area identification number
timestamp Time or moment when the event was recorded
Events type The nature of action performed that triggered
the event
Therefore, the trajectories extracted are a set of cell IDs,
which reflect a coverage zone location of the mobile network
in a chronological order (Fig. 1).
The Cell ID number is used to identify the base transceivers
stations (BTS) or sectors within BTS. Each base transceiver
sector covers an area, hence the name ”coverage area”. In
our study we identify coverage areas by using the ID of
corresponding BTS sectors.
As a result, it is very challenging to know whether the user
is moving or not and also to estimate his location within the
coverage zone. Hence, our main objective is to try to position
the mobile user within the mobile network coverage area by
using CDR data and pre-defined mobility models.
Fig. 1. Illustration of Extracted Trajectory from CDR data - Polygons are the
CDR Trajectory - dots are real GPS trajectory
III. METHODOLOGY
Determining the exact location of a mobile subscriber within
the mobile network coverage area is very challenging when it
comes to the use of CDR data only. In addition, literature
discussed in previous section indicates a lack referring to the
use of CDR data to extract the exact location in the mobile
network.
Before describing the methodology adopted, we will start by
describing the data used and its purpose. There are three
sources of data that we used for three purposes. First we col-
lected GPS data using a customized mobile application capable
of recording the CellID, to which the phone is connected, and
the GPS locations simultaneously. The information collected
gives an idea about the location of the phone based on the GPS
and also the CellID or coverage area that the phone is using.
This data was used in the process of optimizing the coverage
area (described in section A) since we noticed that in some
cases the phone was located outside of the coverage area and
still connected to it. The second data is the CDR data that was
used with our proposed enhanced Kalman filter algorithm for
estimating the location of the phone. In order to evaluate the
outcome of the algorithm, we created a mobile application to
collect GPS data from the mobile phones and this data was
used as a ground truth to evaluate our estimations.
A. Coverage Optimization
The coverage area information, existing in the CDR data
provided by the mobile operators, does not reflect the real
coverage zone. The subscriber can still be connected to the
transceiver (BTS) by certain Cell ID and at the same time
he/she can be outside of the coverage area declared by the
operator for this BTS. This clearly indicates a need for updat-
ing the coverage area information for this BTS. Therefore, we
propose a solution to enhance the coverage representation by
coupling between the GPS data and the cell events.
For this purpose, we defined a function f(u)that penalizes
large distances d(x, r, xg)between the cell coverage circle
(x,r), which is defined by its center xand radius r, and the
GPS coordinates xgat the time of the cell events occurrence.
In general, the function created is performing a mathematical
optimization by minimizing a penalty function f(u). More-
over, we took into account that the coverage area should
not exceed the area that is defined by the GPS. Hence,
the function also penalizes a large coverage radius ui.The
penalizing function is defined as follows:
Let Cbe the set of all pairs (j, xg)of a cell index and its
event GPS coordinates.
f(u) = X
i
u2
i+wX
(j,xg)∈C
[min (0, d(xj, rj,xg))]2,(1)
where uiis radius extensions, d(x, r, xg) = r− |xxg|and
we defined w= 10 as the weight for non-coverage penalty. We
used the implementation in scipy of the L-BFGS-B algorithm
[8] to minimize the coverage function.
B. Localization estimation in the mobile network
In general, the Kalman Filter (KF) permits only one single
transition matrix at each step t. Therefore, classic Kalman is
good at predicting in the scope of one behavior model, for
example stable directed movement or standing. However, in
reality there are many types of behaviors to select from and
the behavior can change multiple times within one trajectory.
Thus, for our case, in order to cast this missing mobility
behavior aspect, we add discrete random variables to the
Kalman Filter. This results in the adaptive Kalman Filter [4].
As a result, our discrete random variable Stdefines the model
used for the transition to step t. Then, KF calculates the
probability of each model Mat time t, given all the evidence
or up-to-date evidence as well as a probability distribution of
the hidden state variable associated with each model:
Mt|t(i) = P(St=i|xg1:t)(2)
Mt|T(i) = P(St=i|xg1:T)(3)
The consolidated belief state of hidden variable Xtat time t
is represented as the mixture of Gaussians of all models scaled
by the probabilities of the models:
P(Xt|xg1:τ) = X
i
Mt|τ(i)·P(Xt|St=i, xg1:τ)(4)
, and
P(Xt|St=i, xg1:τ) = N(µi
t|τ,Σi
t|τ)τ∈ {t, T }(5)
Finally, it is necessary to define the model transition prob-
ability matrix Tz(i, j) = P(St=j|St1=i). We define the
transition probability from St1=ito St=jwith the higher
chance to stay in the same model:
Tz(i, j) = (0.8if i=j
0.2
Num1otherwise (6)
where, Numis the number of models. Besides, one of the
issues with KF is the exponential growth of the belief state
due to the multiplication between the number of Gaussians and
the number of models at each step. For this reason, filtering
and smoothing are a necessity in the process and they are
computed using predefined models of behavior.
C. Mobility Models
Taking location ¯
xtand velocity ϑtat time tas hidden
variables xt=¯
xt
ϑtand specifying that a moving user‘s
coordinates and velocities satisfy the following equations:
¯
xt=¯
xt1+ϑt1δt +¯
qt(7)
ϑt=ϑt1+˙
qt(8)
where δt is the time difference from the previous event
resulting from the Bayes Network approach and ¯
qt
˙
qtare
noise. In general, the KF equations are as follows:
xt=Fxt1+Qt(9)
xgt=Hxt+Rt,(10)
hence, for each model we have to define a transition matrix
Fand a noise variance matrix Qt N (0,Q(M)
t). For
instance, in case of a moving user on plain 2D map, its
transition matrix (Move model) is:
F(M)=
1 0 δt 0
0 1 0 δt
0 0 1 0
0 0 0 1
(11)
Whereas, a staying user at the same location can be char-
acterized by an identity matrix F(S)=I(Stay Model).
F(S)=
1000
0100
0010
0001
(12)
Meanwhile, the observation model (Hand Rt) should
reflect the location of the antenna that the user is connected to,
and the coverage zone of the antenna expresses the observation
error. During testing, all the ”antennas” will have the same
model:
H=1000
0100Rt=1.220
0 1.22.(13)
As a consequence, the algorithm computes the probabilities
P(St=k|xg1:t)of each model kat time tand the probability
distribution of the coordinate and the velocity P(Xt|St=
k, xg1:t)for given up-to-date evidence xg1:t. Then, the same
process is applied to all the evidence xg1:T, which gives
smoothed results, that are more accurate and therefore are used
in actual testing and validation (Fig. 2).
Fig. 2. Coordinates and probabilities for Stay and Move models
IV. EXP ER IM EN TATIO N AN D TE ST IN G
A. Data
During the testing process, we will use the GPS data col-
lected described in section III as a ground truth for evaluating
the algorithm. The Kalman filter will use the CDR data for
estimating the subscribers locations.
In addition, we made sure that the CDR data corresponds
exactly to the time period and users from whom we collected
the GPS data (ground truth) in order to be able to compare
the estimated location using our algorithm and evaluate its
performance.
B. Results and discussion
The testing of our method was carried out by using GPS data
as reference and 271 CDR records from different users.Then,
we compared the estimated positions given by our algorithm
to the real GPS data. For example in (Fig. 3), we have a
case where the mobile user is traveling by train. The triangles
represent his/her GPS data during the trip and the circles are
the estimated positions using our method. In addition, every
circle dot is linked to its appropriate ground truth GPS data
by a segment. The illustration demonstrate interesting results
when we realize that we used only the coverage area of the
mobile network location generated by the mobile user as an
input to the proposed algorithm. The second example in figure
4 is a user walking around and stopping from time to time in
the city center. We can clearly notice that the algorithm is
capable of getting closer to the area where the user is located.
Fig. 3. The Algorithm output ”case of a user traveling by train”; - the
polygons are the optimized mobile network coverage; - the circle dots are
the algorithm’s estimations (Estimated Locations) and the triangles are the
GPS data (Real Positions).
For more analysis, we got the first glimpse of the results by
checking the estimation error through the comparison of the
estimated locations and the real locations provided by the GPS
data. The general view about the output illustrated in Table 2
is the estimation without applying any coverage optimization.
The algorithm produces an average error of 0.9 kilometers in
a stay model and 1.9 kilometers in a move model.
However, when applying the coverage optimization to the
algorithm the results get better (Table 3). The algorithm
produces an average error of 0.4 kilometers in the stay model
and 1.2 kilometers in the move model. After all, the coverage
optimization step has a positive impact on the algorithm‘s
estimation. Moreover, the algorithm is relatively good at
estimating the location when the subjects are staying in a
specific location instead of moving.
Fig. 4. The Algorithm output ”case of a user walking and stopping from time
to time”; - the circle dots are the algorithm’s estimations (Estimated Locations)
and the triangles are the GPS data (Real Positions- Red stay location, Green
walking).
TABLE II
PER FOR MA NCE O F THE P ROP OS ED AL GO RIT HM W ITH OU T COVE RAG E
OPTIMIZATION
Model Estimation Number
T ype Err or Averag e of
CDR
in(Km)in(K m)Records
Stay [Min:0.002 ; Max: 13.543 ] 0.931 147
Model
Move [Min:0.027 ; Max: 13.061 ] 1.946 124
Model
Total [Min:0.002 ; Max: 13.543] 1.438 271
Performance
TABLE III
PER FOR MA NCE O F THE P ROP OS ED AL GO RIT HM W ITH C OVE RAG E
OPTIMIZATION
Model Estimation Number
T ype Err or Averag e of
CDR
in(Km)in(K m)Records
Stay [Min:0.001 ; Max: 4.958 ] 0.432 147
Model
Move [Min:0.008 ; Max: 10.749 ] 1.279 124
Model
Total [Min:0.001 ; Max: 10.749] 0.892 271
Performance
V. CONCLUSION
In this paper, we have presented a method capable of
estimating the location of mobile users within the cell coverage
area by using only CDR data.
The algorithm uses the centroid of the coverage areas and
mobility models in order to estimate the positions of the
mobile users. These models allow the algorithm to simulate
the movement of the users by using the observed CDR data.
The results are very encouraging for more investigations aimed
at enhancing the algorithm. In addition, the actual results can
be very useful for applications in intelligent transportation
systems or location based services.
ACKNOWLEDGMENT
The authors gratefully acknowledge the contribution of The
Software Technology and Applications Competence Centre
(STACC) through Large-scale Mobile Positioning Data Mining
(Demograft) project and all the partners in Archimedes project
”The Real-time Location-based Big Data Algorithms” for their
help in providing the data.
This research was supported by IUT34-4 ”Data Science Meth-
ods and Applications” (DSMA) project and the European
Regional Development Fund through the Estonian Centre of
Excellence in IT (EXCITE).
REFERENCES
[1] Mao, G., Fidan, B., Anderson, B.D.O.,”Wireless sensor network local-
ization techniques”, in the Computer Networks journal, Vol. 51(10), pp.
2529-2553, 2007.
[2] Kus’y, B., Sallai, J., Balogh, G., L’edeczi, A., Protopopescu, V., Tolliver,
J., DeNap, F., Parang, M.,”Radio interferometric tracking of mobile
wireless nodes”. In: Proc. of MobiSys , 2007, pp. 139-151.
[3] F. Meyer, O. Hlinka, and F. Hlawatsch, ”Sigma point belief propagation,”
in the IEEE Signal Process. Lett., vol. 21, pp. 145-149, Feb. 2014.
[4] Feng Xiao, Mingyu Song, Xin Guo, et al, ”Adaptive Kalman filtering for
target tracking”, in the China Ocean Acoustics (COA) Conference, 2016.
[5] Carsten Fritsche and Anja Klein ”On the performance of mobile terminal
tracking in urban GSM networks using particle filters”, in The 17th
European Signal Processing Conference, 2009.
[6] Salke Hartung; Ansgar Kellner; Konrad Rieck; Dieter Hogrefe, ”Monte
Carlo Localization for path-based mobility in mobile wireless sensor
networks”, in The IEEE Wireless Communications and Networking
Conference (WCNC), 2016.
[7] Ficek M.; Kencl L, ”Inter-call mobility model: A spatio-temporal re-
finement of call data records using a gaussian mixture model”, in the
Proceedings IEEE INFOCOM, 2012, pp. 469-477.
[8] J.L. Morales and J. Nocedal, ”L-bfgs-b: Remark on algorithm 778: L-
bfgs-b, fortran routines for large scale bound constrained optimization”,
in the ACM Transactions on Mathematical Software Journal, Vol. 38(1),
2011.
[9] H. Zang ; F. Baccelli ; J. Bolot, ” Bayesian Inference for Localization in
Cellular Networks”, in the IEEE Proceedings INFOCOM, 2010 .
[10] Shan Jiang, Joseph Ferreira, Marta C. Gonzalez, ”Activity-Based Human
Mobility Patterns Inferred from Mobile Phone Data: A Case Study of
Singapore”,in the IEEE Transactions on Big Data Journal, Vol:PP, Issue:
99, 2016 .
[11] Andreas Janecek, Danilo Valerio, Karin Anna Hummel, Fabio Ricciato,
Helmut Hlavacs, ”The Cellular Network as a Sensor: From Mobile Phone
Data to Real-Time Road Traffic Monitoring”, in the IEEE Transactions
on Intelligent Transportation Systems Journal, Vol: 16, Issue: 5, 2015.
[12] D. Krishna Reddy, A.D. Sarma and V. Satya Srinivas, ”Mobile Position
Estimation with RSS Based Techniques in an Urban City with Multiple
Multi-storied Structures”, in the Radio Science Meeting (Joint with AP-S
Symposium), 2014.
[13] S. Merigeault ; M. Batariere ; J.N. Patillon, ”Data fusion based on
neural network for the mobile subscriber location”, in the IEEE Vehicular
Technology Conference, 2000.
[14] Chin-Liang Wang; Dong-Shing Wu; Shih-Cheng Chen; Kai-Jie Yang,
”A Decentralized Positioning Scheme Based on Recursive Weighted
Least Squares Optimization for Wireless Sensor Networks”, in the IEEE
Transactions on Vehicular Technology, 2015.
[15] Isaac Amundson and Xenofon D. Koutsoukos, ”A Survey on Local-
ization for Mobile Wireless Sensor Networks ”, in Proc of the Second
International Workshop, 2009, pp 235-254.
[16] Lyudmila Mihaylova, Donka Angelova, David R. Bull, and Nishan
Canagarajah, ”Localization of Mobile Nodes in Wireless Networks with
Correlated in Time Measurement Noise”, in the IEEE Transactions on
Mobile Computing, Volume 10, No. 1, JANUARY 2011.
[17] Mohammad Shaifur Rahman, Youngil Park, and Ki-Doo
Kim,”Localization of Wireless Sensor Network using artificial neural
network”, in the 9th International Symposium on Communications and
Information Technology, ISCIT, 2009.
[18] Fumio Teraoka, Tetsuya Arita, ”PNEMO: a Network-Based Localized
Mobility Management Protocol for Mobile Networks”, in the Third
International Conference onUbiquitous and Future Networks (ICUFN),
2011.
[19] Jin, Yu and Duffield, Nick and Gerber, Alexandre and Haffner, Patrick
and Hsu, Wen-Ling and Jacobson, Guy and Sen, Subhabrata and
Venkataraman, Shobha and Zhang, Zhi-Li, ”Characterizing Data Usage
Patterns in a Large Cellular Network”, in the Proc. of the ACM SIG-
COMM Workshop on Cellular Networks: Operations, Challenges, and
Future Design, 2012.
[20] Cisco and/or its affiliates, ”Cisco Visual Networking Index (tm), Global
Mobile Data Traffic Forecast (2016 to 2021)”, White paper, Cisco Public,
2017.
[21] Furletti, Barbara and Gabrielli, Lorenzo and Renso, Chiara and
Rinzivillo, Salvatore, ”Identifying Users Profiles from Mobile Calls
Habits”, in the Proceedings of the ACM SIGKDD International Workshop
on Urban Computing, 2012.
[22] H. Kanasugi and Y. Sekimoto and M. Kurokawa and T. Watanabe
and S. Muramatsu and R. Shibasaki, ”Spatiotemporal route estimation
consistent with human mobility using cellular network data”, in the IEEE
International Conference on Pervasive Computing and Communications
Workshops (PERCOM Workshops), 2013.
[23] S. Zhang and D. Yin and Y. Zhang and W. Zhou, ”Computing on Base
Station Behavior Using Erlang Measurement and Call Detail Record”,in
the IEEE Transactions on Emerging Topics in Computing Journal, 2015.
[24] International Telecommunication Union, ”Specification of TMN ap-
plications at the Q3 interface: Call detail recording”, in the ITU-T
Recommendation Q.825,1998.
[25] J. L. Toole, S. Colak, B. Sturt, P. A. Lauren, A. Evsukoff, M.C.
Gonzalez, ”The path most traveled: Travel demand estimation using big
data resources”, in Transportation Research Part C Journal, Vol. 58, pp.
162-177, 2015.
[26] N.E. Williams, T.A. Thomas, M. Dunbar, N. Eagle, A. Dobra, ”Measures
of Human Mobility Using Mobile Phone Records Enhanced with GIS
Data”, in PLoS ONE Journal, Vol. 10(7),2015.
... This means that we have only temporal events. By combining this information with the geographical shape of the coverage area, the data is transformed into spatiotemporal data [25]. Then, we are able to construct geographical trajectories in which the data is very sparse in time and space ( Figure 2) due to the network protocols, as explained previously in sections:II-C,II-B. ...
... This renders positioning unreliable. Therefore, the accuracy of positioning is an issue when using this type of data [25]. As a consequence, all proposed methods in the literature have considerable variance in terms of travel time observations. ...
... Another approach would be to try to use data for developing and exploring a new way of positioning mobile devices within coverage areas by introducing probabilistic and statistical approaches as it was illustrated in [25], [61]. The presented method showed that it is possible to estimate the exact location using only location area data. ...
Article
Full-text available
The fast development of information and communication technologies has engendered a massive generation of mobile data with great potential for extracting mobility patterns and sensing urban dynamics. The data, itself, has unique characteristics depending on the technology and the network coverage design for different zones (urban-rural). Therefore, many research directions investigated the potential of such data to extract some important features of traffic information, such as travel time estimation, traffic flow, etc. From this perspective, our paper focuses on highlighting the advancements done during the last twenty years concerning the usage of mobile phone network data to estimate vehicles' travel time in different geographical conditions. Furthermore, it discusses and covers the existing literature methods and the challenges they face due to the nature of the data, its quality, and its complexity. Finally, we reveal the main trends and directions for future work.
... Then, the next aspect that needs attention is the reduction of spatial uncertainties by estimating the position of the user within the cell-plan. The only study we have found in this area is from Hadachi, Lind et al. [8] [9] where the authors use a version of the Kalman Filter called the Switching Kalman filter with smoothing. The model tries to extract the movement patterns using data-driven exploration and label each record as a Stay, Move, or Jump position. ...
... In [11] it is stated that the linear approaches should satisfy the needs of modelling the human behaviour without the use of complex non-linear models. Hence, we will explore if Particle Filter as a non-linear approach is capable of outperform linear approach such as Switching Kalman filter [8] in positioning and trajectory reconstruction. ...
... The error distribution When it comes to similar applications we already mentioned that there is not extensive related works. However, the authors in [9] [8], have taken the same problem under the consideration and used the Switching Kalman Filter approach to solve it. We have received access to the dataset they used in their study and the results acquired. ...
Conference Paper
Full-text available
Mobile positioning is a key element in many geolocation applications and research fields about human mobility patterns, location-based services, targeted marketing, urban mobility, public health, and transport planning. The commonly used data for understanding the large scale mobility patterns are mobile network data or Call Detail Records (CDRs). However, CDR data has two major drawbacks: temporal and spatial uncertainties and sparseness. Although the first problem is widely covered by trajectory reconstruction techniques, the second problem still remains challenging. Hence, in this paper, we propose the adaptation of a new method based on a particle filter algorithm for mobile positioning and trajectory reconstruction. Our goal is to evaluate if this nonlinear method can out-perform the existent linear methods like Switching Kalman Filter. Therefore, the model performance and the effects of the parameters on accuracy were evaluated in controlled experimental settings. Additionally, the experiments were performed on a real dataset and compared with the results achieved by a linear approach. Source code: https://github.com/salijona/pfLoc
... Earlier, we proposed a mobile positioning method applying Switching Kalman Filter (SKF) on CDR data [14]. Our current research is focused on understanding if the cell coverage approximation method used in SKF based positioning causes more significant errors. ...
... Later, Batrashev et al. demonstrated mobility episode detection based on the multimodal Kalman filter [26]. As a result, Hadachi et al. and later focused explicitly on the subscriber's trajectory reconstruction and positioning [14]. Dyrmishi et al. re-evaluated the method against an approach based on particle filter [27]. ...
... Previously in our research, we proposed a method to estimate subscribers' location within the coverage area of the cellular network [14] using mobile network data. The method showed promising results yet leaving many open questions. ...
Preprint
Full-text available
The mobile subscribers' positioning is an essential feature for multiple applications such as navigation, geo-information systems, and location-based service. Having accurate information about the subscriber's location implicitly affects customer satisfaction. Although satellite-based positioning shows remarkable accuracy on the end-user device, the casual deviations and drift are still present. One of the options to supplement satellite-based positioning is to use cellular network-based positioning. Our studies focus on the critical aspect affecting the accuracy of positioning based on mobile cellular network data. In particular, we illustrate how the coverage area model can affect positioning accuracy. Besides, we propose a robust cell coverage modeling technique that improves mobile positioning accuracy based on mobile network data. This paper has been accepted for publication in ICL-GNSS 2021.
... The numbers of observations for the VPs whose radius is above 8 km are low and the average over these VPs is presented as the last point on both charts. varies from several hundred meters to 1-2 km (Forghani, Karimipour, and Claramunt, 2020;Horn, Klampfl, Cik, and Reiter, 2014;Lind, Hadachi, and Batrashev, 2017), while the estimates of the total traversed distance are closer, and the difference does not exceed 20% (Mohamed, Aly, and Youssef, 2017). The closest to ours, the study by Ulm, Widhalm, and Brändle (2015) compares GPS and cell tower location data for 250 volunteers, available from the OpenCellID project (https://opencellid.org/) that collects data on cell tower connections together with the GPS positions. ...
Article
The partition of the Mobile Phone Network (MPN) service area into the cell towers' Voronoi polygons (VP) may serve as a coordinate system for representing the location of the mobile phone devices, as demonstrated by numerous papers that exploit mobile phone data for studying human spatial mobility. In these studies, the user is assumed to be located inside the VP of the connected antenna. We investigate the credibility of this view by comparing volunteers' empirical data of two kinds: (1) VP of the connected 3G and 4G cell towers and (2) GPS tracks of these users at the time of connection. In more than 60% of connections, the user's mobile device was found outside the VP of the connected cell tower. We demonstrate that the area of a possible device's location is many times larger than the area of the cell tower's VP. To comprise 90% of the possible locations of the device connected to a specific cell tower, one has to consider the tower's VP together with the two adjacent rings of VPs. An additional, third, ring of the adjacent VPs is necessary to comprise 95% of the possible locations of the device connected to the cell tower. The revealed location uncertainty is in the nature of the MPN structure and service and entails essential overlap between the cell towers' service areas. We discuss the far-reaching consequences of this uncertainty for estimating locational privacy and urban mobility - population flows and individual trajectories. Our results undermine today's dominant opinion that an adversary, who obtains access to the database of the Call Detail Records maintained by the MPN operator, can identify a mobile device without knowing its number based on a very short sequence of time-stamped field observations of the user's connection.
... Recent and more extensive study of Pappalardo (2021) reports larger differences between the CDR-and GPS-based estimates of the home location, which vary, depending on the location between 1 and 5 km and just as observed in our study, the antenna that serves the connection is not necessarily the closest one. The average difference between CDR-and GPS trajectories varies between several hundred meters to 1-2 km (Horn et al 2014; Lind et al, 2017), while the estimates of the total traversed distance are closer, and the difference does not exceed 20% (Mohamed et al, 2017). ...
Preprint
Full-text available
The partition of the Mobile Phone Network (MPN) service area into the cell towers' Voronoi polygons (VP) may serve as a coordinate system for representing the location of the mobile phone devices. This view is shared by numerous papers that exploit mobile phone data for studying human spatial mobility. We investigate the credibility of this view by comparing volunteers' locational data of two kinds: (1) Cell towers' that served volunteers' connections and (2) The GPS tracks of the users at the time of connection. In more than 60\% of connections, user's mobile device was found outside the VP of the cell tower that served for the connection. We demonstrate that the area of possible device's location is many times larger than the area of the cell tower's VP. To comprise 90\% of the possible locations of the devices that may be connected to the cell tower one has to consider the tower's VP together with the two rings of the VPs adjacent to the tower's VPs. An additional, third, ring of the adjacent VPs is necessary to comprise 95\% of possible locations of the devices that can be connected to a cell tower. The revealed location uncertainty is in the nature of the MPN structure and service and entail essential overlap between the cell towers' service areas. We discuss the far-reaching consequences of this uncertainty in regards to the estimating of locational privacy and urban mobility. Our results undermine today's dominant opinion that an adversary, who obtains the access to the database of the Call Detail Records maintained by the MPN operator, can identify a mobile device without knowing its number based on a very short sequence of time-stamped field observations of the user's connection.
... LBTU data can be obtained from a mesh of fixed sensors on the road network with unique vehicle feature identification capability. For instance, Automatic number plate recognition systems (ANPR) can trace the unique number plate of the vehicles on the network [11]; Bluetooth MAC Scanners (BMS) can trace the unique MAC-ID of the Bluetooth equipped electronic devices in the car [12]; Radio Frequency Identification Detector (RFID) traces the unique RFID number attached to the vehicle [13]; and Location Area Update from Call Detailed Record (CDR) data includes the information for the cell location of the mobile network [14]. The trajectory obtained from the LBTU data is a course trajectory, where the intra-movement between two successive updates of the locations is to be estimated. ...
Article
Availability of the big data from Bluetooth MAC Scanners (BMS) over the network provides opportunities to trace the movement of the individual Bluetooth-equipped vehicles on the network. However, BMS might not perfectly detect all the devices within its detection zone. For dense urban networks, the scanner scanning zones can significantly overlap, which complicates the detailed reconstruction of the vehicle trajectory. Addressing the need, this paper proposes a Slit based Trajectory Reconstruction (STATER) algorithm where for each BMS, a slit is defined that considers the overlap and connectivity with other BMS, and thereafter the trajectory is reconstructed considering the shortest path over the observed sequence of slits. A numerical simulation framework is proposed to thoroughly test the proposed STATER algorithm at various levels of ambiguity and randomness in the input dataset. The testing results indicate that the reconstructed trajectories could capture more than 90% (true positive) of the actual path and an average error (false positive) of 11.3% at different randomness levels considered in the experiments. As proof of concept, STATER is applied on one-day data from the entire Brisbane network with 0.56m trips, the computational performance for which supports its practical applicability.
... In the last year, we witnessed an increase in the scientific research contribution concerning the usage of mobile data or call detail records (CDRs) in many applications related to mobility aspect and intelligent transportation systems such as trajectory reconstruction [1], [2], mobility hubs discovery [3], traffic monitoring [5], mobility episodes detection [4], travel time estimation [6], etc. From this perspective, it is clear that mobile data has a great potential in depicting mobility patterns [8] from macroscopic perceptive and to some extent from microscopic level [7]. However, during the process of manipulating the CDR data and extracting trajectories from it, many researchers noticed that the handover phenomena, which is the process of transferring a connection from one cell to another without disrupting the session, create anomalies and misleading information about the movement of the mobile users which affect the quality of the mobility patterns [9]. ...
Preprint
Full-text available
Call Detail Records (CDRs) coupled with the coverage area locations provide the operator with an incredible amount of information on its customers' whereabouts and movement. Due to the non-static and overlapping nature of the antenna coverage area there commonly exist situations where cellphones geographically close to each other can be connected to different antennas due to handover rule - the operator hands over a certain cellphone to another antenna to spread the load between antennas. Hence, this aspect introduces a ping-pong handover phenomena in the trajectories extracted from the CDR data which can be misleading in understanding the mobility pattern. To reconstruct accurate trajectories it is a must to reduce the number of those handovers appearing in the dataset. This letter presents a novel approach for filtering ping-pong handovers from CDR based trajectories. Primarily, the approach is based on anchors model utilizing different features and parameters extracted from the coverage areas and reconstructed trajectories mined from the CDR data. Using this methodology we can significantly reduce the ping-pong handover noise in the trajectories, which gives a more accurate reconstruction of the customers' movement pattern.
... Different approaches have been conducted in the literature for enhancing positioning errors. One possible direction is using data for positioning the mobile devices within the coverage area, using the probabilistic and statistical approaches Lind et al., 2017a). Although this approach is not capable of determining the exact location, the experiments show, in some cases, it is possible to estimate the location of the mobile device within a couple of meters. ...
Article
Full-text available
In this study, with Estonia as an example,we established an approach based on Hidden Markov Model to extract large-scale commuting patterns at different geographical levels using a massive amount of mobile phone cellular network data, which is referred to as Call detail record (CDR). The proposed model is designed for reconstructing and transforming the trajectories extracted from the CDR data. This step allowed us to perform origin-destination matrix extraction among different geographical levels, which helped in depicting the commuting patterns. Besides, we introduced different techniques for analyzing the commuting at the urban level. Our results unveiled that there is great potential behind mobile data of the cellular networks after transforming it into meaningful mobility patterns. That can easily be used for understanding urban dynamics, large-scale daily commuting and mobility. The aggressive development and growth of ubiquitous mobile sensing have generated valuable data that can be used with our approach for providing answers and solutions to the growing problems of transportation, urbanization and sustainability.
Article
Full-text available
In this study, with Singapore as an example, we demonstrate how we can use mobile phone call detail record (CDR) data, which contains millions of anonymous users, to extract individual mobility networks comparable to the activity-based approach. Such an approach is widely used in the transportation planning practice to develop urban micro simulations of individual daily activities and travel; yet it depends highly on detailed travel survey data to capture individual activity-based behavior. We provide an innovative data mining framework that synthesizes the state-of-the-art techniques in extracting mobility patterns from raw mobile phone CDR data, and design a pipeline that can translate the massive and passive mobile phone records to meaningful spatial human mobility patterns readily interpretable for urban and transportation planning purposes. With growing ubiquitous mobile sensing, and shrinking labor and fiscal resources in the public sector globally, the method presented in this research can be used as a low-cost alternative for transportation and planning agencies to understand the human activity patterns in cities, and provide targeted plans for future sustainable development.
Article
Full-text available
Mobile cellular networks can serve as ubiquitous sensors for physical mobility. We propose a method to infer vehicle travel times on highways and to detect road congestion in real-time, based solely on anonymized signaling data collected from a mobile cellular network. Most previous studies have considered data generated from mobile devices active in calls, namely Call Detail Records (CDR), an approach that limits the number of observable devices to a small fraction of the whole population. Our approach overcomes this drawback by exploiting the whole set of signaling events generated by both idle and active devices. While idle devices contribute with a large volume of spatially coarse-grained mobility data, active devices provide finer-grained spatial accuracy for a limited subset of devices. The combined use of data from idle and active devices improves congestion detection performance in terms of coverage, accuracy, and timeliness. We apply our method to real mobile signaling data obtained from an operational network during a one-month period on a sample highway segment in the proximity of a European city, and present an extensive validation study based on ground-truth obtained from a rich set of reference datasources—road sensor data, toll data, taxi floating car data, and radio broadcast messages.
Article
Full-text available
In the past decade, large scale mobile phone data have become available for the study of human movement patterns. These data hold an immense promise for understanding human behavior on a vast scale, and with a precision and accuracy never before possible with censuses, surveys or other existing data collection techniques. There is already a significant body of literature that has made key inroads into understanding human mobility using this exciting new data source, and there have been several different measures of mobility used. However, existing mobile phone based mobility measures are inconsistent, inaccurate, and confounded with social characteristics of local context. New measures would best be developed immediately as they will influence future studies of mobility using mobile phone data. In this article, we do exactly this. We discuss problems with existing mobile phone based measures of mobility and describe new methods for measuring mobility that address these concerns. Our measures of mobility, which incorporate both mobile phone records and detailed GIS data, are designed to address the spatial nature of human mobility, to remain independent of social characteristics of context, and to be comparable across geographic regions and time. We also contribute a discussion of the variety of uses for these new measures in developing a better understanding of how human mobility influences micro-level human behaviors and well-being, and macro-level social organization and change.
Conference Paper
Kalman filtering is widely used in target tracking. However, conventional Kalman filtering may fail to track the target when there is acceleration, deceleration, and turn. In this paper, these maneuvers are characterized by two orthogonal components of the changed velocity in l 1 -norm. The adaptive factors to adjust the Kalman gain are then generated through a mapping function based on the characterization. Sea experimental results show that the proposed adaptive Kalman filtering is better at tracking the maneuvering target
Conference Paper
Localization in Wireless Sensor Networks (WSNs) denotes the procedure of a single sensor node to determine its geographical position in space. As these nodes are limited in computational power, battery lifetime and communication range, there is the requirement for efficient localization algorithms which is an ongoing topic in research. Nearly all algorithms are based on the usage of seed nodes which are aware of their location and help other nodes approximating their own position. In this paper we extend an existing Monte Carlo particle filter approach (Monte Carlo Localization, MCL) to account for situations where the degree of seed nodes is low, i.e. the location estimation of a node cannot be updated. For this purpose we make use of comparatively cheap sensors to determine the movement direction and velocity of a node. With this obtained information we can update a nodes recent position estimation even in the absence of seed nodes. We simulate our approach and compare our results to the originally proposed algorithm, MCL.
Article
As the impressive development of wireless devices and growth of mobile users, telecommunication operators are thirsty for understanding the characteristics of mobile network behavior. Based on the big data generated in the telecommunication networks, telecommunication operators are able to obtain substantial insights by using big data analysis and computing techniques. This paper introduces the important aspects in this topic, including data set information, data analysis techniques and two case studies. We categorize the data set in the telecommunication networks into two types, user-oriented and network-oriented, and discuss the potential application. Then several important data analysis techniques are summarized and reviewed, from temporal and spatial analysis to data mining and statistical test. Finally, we present two case studies, using erlang measurement and call detail record, respectively, to understand the base station behavior. Interestingly, the ’Night Burst’ phenomenon of college students is revealed by comparing the base stations location and real-world map, and we conclude that it is not proper to model the voice call arrivals as Poisson process.
Conference Paper
Continuous personal position information has been attracting attention in a variety of service and research areas. In recent years, many studies have applied the telecommunication histories of mobile phones (CDRs: call detail records) to position acquisition. Although large-scale and long-term data are accumulated from CDRs through everyday use of mobile phones, the spatial resolution of CDRs is lower than that of existing positioning technologies. Therefore, interpolating spatiotemporal positions of such sparse CDRs in accordance with human behavior models will facilitate services and researches. In this paper, we propose a new method to compensate for CDR drawbacks in tracking positions. We generate as many candidate routes as possible in the spatiotemporal domain using trip patterns interpolated using road and railway networks and select the most likely route from them. Trip patterns are feasible combinations between stay places that are detected from individual location histories in CDRs. The most likely route could be estimated through comparing candidate routes to observed CDRs during a target day. We also show the assessment of our method using CDRs and GPS logs obtained in the experimental survey.