Figure 4 - uploaded by Patrick Jaillet
Content may be subject to copyright.
The 50 busiest inter-district connections, depending on the time of the day. The width of the lines represents the number of people traveling between the connected districts. Blue lines indicate the number of people taking public transport, and red lines the number of people taking private transport. The highest mode share of private trac can be observed between Bukit Timah and the city center. Moreover, many weak public transport connections can be found around Bedok and Tampines.  

The 50 busiest inter-district connections, depending on the time of the day. The width of the lines represents the number of people traveling between the connected districts. Blue lines indicate the number of people taking public transport, and red lines the number of people taking private transport. The highest mode share of private trac can be observed between Bukit Timah and the city center. Moreover, many weak public transport connections can be found around Bedok and Tampines.  

Source publication
Article
Full-text available
Securing public transportation ridership is critical for developing a sustainable urban future. However, many modern and growing cities are facing declines in public transport usage. Existing systems for analyzing and identifying weaknesses in public transport connections face major limitations. In cities, origin-destination (OD) matrices--which me...

Citations

... Isaacman et al. (2011) used CDRs at various times of the day to analyze the commuter population and employment space in the New York suburban areas. Holleczek et al. (2014) adopted a data mining approach based on 24-hour phone-based data from Singapore and visualized the country's mobility and connectivity. Lambiotte et al. (Scellato et al., 2011) examined a dataset of mobile phone CDRs from a mobile phone communication network in Belgium. ...
... So, it was difficult to obtain comprehensive results due to limitations of data amount and coverage (Mikkelsen and Christensen, 2009). Nowadays, the emergence of fine-grained spatio-temporal mobility traces (timestamped location information) has extended the studies on human mobility dynamics (Holleczek et al., 2014;Legara and Monterola, 2018). Moreover, it is thanks to well-developed mobile communication technology that one can collect and analyze large-volume of fine-grained data. ...
... Moreover, it is thanks to well-developed mobile communication technology that one can collect and analyze large-volume of fine-grained data. As an example, recent studies using cellular network data or smart card data have allowed capturing commuting patterns of virtually the entire population of travellers (Holleczek et al., 2014;Poonawala et al., 2016;Wang et al., 2018). This was hardly imaginable before. ...
... The rapid growth of information technology and communication (ICT), and Internet of Things (IoT) deployments in recent years has allowed researchers use mobility information to infer spatiotemporal features. While Global Positioning System (GPS) technology (e.g., social media data or mobile phone data) is the most widely implemented due to its mature technology, simple operation, and low cost, other alternatives such as WiFi hotspot based localization, Global System for Mobiles (GSM) or smart card data have all proven valuable for the transport mode detection and discover regularities in human mobility (Schuessler and Axhausen, 2009;Holleczek et al., 2014;Legara and Monterola, 2018;Wang et al., 2018). Transport operators and road authorities are no exception to the use of ICT. ...
Article
This study examines the school travel mode of children and youth students (ages 7 to 18) in Singapore. Using a large crowdsensing dataset the paper focuses on minute-by-minute decision-making of those students living within 2.5km of school. Data-driven methods are employed in order to identify students' chosen transport mode (car, walking, taking bus or riding metro). Furthermore, we present attributes of travel mode alternatives computed by a replicable framework that utilises open sources. New algorithms are developed to identify proxies for walking access and public transport access. We found that about 19% of students in the sample live up to a distance of 2.5km from the school. From these, about 45% of trips are made by public transit (e.g., bus and metro), and only 13% are made by walking. The empirical results suggest that the public transport modes of bus and metro are not distinct. Consistent with past research based on traditional survey data, walking time and walking distance are the most influential factors in the decision to walk-to-school. Interestingly, schools' connectivity to the street network is found to play a key role on the shift from public transport to walking. Likewise, departing at peak hours, the odds to choose public transport modes are about 40-45% lower as compared to walk.
... Big data methods can also be used to evaluate the network's performance by monitoring and analyzing the demand and actual passenger count, delays, and disruptions [14]. Using historical data, patterns of broken connections can, for example, be analyzed to reveal flaws in the timetable and network plans [15]. Duty scheduling can also be optimized. ...
Article
Full-text available
The planning and implementation of public transport involves many data sources. These data sources in turn generate a high volume of data, in a wide variety of formats and data rates. This phenomenon is reinforced by the ongoing digitization of public transport; new data sources have continuously emerged in public transport in recent years and decades. This results in a great potential for the application and utilization of data science methods in public transport. Using big data methods and sources can, or in some cases already does, contribute to a better understanding and the further optimization of public transport networks, public transport service and public transport in general. This paper classifies data sources in the field of public transport and examines systematically for which use cases the data are used or can be used. These steps contribute by structuring ongoing discussions about the application of data science in the public transport domain and illustrate the potential of the application of data science for public transport. We present several use cases in which we applied data science methods, such as machine learning and visualization to public transport data. Several of these projects use data from automated passenger information systems, a data source that has not been widely studied to date. We report our findings for these use cases and discuss the lessons learned, to inform future research on these use cases and discuss their potential. This paper concludes with a summary of the typical problems that occur when dealing with big public transport data and a discussion of solutions for these problems. This discussion identifies future work and topics worth investigating for public transport companies as well as for researchers. Working on these topics will, in our opinion, support the improvement of public transport towards the efficiency and attractiveness that is needed for public transport to play its essential role in future sustainable mobility. The application of these methods in public transport requires the collaboration of domain experts with researchers and data scientists, calling for a mutual understanding. This paper also contributes to this understanding by providing an overview of the methods that are already used, potential new use cases, data sources, challenges and possible solutions.
... Other heterogeneous big data such as Global System for Mobile Communications (GSM) data (White and Wells, 2002), high-frequency mobile phone location data (Calabrese et al., 2011), and social media data, e.g., Twitter data (Lee et al., 2015) are rarely explored for the validation purpose of tOD estimation (Liu and Zhou, 2019). There are studies which have loosely coupled smartcard and mobile phone data for the tOD analysis, such as Holleczek et al. (2014), and Regt et al. (2017) but their work doesn't directly build on the tOD estimation or validation problem. Studies such as Gu et al. (2017), where smartphone data is used to detect short activities, can be potentially used to validate the transfer inference algorithm. ...
Article
Full-text available
In public transport, smartcards are primarily used for automatic fare collection purpose, which in turn generate massive data. During the last two decades, a tremendous amount of research has been done to employ this big data for various transport applications from transit planning to real-time operation and control. One of the smart card data applications is the estimation of the public transit origin-destination matrix (tOD). The primary focus of this article is to critically analyse the current literature on essential steps involved in the tOD estimation process. The steps include processes of data cleansing, estimation of unknowns, transfer detection, validation of developed algorithms, and ultimately estimation of zone level transit OD (ztOD). Estimation of unknowns includes boarding and alighting information estimation of passengers. Transfer detection algorithms distinguish between a transfer or an activity between two consecutive boarding and alighting. The findings reveal many unanswered critical research questions which need to be addressed for ztOD estimation using smartcard data. The research questions are primarily related to the conversion of stop level OD (stOD) to ztOD, transfer detection, and a few miscellaneous problems.
... residential areas) to a destination zone (e.g. parks) in Singapore (Holleczek et al., 2014;Jiang et al., 2017). These data are developed from a call digital record, which is collected for the billing purpose by the company and indicates the location of the cell tower that each mobile phone connects to when the phone is used for either a phone call, text messages, or the use of the internet including emails (Holleczek et al., 2014). ...
... parks) in Singapore (Holleczek et al., 2014;Jiang et al., 2017). These data are developed from a call digital record, which is collected for the billing purpose by the company and indicates the location of the cell tower that each mobile phone connects to when the phone is used for either a phone call, text messages, or the use of the internet including emails (Holleczek et al., 2014). ...
... The literature demonstrates that these origin-destination matrices data can robustly reveal daily trip patterns of the phone users in comparison to the Singapore Household Interview Travel Survey data (Holleczek et al., 2014;Jiang et al., 2017). The origin and destination zones of the data were defined by the subzone system of the Master Plan 2014 developed by the Urban Redevelopment Authority in Singapore ( Figure 1). ...
Article
Full-text available
Big data have the potential to improve nonmarket valuation, but their application has been scarce. To test this potential, we apply mobile phone data to the zonal travel cost method and measure recreational ecosystem services from Bukit Timah (representing an urban protected area) and Jurong Lake Gardens (an urban recreational park) in Singapore. The study results show that the annual recreational benefits of the recreational park (S54,698,761toS54,698,761 to S66,805,454) outweighed the benefits of the protected area (S6,947,974toS6,947,974 to S9,068,027). The count data structure reduced the flexibility of the mobile phone data application. Compared to survey data, however, mobile phone data could prevent random errors and visitor memory biases; monitor impacts of site quality changes over time; count visitors from multiple entrances; and be cost-efficient. Overall, these results highlight the potential of mobile phone data application to improve travel cost analysis. (Free access: https://authors.elsevier.com/a/1afvw14Z6tehIY)
... These approaches are indeed innovative and capture in detail individual travel behaviour, but are limited by their sample sizes (e.g. number of volunteers) and currently face scaling difficulties (Holleczek et al., 2014). Also, a low penetration of smartphones on a global scale and limited access to GPS related information from Telecom Operators because of user privacy policies also hinders this to be an effective mode for calculating travel times going forward. ...
... Some studies using mobile phone data have been done. For instance, Xu, Shaw, Fang, and Yin (2016), Kujala et al. (2016), Holleczek et al. (2014), Calabrese, Di Lorenzo, Lui, andRatti (2011), andJarv, Ahas, Saluveer, Derudder, andWitlox (2012) all show that cell phone data can be used to describe people's movement pattern. However, most studies have not focused on the rail in particular, but typically addressing travel in general. ...
Article
Full-text available
Several studies have pointed to the difficulties of obtaining good data on train ridership. This paper is a literature review on how the number of travellers on trains are measured, including technologies and practices for measuring actual ridership. There are a number of publications and practical work done on estimating ridership. We find there are several technologies that can be applied for measuring ridership on trains. The technologies and approaches include (1) Manual counts and surveys, (2) On-board sensors such as door passing, weight, CCTV and Wi-Fi-use, (3) Ticketing systems, ticket sales or ticket validation, and (4) Tracking of travellers for larger part of the journey, e.g. by mobile phones and payments. Data from on-board sensors and ticketing systems are both managed by public transportation providers. By contrast, surveys, payments statistics and mobile phone data may be available to stakeholders outside the public transportation system, which can be an advantage, as access to ridership data can be an issue for business reasons. Furthermore, mobile phone data appears as an interesting option, as they can track complete journeys. New technologies, and especially mobile phone data, are therefore of special interest in future uses of ridership data for evaluations and quality assessments.
... However, extracting required information (e.g., travel mode and purpose) from the data captured by smartphone applications is relatively complex. Holleczek et al. (2014) showed that urban mobility patterns and transport mode choices can be derived from mobile phone CDR coupled with public transport data. This public transport dataset consists of trips made by 4.4 million anonymized users of Singa- pore's public transport system. ...
Article
Several studies have pointed to the difficulties of obtaining good data on train ridership. There are at least two challenges regarding these data. First, train operators consider such data confidential business information, especially in high resolution. Second, the data that actually are available vary in quality and coverage. This paper studies mobile phone data as an alternative measure to obtain data about train ridership. Handset counts were obtained from one telecom operator for selected mobile phone base stations and compared with timetable data and APC. The selected base stations are located so that it is likely that a large share of the mobile phone traffic is generated by train passengers. The number of units connected to a base station is found to correspond relatively well with the trains that pass close to the base stations. A ratio between the handset count and APC data appear as promising in utilizing handset count to calculate train ridership, with ratios around one in the rush hours. We discuss preliminary results as well as methodological and technical challenges. To make sure that we do not violate privacy concerns, the data used in the study have been approved by personal privacy representatives.
... We now give an overview of studies comparable to ours: @BULLET The works of[Holleczek et al., 2015;Lee and Kam, 2014;Sun et al., 2012]or[Poonawala et al., 2016], produced in Singapore, are based on smart card data or GSM (cellular phone) data and focus only on metro commuter trips, while we recognize different modes of transportation. @BULLET[Holleczek et al., 2014]Kam, 2014]). ...
Article
Full-text available
Routing games are one of the most successful domains of game theory. It is well understood that simple dynamics converge to equilibria, whose performance is nearly optimal regardless of the size of the network or the number of agents. These strong theoretical assertions prompt a natural question: How well do these pen-and-paper calculations agree with the reality of everyday traffic routing? We focus on a semantically rich dataset from Singapore's National Science Experiment that captures detailed information about the daily behavior of thousands of Singaporean students. Using this dataset, we can identify the routes as well as the modes of transportation used by the students, e.g. car (driving or being driven to school) versus bus or metro, estimate source and sink destinations (home-school) and trip duration, as well as their mode-dependent available routes. We quantify both the system and individual optimality. Our estimate of the Empirical Price of Anarchy lies between 1.11 and 1.22. Individually, the typical behavior is consistent from day to day and nearly optimal, with low regret for not deviating to alternative paths.
... Furthermore, this does not address the fact that spatially, building energy use changes throughout the day as people go to and from work and home. Future work might attempt to quantify the spatial ebb and flow of people using a combination of surveys, census data, and methods using call detail records to derive home versus work locations as shown in Holleczek et al. (2014) . Building energy use intensity might be modelled by season and diurnally based on factors such as building occupancy, building age, form, and function. ...
Article
Full-text available
A method for directly measuring carbon dioxide (CO2) emissions using a mobile sensor network in cities at fine spatial resolution was developed and tested. First, a compact, mobile system was built using an infrared gas analyzer combined with open-source hardware to control, georeference, and log measurements of CO2 mixing ratios on vehicles (car, bicycles). Second, two measurement campaigns, one in summer and one in winter (heating season) were carried out. Five mobile sensors were deployed within a 1×12.7 km transect across the city of Vancouver, BC, Canada. The sensors were operated for 3.5 h on pre-defined routes to map CO2 mixing ratios at street level, which were then averaged to 100 × 100 m grid cells. The averaged CO2 mixing ratios of all grids in the study area were 417.9 ppm in summer and 442.5 ppm in winter. In both campaigns, mixing ratios were highest in the grid cells of the downtown core and along arterial roads and lowest in parks and well vegetated residential areas. Third, an aerodynamic resistance approach to calculating emissions was used to derive CO2 emissions from the gridded CO2 mixing ratio measurements in conjunction with mixing ratios and fluxes collected from a 28 m tall eddy-covariance tower located within the study area. These measured emissions showed a range of -12 to 226 CO2 ha-1 h-1 in summer and of -14 to 163 kg CO2 ha-1 h-1 in winter, with an average of 35.1 kg CO2 ha-1 h-1 (summer) and 25.9 kg CO2 ha-1 h-1 (winter). Fourth, an independent emissions inventory was developed for the study area using buildings energy simulations from a previous study and routinely available traffic counts. The emissions inventory for the same area averaged to 22.06 kg CO2 ha-1 h-1 (summer) and 28.76 kg CO2 ha-1 h-1 (winter) and was used to compare against the measured emissions from the mobile sensor network. The comparison on a grid-by-grid basis showed linearity between CO2 mixing ratios and the emissions inventory (R2=0.53 in summer and R2=0.47 in winter). Also, 87 % (summer) and 94 % (winter) of measured grid cells show a difference within ±1 order of magnitude, and 49 % (summer) and 69 % (winter) show an error of less than a factor 2. Although associated with considerable errors at the individual grid cell level, the study demonstrates a promising method of using a network of mobile sensors and an aerodynamic resistance approach to rapidly map greenhouse gases at high spatial resolution across cities. The method could be improved by longer measurements and a refined calculation of the aerodynamic resistance.
... As can be seen from the literature, several studies have focused on the analysis of OD flows in public transport and have explored the possibility to use mobile phone data to identify the weak links of the road network (i.e., Holleczek et al., 2014) or GPS coordinates to analyze and describe the patterns that characterize people behavior (i.e., Jiang, Yin, & Zhao, 2009;Liu, Kang, Gao, Xiao, & Tian, 2012). Positioning data have been deployed to have information on the current and historical position and predict the place a user will visit (i.e., Noulas, Scellato, Lathia, & Mascolo, 2012) or to analyze travelers' behavior between different origins and destinations of the city based on mobility purposes such as work and shopping (i.e., Liu, Janssens, Wets, & Cools, 2013;Yuan, Raubal, & Liu, 2012). ...
Article
With the increasing use of Intelligent Transport Systems, large amounts of data are created. Innovative information services are introduced and new forms of data are available, which could be used to understand the behaviour of travellers and the dynamics of people flows. This work analyse the requests for real time arrivals of bus routes at stops in London made by travellers using Transport for London's LiveBus Arrivals system. The available dataset consists of about one million requests for real time arrivals for each of the 28 days under observation. These data are analysed for different purposes. LiveBus Arrivals users are classified based on a set of features and using K-Means, Expectation Maximization, Logistic regression, One-level decision tree, Decision Tree, Random Forest, and Support Vector Machine (SVM) by Sequential Minimal Optimization (SMO). The results of the study indicate that the LiveBus Arrivals requests can be classified into six main behaviours. It was found that the classification-based approaches produce better results than the clustering-based ones. The most accurate results were obtained with the SVM-SMO methodology (Precision of 97%). Furthermore, the behaviour within the six classes of users is analysed to better understand how users take advantage of the LiveBus Arrivals service. It was found that the 37% of users can be classified as interchange users. This classification could form the basis of a more personalised LiveBus Arrivals application in future, which could support management and planning by revealing how public transport and related services are actually used or update information on commuters.