Article
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In recent decades, social media platforms have become valuable sources of mobility data, offering real-time insights into human movement, transportation preferences, and traffic conditions (Zheng et al., 2015). Geo-referenced social media plays a crucial role in ITS by providing dynamic information that improves traffic management and operation, enhances road safety, and optimizes public transit (Torbaghan et al., 2022). ...
... Geotagged posts are social media posts with geographic coordinates, offering precise snapshots of user locations and movements. Platforms like Twitter generate vast geospatial data, valuable for understanding popular destinations, tracing movement patterns, and analyzing urban flow (Torbaghan et al., 2022;Zheng et al., 2015). For ITS, geotagged data helps identify high-traffic areas, optimize routes, and manage crowd dynamics, leading to more efficient traffic management. ...
Preprint
Full-text available
Leveraging recent advances in generative AI, multi-agent systems are increasingly being developed to enhance the functionality and efficiency of smart city applications. This paper explores the transformative potential of large language models (LLMs) and emerging Retrieval-Augmented Generation (RAG) technologies in Intelligent Transportation Systems (ITS), paving the way for innovative solutions to address critical challenges in urban mobility. We begin by providing a comprehensive overview of the current state-of-the-art in mobility data, ITS, and Connected Vehicles (CV) applications. Building on this review, we discuss the rationale behind RAG and examine the opportunities for integrating these Generative AI (GenAI) technologies into the smart mobility sector. We propose a conceptual framework aimed at developing multi-agent systems capable of intelligently and conversationally delivering smart mobility services to urban commuters, transportation operators, and decision-makers. Our approach seeks to foster an autonomous and intelligent approach that (a) promotes science-based advisory to reduce traffic congestion, accidents, and carbon emissions at multiple scales, (b) facilitates public education and engagement in participatory mobility management, and (c) automates specialized transportation management tasks and the development of critical ITS platforms, such as data analytics and interpretation, knowledge representation, and traffic simulations. By integrating LLM and RAG, our approach seeks to overcome the limitations of traditional rule-based multi-agent systems, which rely on fixed knowledge bases and limited reasoning capabilities. This integration paves the way for a more scalable, intuitive, and automated multi-agent paradigm, driving advancements in ITS and urban mobility.
... With Big data, we can think of various innovative services by mining it. There are studies which shows that Big data can be helpful in finding better routes, estimation of travel delays, identification of poor road infrastructure [19]. However, the four characteristics, i.e., volume, velocity, value, and variety of Big data, become a bottleneck in integrating it with transportation. ...
... Supervised learning, unsupervised learning, ontology-based, and reinforcement learning methods help mine and analyze the data. AI-empowered Big data analytics assists in developing ITS services like traffic flow prediction, route planning, asset maintenance, traffic anomaly detection, and signal control [15], [19], [44], [45]. ...
Article
Advances in the connected vehicle and cloud computing technologies, Big data, and artificial intelligence techniques have opened new research opportunities. We can integrate them to work out the issues originating from transportation complexities and offer improved services. In this work, we present a seamless multi-module multi-layer vehicular cloud computing system developed using resources of parked vehicles, cloud computing facilities, and vehicular networking technologies. It can offer transportation-specific AI and Big data-empowered services to on-road vehicles. As use cases, we present two innovative and improved services, vehicular Big data mining and vehicular route optimization. A physical testbed is formed to show the feasibility of this work. Results analysis shows that the systems perform better than the standalone systems and servers under different scenarios. Relevant fundamental challenges and future outlooks are also highlighted in this work.
... Такі пристрої часто стикаються з викликами, пов'язаними з продуктивністю, надійністю, безпекою та конфіденційністю. У свою чергу, хмарні обчислення працюють на основі потужної інфраструктури з практично необмеженими ресурсами для збереження та обробки даних [8]. ...
Article
The research work is aimed at studying, developing and optimising a system for integrating diagnostic data with cloud platforms to implement remote monitoring of motor vehicles. The results of this research will not only help to increase the efficiency and availability of vehicle maintenance, but also identify new areas of development in the field of transport technology and cloud technologies. The paper also addresses the issues of data security, information transfer efficiency, and scalability of solutions, which are key to the reliable operation of remote monitoring systems. Ensuring the confidentiality and integrity of data is a top priority, requiring the implementation of advanced encryption and access control methods. The efficiency of information transmission plays a crucial role in the face of a large amount of data coming from vehicles, and the scalability of the systems allows them to adapt to the growing needs of enterprises. Future systems will allow, if necessary, downloading specialized diagnostic methods for troubleshooting from a remote service centre; standardization of functionality and interfaces of on-board vehicle monitoring systems of different manufacturers to reduce the range of test and diagnostic equipment. In addition, the emphasis is placed on the practical aspects of applying cloud technologies in real-world transportation systems. The practical approach involves analysing specific cases and examples of the use of cloud platforms for monitoring various types of vehicles. In particular, the paper considers the implementation of cloud solutions in road transport companies, railway companies and sea carriers. The conclusions of the paper include recommendations for the implementation and optimisation of cloud-based solutions for vehicle monitoring, which can reduce maintenance costs, improve the safety and efficiency of transport systems. Combining diagnostic data with cloudbased platforms for remote maintenance is becoming a response to the challenges of the modern automotive industry. The integration of these technological solutions is aimed at improving the quality of service, ensuring operational safety, and reducing maintenance time and costs
... They have the potential to revolutionize urban transportation systems by enabling seamless communication among vehicles and between vehicles and roadside infrastructure. This real-time exchange of critical information, as illustrated in Fig. 1, can substantially enhance traffic efficiency, road safety, and the overall commuting experience [1], [2]. ...
Article
Named Data Networking (NDN) has proven to be a suitable candidate for Vehicular Ad-hoc Networks (VANETs) because of its data-centric nature and as a worthy replacement for IP addressing, particularly for those with high mobility like VANETs. This has led to the emergence of Vehicular Named Data Networking (VNDN). With the blockchain's ability to ensure immutability, transparency, accountability, and trust, the combination of blockchain and NDN in the VANETs environment has the propensity to alleviate many security issues in VANETs. This paper introduces a blockchain-enabled NDN framework that guarantees a trustworthy and secure data-sharing network in VNDN. Moreover, we utilized an effective reputation mechanism to facilitate a trustworthy and honest data provision in our mobility network. We also adopted a collaborative caching mechanism to improve our system performance, as caching is one of the pivotal reasons for VNDN. We simulated our work using SUMO and ndnSIM and tested our framework against other related systems. The findings show that our proposed approach enhances performance depending on the parameters used.
... SMN BD on social transportation has created previously unheard-of chances to construct the intelligent transportation networks of the future. Zheng et al. [97] examined several strategies, such as data sources, analytical techniques, and application systems that are required for using SMNs BD in social transportation systems. For instance, a real-time monitoring system for traffic incident detection via Twitter stream analysis was proposed by D'Andrea et al. [98]. ...
Article
Full-text available
Social media networks (SMNs) serve as global communication platforms where users can share content, images, and videos as well as post comments, follow friends, and share their thoughts. However, developing countries are lagging behind in understanding the techniques, challenges, and opportunities associated with mining and analytics of SMNs Big Data (BD). The study's objective was to review relevant literature to establish awareness and understanding in developing countries about these techniques, opportunities, and challenges associated with mining and analytics of SMNs BD. A systematic literature review analysis was used to address the study objective. The SMNs BD mining and analytics techniques resulting from the review include, but are not limited to, data mining, value chain technique, infosphere big insights, and SMNs BD sentiment analysis. Three categories of challenges discovered on the subject under investigation are process challenges, data challenges, management challenges, and infrastructure challenges. Opportunities discovered during the review include, but are not limited to, business improvements and adjustments, constructing intelligent networks, and customer engagement boosts, among others. Based on the review results, the study proposed SMNs BD management, mining, and analytics steps to guide developing countries in any endeavors aiming at utilizing SMNs BD mining and analytics initiatives.
... In terms of safety applications, any maliciously fed infor-B mation may lead to false trajectory predictions and wrong estimation of a vehicle's neighboring information in the near future which could prove quite fatal for passengers traveling in both semiautonomous and fully autonomous vehicles. In terms of non-safety applications, this could not only lead to a considerable delay in the requested services but also exposes a user's privacy data to extreme vulnerabilities (Singh et al. 2016;Zheng et al. 2015). Thus, any secure IoV information collection scheme should adhere to the following requirements so as to guarantee the secure Big Data collection: ...
Preprint
Full-text available
The evolution of Big Data in large-scale Internet-of-Vehicles has brought forward unprecedented opportunities for a unified management of the transportation sector, and for devising smart Intelligent Transportation Systems. Nevertheless, such form of frequent heterogeneous data collection between the vehicles and numerous applications platforms via diverse radio access technologies has led to a number of security and privacy attacks, and accordingly demands for a secure data collection in such architectures. In this respect, this chapter is primarily an effort to highlight the said challenge to the readers, and to subsequently propose some security requirements and a basic system model for secure Big Data collection in Internet-of-Vehicles. Open research challenges and future directions have also been deliberated.
... R APID economic growth and the surge in vehicle numbers have intensified traffic congestion and parking challenges in urban areas globally. To address these challenges, numerous countries have been investing in the development of Intelligent Transportation Systems (ITS), harnessing advances in data collection and mobile computing technologies [1], [2], [3]. Modeling and analyzing spatio-temporal dynamic systems are applicable to various prediction scenarios, and research in this field has received sustained attention over the past few decades [4], [5]. ...
Preprint
Predicting spatio-temporal traffic flow presents significant challenges due to complex interactions between spatial and temporal factors. Existing approaches often address these dimensions in isolation, neglecting their critical interdependencies. In this paper, we introduce the Spatio-Temporal Unitized Model (STUM), a unified framework designed to capture both spatial and temporal dependencies while addressing spatio-temporal heterogeneity through techniques such as distribution alignment and feature fusion. It also ensures both predictive accuracy and computational efficiency. Central to STUM is the Adaptive Spatio-temporal Unitized Cell (ASTUC), which utilizes low-rank matrices to seamlessly store, update, and interact with space, time, as well as their correlations. Our framework is also modular, allowing it to integrate with various spatio-temporal graph neural networks through components such as backbone models, feature extractors, residual fusion blocks, and predictive modules to collectively enhance forecasting outcomes. Experimental results across multiple real-world datasets demonstrate that STUM consistently improves prediction performance with minimal computational cost. These findings are further supported by hyperparameter optimization, pre-training analysis, and result visualization. We provide our source code for reproducibility at https://anonymous.4open.science/r/STUM-E4F0.
... Unstructured data encompasses multimedia (audio and video) and documents (Shorfuzzaman, 2017). The rise of real-time applications, social networking, and IoT contributes to the growing amounts of organizational data (Curry, 2016;Jain, 2017;Zheng et al., 2015;Mourtzis et al., 2016). This surge presents challenges for organizations using traditional systems to manage it (Haddad et al., 2018). ...
... CAV and VANET data are collected from the coordinates, speed, acceleration, and safety data which are used in online vehicle diagnosis, smart charging planning, travel delay reduction, safety performance enhancement, congestion, accident detection, and trafc fow prediction [36,37]. Passive collection data are collected from the matrix of origin/destination, and time of travel which are used in real-time congestion avoidance routing, and OD estimation [38,39]. Other sources are collected from the electric and energy consumption, location, and channel data which are used in trafc forecasting, performance, efciency improvement, and dashboard analysis [31,40]. ...
Article
Full-text available
Accurate and timely forecasting of critical components is pivotal in intelligent transportation systems and traffic management, crucially mitigating congestion and enhancing safety. This paper aims to comprehensively review deep learning algorithms and classical models employed in traffic forecasting. Spanning diverse traffic datasets, the study encompasses various scenarios, offering a nuanced understanding of traffic forecasting methods. Reviewing 111 seminal research works since the 1980s, encompassing both deep learning and classical models, the paper begins by detailing the data sources utilized in transportation systems. Subsequently, it delves into the theoretical underpinnings of prevalent deep learning algorithms and classical models prevalent in traffic forecasting. Furthermore, it investigates the application of these algorithms and models in forecasting key traffic characteristics, informed by their utility in transport and traffic analyses. Finally, the study elucidates the merits and drawbacks of proposed models through applied research in traffic forecasting. Findings indicate that while deep learning algorithms and classic models serve as valuable tools, their suitability varies across contexts, necessitating careful consideration in future studies. The study underscores research opportunities in road traffic forecasting, providing a comprehensive guide for future endeavors in this domain.
... SmartApps cater to the modern traveler's demand for convenience and personalization (Kapiki, 2021). They aggregate data from various sources, including user-generated content, social media, and travel databases, to offer relevant and timely information (Zheng et al., 2015). Key features of SmartApps include itinerary planning, language translation, augmented reality for sightseeing, instant booking services for flights, buses, ferries, and hotels, and comprehensive mapping services. ...
Conference Paper
Full-text available
This study aims to formulate a strategic plan to enhance tourists' behavioral intention towards visiting Aceh through the utilization of smart tourism applications. The research employs SWOT Analysis, encompassing Strengths, Weaknesses, Opportunities, and Threats, combined with Internal Factor Evaluation (IFE) and External Factor Evaluation (EFE) matrices to systematically assess the internal and external factors impacting smart tourism development in Aceh. The analysis results position Aceh's tourism sector in Quadrant 1 of the TOWS Matrix, indicating a strong strategic position for aggressive growth. Through qualitative and quantitative data collection methods, the study identifies key internal strengths such as unique cultural heritage and natural beauty, alongside weaknesses like limited technological infrastructure and language barriers. Externally, opportunities include growing global interest in smart tourism and technological advancements, while threats consist of regional instability and competition from other tourist destinations. The IFE and EFE matrices are used to score and prioritize these factors, providing a comprehensive view of the strategic positioning of Aceh's tourism sector. The findings indicate that leveraging smart applications can significantly enhance tourist experiences by providing real-time information, personalized recommendations, and interactive guides. Strategic initiatives proposed include improving digital infrastructure, creating multilingual content, and promoting collaboration between local authorities and technology providers. In conclusion, the study provides actionable insights and a strategic roadmap for stakeholders in Aceh's tourism industry. By addressing identified weaknesses and threats while capitalizing on strengths and opportunities, the adoption of smart tourism applications is poised to boost tourist engagement and satisfaction, ultimately increasing their behavioral intention to visit Aceh. The positioning in Quadrant 1 highlights the potential for aggressive strategies to maximize these advantages and achieve significant growth in the tourism sector.
... In the past few years, spurred by the rapid advancement of intelligent transportation systems [1] and the widespread availability of diverse data sources [2] such as GPS trajectories, traffic cameras, and mobile applications [3], there has been an increasing demand for advanced traffic prediction models that can effectively utilize these data for accurate predictions. At the same time, real-time changes in dynamic transportation networks [4,5] are affected by many factors, such as time, season, and weather. ...
Preprint
Accurate traffic Flow Prediction can assist in traffic management, route planning, and congestion mitigation, which holds significant importance in enhancing the efficiency and reliability of intelligent transportation systems (ITS). However, existing traffic flow prediction models suffer from limitations in capturing the complex spatial-temporal dependencies within traffic networks. In order to address this issue, this study proposes a multi-segment fusion tensor graph convolutional network (MS-FTGCN) for traffic flow prediction with the following three-fold ideas: a) building a unified spatial-temporal graph convolutional framework based on Tensor M-product, which capture the spatial-temporal patterns simultaneously; b) incorporating hourly, daily, and weekly components to model multi temporal properties of traffic flows, respectively; c) fusing the outputs of the three components by attention mechanism to obtain the final traffic flow prediction results. The results of experiments conducted on two traffic flow datasets demonstrate that the proposed MS-FTGCN outperforms the state-of-the-art models.
... Smart cities are a form of modern cities, representing an advanced stage of development, and they have become a new driver for the growth of the urban economy [1]. The industrial brain, as an important component of the smart city framework, refers to a system that utilizes technologies such as AI, big data, and cloud computing to connect up and intelligently manage the industrial chain [2]. The core of the industrial brain lies in achieving deep data mining and value transformation through technological means, promoting industrial upgrading and innovation. ...
Article
Full-text available
This article explores the mechanism for constructing and the path for implementing an industrial brain in the development of smart cities, with a focus on the case of the Yiwu knitting industry platform in China. Accordingly, our study involved a literature review, questionnaire survey, data analysis, qualitative comparative analysis (QCA), and discussion. Our key finding was that the manufacturing brain evolves in three distinct stages: platform creation, growth, and expansion. The mechanisms of implementing these are functional development, trust creation, and value co-creation, respectively. Specifically, functional development marks the commencement of the industrial brain’s construction, which involves enterprise demand analysis, capability bottleneck identification, data value formation, and platform architecture simplification. Trust building serves as the central mechanism of evolving the manufacturing brain, comprising institutional, relational, and computational trust. Lastly, value co-creation proceeds, which is pivotal for a business paradigm revolution, encompassing connection, linkage, and integration. The main theoretical contribution of this article is to propose a normative analytical framework for revealing the mechanism of construction and the path of implementation of industrial platforms in smart city development. Meanwhile, in its practical contribution, this article provides policy guidance, as developed through our analysis of how an industrial platform can promote the transformation and upgrading of the urban manufacturing industry, to realize smart city construction and the economy and society’s coordinated development.
... tion in recent decades. These studies have been applied to various domains, including healthcare [21]- [23], transportation [24], [25], sports [26] and manufacturing [19], [20], [27]. Among these visual analytics approaches for time-series data, the visualization of multivariate temporal data is most relevant to HPPP [28], [29]. ...
Article
Full-text available
Efficient monitoring of production performance is crucial for ensuring safe operations and enhancing the economic benefits of the Iron and Steel Corporation. Although basic modeling algorithms and visualization diagrams are available in many scientific platforms and industrial applications, there is still a lack of customized research in production performance monitoring. Therefore, this article proposes an interactive visual analytics approach for monitoring the heavy-plate production process (iHPPPVis). Specifically, a multicategory aggregated monitoring framework is proposed to facilitate production performance monitoring under varying working conditions. In addition, A set of visualizations and interactions are designed to enhance analysts’ analysis, identification, and perception of the abnormal production performance in heavy-plate production data. Ultimately, the efficacy and practicality of iHPPPVis are demonstrated through multiple evaluations.
... This comprehensive review undertakes an examination of pivotal scholarly contributions and prevalent trajectories within the domain of traffic flow prediction, centering its attention on the utilization of deep learning methodologies. Within this context, a multitude of strategies has been propounded by researchers to substantiate and assess the veracity of traffic prognostications [14,15,16]. Traffic flow prediction models are commonly categorized into two primary classes: parametric and non-parametric models [17]. ...
Article
Full-text available
Efficient traffic flow prediction is paramount in modern urban transportation management, contributing significantly to energy efficiency and overall sustainability. Traditional traffic prediction models often struggle in complex urban traffic networks, especially at multi-intersection junctions. In response to this challenge, this research paper presents a pioneering approach that not only enhances traffic flow prediction accuracy but also indirectly supports energy efficiency. This study leverages deep learning techniques, specifically the Gated Recurrent Unit (GRU), to analyze traffic patterns simultaneously at multiple intersections within a city. By treating the entire traffic network as a distributed system, the model provides real-time predictions, allowing for better traffic management and reduced fuel consumption. Moreover, the incorporation of data fusion techniques, which integrate data from various sources, including traffic sensors and historical traffic information, bolsters the accuracy and robustness of predictions. By predicting traffic flows with precision, this research aids in optimizing traffic signal timing, reducing congestion, and ultimately promoting more efficient transportation systems, which, in turn, reduces fuel wastage and emissions. This study, therefore, advances intelligent transportation systems and offers a promising pathway toward improved energy efficiency in urban mobility.
... Businesses may more effectively personalize their services and solutions to each client profile and increase return on investment by analyzing this data. (Zheng et al., 2015) Fraud Detection: Big data has revolutionized the methods and technologies used to detect fraud. In the past, fraud detection methods have been based on linear analytics that use simple algorithms to identify potential fraud. ...
Article
Full-text available
Innovative and sophisticated technologies have been rapidly developing in recent years. These cutting-edge advancements encompass a wide spectrum of devices like mobile phones, PCs and social media trackers. As a consequence of their widespread usage, these technologies have engendered the generation of vast volumes of unstructured data in diverse formats, spanning terabytes (TB) to petabytes (PB).This vast and varied data is called big data. It holds great promise for both public and private industries. Many organizations utilize big data to uncover useful insights, whether for marketing choices, monitoring specific actions, or identifying potential threats.This kind of data processing is made possible using different methods known as Big Data Analytics. It allows you to gain significant advantages by handling large amounts of unorganized, organized, and partially organized information quickly, which would be impossible with traditional database techniques. While big data presents considerable While it offers benefits for businesses and decision-makers, it also puts consumers at risk. This risk results from the use of analytics technologies, which need the preservation, administration, and thorough analysis of enormous volumes of data gathered from many sources. Consequently, individuals face the risk of their personal information being compromised as a result of the collection and revelation of behavioral data. Put simply, the excessive accumulation of data may lead to multiple breaches in security and privacy. Nevertheless, the realm of big data does indeed raise concerns pertaining to security and privacy. Scholars from various disciplines are actively engaged in addressing these concerns. The study will concentrate on large data applications, substantial security hurdles, and privacy concerns. We'll talk about potential methods for enhancing confidentiality and safety in problematic big data scenarios, and we'll also analyze present security practices.
... ITSs have also brought great interest in developing big data based on the movement of passengers in public transportation stations. Zheng, Xinhu, et al. [13] mentioned that characterizing the overall state of a transportation system is very challenging. Systematic approaches with different features and precisions are required to combine big data in social transportation systems. ...
Article
Full-text available
Robust methods are needed to detect how people are moving in smart public transportation systems. This paper proposes and characterizes effective means to accurately detect passengers. We analyze a public WiFi-based activity recognition (WiAR) dataset to extract human activity features from Channel State Information (CSI) data. To do so, CSI power changes caused by nearby human activity are analyzed. Our method first extracts multi-dimensional features using a Short-Time Fourier Transform (STFT) of CSI data to capture the relevant signal features. Since the environment of a transportation system changes dynamically and non-deterministically, we propose analyzing these changes with a heuristic algorithm that leverages a decision tree to automate a decision-making solution for feature selection. Principal Component Analysis (PCA) is performed before the decision tree algorithm. Reported results are compared with those obtained from the existing methods. Based on these results, we explore the effectiveness of various features such as the chirp rate, delta band power, spectral flux, and frequency of movement. This allows identifying and recommending the most effective features for the explored detection task according to observed variability, information gain, and correlation between features. The reported classification results show that using only the chirp rate generated from CSI information as a feature, we achieve precision = 83%, True Positive (TP)=94%, True Negative (TN)=91% and F1-score = 87%. Considering delta band power as an additional feature adds more information and allows getting higher performance with precision = 100%, TP=97%, TN=95% and F1-score = 95%.
... Considering other works, in [85] the authors used classi cation, regression, dimensionality reduction, clustering, and density estimation to classify the good, the bad, and the ugly use for cybersecurity and cyber physical systems. On the other hand, the authors in [86] analyzed big data for social transportation. The authors concluded that social data contain abundant information and evolve with time. ...
Chapter
A massive amount of data is generated at an ever-increasing rate. Social media, mobile phones, sensors, and medical imaging, among others, are examples of data sources. An important characteristic of the data generated by these sources is that the data is commonly either unstructured or semi-structured. Big data analytics comprises software systems that are able to analyze vast amounts of data to uncover information such as patterns and correlations that help decision-makers in making better decisions. Traditional approaches such as data warehousing and the use of a classic relational database management system (RDBMS) have become impractical to analyze such unstructured and semi-structured data. On the other hand, machine learning (ML) algorithms have proven to be successful in analyzing such vast amounts of data. In this chapter, we present some of the most widely used ML algorithms in big data analytics as well as the distributed platforms typically employed for processing the data. We also present a selection of three important application domains where ML algorithms have been applied to perform big data analytics. These application domains include healthcare, weather forecasting, and social networking. Finally, we review relevant approaches used in each domain area, the most commonly used ML algorithms per area, and specific domain area issues that need further research in big data analytics.
... The analysis of big data from urban sources has been applied in many works oriented towards characterizing the mobility behavior of citizens, as well as to evaluate the efficacy of the service provided by public transportation systems [23,24]. The application of big data analysis for studying public transportation systems has been presented in reviews by Zheng et al. [25] and Welch et al. [26]. ...
Article
Full-text available
In this article, we introduces a model based on big data analysis to characterize the travel times of buses in public transportation systems. Travel time is a critical factor in evaluating the accessibility of opportunities and the overall quality of service of public transportation systems. The methodology applies data analysis to compute estimations of the travel time of public transportation buses by leveraging both open-source and private information sources. The approach is evaluated for the public transportation system in Montevideo, Uruguay using information about bus stop locations, bus routes, vehicle locations, ticket sales, and timetables. The estimated travel times from the proposed methodology are compared with the scheduled timetables, and relevant indicators are computed based on the findings. The most relevant quantitative results indicate a reasonably good level of punctuality in the public transportation system. Delays were between 10.5% and 13.9% during rush hours and between 8.5% and 13.7% during non-peak hours. Delays were similarly distributed for working days and weekends. In terms of speed, the results show that the average operational speed is close to 18 km/h, with short local lines exhibiting greater variability in their speed.
... In recent times, research in several domains is heavily focusing on large-scale data analysis utilizing sophisticated computing capabilities and machine learning (Golbeck et al. 2011;Zheng et al. 2015;Erickson et al. 2017;Nguyen et al. 2018;Chakraborty et al. 2020Chakraborty et al. , 2021aChakraborty et al. , 2022. However, causal analysis of data for prediction purposes has received limited attention in the literature. ...
Article
Full-text available
Since the increasing outspread of COVID-19 in the U.S., with the highest number of confirmed cases and deaths in the world as of September 2020, most states in the country have enforced travel restrictions resulting in sharp reductions in mobility. However, the overall impact and long-term implications of this crisis to travel and mobility remain uncertain. To this end, this study presents an analytical framework that determines and analyzes the most dominant factors impacting human mobility and travel in the U.S. during this pandemic. In particular, the study uses Granger causality to determine the important predictors influencing daily vehicle miles traveled and utilize linear regularization algorithms, including Ridge and LASSO techniques, to model and predict mobility. State-level time-series data were obtained from various open-access sources for the period starting from March 1, 2020, through June 13, 2020, and the entire data set was divided into two parts for training and testing purposes. The variables selected by Granger causality were used to train the three different reduced order models by ordinary least square regression, Ridge regression, and LASSO regression algorithms. Finally, the prediction accuracy of the developed models was examined on the test data. The results indicate that the factors including the number of new COVID cases, social distancing index, population staying at home, percent of out of county trips, trips to different destinations, socioeconomic status, percent of people working from home, and statewide closure, among others, were the most important factors influencing daily VMT. Also, among all the modeling techniques, Ridge regression provides the most superior performance with the least error, while LASSO regression also performed better than the ordinary least square model.
... Businesses may more effectively personalize their services and solutions to each client profile and increase return on investment by analyzing this data. (Zheng et al., 2015) Fraud Detection: Big data has revolutionized the methods and technologies used to detect fraud. In the past, fraud detection methods have been based on linear analytics that use simple algorithms to identify potential fraud. ...
Article
Full-text available
Innovative and sophisticated technologies have been rapidly developing in recent years. These cutting-edge advancements encompass a wide spectrum of devices like mobile phones, PCs and social media trackers. As a consequence of their widespread usage, these technologies have engendered the generation of vast volumes of unstructured data in diverse formats, spanning terabytes (TB) to petabytes (PB).This vast and varied data is called big data. It holds great promise for both public and private industries. Many organizations utilize big data to uncover useful insights, whether for marketing choices, monitoring specific actions, or identifying potential threats.This kind of data processing is made possible using different methods known as Big Data Analytics. It allows you to gain significant advantages by handling large amounts of unorganized, organized, and partially organized information quickly, which would be impossible with traditional database techniques. While big data presents considerable While it offers benefits for businesses and decision-makers, it also puts consumers at risk. This risk results from the use of analytics technologies, which need the preservation, administration, and thorough analysis of enormous volumes of data gathered from many sources. Consequently, individuals face the risk of their personal information being compromised as a result of the collection and revelation of behavioral data. Put simply, the excessive accumulation of data may lead to multiple breaches in security and privacy. Nevertheless, the realm of big data does indeed raise concerns pertaining to security and privacy. Scholars from various disciplines are actively engaged in addressing these concerns. The study will concentrate on large data applications, substantial security hurdles, and privacy concerns. We'll talk about potential methods for enhancing confidentiality and safety in problematic big data scenarios, and we'll also analyze present security practices.
... tandard text by correcting spelling, grammar, and other errors (Neto et al., 2020;Dirkson et al., 2019). The results of text normalization can help improve the accuracy of collecting correct and consistent words for further analysis. The data obtained from social media was still unstructured which still needed to be improved (H. Zheng et al., 2020;X. Zheng et al., 2015). Several studies had been carried out on social media data in many languages such as Indian (Tanna et al., 2020;Roshini et al., 2019;Kumar et al., 2021) Chinese (Xuanyuan et al., 2021;Liu & Chen, 2019). The conducted research focused on improving the technique of the preprocessing process and the completion of nonstandard and unstructur ...
Article
Full-text available
is one of the most important data sources in social data analysis. However, the text contained on Twitter is often unstructured, resulting in difficulties in collecting standard words. Therefore, in this research, we analyze Twitter data and normalize text to produce standard words that can be used in social data analysis. The purpose of this research is to improve the quality of data collection on standard words on social media from Twitter and facilitate the analysis of social data that is more accurate and valid. The method used is natural language processing techniques using classification algorithms and text normalization techniques. The result of this study is a set of standard words that can be used for social data analysis with a total of 11430 words, then 4075 words with structural or formal words and 7355 informal words. Informal words are corrected by trusted sources to create a corpus of formal and informal words obtained from social media tweet data @fullSenyum. The contribution to this research is that the method developed can improve the quality of social data collection from Twitter by ensuring the words used are standard and accurate and the text normalization method used in this study can be used as a reference for text normalization in other social data, thus facilitating collection. and better-quality social data analysis. This research can assist researchers or practitioners in understanding natural language processing techniques and their application in social data analysis. This research is expected to assist in collecting social data more effectively and efficiently.
... It is worth noting that it is common to find research similar to Zheng et al. [2016], where the usage of Big Data on Social Transportation is reviewed, mentioning several topics clearly related to context-awareness, but without even mentioning the word "context". A careful researcher in this area must be aware of this fact because most of the projects related to ITS have some use of computational context. ...
Article
Full-text available
Design and development of context-aware Intelligent Transportation Systems (ITS) are not trivial due to the large number of possible context elements that may be relevant to the application and the lack of structured information to guide system designers in this task. This paper proposes that context elements with common characteristics can be grouped into categories, and these categories can be organized in a taxonomy. This taxonomy could help system designers with the task of modeling and developing new context-aware ITS. We performed a literature review of 68 articles describing 70 ITS applications with context-aware features to identify context elements used in this type of application. Furthermore, we also analyzed three commercial ITS applications. We used data collected from the analysis of these 73 projects to define the categories and identify their relationships. We propose a taxonomy with 79 categories, with 57 leaf categories (a category without children subcategories). We also performed two experiments to validate whether the exposure to this taxonomy could improve the quality of an ITS application during its design, with favorable results showing a 2.7 times increase in the average amount of relevant context elements used in the application. Finally, we compiled a knowledge base of which context element categories are used in the 73 analyzed projects. It is another companion information that can be used to help system designers. The proposed taxonomy of context element categories organizes the information of the context-aware ITS domain in a way that can ease the task of designing such systems and improve the usage of context-aware features. The overall methodology used in this work to create the taxonomy for the ITS domain could be applied to other popular domains of context-aware applications.
... One of the most traditional problems in transportation systems is the management of traffic flow. In [31], the authors apply the concept of Cyber-Physical-Social Systems (CPSS) to achieve signals from both the physical and social spaces. The proposed CPSS-based Transportation System (CTS) is a software-defined transportation system that creates an environment where human factors, transportation systems, and computing technologies are integrated and interact to provide intelligent responses that affect the real world (Fig. 3). ...
Article
Full-text available
The big data concept has been gaining strength over the last few years. With the arise and dissemination of social media and high access easiness to information through applications, there is a necessity for all kinds of service providers to collect and analyze data, improving the quality of their services and products. In this regard, the relevance and coverage of this niche of study are notorious. It is not a coincidence that governments, supported by companies and startups, are investing in platforms to collect and analyze data, aiming at the better efficiency of the services provided to the citizens. Considering the aforementioned aspects, this work makes contextualization of the Big Data and ITS (Intelligent Transportation System) concepts by gathering recently published articles, from 2017 to 2021, considering a survey and case studies to demonstrate the importance of those themes in current days. Within the scope of big data applied to ITS, this study proposes a database for public transportation in the city of Campinas (Brazil), enabling its improvement according to the population demands. Finally, this study tries to present clearly and objectively the methodology employed with the maximum number of characteristics, applying statistical analyses (box-and-whisker diagrams and Pearson correlation), highlighting the limitations, and expanding the studied concepts to describe the application of an Advanced Traveler Information System (ATIS), a branch of Intelligent Transportation System (ITS), in a real situation. Therefore, besides the survey of the applied concepts, this work develops a specific case study, highlighting the identified deficiencies and proposing solutions. Future works are also contemplated to expand this study and improve the accuracy of the achieved results.
... Existing work leverages social media data to analyze travel behaviors, including activity pattern classification [11], location inference [12], travel activity estimation [13], and longitudinal travel behavior inference [14]. A forecasting model is proposed to predict mode choices according to the check-in information of individual tweets [15]. ...
Article
Full-text available
This paper aims to leverage Twitter data to understand travel mode choices during the pandemic. Tweets related to different travel modes in New York City (NYC) are fetched from Twitter in the two most recent years (January 2020–January 2022). Building on these data, we develop travel mode classifiers, adapted from natural language processing (NLP) models, to determine whether individual tweets are related to some travel mode (subway, bus, bike, taxi/Uber, and private vehicle). Sentiment analysis is performed to understand people’s attitudinal changes about mode choices during the pandemic. Results show that a majority of people had a positive attitude toward buses, bikes, and private vehicles, which is consistent with the phenomenon of many commuters shifting away from subways to buses, bikes and private vehicles during the pandemic. We analyze negative tweets related to travel modes and find that people were worried about those who did not wear masks on subways and buses. Based on users’ demographic information, we conduct regression analysis to analyze what factors affected people’s attitude toward public transit. We find that the attitude of users in the service industry was more easily affected by MTA subway service during the pandemic.
... Or finally, groups of users can be exploited for targeted advertising: advertising companies create targeted ad groups, based on interests, lifestyles, demographics, geolocation or mobility patterns. Similar examples can be found in mobile alerting systems or social transportation systems [9] where mobility groups are utilized for ride-sharing or to react to disruptions of transportation services in real-time. ...
Preprint
The proliferation of smartphone devices has led to the emergence of powerful user services from enabling interactions with friends and business associates to mapping, finding nearby businesses and alerting users in real-time. Moreover, users do not realize that continuously sharing their trajectory data with online systems may end up revealing a great amount of information in terms of their behavior, mobility patterns and social relationships. Thus, addressing these privacy risks is a fundamental challenge. In this work, we present TP3TP^3, a Privacy Protection system for Trajectory analytics. Our contributions are the following: (1) we model a new type of attack, namely 'social link exploitation attack', (2) we utilize the coresets theory, a fast and accurate technique which approximates well the original data using a small data set, and running queries on the coreset produces similar results to the original data, and (3) we employ the Serverless computing paradigm to accommodate a set of privacy operations for achieving high system performance with minimized provisioning costs, while preserving the users' privacy. We have developed these techniques in our TP3TP^3 system that works with state-of-the-art trajectory analytics apps and applies different types of privacy operations. Our detailed experimental evaluation illustrates that our approach is both efficient and practical.
... Big data is a general term that refers to the technologies and techniques for processing and analysing large amounts of data, whether structured, semi-structured, or unstructured [26]. Big data is commonly used in all fields, including transportation [27,28]. The ability to process this data is used for decision making, from big data to analysis and prediction. ...
Article
Full-text available
The need to solve public transport planning challenges using 5G is demanding. In 2019, the world started using 5G technology. Unfortunately, many countries have no equipment that is compatible with 5G infrastructures. There are two main deployment options for countries willing to accept 5G. They can directly venture to install relatively expensive infrastructure, called 5G SA (standalone access). However, more countries use the 5G NSA (non-standalone access) alternative, a 5G network supported by existing 4G infrastructure. One of the considerations for choosing NSA 5G is that it still performs 4G equalisation in its area. The data throughput is faster but still uses the leading 4G network. Interestingly, there are three types of 5G: low-band (sub-6), middle-band (sub-6), and high-band (millimetre-wave (mmWave)). The problem is determining the kind of 5G needed for public transport planning. Meanwhile, mobile network big data (MNBD) requires robust and stable internet access, with broad coverage in real time. MNBD movement includes the movement of people and vehicles, as well as logistics. GPS and internet connections track the activity of private vehicles and public transportation. The difference between mmWave and sub-6 5G can complement transportation planning needs. The density and height of buildings in urban areas and the affordability of the range of the connections determine 5G. This study examines the literature on 5G and then, using the bibliographic method, matches the network coverage obtained in Indonesia using nPerf data services. According to the data, urban areas are becoming more densely populated. Thus, this could show the differences in the data quality outside of metropolitan areas. This study also discusses the current conditions in terms of market potential and the development of smart cities and provides an overview of how real-time mobile data can support public transport planning. This article provides beneficial insight into the stability and adjustment of 5G, where the connectivity can be adequately maintained so that the MNBD can deliver representative data for analysis.
... The effect of online facilities on transport devices is examined in who signified the possibility of GPS furnished phones as widely obtainable web traffic congestion observing methods to offer dependable visitors information instantly (Keskar et al., 2021). Yang et al. (2017) suggested a visitor observing technique contingent on GPS enabled cell devices, applying the comprehensive scope offered by the connected network in numerous cities. Zheng et al. (2015) unveiled a visitor observing technique dependent on virtual transport channels and executed it with a GPS phone program to capitalize on the authenticity of instant traffic congestion approximation and resolve security issues. ...
Article
Full-text available
With increasing urbanization across the world, the demand for smart transportation methods to support everyone, as well as freight, becomes more vital. To tackle the challenges of growing congestion on the roads, big data analytics (BDA) strategies can be used to offer insights for real decision-making, and policy designing. This study has two primary goals. First, this study evaluates academic literature regarding BDA for smart commuter routes programs; and next based upon the studies, it suggests a framework that is effective, but comprehensive in making recommendation to drive down the congestion and increase efficiency of shared transportation system. The study believes that the framework suggested is solid, versatile, and adaptive enough to be implemented in transportation systems in large cities. Using the framework, system will be managed in a centralized system, allowing much more efficient transportation across cities. Further studies should be conducted over a long period, in smaller cities as well, to make improvement on the framework.
Article
Urban commuters in India often face transportation challenges during their daily travels. Traditional feedback methods, such as surveys and hotlines, struggle to scale effectively due to the large population in Indian cities. In this context, social media platforms such as Twitter/X present a practical alternative. where commuters’ complaints are often voiced through short, informal posts. These commuter complaints represent various ongoing issues in India’s urban transportation sector. Hence they are important for urban planners, policymakers, and transportation authorities to gain real-time insights into public concerns. However, an efficient framework is needed to automatically identify transportation-related concerns from the vast pool of social media posts and then generate a concise summary highlighting the most pressing issues, so that the policymakers/authorities can understand the key challenges and respond effectively to them. This study proposes a framework that utilizes generative AI and Natural Language Processing (NLP) to automatically identify and summarize transportation-related complaints from social media posts. To improve the quality of summarization, a novel prompt is developed for systematically summarizing transportation-related concerns and grievances. Findings indicate that this prompt significantly enhances summarization performance with the GPT-4 Turbo LLM. Notably, GPT-4-Turbo using proposed prompt achieves a ROUGE score of 0.86, surpassing the widely used LexRank algorithm, which scores 0.45.
Article
Purpose The pivotal role of artificial intelligence (AI) technology in industrial upgrading has necessitated an understanding of its evolving competitive landscape and technological trends. This paper proposes a patent data analysis framework based on fine-grained knowledge units, which is applied to provide a landscape analysis of enterprises’ technological competitive advantage in the AI field. Design/methodology/approach AI patent data collected from the Derwent Innovation Index (DII) database. The competitive patterns of the AI industry are investigated through an analysis of patent applications, regional distributions and a social network analysis of the International Patent Classification (IPC) and Derwent Manual Code (MC). Findings The study found that China, Japan and the United States are leading in AI technologies in terms of technological prowess and market potential. Emerging sectors include intelligent education, biological identification, intelligence servers, intelligent terminals, big data analysis and information security. Companies like IBM, Panasonic, Microsoft and Google demonstrate unique strategic orientations, with IBM notable for having the highest number of patents and citations. The analysis indicates a growing, concentrated and multi-path development profile for AI patenting, with interorganizational technical flows primarily. Research limitations/implications The research is limited by the scope of patent data and the methodologies employed, which may not capture the full spectrum of AI technological advancements and competitive dynamics. Future studies should consider incorporating additional data sources for a more comprehensive analysis. Based on the particularity of AI technology, elements like open resources should be paid special attention. Practical implications The study provides actionable strategic recommendations for AI industry stakeholders, emphasizing the importance of focused technological development, international collaboration and strategically designed patent portfolio management. These insights can help stakeholders navigate the complexities of the AI industry and adapt their strategies to fit evolving technological trends. Originality/value Most previous studies on AI patent data analysis have used a macro (whole) perspective at the country level. However, this study narrows its focus to research from a micro perspective, that is, at the enterprise level. This paper studies the technological competition pattern of enterprises in the AI industry from the perspectives of competitive environment and competitive strength.
Article
Full-text available
Big Data applications have transformed Intelligent Transport Systems (ITS), enabling improvements in traffic management, safety, and efficiency. This study presents a bibliometric analysis and review of the recent advancements of big data applications in ITS. For this bibliometric analysis, the Scopus database was utilized due to its extensive resources. Various tools, such as RStudio, VOSviewer, Excel, and Python, were used to analyze data and identify trends, patterns, and relationships in the selected articles through performance analysis and science mapping. The study examined 447 articles published between 2014 and 2023. The analysis indicates that research in this field has experienced exponential growth annually, although it experienced setbacks in 2020 due to the global pandemic before regaining momentum. While the number of research publications has risen sharply, the slower growth in citations highlights the need for a greater focus on producing higher-quality research. Our investigation revealed that the most significant research efforts focused on traffic flow prediction, traffic anomaly prediction, traffic safety, the integration of big data with the Internet of Things (IoT) and the Internet of Vehicles (IoV). China, United States, and Canada were the primary contributors to this field, with China conducting the majority of studies. We summarized and critically reviewed the most cited papers, as well as those that present the most significant innovations in this field. We found that ethical, privacy, and security concerns related to the use of Big Data in ITS have received limited attention. This work aims to serve as a valuable resource for researchers and practitioners, encouraging innovation and the development of more effective and sustainable transportation solutions.
Article
This letter reports the insights gained during a Distributed/Decentralized Hybrid Workshop on Foundation/Infrastructure Intelligence (FII), where we discussed the evolving role of Foundation Models in the field of intelligent vehicles. These models, pre-trained on multimodal data, have emerged as pivotal in the landscape of intelligent vehicles by leveraging their capabilities for high-level reasoning. Ongoing research focuses on these models to further improve scene perception and decision-making, aiming to develop adaptive systems for robot navigation and autonomous driving. However, for smart mobility across the Cyber-Physical-Social space, foundation intelligence should learn human-level knowledge to perform sophisticated interactions and collaborations based on human feedback. Agent-based Foundation Models, as the new training paradigm, can generate cross-domain actions consistent with perception information, paving the way to realize interactive and collaborative agents. This letter discusses the challenges of enhancing and leveraging the scene understanding and spatial reasoning capabilities of the pre-trained foundation model for smart mobility. It also offers insights into the embodied employment of foundation and infrastructure intelligence in enhancing multimodal interactions between robots, environments, and humans.
Conference Paper
Full-text available
In the past, transportation planners and engineers primarily relied on supply management tactics to address people's transportation requirements. However, numerous countries noticed that despite fully utilizing the transportation network, it remained challenging to accommodate the escalating demand. Consequently, there is a growing importance placed on addressing these needs strategically through demand management strategies. Among the leading strategies is the implementation of Intelligent Transportation Systems (ITS) which enables the efficient utilization of existing infrastructure. To ensure the successful implementation of ITS into the existing transportation framework, the use of advanced data collection technologies is an absolute necessity. As cities continue to grow, traffic issues in regular operations and management become increasingly difficult. Solving these issues, enhancing traffic performance, and fruitful planning of future transportation infrastructure requires the incorporation and wide use of modern technology. Numerous smart data collection technologies have already been developed and are being utilized around the world but there still exists an inertia among the general public in adopting these techniques. Proper knowledge about the technologies and their offered benefits can bring about a change in this perspective. This paper aims to provide a comprehensive overview of some of the available data collection technologies and their utilization techniques in ITS that can help in battling imminent transportation issues and aid in modernizing and improving the overall performance of the transportation sector of any country. The organized, structured form of information showcased in this paper will aid the existing literature and help decision-makers in making crucial transportation development conclusions.
Article
As data-driven decision-making becomes prevalent, research needs to provide more evidence to direct user decision-making, particularly concerning transportation systems. In concurrence, ridesharing has been touted to reduce driving-while-intoxicated fatalities, albeit prior studies have provided inconsistent findings. A limitation of prior research on this topic is lacking adequate experimental controls while addressing the impact of potential confounds. This issue may affect potential assumptions and conclusions on whether the deployment of ridesharing services has led to a considerable reduction in driving-while-intoxicated fatalities. The present study leverages statistical modeling to control age, education, vehicle miles traveled, and metropolitan size. It reveals that ridesharing represented a 13.8% decline in driving-while-intoxicated fatalities among youths ages 17-34, but without significantly affecting drivers ages 35-65. Also, the results suggest that city population, vehicle miles traveled, and educational attainment can affect younger adults, whereas the same features were not significant for older adults. Further, the study suggests that the initiation of UberX can serve as a ride-planning option to reduce driving-while-intoxicated fatalities among younger rather than older drivers. Based on the analysis results, multiple implications for transportation platform and software engineering managers are provided, especially in the areas of dispatch algorithms, requirement analysis, and ridesharing security and safety.
Article
Full-text available
Big Data is still gaining attention as a fundamental building block of the Artificial Intelligence and Machine Learning world. Therefore, a lot of effort has been pushed into Big Data research in the last 15 years. The objective of this Systematic Literature Review is to summarize the current state of the art of the previous 15 years of research about Big Data by providing answers to a set of research questions related to the main application domains for Big Data analytics; the significant challenges and limitations researchers have encountered in Big Data analysis, and emerging research trends and future directions in Big Data. The review follows a predefined procedure that automatically searches five well-known digital libraries. After applying the selection criteria to the results, 189 primary studies were identified as relevant, of which 32 were Systematic Literature Reviews. Required information was extracted from the 32 studies and summarized. Our Systematic Literature Review sketched the picture of 15 years of research in Big Data, identifying application domains, challenges, and future directions in this research field. We believe that a substantial amount of work remains to be done to align and seamlessly integrate Big Data into data-driven advanced software solutions of the future.
Conference Paper
A megosztásos közösségi közlekedés egy olyan innovatív megközelítés, amely a hagyományos egyéni gépjárműhasználat helyett a közösségi erőforrások megosztására épül. Ez a koncepció lehetővé teszi, hogy az emberek közösen használják a járműveket, így csökkentve a személyes használatba vett autók számát az utakon, ezzel hatékonyabban kihasználva a meglévő járműállományt. Szisztematikus szakirodalmi elemzést végeztünk annak érdekében, hogy megértsük, milyen tényezők motiválják a fogyasztókat a megosztáson alapuló szolgáltatások igénybevételére. A kutatásunk során összesen 43 publikációt tekintettünk át, és ezek alapján 3 fő témakört azonosítottunk: Környezettudatosság, Gazdasági megfontolás és Biztonság. A legtöbb kutatás a fiatal generációra összpontosított, és nem tett különbséget az új és a tapasztalt felhasználók között a közösségi közlekedés terén. Az eredmények alapján megállapítottuk, hogy a környezettudatosság kiemelt szerepet játszik a fogyasztói döntésekben a megosztásos közösségi közlekedés területén. A fogyasztók egyre inkább arra törekednek, hogy fenntarthatóbb módon használják az erőforrásokat, és ennek érdekében hajlandók megosztani a közlekedési eszközöket másokkal. A környezettudatosság motiválja őket arra, hogy csökkentsék a gépjárműhasználatot és mindinkább közösségi megoldásokat válasszanak. Azonban további kutatási lehetőségként javasolható egy átfogóbb, kvantitatív megközelítés alkalmazása is, amely lehetővé teszi a nagyobb adatmennyiségen alapuló eredmények elérését.
Preprint
Full-text available
(To appear in a revised form on Journal of Big data - Springer Nature) Big Data is still gaining attention as a fundamental building block of the Artificial Intelligence and Machine Learning world. Therefore, a lot of effort has been pushed into Big Data research in the last 15 years. The objective of this Systematic Literature Review is to summarise the current state of the art of the previous 15 years of research about Big Data by providing answers to a set of research questions related to the main application domains for Big Data analytics; the significant challenges and limitations researchers have encountered in Big Data analysis, and emerging research trends and future directions in Big Data. The review follows a predefined procedure that automatically searches five well-known digital libraries. After applying the selection criteria to the results, 189 primary studies were identified as relevant, of which 32 were Systematic Literature Reviews. Required information was extracted from the 32 studies and summarised. Our Systematic Literature Review sketched the picture of 15 years of research in Big Data, identifying application domains, challenges, and future directions in this research field. We believe that a substantial amount of work remains to be done to align and seamlessly integrate Big Data into data-driven advanced software solutions of the future.
Conference Paper
Intelligent transportation systems (ITS) research is increasingly focusing on big data, as seen in numerous global projects. A lot of data will be generated by intelligent transportation systems. The creation and usage of intelligent transportation systems (ITS) will be significantly impacted by big data, which will lead to their safer use and increase of efficiency. The utilization of inferences drawn using analysis performed on big data in ITS is flourishing. Initial part of the paper includes discussion around parameters of Big Data in the context of ITS. Then a system model for Big Data analytics aimed at ITS along with different entities of different planes is discussed. The different phases of Big Data analytic process are discussed in detail along with existing work done in the domain. The final section of this paper discusses some unresolved issues with analysis of big data in ITS.
Article
Cyber–physical–social connectivity is a key element in intelligent transportation systems (ITSs) due to the ever-increasing interaction between human users and technological systems. Such connectivity translates the ITSs into dynamical systems of socio-technical nature. Exploiting this socio-technical feature to our advantage, we propose a cyber-attack detection scheme for ITSs that focuses on cyber-attacks on freeway traffic infrastructure. The proposed scheme combines two parallel macroscopic traffic model-based partial differential equation (PDE) filters whose output residuals are compared to make decision on attack occurrences. One of the filters utilizes physical (vehicle/infrastructure) sensor data as feedback whereas the other utilizes social data from human users’ mobile devices as feedback. The Social Data-based Filter is aided by a fake data isolator and a social signal processor that translates the social information into usable feedback signals. Mathematical convergence properties are analyzed for the filters using Lyapunov’s stability theory. Finally, we validate our proposed scheme by presenting simulation results.
Article
Despite significant advancements in Multi-Agent Deep Reinforcement Learning (MADRL) approaches for Traffic Light Control (TLC), effectively coordinating agents in diverse traffic environments remains a challenge. Studies in MADRL for TLC often focus on repeatedly constructing the same intersection models with sparse experience. However, real road networks comprise Multi-Type of Intersections (MTIs) rather than being limited to intersections with four directions. In the scenario with MTIs, each type of intersection exhibits a distinctive topology structure and phase set, leading to disparities in the spaces of state and action. This paper introduces Adaptive Multi-agent Deep Mixed Reinforcement Learning (AMDMRL) for addressing tasks with multiple types of intersections in TLC. AMDMRL adopts a two-level hierarchy, where high-level proxies guide lowlevel agents in decision-making and updating. All proxies are updated by value decomposition to obtain the globally optimal policy. Moreover, the AMDMRL approach incorporates a mixed cooperative mechanism to enhance cooperation among agents, which adopts a mixed encoder to aggregate the information from correlated agents.We conduct comparative experiments involving four traditional and four DRL-based approaches, utilizing three training and four testing datasets. The results indicate that the AMDMRL approach achieves average reductions of 41% than traditional approaches, and 16% compared to DRL-based approaches in traveling time on three training datasets. During testing, the AMDMRL approach exhibits a 37% improvement in reward compared to the MADRL-based approaches.
Article
Summary form only. Includes abstracts of articles presented in this issue of the publication. Also presents a brief editorial.
Chapter
Today, the problem of processing big data in real time is observed not only in unstructured big data but also in dealing with structured data in databases of small businesses and organizations due to the rapid increase in data volume. Traditional methods and approaches are not considered effective to solve the problem. Moreover, most of the modern effective approaches are based on the cooperation of several computers, and they require plenty of expenses, so it is not suitable for small organizations. The approach proposed in this chapter aims to effectively process big data in real time, bypassing the shortcomings above. The proposed approach is based on the use of a distributed computing mechanism on a single server. The chapter reveals the architecture of this approach, the functional scheme, the essence of the approach, and the effectiveness of the approach. Moreover, in the chapter improving the effectiveness of the approach through machine learning is discussed. Experimental results have been obtained based on the approach and they compared with the traditional approach.
Article
This paper discusses the recent insights on the Big Data role in the sustainability of smart mobility. Systematic Literature Review is applied to scientific publications web repositories retrieving 2,000+ records (years 2010-2022). 83 selected publications are analyzed and discussed in detail considering methods, tools, pros, cons, solved challenges, and pending limitations. The final picture shows significant attention given to Big Data handling/modeling, while yet there is marginal consideration of how such solutions effectively consider the environmental concerns. These, instead, represent the leading priority for improving and ameliorating the smart mobility system sustainably. In this regard, possible research directions are proposed.
Article
Full-text available
The construction of transportation 5.0 or the so-called society-centered intelligent transportation systems (ITS) has aroused higher requirements for the intelligent sensing capability to seamlessly integrate Cyber-Physical-Social Systems (CPSS). Crowd Sensing Intelligence (CSI), as a promising paradigm, leverages the collective intelligence of heterogeneous sensing resources to gather data and information from CPSS. Our first Distributed/Decentralized Hybrid Workshop on Crowd Sensing Intelligence (DHW-CSI) has been focused on principles and high-level processes of organizing and operating CSI. This letter reports the discussion results of the second DHW-CSI addressing the participants, methods, and stages of CSI for ITS. We categorized sensing participants into three kinds, i.e., biological , digital, and robotic. Then we summarized three methods to enable sensing intelligence, i.e., foundation models, scenarios engineering, and human-oriented operating systems. Finally, we anticipated that the progression of CSI will experience three stages, from algorithmic intelligence to linguistic intelligence, and eventually to imaginative intelligence.
Article
The booming development of Internet of Vehicles (IoV) has brought new vitality to the construction of intelligent transportation systems (ITS). At the same time, a huge amount of data has been generated due to the gradual development of IoV towards large-scale, complex, and diversified. These data are owned by the companies that vehicles belonging to or service providers, such as taxi companies own taxi data. Due to interest and privacy considerations, data owners are not willing to share data, thus a serious data isolated island problem is created, which is detrimental to the development of ITS. Therefore, this paper focuses on how to prevent privacy disclosure of vehicles while sharing vehicle data to improve the service. Considering the amount of interactive data and privacy disclosure during data release, vehicle data is abstracted from text form into graph-structured data form. At the same time, graph differential privacy together with anonymity protection is proposed innovatively to firmly protect vehicle privacy. Moreover, to solve the high complexity of big data graph structure transformation, an Accelerated nodes and edges Combined Graph Differential Privacy algorithm (ACGDP) is proposed. Based on the simulations of real-world data that combine electric and non-electric taxies, it is verified that our proposed scheme has a tradeoff between information availability and privacy protection. With the graph differential privacy processed data, our proposed scheme reduces the average wasted mileage for charging by 3.87% and achieves a 44.28% increase in drivers’ income. Drivers’ satisfaction of receiving orders and charging preference reaches 68% after the graph-structured data reuse.
Article
Full-text available
A cyber-physical system (CPS) is composed of a physical system and its corresponding cyber systems that are tightly fused at all scales and levels. CPS is helpful to improve the controllability, efficiency and reliability of a physical system, such as vehicle collision avoidance and zero-net energy buildings systems. It has become a hot R&D and practical area from US to EU and other countries. In fact, most of physical systems and their cyber systems are designed, built and used by human beings in the social and natural environments. So, social systems must be of the same importance as their CPSs. The indivisible cyber, physical and social parts constitute the cyber-physical-social system (CPSS), a typical complex system and it's a challengeable problem to control and manage it under traditional theories and methods. An artificial systems, computational experiments and parallel execution (ACP) methodology is introduced based on which data-driven models are applied to social system. Artificial systems, i.e., cyber systems, are applied for the equivalent description of physical-social system (PSS). Computational experiments are applied for control plan validation. And parallel execution finally realizes the stepwise control and management of CPSS. Finally, a CPSS-based intelligent transportation system (ITS) is discussed as a case study, and its architecture, three parts, and application are described in detail.
Conference Paper
Full-text available
Agents in many transportation systems prefer to use the shortest paths, which may not guarantee the optimal transporting efficiency. Given that passengers in some big urban rail transit (URT) systems experience severe congestions, here we developed a congestion avoidance routing model. Based on the Beijing Subway data, the usage patterns of the URT network were comprehensively analyzed under the scenarios of shortest path (SP) routing and minimum cost (MC) routing. We found that MC routing can considerably reduce congestion in the URT network with a tiny increase of travel time. Interestingly, encouraging a small fraction of passengers who experience the most congestion to adopt MC routing achieves nearly the same effect with pure MC routing on mitigating congestion. Hence, a hybrid routing model was proposed to offer practical solutions for alleviating passenger congestion in the URT networks.
Article
Full-text available
In this paper, we present a visual analysis system to explore sparse traffic trajectory data recorded by transportation cells. Such data contains the movements of nearly all moving vehicles on the major roads of a city. Therefore it is very suitable for macro-traffic analysis. However, the vehicle movements are recorded only when they pass through the cells. The exact tracks between two consecutive cells are unknown. To deal with such uncertainties, we first design a local animation, showing the vehicle movements only in the vicinity of cells. Besides, we ignore the micro-behaviors of individual vehicles, and focus on the macro-traffic patterns. We apply existing trajectory aggregation techniques to the dataset, studying cell status pattern and inter-cell flow pattern. Beyond that, we propose to study the correlation between these two patterns with dynamic graph visualization techniques. It allows us to check how traffic congestion on one cell is correlated with traffic flows on neighbouring links, and with route selection in its neighbourhood. Case studies show the effectiveness of our system.
Article
Full-text available
Recent advances in localization techniques have fundamentally enhanced social networking services, allowing users to share their locations and location-related contents, such as geo-tagged photos and notes. We refer to these social networks as location-based social networks (LBSNs). Location data bridges the gap between the physical and digital worlds and enables a deeper understanding of users’ preferences and behavior. This addition of vast geo-spatial datasets has stimulated research into novel recommender systems that seek to facilitate users’ travels and social interactions. In this paper, we offer a systematic review of this research, summarizing the contributions of individual efforts and exploring their relations. We discuss the new properties and challenges that location brings to recommender systems for LBSNs. We present a comprehensive survey analyzing 1) the data source used, 2) the methodology employed to generate a recommendation, and 3) the objective of the recommendation. We propose three taxonomies that partition the recommender systems according to the properties listed above. First, we categorize the recommender systems by the objective of the recommendation, which can include locations, users, activities, or social media. Second, we categorize the recommender systems by the methodologies employed, including content-based, link analysis-based, and collaborative filtering-based methodologies. Third, we categorize the systems by the data sources used, including user profiles, user online histories, and user location histories. For each category, we summarize the goals and contributions of each system and highlight the representative research effort. Further, we provide comparative analysis of the recommender systems within each category. Finally, we discuss the available data-sets and the popular methods used to evaluate the performance of recommender systems. Finally, we point out promising research topics for future work. This article presents a panorama of the recommender systems in location-based social networks with a balanced depth, facilitating research into this important research theme.
Conference Paper
Full-text available
Social networks such as Twitter and Facebook are popular, personal, and real-time in nature. We found that there exists a significant number of traffic information such as traffic congestion, incidents, and weather in Twitter. However, an algorithm is needed to extract and classify the traffic information before publishing (re-tweeting) and becoming useful for others. Traffic information was extracted from Twitter using syntactic analysis and then further classified into two categories: point and link. This method can classify 2,942 traffic tweets into the point category with 76.85% accuracy and classify 331 traffic tweets into the link category with 93.23% accuracy. Our system can report traffic information real-time.
Article
Full-text available
Vehicle electrification is envisioned to be a significant component of the forthcoming smart grid. In this paper, a smart grid vision of the electric vehicles for the next 30 years and beyond is presented from six perspectives pertinent to intelligent transportation systems: 1) vehicles; 2) infrastructure; 3) travelers; 4) systems, operations, and scenarios; 5) communications; and 6) social, economic, and political.
Conference Paper
Full-text available
The popularity of location-based social networks provide us with a new platform to understand users' preferences based on their location histories. In this paper, we present a location-based and preference-aware recommender system that offers a particular user a set of venues (such as restaurants) within a geospatial range with the consideration of both: 1) User preferences, which are automatically learned from her location history and 2) Social opinions, which are mined from the location histories of the local experts. This recommender system can facilitate people's travel not only near their living areas but also to a city that is new to them. As a user can only visit a limited number of locations, the user-locations matrix is very sparse, leading to a big challenge to traditional collaborative filtering-based location recommender systems. The problem becomes even more challenging when people travel to a new city. To this end, we propose a novel location recommender system, which consists of two main parts: offline modeling and online recommendation. The offline modeling part models each individual's personal preferences with a weighted category hierarchy (WCH) and infers the expertise of each user in a city with respect to different category of locations according to their location histories using an iterative learning model. The online recommendation part selects candidate local experts in a geospatial range that matches the user's preferences using a preference-aware candidate selection algorithm and then infers a score of the candidate locations based on the opinions of the selected local experts. Finally, the top-k ranked locations are returned as the recommendations for the user. We evaluated our system with a large-scale real dataset collected from Foursquare. The results confirm that our method offers more effective recommendations than baselines, while having a good efficiency of providing location recommendations.
Article
Full-text available
Massive amount of movement data, such as daily trips made by millions of passengers in a city, are widely available nowadays. They are a highly valuable means not only for unveiling human mobility patterns, but also for assisting transportation planning, in particular for metropolises around the world. In this paper, we focus on a novel aspect of visualizing and analyzing massive movement data, i.e., the interchange pattern, aiming at revealing passenger redistribution in a traffic network. We first formulate a new model of circos figure, namely the interchange circos diagram, to present interchange patterns at a junction node in a bundled fashion, and optimize the color assignments to respect the connections within and between junction nodes. Based on this, we develop a family of visual analysis techniques to help users interactively study interchange patterns in a spatiotemporal manner: 1) multi-spatial scales: from network junctions such as train stations to people flow across and between larger spatial areas; and 2) temporal changes of patterns from different times of the day. Our techniques have been applied to real movement data consisting of hundred thousands of trips, and we present also two case studies on how transportation experts worked with our interface.
Article
Full-text available
In this work, we present an interactive system for visual analysis of urban traffic congestion based on GPS trajectories. For these trajectories we develop strategies to extract and derive traffic jam information. After cleaning the trajectories, they are matched to a road network. Subsequently, traffic speed on each road segment is computed and traffic jam events are automatically detected. Spatially and temporally related events are concatenated in, so-called, traffic jam propagation graphs. These graphs form a high-level description of a traffic jam and its propagation in time and space. Our system provides multiple views for visually exploring and analyzing the traffic condition of a large city as a whole, on the level of propagation graphs, and on road segment level. Case studies with 24 days of taxi GPS trajectories collected in Beijing demonstrate the effectiveness of our system.
Article
Full-text available
This article presents the Mobile Century fleld experiment, performed on February 8, 2008, to demonstrate the feasibility of a prototype location-based service: real-time tra-c estimation using GPS data from cellular phones only. Mobile Century consisted of 100 vehicles carrying a GPS-equipped Nokia N95 cell phone driving loops on a 10-mile stretch of I-880 between Hayward and Fremont, California. The data obtained in the experiment was processed in real-time and broadcasted on the internet for 8 hours. Travel time and velocity contour estimates were shown in real-time using a privacy-preserving architecture developed to provide this new service in an environment acceptable to users and participants. The quality of the data proves to be accurate against data obtained independently from the experiment. The experiment also shows that it is not necessary to have a high proportion of equipped vehicles to obtain accurate results, conflrming that GPS-enabled cell phones can realistically be used as tra-c sensors, while preserving individuals' privacy. 1. Background The convergence of sensing, communication and multi-media platforms has enabled a key capability: mobility tracking using GPS. Major cellular phone manufacturers plan to embed GPS receivers in most of their phones in the near future. This trend has major implications for the tra-c engineering community. Currently, tra-c monitoring is most commonly based on flxed detectors, which provide vehicle counts, roadway occupancy, and often speed. Unfortunately, their high installation and maintenance costs prohibit more widespread deployments, particularly in developing countries. Moreover, the reliability and accuracy of this type of detectors vary. GPS-equipped mobile phones can provide speed and position measurements to the transportation engi- neering community by leveraging infrastructure deployed by phone manufacturing companies and network providers. Because this technology is market driven, it will penetrate transportation networks at a very rapid pace, soon covering rural areas with a signiflcant impact in developing countries where there is a lack of public tra-c monitoring infrastructure. The present study describes a fleld experiment to assess the feasibility of this new tra-c monitoring system based on GPS equipped phones.
Article
Full-text available
Optimizing paths on networks is crucial for many applications, ranging from subway traffic to Internet communication. Because global path optimization that takes account of all path choices simultaneously is computationally hard, most existing routing algorithms optimize paths individually, thus providing suboptimal solutions. We use the physics of interacting polymers and disordered systems to analyze macroscopic properties of generic path optimization problems and derive a simple, principled, generic, and distributed routing algorithm capable of considering all individual path choices simultaneously. We demonstrate the efficacy of the algorithm by applying it to: (i) random graphs resembling Internet overlay networks, (ii) travel on the London Underground network based on Oyster card data, and (iii) the global airport network. Analytically derived macroscopic properties give rise to insightful new routing phenomena, including phase transitions and scaling laws, that facilitate better understanding of the appropriate operational regimes and their limitations, which are difficult to obtain otherwise.
Article
Full-text available
In this paper, we combine the most complete record of daily mobility, based on large-scale mobile phone data, with detailed Geographic Information System (GIS) data, uncovering previously hidden patterns in urban road usage. We find that the major usage of each road segment can be traced to its own - surprisingly few - driver sources. Based on this finding we propose a network of road usage by defining a bipartite network framework, demonstrating that in contrast to traditional approaches, which define road importance solely by topological measures, the role of a road segment depends on both: its betweeness and its degree in the road usage network. Moreover, our ability to pinpoint the few driver sources contributing to the major traffic flow allows us to create a strategy that achieves a significant reduction of the travel time across the entire road system, compared to a benchmark approach.
Article
Full-text available
Crowd-powered search is a new form of search and problem solving scheme that involves collaboration among a potentially large number of voluntary Web users. Human flesh search (HFS), a particular form of crowd-powered search originated in China, has seen tremendous growth since its inception in 2001. HFS presents a valuable test-bed for scientists to validate existing and new theories in social computing, sociology, behavioral sciences, and so forth. In this research, we construct an aggregated HFS group, consisting of the participants and their relationships in a comprehensive set of identified HFS episodes. We study the topological properties and the evolution of the aggregated network and different sub-groups in the network. We also identify the key HFS participants according to a variety of measures. We found that, as compared with other online social networks, HFS participant network shares the power-law degree distribution and small-world property, but with a looser and more distributed organizational structure, leading to the diversity, decentralization, and independence of HFS participants. In addition, the HFS group has been becoming increasingly decentralized. The comparisons of different HFS sub-groups reveal that HFS participants collaborated more often when they conducted the searches in local platforms or the searches requiring a certain level of professional knowledge background. On the contrary, HFS participants did not collaborate much when they performed the search task in national platforms or the searches with general topics that did not require specific information and learning. We also observed that the key HFS information contributors, carriers, and transmitters came from different groups of HFS participants.
Article
Full-text available
Using an algorithm to analyze opportunistically collected mobile phone location data, the authors estimate weekday and weekend travel patterns of a large metropolitan area with high accuracy.
Article
Full-text available
Agent-based traffic management systems can use the autonomy, mobility, and adaptability of mobile agents to deal with dynamic traffic environments. Cloud computing can help such systems cope with the large amounts of storage and computing resources required to use traffic strategy agents and mass transport data effectively. This article reviews the history of the development of traffic control and management systems within the evolving computing paradigm and shows the state of traffic control and management systems based on mobile multiagent technology.
Article
Full-text available
Finding 10 balloons across the U.S. illustrates how the Internet has changed the way we solve highly distributed problems.
Chapter
Timely updates, increased citizen engagement, and more effective marketing are just a few of the reasons transportation agencies have already started to adopt social media networking tools. Best Practices for Transportation Agency Use of Social Media offers real-world advice for planning and implementing social media from leading government practitioners, academic researchers, and industry experts. The book provides an overview of the various social media platforms and tools, with examples of how transportation organizations use each platform. It contains a series of interviews that illustrate what creative agencies are doing to improve service, provide real-time updates, garner valuable information from their customers, and better serve their communities. It reveals powerful lessons learned from various transportation agencies, including a regional airport, city and state departments of transportation, and municipal transit agencies. Filled with examples from transportation organizations, the text provides ideas that can apply to all modes of transportation including mass transit, highways, aviation, ferries, bicycling, and walking. It describes how to measure the impact of your social media presence and also examines advanced uses of social media for obtaining information by involving customers and analyzing their social media use. The book outlines all the resources you will need to maintain a social media presence and describes how to use social media analytical tools to assess service strengths and weaknesses and customer sentiment. Explaining how to overcome the digital divide, language barriers, and accessibility challenges for patrons with disabilities, it provides you with the understanding of the various social media technologies along with the knowhow to determine which one is best for a specific situation and purpose.
Conference Paper
With the ubiquity of mobile communication devices, people experiencing traffic jams share real-time information and interact with each other on social media sites, which provide new channels to monitor, estimate and manage traffic flows. In this paper, we use natural language processing and data mining technologies to extract traffic jam related information from Tianya.cn, analyze the content of people's talk to discover the 'talking point' of people when facing traffic jams, and to provide data support for relevant authorities to make successful and effective decisions for real-time traffic jam response and management.
Article
Studying human movement citywide is important for understanding mobility and transportation patterns. Rather than investigating the trajectories of individuals, we employ an Eulerian approach to analyze the crowd flows among a geographical network and a social network, which are extracted from mobile phone data. We design a suite of visualization techniques to illustrate the dynamic evolutions of the flow over the networks. We contribute the design and implementation of a visual analytics system, which is called Mobility Viewer, that supports situation-aware understanding and visual reasoning of human mobility. We exemplify our approach with a real citywide data set of seven million users in two months.
Article
Transport assessment plays a vital role in urban planning and traffic control, which are influenced by multi-faceted traffic factors involving road infrastructure and traffic flow. Conventional solutions can hardly meet the requirements and expectations of domain experts. In this paper we present a data-driven solution by leveraging a visual analysis system to evaluate the real traffic situations based on taxi trajectory data. A sketch-based visual interface is designed to support dynamic query and visual reasoning of traffic situations within multiple coordinated views. In particular, we propose a novel road-based query model for analysts to interactively conduct evaluation tasks. This model is supported by a bi-directional hash structure, TripHash, which enables real-time responses to the data queries over a huge amount of trajectory data. Case studies with a real taxi GPS trajectory dataset (> 30GB) show that our system performs well for on-demand transport assessment and reasoning.
Article
Intelligent transportation systems (ITS) are becoming a crucial component of our society, whereas reliable and efficient vehicular communications consist of a key enabler of a well-functioning ITS. To meet a wide variety of ITS application needs, vehicular-to-vehicular and vehicular-to-infrastructure communications have to be jointly considered, configured, and optimized. The effective and efficient coexistence and cooperation of the two give rise to a dynamic spectrum management problem. One recently emerged and rapidly adopted solution of a similar problem in cellular networks is the so-termed device-to-device (D2D) communications. Its potential in the vehicular scenarios with unique challenges, however, has not been thoroughly investigated to date. In this paper, we for the first time carry out a feasibility study of D2D for ITS based on both the features of D2D and the nature of vehicular networks. In addition to demonstrating the promising potential of this technology, we will also propose novel remedies necessary to make D2D technology practical as well as beneficial for ITS.
Article
Urban transportation is an important factor in energy consumption and pollution, and is of increasing concern due to its complexity and economic significance. Its importance will only increase as urbanization continues around the world. In this article, we explore drivers' refueling behavior in urban areas. Compared to questionnaire-based methods of the past, we propose a complete data-driven system that pushes towards real-time sensing of individual refueling behavior and citywide petrol consumption. Our system provides the following: detection of individual refueling events (REs) from which refueling preference can be analyzed; estimates of gas station wait times from which recommendations can be made; an indication of overall fuel demand from which macroscale economic decisions can be made, and a spatial, temporal, and economic view of urban refueling characteristics. For individual behavior, we use reported trajectories from a fleet of GPS-equipped taxicabs to detect gas station visits. For time spent estimates, to solve the sparsity issue along time and stations, we propose context-aware tensor factorization (CATF), a factorization model that considers a variety of contextual factors (e.g., price, brand, and weather condition) that affect consumers' refueling decision. For fuel demand estimates, we apply a queue model to calculate the overall visits based on the time spent inside the station. We evaluated our system on large-scale and real-world datasets, which contain 4-month trajectories of 32,476 taxicabs, 689 gas stations, and the self-reported refueling details of 8,326 online users. The results show that our system can determine REs with an accuracy of more than 90%, estimate time spent with less than 2 minutes of error, and measure overall visits in the same order of magnitude with the records in the field study.
Article
The participation of a large and varied group of people in the planning process has long been encouraged to increase the effectiveness and acceptability of plans. However, in practice, participation by affected stakeholders has often been limited to small groups, both because of the lack of reach on the part of planners and because of a sense of little or no ownership of the process on the part of citizens. Overcoming these challenges to stakeholder participation is particularly important for any transportation planning process because the success of the system depends primarily on its ability to cater to the requirements and preferences of the people whom the system serves. Crowdsourcing uses the collective wisdom of a crowd to achieve a solution to a problem that affects the crowd. This paper proposes the use of crowdsourcing as a possible mechanism to involve a large group of stakeholders in transportation planning and operations. Multiple case studies show that crowdsourcing was used to collect data from a wide range of stakeholders in transportation projects. Two distinct crowdsourcing usage types are identified: crowdsourcing for collecting normally sparse data on facilities such as bike routes and crowdsourcing for soliciting feedback on transit quality of service and real-time information quality. A final case study exemplifies the use of data quality auditors for ensuring the usability of crowd-sourced data, one of many potential issues in crowdsourcing presented in the paper. These case studies show that crowdsourcing has immense potential to replace or augment traditional ways of collecting data and feedback from a wider group of a transportation system's users without creating an additional financial burden.
Article
Social networks have been recently employed as a source of information for event detection, with particular reference to road traffic congestion and car accidents. In this paper, we present a real-time monitoring system for traffic event detection from Twitter stream analysis. The system fetches tweets from Twitter according to several search criteria; processes tweets, by applying text mining techniques; and finally performs the classification of tweets. The aim is to assign the appropriate class label to each tweet, as related to a traffic event or not. The traffic detection system was employed for real-time monitoring of several areas of the Italian road network, allowing for detection of traffic events almost in real time, often before online traffic news web sites. We employed the support vector machine as a classification model, and we achieved an accuracy value of 95.75% by solving a binary classification problem (traffic versus nontraffic tweets). We were also able to discriminate if traffic is caused by an external event or not, by solving a multiclass classification problem and obtaining an accuracy value of 88.89%.
Article
In this paper, we propose a novel centralized time-division multiple access (TDMA)-based scheduling protocol for practical vehicular networks based on a new weight-factor-based scheduler. A roadside unit (RSU), as a centralized controller, collects the channel state information and the individual information of the communication links within its communication coverage, and it calculates their respective scheduling weight factors, based on which scheduling decisions are made by the RSU. Our proposed scheduling weight factor mainly consists of three parts, i.e., the channel quality factor, the speed factor, and the access category factor. In addition, a resource-reusing mode among multiple vehicle-to-vehicle (V2V) links is permitted if the distances between every two central vehicles of these V2V links are larger than a predefined interference interval. Compared with the existing medium-access-control protocols in vehicular networks, the proposed centralized TDMA-based scheduling protocol can significantly improve the network throughput and can be easily incorporated into practical vehicular networks.
Article
Two microscopic simulation methods are compared for driver behavior: the Gazis-Herman-Rothery (GHR) car-following model and a proposed agent-based neural network model. To analyze individual driver characteristics, a back-propagation neural network is trained with car-following episodes from the data of one driver in the naturalistic driving database to establish action rules for a neural agent driver to follow under perceived traffic conditions during car-following episodes. The GHR car-following model is calibrated with the same data set, using a genetic algorithm. The car-following episodes are carefully extracted and selected for model calibration and training as well as validation of the calibration rules. Performances of the two models are compared, with the results showing that at less than 10-Hz data resolution the neural agent approach outperforms the GHR model significantly and captures individual driver behavior with 95% accuracy in driving trajectory.
Article
Ubiquitous mobile devices, such as smartphones, led to an increased popularity of pedestrian-related routing applications over the past few years. Because pedestrians typically aim to minimize their walking distance, especially in nonrecreational and multimodal trips, pedestrian routing systems will be fully used only if they can find the correct shortest path and thus help to avoid unnecessary detours. The standard equipment of car navigation systems based on the Global Positioning System several years ago led to the availability of accurate street network data for car-based routing applications. However, pedestrian routing applications should consider pedestrian-related network segments besides those used by motorized traffic, including footpaths and pedestrian bridges. The authors of this paper performed a shortest-path analysis of pedestrian routes for cities in Germany and the United States. For a set of 1,000 randomly generated origin-destination pairs, the authors compared the lengths of pedestrian routes that were computed by different freely available network sources, such as OpenStreetMap and TIGER/Line data, and proprietary data sets, such as TomTom, NAVTEQ, and ATKIS. The results showed that freely available data sources such as OpenStreetMap provided a relatively comprehensive option for cities in which commercial pedestrian data sets were not yet available.
Article
In this paper, we propose a citywide and real-time model for estimating the travel time of any path (represented as a sequence of connected road segments) in real time in a city, based on the GPS trajectories of vehicles received in current time slots and over a period of history as well as map data sources. Though this is a strategically important task in many traffic monitoring and routing systems, the problem has not been well solved yet given the following three challenges. The first is the data sparsity problem, i.e., many road segments may not be traveled by any GPS-equipped vehicles in present time slot. In most cases, we cannot find a trajectory exactly traversing a query path either. Second, for the fragment of a path with trajectories, they are multiple ways of using (or combining) the trajectories to estimate the corresponding travel time. Finding an optimal combination is a challenging problem, subject to a tradeoff between the length of a path and the number of trajectories traversing the path (i.e., support). Third, we need to instantly answer users' queries which may occur in any part of a given city. This calls for an efficient, scalable and effective solution that can enable a citywide and real-time travel time estimation. To address these challenges, we model different drivers' travel times on different road segments in different time slots with a three dimension tensor. Combined with geospatial, temporal and historical contexts learned from trajectories and map data, we fill in the tensor's missing values through a context-aware tensor decomposition approach. We then devise and prove an object function to model the aforementioned tradeoff, with which we find the most optimal concatenation of trajectories for an estimate through a dynamic programming solution. In addition, we propose using frequent trajectory patterns (mined from historical trajectories) to scale down the candidates of concatenation and a suffix-tree-based index to manage the trajectories received in the present time slot. We evaluate our method based on extensive experiments, using GPS trajectories generated by more than 32,000 taxis over a period of two months. The results demonstrate the effectiveness, efficiency and scalability of our method beyond baseline approaches.
Article
Provides an overview of the technical articles and features presented in this issue.
Article
Data dissemination is a promising application for the vehicular network. Existing data dissemination schemes are generally built upon some random-access protocol, which results in the unavoidable collision problem. To address this problem, in this paper we design a novel data dissemination strategy from the scheduling perspective. A data dissemination scheduling framework is then proposed. In the proposed framework, the main challenge is how best to assign the transmission opportunity to nodes with maximum dissemination utility and to avoid the collision problem. We then propose a novel and practical relay selection strategy and adopt the space–time network coding (STNC) with low detection complexity and space–time diversity gain to improve the dissemination efficiency. Compared with the random-access dissemination such as CodeOn-Basic and the noncooperative transmission, our proposed data dissemination strategy performs better in terms of the dissemination delay. In addition, the proposed strategy works even better in the dense network than the sparse scenario, benefitting from the space–time diversity gain of STNC and no-collision transmissions. This is in sharp contrary to the CodeOn-Basic method.
Article
Performance evaluation is considered as an important part of the unmanned ground vehicle (UGV) development; it helps to discover research problems and improves driving safety. In this paper, a task-specific performance evaluation model of UGVs applied in the Intelligent Vehicle Future Challenge (IVFC) annual competitions is discussed. It is defined in functional levels with a formal evaluation process, including metrics analysis, metrics preprocessing, weights calculation, and a technique for order of preference by similarity to ideal solution and fuzzy comprehensive evaluation methods. IVFC 2012 is selected as a case study and overall performances of five UGVs are evaluated with specific analyzed autonomous driving tasks of environment perception, structural on-road driving, unstructured zone driving, and dynamic path planning. The model is proved to be helpful in IVFC serial competition UGVs performance evaluation.
Article
Significance Recent advances in information technologies have increased our participation in “sharing economies,” where applications that allow networked, real-time data exchange facilitate the sharing of living spaces, equipment, or vehicles with others. However, the impact of large-scale sharing on sustainability is not clear, and a framework to assess its benefits quantitatively is missing. For this purpose, we propose the method of shareability networks, which translates spatio-temporal sharing problems into a graph-theoretic framework that provides efficient solutions. Applying this method to a dataset of 150 million taxi trips in New York City, our simulations reveal the vast potential of a new taxi system in which trips are routinely shareable while keeping passenger discomfort low in terms of prolonged travel time.
Article
Presents summary abstracts of the papers in this issue of IEEE Transactions on Intelligent Transportation Systems.
Article
Presents abstracts of the articles included in this issue of IEEE Transactions on Intelligent Transportation Systems.
Conference Paper
The advances in mobile computing and social networking services enable people to probe the dynamics of a city. In this paper, we address the problem of detecting and describing traffic anomalies using crowd sensing with two forms of data, human mobility and social media. Traffic anomalies are caused by accidents, control, protests, sport events, celebrations, disasters and other events. Unlike existing traffic-anomaly-detection methods, we identify anomalies according to drivers' routing behavior on an urban road network. Here, a detected anomaly is represented by a sub-graph of a road network where drivers' routing behaviors significantly differ from their original patterns. We then try to describe the detected anomaly by mining representative terms from the social media that people posted when the anomaly happened. The system for detecting such traffic anomalies can benefit both drivers and transportation authorities, e.g., by notifying drivers approaching an anomaly and suggesting alternative routes, as well as supporting traffic jam diagnosis and dispersal. We evaluate our system with a GPS trajectory dataset generated by over 30,000 taxicabs over a period of 3 months in Beijing, and a dataset of tweets collected from WeiBo, a Twitter-like social site in China. The results demonstrate the effectiveness and efficiency of our system.
Conference Paper
Path prediction is useful in a wide range of applications. Most of the existing solutions, however, are based on eager learning methods where models and patterns are extracted from historical trajectories and then used for future prediction. Since such approaches are committed to a set of statistically significant models or patterns, problems can arise in dynamic environments where the underlying models change quickly or where the regions are not covered with statistically significant models or patterns. We propose a "semi-lazy" approach to path prediction that builds prediction models on the fly using dynamically selected reference trajectories. Such an approach has several advantages. First, the target trajectories to be predicted are known before the models are built, which allows us to construct models that are deemed relevant to the target trajectories. Second, unlike the lazy learning approaches, we use sophisticated learning algorithms to derive accurate prediction models with acceptable delay based on a small number of selected reference trajectories. Finally, our approach can be continuously self-correcting since we can dynamically re-construct new models if the predicted movements do not match the actual ones. Our prediction model can construct a probabilistic path whose probability of occurrence is larger than a threshold and which is furthest ahead in term of time. Users can control the confidence of the path prediction by setting a probability threshold. We conducted a comprehensive experimental study on real-world and synthetic datasets to show the effectiveness and efficiency of our approach.
Conference Paper
In the surveillance of road tunnels, video data plays an important role for a detailed inspection and as an input to systems for an automated detection of incidents. In disaster scenarios like major accidents, however, the increased amount of detected incidents may lead to situations where human operators lose a sense of the overall meaning of that data, a problem commonly known as a lack of situation awareness. The primary contribution of this paper is a design study of AlVis, a system designed to increase situation awareness in the surveillance of road tunnels. The design of AlVis is based on a simplified tunnel model which enables an overview of the spatiotemporal development of scenarios in real-time. The visualization explicitly represents the present state, the history, and predictions of potential future developments. Concepts for situation-sensitive prioritization of information ensure scalability from normal operation to major disaster scenarios. The visualization enables an intuitive access to live and historic video for any point in time and space. We illustrate AlVis by means of a scenario and report qualitative feedback by tunnel experts and operators. This feedback suggests that AlVis is suitable to save time in recognizing dangerous situations and helps to maintain an overview in complex disaster scenarios.
Article
Provides an overview of the technical articles and features presented in this issue.
Conference Paper
This paper presents a research effort aimed at modeling normal and safety-critical driving behavior in traffic under naturalistic driving data using agent based modeling techniques. Neuro-fuzzy reinforcement learning was used to train the agents. The developed agents were implemented in the VISSIM simulation platform and were evaluated by comparing the behavior of vehicles with and without agent behavior activation. The results showed very close resemblance of the behavior of agents to driver data.
Article
This paper investigates the resource-sharing problem in vehicular networks, including both vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication links. A novel underlaying resource-sharing communication mode for vehicular networks is proposed, in which different V2V and V2I communication links are permitted to access the same resources for their individual data transmission. To solve the resource-sharing problem in vehicular networks, we, for the first time, apply graph theory and propose the following two interference graph-based resource-sharing schemes: 1) the interference-aware graph-based resource-sharing scheme and 2) the interference-classified graph-based resource-sharing scheme. Compared with the traditional orthogonal communication mode in vehicular networks, the proposed two resource-sharing schemes express better network sum rate. The utility of the proposed V2V and V2I underlaying communication mode and the two proposed interference graph-based resource-sharing schemes are verified by simulations.
Article
Consider real-time exploration of large multidimensional spatiotemporal datasets with billions of entries, each defined by a location, a time, and other attributes. Are certain attributes correlated spatially or temporally? Are there trends or outliers in the data? Answering these questions requires aggregation over arbitrary regions of the domain and attributes of the data. Many relational databases implement the well-known data cube aggregation operation, which in a sense precomputes every possible aggregate query over the database. Data cubes are sometimes assumed to take a prohibitively large amount of space, and to consequently require disk storage. In contrast, we show how to construct a data cube that fits in a modern laptop's main memory, even for billions of entries; we call this data structure a nanocube. We present algorithms to compute and query a nanocube, and show how it can be used to generate well-known visual encodings such as heatmaps, histograms, and parallel coordinate plots. When compared to exact visualizations created by scanning an entire dataset, nanocube plots have bounded screen error across a variety of scales, thanks to a hierarchical structure in space and time. We demonstrate the effectiveness of our technique on a variety of real-world datasets, and present memory, timing, and network bandwidth measurements. We find that the timings for the queries in our examples are dominated by network and user-interaction latencies.
Article
As increasing volumes of urban data are captured and become available, new opportunities arise for data-driven analysis that can lead to improvements in the lives of citizens through evidence-based decision making and policies. In this paper, we focus on a particularly important urban data set: taxi trips. Taxis are valuable sensors and information associated with taxi trips can provide unprecedented insight into many different aspects of city life, from economic activity and human behavior to mobility patterns. But analyzing these data presents many challenges. The data are complex, containing geographical and temporal components in addition to multiple variables associated with each trip. Consequently, it is hard to specify exploratory queries and to perform comparative analyses (e.g., compare different regions over time). This problem is compounded due to the size of the data-there are on average 500,000 taxi trips each day in NYC. We propose a new model that allows users to visually query taxi trips. Besides standard analytics queries, the model supports origin-destination queries that enable the study of mobility across the city. We show that this model is able to express a wide range of spatio-temporal queries, and it is also flexible in that not only can queries be composed but also different aggregations and visual representations can be applied, allowing users to explore and compare results. We have built a scalable system that implements this model which supports interactive response times; makes use of an adaptive level-of-detail rendering strategy to generate clutter-free visualization for large results; and shows hidden details to the users in a summary through the use of overlay heat maps. We present a series of case studies motivated by traffic engineers and economists that show how our model and system enable domain experts to perform tasks that were previously unattainable for them.
Article
Intelligent Transportation Systems (ITS) and their applications are attracting significant attention in research and industry. ITS makes use of various sensing and communication technologies to assist transportation authorities and vehicles drivers in making informative decisions and provide leisure and safe driving experience. Data collection and dispersion are of utmost importance for the proper operation of ITS applications. Numerous standards, architectures and communication protocols have been anticipated for ITS applications. However, existing schemes are based on assumption that vehicles and roadside devices are equipped with sensing and communication capabilities. One of the major gaps of these approaches is their inability to capture events that can easily be logged by drivers using their mobile phones. In this paper, we propose to fill the gap by the use of Crowdsourcing in ITS namely, CrowdITS. In CrowdITS human inputs, along with available sensory data, are collected and communicated to a processing server using mobile phones. The basic idea is to use the Crowd with smart mobile phones to enable certain ITS applications without the need of any special sensors or communication devices, both in-vehicle and on-road. Alternatively, we integrate and aggregate human inputs with multiple information sources, and then selectively disseminate the aggregated information based on the driver's geo-location. Conceptually, the major change is to integrate human inputs, with multiple information sources, aggregate and finally it is localized according to the driver's geo-location. We describe the design of CrowdITS, report on the development of key ITS applications using Android and iPhone mobile phones, and outline the future work in the development of crowdsourced-based applications for intelligent transportation systems.
Conference Paper
Microblogs are increasingly gaining attention as an important information source in emergency management. Nevertheless, it is still difficult to reuse this information source during emergency situations, because of the sheer amount of unstructured data. Especially for detecting small scale events like car crashes, there are only small bits of information, thus complicating the detection of relevant information. We present a solution for a real-time identification of small scale incidents using microblogs, thereby allowing to increase the situational awareness by harvesting additional information about incidents. Our approach is a machine learning algorithm combining text classification and semantic enrichment of microblogs. An evaluation based shows that our solution enables the identification of small scale incidents with an accuracy of 89% as well as the detection of all incidents published in real-time Linked Open Government Data.
Article
Background: Crowdsourcing is a novel process of data collection that can provide insight into the effectiveness of acne treatments in real-world settings. Little is known regarding the feasibility of crowdsourcing as a means of collecting dermatology research data, the quality of collected data, and how the data compare to the published literature. Objective: The objective of this analysis is to compare acne data collected from a medical crowdsourcing site with high-quality controlled studies from peer-reviewed medical literature. Methods: Crowdsourced data was collected from 662 online acne patients. Online patients reported data in a Likert-type format to characterize their symptom severity (740 total responses) and their treatment outcomes (958 total responses). The crowdsourced data were compared with meta-analyses and reviews on acne treatment from August 20, 2010 to August 20, 2011. Results: We compared topical, oral systemic, alternative, phototherapy, and physical acne treatments of crowdsourced data to published literature. We focused on topical tretinoin due to the large number of online patient responses. While approximately 80% of tretinoin users observed clinical improvement after a 12-week treatment period in clinical trials, 46% of online users reported improvement in an unspecified time period. For most topical treatments, medication with high efficacy in clinical trials did not produce high effectiveness ratings based on the crowdsourced online data. Conclusion: While limitations exist with the current methods of crowdsourced data collection, with standardization of data collection and use of validated instruments, crowdsourcing will provide an important and valuable platform for collecting high-volume patient data in real-world settings.
Article
The agent computing paradigm is rapidly emerging as one of the powerful technologies for the development of large-scale distributed systems to deal with the uncertainty in a dynamic environment. The domain of traffic and transportation systems is well suited for an agent-based approach because transportation systems are usually geographically distributed in dynamic changing environments. Our literature survey shows that the techniques and methods resulting from the field of agent and multiagent systems have been applied to many aspects of traffic and transportation systems, including modeling and simulation, dynamic routing and congestion management, and intelligent traffic control. This paper examines an agent-based approach and its applications in different modes of transportation, including roadway, railway, and air transportation. This paper also addresses some critical issues in developing agent-based traffic control and management systems, such as interoperability, flexibility, and extendibility. Finally, several future research directions toward the successful deployment of agent technology in traffic and transportation systems are discussed.
Article
Mobile-agent technology has been adopted in many transportation fields to take advantages of different agents to deal with dynamic changes and uncertainty in traffic environments. However, few research studies have been conducted in urban-transportation systems on decision making about what kind of agents to be used in coping with a specific traffic states. With the increasing availability of control and service agents for agent-based urban-transportation systems, an agent recommendation system is necessary to manage and select those agents so original objectives can be fulfilled. In this article, the authors address issues related to the creation of such a platform.
Article
The practice of crowdsourcing is transforming the Web and giving rise to a new field.