Conference Paper

Using Web data to enhance traffic situation awareness

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

With the ubiquity of mobile communication devices, people experiencing traffic jams share real-time information and interact with each other on social media sites, which provide new channels to monitor, estimate and manage traffic flows. In this paper, we use natural language processing and data mining technologies to extract traffic jam related information from Tianya.cn, analyze the content of people's talk to discover the 'talking point' of people when facing traffic jams, and to provide data support for relevant authorities to make successful and effective decisions for real-time traffic jam response and management.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Conventionally, traffic data is collected based on physical sensors like floating cars, closed-circuit television cameras, and loop detectors [7]. Since people and authoritative agencies often post transportation information with the popularity of such platforms, social media have been regarded as the potential source to serve as social sensors to extract traffic information [8], [9]. The term of social transportation was firstly introduced in [10]. ...
Article
Mining traffic-relevant information from social media data has become an emerging topic due to the real-time and ubiquitous features of social media. In this paper, we focus on a specific problem in social media mining which is to extract traffic relevant microblogs from Sina Weibo, a Chinese microblogging platform. It is transformed into a machine learning problem of short text classification. First, we apply the continuous bag-of-word model to learn word embedding representations based on a data set of three billion microblogs. Compared to the traditional one-hot vector representation of words, word embedding can capture semantic similarity between words and has been proved effective in natural language processing tasks. Next, we propose using convolutional neural networks (CNNs), long short-term memory (LSTM) models and their combination LSTM-CNN to extract traffic relevant microblogs with the learned word embeddings as inputs. We compare the proposed methods with competitive approaches, including the support vector machine (SVM) model based on a bag of n-gram features, the SVM model based on word vector features, and the multi-layer perceptron model based on word vector features. Experiments show the effectiveness of the proposed deep learning approaches.
... Therefore, SNSs have played an important role in real time analysis and have been used for faster trend predictions in many areas [2,3]. The areas include traffic [4][5][6][7], disaster prediction [8][9][10][11][12], management [13][14][15], networking [16,17], news [18][19][20][21][22] and so on. In the public health area, SNS provides an efficient resource for disease surveillance and also an efficient way to communicate to prevent disease outbreaks [23]. ...
Article
Full-text available
Early prediction of seasonal epidemics such as influenza may reduce their impact in daily lives. Nowadays, the web can be used for surveillance of diseases. Search engines and social networking sites can be used to track trends of different diseases seven to ten days faster than government agencies such as Center of Disease Control and Prevention (CDC). CDC uses the Illness-Like Influenza Surveillance Network (ILINet), which is a program used to monitor Influenza-Like Illness (ILI) sent by thousands of health care providers in order to detect influenza outbreaks. It is a reliable tool, however, it is slow and expensive. For that reason, many studies aim to develop methods that do real time analysis to track ILI using social networking sites. Social media data such as Twitter can be used to predict the spread of flu in the population and can help in getting early warnings. Today, social networking sites (SNS) are used widely by many people to share thoughts and even health status. Therefore, SNS provides an efficient resource for disease surveillance and a good way to communicate to prevent disease outbreaks. The goal of this study is to review existing alternative solutions that track flu outbreak in real time using social networking sites and web blogs. Many studies have shown that social networking sites can be used to conduct real time analysis for better predictions.
... As studies for big data sources and applications in the transport sector have tremendously increased, authors believe that the incorporation of additional studies to this review in the future will bring new information for better understanding and new possibilities for more efficient management of transport systems. [54] X X Roth et al. [55] X X Santi et al. [57] X X Schmoecker et al. [58] X X Schulz et al. [59] X X Seaborn et al. [60] X X Song et al. [62] X X Tabbitt [64] X X X X X Toole et al. [66] X X Trepanier et al. [67] X X van Oort et al. [69] X X X Wang et al. [70] X X Wang et al. [72] X X Wang et al. [71] X X Wanichayapong et al. [73] X X Wanq et al. [74] X X Weinstein [76] X X X Widhalm et al. [77] X X Wood et al. [78] X X Yeung et al. [79] X X Yu et al. [80] X X X X Yuan et al. [ ...
Chapter
Full-text available
The development of Information and Communications Technology (ICT) and the Internet provide Intelligent Transport Systems (ITS) with a huge amount of real-time data. These data are the so-called “Big Data” which can be collected, interpreted, managed and analyzed in a proper way in order to improve the knowledge around the transport system. The use of these technologies has greatly enhanced the efficiency and user friendliness of ITS, providing significant economic and social impacts, contributing positively to the management of sustainable mobility.
... The second most productive institutions are Institute of Automation, Chinese Academy of Sciences, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, and Virginia Tech, all of which have 3 papers in the dataset. It is worth mentioning that some papers are produced by research branches of one institute, such as IBM (including IBM Dublin Research Centre [16]- [20], IBM Research C India [21], and IBM Rio Research Centre [16]), and Chinese Academy of Sciences (including Institute of Automation [13], [22], [23] and Institute of Geographic Sciences and Natural Resources Research [24]- [26]). ...
Article
Recently, there has been an increased interest in the use of social media data as important traffic information sources. In this paper, we review social media based transportation research with social network analysis methods. We summarize main research topics in this field, and report collaboration patterns at levels of researchers, institutions, and countries, respectively. Finally, some future research directions are identified.
... Using online chatting messages from Tianya.cn, Wang et al.[38]used NLP approach and data mining techniques to detect traffic jams. The authors discovered people's talking point when meeting traffic jams, which can offer data support for relevant authorities to make effective decisions.Fig. ...
Article
Reasoning aiming at inferring implicit facts over knowledge graphs (KGs) is a critical and fundamental task for various intelligent knowledge-based services. With multiple distributed and complementary KGs, the effective and efficient capture and fusion of knowledge from different KGs is becoming an increasingly important topic, which has not been well studied. To fill this gap, we propose to explore cross-KG relation paths with the anchor links identified by entity alignment for the knowledge fusion and collaborative reasoning of multiple KGs. To address the heterogeneity of different KGs, this paper proposes a novel reasoning model named HackRL based on the reinforcement learning framework, which incorporates the long short-term memory and hierarchical graph attention in the policy network to infer indicative cross-KG relation paths from the history trajectory and the heterogeneous environment for predicting corresponding relations. Meanwhile, an entity alignment-oriented representation learning method is utilized to embed different KGs into a unified vector space based on the anchor links to reduce the impact of distinct vector spaces, and two training mechanisms, action mask and retrain with sampled paths, are proposed to optimize the training process to learn more successful indicative paths. The proposed HackRL is validated on three cross-lingual datasets built from DBpedia on the link prediction and fact prediction tasks. Experimental results demonstrate that HackRL achieves better performance on most tasks than existing methods. This work provides an industrially-applicable framework for fusing distributed KGs to make better decisions.
Article
Traffic conditions are among the issues most concerned with the general public, and the freeway is a large-scale Internet of Things application. In addition to obtaining real-time road usage information, analysis of local road usage habits is crucial in evaluations of government policy implementation. Using road usage data provided by the Electronic Toll Collection (ETC) system, we investigated the data on road usage history on the freeways. The ELK Stack was employed to construct a platform for visualizing real-time road usage information and history in this paper; the platform is named the Local Transportation Knowledge (LTK) platform. By analyzing more than 500 million pieces of data, the LTK platform proposed in this study efficiently visualized road usage data and facilitated acquirement of local road usage knowledge. We verified that residents of other counties and cities commuted to Taichung each day. We also discovered that a considerable number of Taichung City residents were employed in Hsinchu Science Park and commuted between the two places. The LTK platform can present real-time freeway traffic conditions, facilitate in-depth analysis of local road usage data, and provide data to verify the relevant information.
Article
In this paper, the current network experiment platforms are investigated, introduced and compared. The most popularly used simulator and its main advantages and disadvantages are demonstrated. To solve the problem of lacking flexibility and service capability of traditional network experiment platforms, based on the parallel network architecture, the corresponding computational experiment platform is proposed. The proposed platform is data-oriented, and by using the computational experiments and analysis, an optimized control strategy can be continuously updated and tracked, thus the self optimization of network systems is achieved. In the end, a computational experiment based on retweet analysis of Wechat Moments is proposed, and the effectiveness of this method is evaluated.
Conference Paper
Recently low power wide area network (LPWAN) is widely researched and deployed due to its excellent performance of supporting large coverage, low power consumption and massive capacity. LPWAN might offer brandnew solutions to Vehicle to anything (V2X) communications, which is faced with the challenge of supporting massive connections due to the dramatically increasing number of vehicles. In this paper, after surveying the existing V2X communication technologies, we compare the traditional technologies with the representative LPWAN technologies according to their performance metrics. After the careful comparison and selection, Long-Range (LoRa) and enhanced machine type communication (eMTC) are introduced to V2X communication due to their support for mobility. Moreover, their performance are evaluated in both V2I (vehicleto- infrastructure) and V2V (vehicle-to-vehicle) communication environments via Monte Carlo simulations.
Article
Radio resource management (RRM), which aims to satisfy the requirements of both mobile users and service providers, can be seen as one of the typical issues of cyber-physical-social system since the social factors, that is, the requirements and priorities of users are extremely important in heterogeneous networks. In this paper, we propose a novel resource allocation and access control mechanism based on parallel network architecture, which provides a high-bandwidth connectivity with guaranteed quality of service (QoS) for mobile users in a seamless manner. In this mechanism, multiple users are classified into several types according to their social property such as priorities and bandwidth requirements of different users. Compared with the general received signal strength (RSS)-based method, the proposed user priority (UP)-based method achieves three main advantages as follows: 1) it further balances the load of base stations (BSs) when the resource is sufficient; 2) it provides a mechanism called high priority users higher QoS when the network is heavily loaded compared to the RSS-based method; and 3) it hands over a few users from a heavily loaded BS to a lightly loaded one to allow more users to access this network. The simulation results confirm the advantages of the proposed UP-based mechanism and show that the simulation results of the Q-learning method are consistent with its theoretical analysis.
Chapter
Social Networking Sites (SNS) such as Twitter are widely used by users of diverse ages. The rate of the data in SNS has made it become an efficient resource for real-time analysis. Thus, SNS data can effectively be used to track disease outbreaks and provide necessary warnings earlier than official agencies such as the American Center of Disease Control and Prevention. In this study, we show that sentiment analysis features and weighting techniques such as Term Frequency-Inverse Document Frequency (TF-IDF) can improve the accuracy of flu tweet classification. Various machine learning algorithms were evaluated to classify tweets to either flu-related or unrelated and then adopt the one with better accuracy. The results show that the proposed approach is useful for flu disease surveillance models/systems.
Article
Full-text available
The aim of this research is to increase the traffic on Website through multiple techniques. Hence, it is necessary to enhance the traffic on Website to improve ranking. Therefore, the primary effort in this research is to explore the factors and techniques that contribute to a page ranking highly in a Website. We examine major changes in Web traffic characteristics during this period, and also investigate the enhancement of traffic. The problems of traffic enhancement represent great challenges for high ranking of Website. The research is divided into two parts. The first part is to analysis current level of ranking and another is ranking of website after technical or non-technical changes. We will measure the ranking through SEs hits. The result will reflect in following term: Band-width, No. of visitors, No. of visitors after changes, No. of hits and No. of followers [8]
Article
In China, traffic police's micro-bo provides instant information for travelers and can help drivers to avoid congested roads. Management rules and laws help to maintain the road traffic order, improve traffic flow, and prevent traffic accidents. How to build reasonable rules and laws is very important. The analysis of traffic flow is good for building traffic laws and rules. In this paper, the congestion time, congested place, and congestion reason are analyzed on the traffic police's micro-bo, and the theories of linguistic dynamic systems based on multifactor time-varying universe and fuzzy comprehension evaluation are used to analyze traffic flow and dynamic fuzzy rules on time-varying universe are built to provide the corresponding traffic management rules. As an example, Shenzhen's traffic police micro-bo is used to study the information of traffic congestion, including jam session, congestion location and reasons, and disposal methods, and their results are presented in language form, i.e., keywords walls; then, the traffic flow of Labor Day is discussed.
Conference Paper
Full-text available
OCR of low resolution documents is not so common, because it has a lot of problems. However, today there are several archives of digital documents which are scanned at low resolution, to consume less storage. These documents which usually have a resolution of 100 to 150 dpi, require to be converted to searchable documents. In this paper presents a new method for clustering of low quality printed Persian sub-words. This is necessary to reduce the number of classes of sub-words in order to improve the overall recognition rate. Two popular clustering methods, hierarchical and k-means implemented and compared. Local binary patterns (LBP) and zoning algorithms used for feature extraction. Both features are fast and represent the global shape information very well. Moreover, we used different distance measures to find the similarity of feature vectors. We applied our algorithms on a dataset of 10,700 images of distinct Persian sub-words with 96 dpi resolution. Experimental results show that the hierarchical clustering with the correlation distance measure has the best performance over other clustering methods and distance measures.
Conference Paper
Full-text available
At least for the last decade, South African transport policies have focused on providing mobility for all. For example, the vision of the White Paper on National Transport Policy [1] is to 'provide safe, reliable, effective, efficient, and fully integrated transport operations and infrastructure, which will best meet the needs of freight and passenger customers at improved level of service'. Government has identified that the provision of public transport plays a crucial role in working towards this vision. Independent of the physical infrastructure and vehicle type, the provision of information has been identified as an important element that provides, or improves, customer satisfaction [2]. Internationally, the trend is to provide real-time information through the implementation of Advanced Traveller Information Systems (ATIS). This paper describes the implementation of an ATIS system for the Jammie Shuttle service at the University of Cape Town (UCT).
Article
Full-text available
The proposed social media crisis mapping platform for natural disasters uses locations from gazetteer, street map, and volunteered geographic information (VGI) sources for areas at risk of disaster and matches them to geoparsed real-time tweet data streams. The authors use statistical analysis to generate real-time crisis maps. Geoparsing results are benchmarked against existing published work and evaluated across multilingual datasets. Two case studies compare five-day tweet crisis maps to official post-event impact assessment from the US National Geospatial Agency (NGA), compiled from verified satellite and aerial imagery sources.
Article
Full-text available
Besides knowing that a problem exists, traffic managers and prediction systems need to know its context. Here, the authors discuss how to extend current ITS technologies to capture and process such information.
Article
Full-text available
Recent research on pattern discovery has progressed from mining frequent patterns and sequences to mining structured patterns, such as trees and graphs. Graphs as general data structure can model complex relations among data with wide applications in web exploration and social networks. However, the process of mining large graph patterns is a challenge due to the existence of large number of subgraphs. In this paper, we aim to mine only frequent complete graph patterns. A graph g in a database is complete if every pair of distinct vertices is connected by a unique edge. Grid Complete Graph (GCG) is a mining algorithm developed to explore interesting pruning techniques to extract maximal complete graphs from large spatial dataset existing in Sloan Digital Sky Survey (SDSS) data. Using a divide and conquer strategy, GCG shows high efficiency especially in the presence of large number of patterns. In this paper, we describe GCG that can mine not only simple co-location spatial patterns but also complex ones. To the best of our knowledge, this is the first algorithm used to exploit the extraction of maximal complete graphs in the process of mining complex co-location patterns in large spatial dataset.
Article
Full-text available
In this work, we present an interactive system for visual analysis of urban traffic congestion based on GPS trajectories. For these trajectories we develop strategies to extract and derive traffic jam information. After cleaning the trajectories, they are matched to a road network. Subsequently, traffic speed on each road segment is computed and traffic jam events are automatically detected. Spatially and temporally related events are concatenated in, so-called, traffic jam propagation graphs. These graphs form a high-level description of a traffic jam and its propagation in time and space. Our system provides multiple views for visually exploring and analyzing the traffic condition of a large city as a whole, on the level of propagation graphs, and on road segment level. Case studies with 24 days of taxi GPS trajectories collected in Beijing demonstrate the effectiveness of our system.
Conference Paper
Full-text available
There are a number of ways to monitor traffic and help people to navigate through or avoid traffic jams. A prospective way is to use smart phones with GPS enabled device as traffic sensors, which complement existing sensors. This paper attempts to highlight a number of progressive steps in the effort to build an integrated ITS, which harnesses smart phones as intelligent agent. However, a number of questions should be addressed first: How smart phones can avoid map mismatching phenomenon which is a common problem in navigation devices ? What if there are compromised agents which attempt to invalidate the gathered data ? and how to place detectors in such a system. Consequently, there are three possible solutions discussed in this paper: the use of non-overlapping zones in Virtual Detection Zone (VDZ), filtering algorithm to ignore compromised agents and the use of macroscopic simulation to aid the placement of VDZ in selected roads.
Article
Full-text available
Crowd-powered search is a new form of search and problem solving scheme that involves collaboration among a potentially large number of voluntary Web users. Human flesh search (HFS), a particular form of crowd-powered search originated in China, has seen tremendous growth since its inception in 2001. HFS presents a valuable test-bed for scientists to validate existing and new theories in social computing, sociology, behavioral sciences, and so forth. In this research, we construct an aggregated HFS group, consisting of the participants and their relationships in a comprehensive set of identified HFS episodes. We study the topological properties and the evolution of the aggregated network and different sub-groups in the network. We also identify the key HFS participants according to a variety of measures. We found that, as compared with other online social networks, HFS participant network shares the power-law degree distribution and small-world property, but with a looser and more distributed organizational structure, leading to the diversity, decentralization, and independence of HFS participants. In addition, the HFS group has been becoming increasingly decentralized. The comparisons of different HFS sub-groups reveal that HFS participants collaborated more often when they conducted the searches in local platforms or the searches requiring a certain level of professional knowledge background. On the contrary, HFS participants did not collaborate much when they performed the search task in national platforms or the searches with general topics that did not require specific information and learning. We also observed that the key HFS information contributors, carriers, and transmitters came from different groups of HFS participants.
Article
Presents abstracts of the articles included in this issue of IEEE Transactions on Intelligent Transportation Systems.
Article
Provides an overview of the technical articles and features presented in this issue.
Conference Paper
There exist noisy, unparallel sentences in parallel web pages. Web page structure is subjected to some limitation for sentences alignment task for web page text. The most straightforward way of aligning sentences is using a translation lexicon. However, a major obstacle to this approach is the lack of dictionary for training. This paper presents a method for automatically align Mongolian-Chinese parallel text on the Web via vector space model. Vector space model is an algebraic model for representing any object as vectors of identifiers, such as index terms. In the statistically based vector-space model, a sentence is conceptually represented by a vector of keywords extracted from the text. Extracted keywords are composed by content words, known as terms and the weight of a term in a sentence vector can be determined tf-idf method. CHI is used to compute the association between bilingual words. Once the term weights are determined, the similarity between sentence vectors is computed via cosine measure. The experimental results indicate that the method is accurate and efficient enough to apply without human intervention.
Article
With the booming of social media, sentiment analysis has developed rapidly in recent years. However, only a few studies focused on the field of transportation, which failed to meet the stringent requirements of safety, efficiency, and information exchange of intelligent transportation systems (ITSs). We propose the traffic sentiment analysis (TSA) as a new tool to tackle this problem, which provides a new prospective for modern ITSs. Methods and models in TSA are proposed in this paper, and the advantages and disadvantages of rule- and learning-based approaches are analyzed based on web data. Practically, we applied the rule-based approach to deal with real problems, presented an architectural design, constructed related bases, demonstrated the process, and discussed the online data collection. Two cases were studied to demonstrate the efficiency of our method: the “yellow light rule” and “fuel price” in China. Our work will help the development of TSA and its applications.
Conference Paper
According to analysis the problem that “Whether it should continue for the highway charges” aroused public attention, as well as traffic congestion in the megalopolis, the paper firstly applied the principle of economics and social utility to analysis the feature of the highway to the implementation of the feasibility of traffic jam charging policy. Secondly, the paper gave a simulation analysis of highway implement traffic jam charging policy by using Systematic Dynamics model, and used vensim software to draw a system flow Figure, and put forward traffic jam charging measures for urban sections of the highway. Finally, the paper draw a conclusion that the highway urban sections of the megalopolis implement traffic jam charging policy can effectively relieve urban traffic jam and social pressure of public.
Article
It is a challenge to design efficient routing protocols for vehicular ad hoc networks (VANETs) because of their highly dynamic properties. We address the vehicular communication problem in urban hybrid networks and present a hybrid routing scheme for data dissemination in VANETs. Location-based crowdsourcing of nearby roadside units (RSUs) has been applied to the infrastructural support of inter-vehicle, vehicle-to-roadside, and inter-roadside communications in hybrid VANETs. The combination of RSU resources and ad hoc networks involves an online probabilistic RSU retrieval algorithm that uses coarse- and fine-grained localization to estimate the number and location of available RSUs; a network coding based multicast routing for dense VANETs using maximum distance separation (MDS) code and local topology information from the forwarding set to achieve robust communication and max-flow min-cut data dissemination; an application of opportunistic routing, using a carry-and-forward scheme to solve the forwarding disconnection problem in sparse VANETs; and a routing switch mechanism to guarantee quality of service (QoS) under various network connectivity and deployment configurations. The performance of our hybrid routing scheme is evaluated using both simulations and real testbed experiments.
Article
The described system uses natural language processing and data mining techniques to extract situation awareness information from Twitter messages generated during various disasters and crises.
Conference Paper
We propose a method for community discovery with side information based on a regularized modularity eigenmap. Community discovery has been conducted mostly based solely on the connectivity relation among nodes in a social network. However, the connectivity may change in time, or some links might be missing. Even when the connectivity relation in a network is only partially available, if other information about the network is available, it can be exploited as auxiliary or side information for community discovery. Our approach constructs a graph structure based on the side information so that both connectivity relation and side information can be uniformly dealt with in terms of graph representation. An objective function with a regularization term for the side information is proposed based the modularity matrix of a network. Extensive experiments are conducted over social network datasets and comparison with several state-of-the-art methods is reported.
Research on Traffic Jam Charging Policy for Vehicles on Highway in Chinese Megalopolis
  • Yuan-Hua Xi-Hui Yin
  • Zhong-Hai Jia
  • Niu