About
228
Publications
48,218
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,040
Citations
Introduction
Skills and Expertise
Additional affiliations
January 2018 - present
January 2013 - November 2020
November 2008 - December 2012
Research Center "ATHENA"
Position
- Senior Researcher
Publications
Publications (228)
We demonstrate the Patterns of Life Simulation to create realistic simulations of human mobility in a city. This simulation has recently been used to generate massive amounts of trajectory and check-in data. Our demonstration focuses on using the simulation twofold: (1) using the graphical user interface (GUI), and (2) running the simulation headle...
This paper presents a novel approach for trajectory anomaly detection using an autoregressive causal-attention model, termed LM-TAD. This method leverages the similarities between language statements and trajectories, both of which consist of ordered elements requiring coherence through external rules and contextual variations. By treating trajecto...
Understanding urban mobility patterns and analyzing how people move around cities helps improve the overall quality of life and supports the development of more livable, efficient, and sustainable urban areas. A challenging aspect of this work is the collection of mobility data by means of user tracking or travel surveys, given the associated priva...
Human mobility data science using trajectories or check-ins of individuals has many applications. Recently, we have seen a plethora of research efforts that tackle these applications. However, research progress in this field is limited by a lack of large and representative datasets. The largest and most commonly used dataset of individual human tra...
The relationship between urban form and function is a complex challenge that can be examined from multiple perspectives. In this study, we propose a method to characterize the urban function of U.S. metropolitan areas by analyzing trip patterns extracted from the 2017 National Household Travel Survey (NHTS). To characterize urban form, we employ me...
Having accurate building information is paramount for a plethora of applications, including humanitarian efforts, city planning, scientific studies, and navigation systems. While volunteered geographic information from sources such as OpenStreetMap (OSM) has good building geometry coverage, descriptive attributes such as the type of a building are...
The range query is one of the most important query types in spatial data processing. Geographic information systems use it to find spatial objects within a user-specified range, and it supports data mining tasks, such as density-based clustering. In many applications, ranges are not computed in unrestricted Euclidean space, but on a network. While...
We introduce the Urban Life agent-based simulation used by the Ground Truth program to capture the innate needs of a human-like population and explore how such needs shape social constructs such as friendship and wealth. Urban Life is a spatially explicit model to explore how urban form impacts agents’ daily patterns of life. By meeting up at place...
Agent-based geospatial simulations have become very popular and widely used in examining the social and cultural characteristics of populations. Well-known toolkits such as NetLogo or MASON generally have scalability limitations , especially when the model and underlying spatial infrastructure become complex. This paper presents a framework for sim...
Map construction algorithms attempt to derive a spatial graph representing a road network from GPS-sampled movement trajectories. Existing methods commonly use trajectories without considering the specific sampling methodology. Hence, the movement information is not preserved in the map construction results. The proposed map-construction method con...
Deep generative models for graphs have exhibited promising performance in ever-increasing domains such as design of molecules (i.e, graph of atoms) and structure prediction of proteins (i.e., graph of amino acids). Existing work typically focuses on static rather than dynamic graphs, which are actually very important in the applications such as pro...
Trajectory data generation is an important domain that characterizes the generative process of mobility data. Traditional methods heavily rely on predefined heuristics and distributions and are weak in learning unknown mechanisms. Inspired by the success of deep generative neural networks for images and texts, a fast-developing research topic is de...
Human mobility and social networks have received considerable attention from researchers in recent years. What has been sorely missing is a comprehensive data set that not only addresses geometric movement patterns derived from trajectories, but also provides social networks and causal links as to why movement happens in the first place. To some ex...
Human mobility and social networks have received considerable attention from researchers in recent years. What has been sorely missing is a comprehensive data set that not only addresses geometric movement patterns derived from trajectories, but also provides social networks and causal links as to why movement happens in the first place. To some ex...
Recently deep generative models for static networks have been under active development and achieved significant success in application areas such as molecule design. However, many real-world problems involve temporal graphs whose topology and attribute values evolve dynamically over time, such as in the cases of protein folding, human mobility netw...
The sudden outbreak of the Coronavirus disease (COVID-19) swept across the world in early 2020, triggering the lockdowns of several billion people across many countries, and most states of the U.S. The transmission of the virus accelerated rapidly with the most confirmed cases in the U.S., and New York City became an epicenter of the pandemic by th...
The sudden outbreak of the Coronavirus disease (COVID-19) swept across the world in early 2020, triggering the lockdowns of several billion people across many countries, including China, Spain, India, the U.K., Italy, France, Germany, and most states of the U.S. The transmission of the virus accelerated rapidly with the most confirmed cases in the...
Urban areas provide us with a treasure trove of available data capturing almost every aspect of a population's life. This work focuses on mobility data and how it will help improve our understanding of urban mobility patterns. Readily available and sizable farecard data captures trips in a public transportation network. However, such data typically...
Space has long been acknowledged by researchers as a fundamental constraint which shapes our world. As technological changes have transformed the very concept of distance, the relative location and connectivity of geospatial phenomena have remained stubbornly significant in how systems function. At the same time, however, technology has allowed us...
Data generators have been heavily used in creating massive trajectory datasets to address common challenges of real-world datasets, including privacy, cost of data collection, and data quality. However, such generators often overlook social and physiological characteristics of individuals and as such their results are often limited to simple moveme...
Traditional navigation systems compute the quantitatively shortest or fastest route between two locations in a spatial network. In practice, a problem resulting from all drivers using the shortest path is the congregation of individuals on routes having a high in-betweenness. To this end, several works have proposed methods for proposing alternativ...
Zoning Improvement Plan (ZIP) Codes provide a sub-division of space. Interestingly, the ZIP code area polygons for different data sources do not match, resulting in uncertainty for a range of services that rely on such data. This paper presents a system that employs traditional classification methods to map a given spatial coordinate to a distribut...
Fixed-route bus systems are an important part of the urban transportation mix. A considerable disadvantage of buses is their slow speed, which is in part due to frequent stops, but also due to the lack of segregation from other vehicles in traffic. As such, assessing bus routes is an important aspect of route planning, scheduling, and the creation...
Congested traffic wastes billions of liters of fuel and is a significant contributor to Green House Gas (GHG) emissions. Although convenient, ride sharing services such as Uber and Lyft are becoming a significant contributor to these emissions not only because of added traffic but by spending time on the road while waiting for passengers. To help i...
In the research field of spatiotemporal data discovery, how to utilize the semantic characteristics of spatiotemporal datasets is an important topic. This paper presented a content-based recommendation method, and applied Bayesian networks and ontologies into the vocabulary recommendation process for spatiotemporal data discovery. The source data o...
Location-based social networks (LBSNs) have been studied extensively in recent years. However, utilizing real-world LBSN datasets in such studies has severe weaknesses: sparse and small datasets, privacy concerns, and a lack of authoritative ground-truth. Our vision is to create a large scale geo-simulation framework to simulate human behavior and...
Spatial regression models are widely used in numerous areas, including detecting and predicting traffic volume, air pollution, and housing prices. Unlike conventional regression models, which commonly assume independent and identical distributions among observations, existing spatial regression requires the prior knowledge of spatial dependency amo...
User-generated content is a valuable resource for capturing all aspects of our environment and lives, and dedicated Volunteered Geographic Information (VGI) efforts such as OpenStreetMap (OSM) have revolutionized spatial data collection. While OSM data is widely used, considerably little attention has been paid to the quality of its Point-of-intere...
Knowledge about human systems usually comes from deliberate, organized efforts. These efforts are increasingly collaborative with partnering among diverse teams of experts. This has spawned research in the means for data integration and data lineage, or provenance, to better enable sharing of data and workflows. Such research has focused on specifi...
The problem of traffic prediction is paramount in a plethora of applications, ranging from individual trip planning to urban planning. Existing work mainly focuses on traffic prediction on road networks. Yet, public transportation contributes a significant portion to overall human mobility and passenger volume. For example, the Washington, DC metro...
Effective road traffic assessment and estimation is crucial not only for traffic management applications, but also for long-term transportation and, more generally, urban planning. Traditionally, this task has been achieved by using a network of stationary traffic count sensors. These costly and unreliable sensors have been replaced with so-called...
This paper addresses the problem of matching and clustering users based on their geolocated posts. Individual posts are matched according to spatial distance and textual similarity thresholds. Then, user similarity is defined as the ratio of their posts that match each other. Based on these criteria, we introduce efficient algorithms for identifyin...
Geocrowdsourcing is a significant new focus area in mapping for people with disabilities. It utilizes public data contributions that are difficult to capture with traditional mapping workflows. Along with the benefits of geocrowdsourcing are critical drawbacks, including reliability and accuracy. A geocrowdsourcing testbed has been designed to expl...
Social media are often heralded as offering cancer campaigns new opportunities to reach the public. However, these campaigns may not be equally successful, depending on the nature of the campaign itself, the type of cancer being addressed, and the social media platform being examined. This study is the first to compare social media activity on Twit...
Cancer awareness campaigns compete with other health and social issues for public attention. We examined whether public engagement with breast cancer and prostate cancer declined in 2016 during the U.S. presidential election compared to 2015 on Twitter and Google Trends. We found that attention to breast cancer and prostate cancer declined in 2016...
Shortest-path computation on graphs is one of the most well-studied problems in algorithmic theory. An aspect that has only recently attracted attention is the use of databases in combination with graph algorithms, so-called distance oracles, to compute shortest-path queries on large graphs. To this purpose, we propose a novel, efficient, pure-SQL...
In current navigation systems quantitative metrics such as distance, time and energy are used to determine optimal paths. Yet, a “best path”, as judged by users, might take qualitative features into account, for instance the scenery or the touristic attractiveness of a path. Machines are unable to quantify such “soft” properties. Crowdsourced data...
In the current data-centered era, there are many highly diverse data sources that provide information about movement on networks, such as GPS trajectories, traffic flow measurements, farecard data, pedestrian cameras, bike-share data and even geo-social movement trajectories. The challenge identified in this vision paper is to create a unified fram...
The emergence of global networking capabilities (e.g. social media) has provided newfound mechanisms and avenues for information to be generated, disseminated, shaped, and consumed. The spread and evolution of online information represents a unique narrative ecosystem that is facilitated by cyberspace but operates at the nexus of three dimensions:...
Nowadays, large amounts of tracking data are generated via GPS-enabled devices and other advanced tracking technologies. These constitute a rich source for inferring the structure of transportation networks. In this work, we present a novel methodology for revealing a road network map from vehicle trajectories. Specifically, we propose an enhanced...
Background: The recent Zika outbreak witnessed the disease evolving from a regional health concern to a global epidemic. During this process, different communities across the globe became involved in Twitter, discussing the disease and key issues associated with it. This paper presents a study of this discussion in Twitter, at the nexus of location...
Novel Web technologies and resulting applications have led to a participatory data ecosystem that, when utilized properly, will lead to more rewarding services. In this work, we investigate the case of Location-Based Services, specifically how to improve the typical location-based Point-of-Interest (POI) request processed as a k-Nearest-Neighbor qu...
With the percentage of Twitter users approaching 20% of the US population by 2019, tweets provide a good sample of the public's sentiment and opinion. Consequently such data has been excessively used in commercial and research efforts. While works have analyzed the content of tweets in relation to the underlying social network of a discussion, some...
The “crowd” has become a very important geospatial data provider. Specifically, nonexpert users have been providing a wealth of quantitative geospatial data (e.g., geotagged tweets or photos, online). With spatial reasoning being a basic form of human cognition, textual narratives expressing user travel experiences (e.g., travel blogs) would provid...
The availability of GPS-enabled devices has generated massive amounts of GPS tracking data produced by vehicles traversing the road-network. While initially used for improving traffic estimation and routing, only recently has this data been used for map-construction efforts. This work focuses on the specific aspect of identifying turning restrictio...
Map construction methods automatically produce and/or update street map datasets using vehicle tracking data. Enabled by the ubiquitous generation of geo-referenced tracking data, there has been a recent surge in map construction algorithms coming from different computer science domains.This chapter gives a comprehensive overview and comparison of...
This chapter presents the TraceBundle algorithm, which is a representative of the intersection linking category of map construction algorithms
. The main approach is to first detect intersection nodes, then “bundle” trajectories around them in order to construct edges. Changes in movement direction and speed are used as turn indicators, and similar...
Map construction algorithms are usually evaluated by comparing the constructed map to a ground-truth
map. In this chapter, several quality measures that have been used for map comparison are described. While these quality measures are related to the comparison of abstract graphs, map comparison is a more specialized problem since the common spatial...
This chapter presents an incremental track insertion algorithm
for map construction that is based on partial map-matching
of the trajectories to the graph. The Fréchet distance
is used as part of the map-matching
algorithm and to provide quality guarantees for the constructed map. One of the contributions of this work is to separate edge regions fr...
This chapter presents a density-based algorithm pipeline
to construct a road map from a set of input trajectories. In the first step of the pipeline a data density function is computed and an undirected skeleton
graph is constructed using grayscale skeletonization of the density. In several different steps the pipeline then uses the trajectory data...
Although a visual inspection allows for a simple intuitive assessment of map construction results, providing a quantifiable assessment of the quality has been a considerable challenge. This chapter summarizes the results of a study that compares three map construction algorithms for three different datasets and using four quality measures. The resu...
This chapter introduces resources that complement the scientific discussion of map construction algorithms and provide the interested researcher with the simplest possible means to start experimenting with map construction algorithms. The Map Construction Web Portal and its content are briefly discussed and user guides are provided for several map...
The best way to get an impression of the capabilities of the various map construction algorithms is to visualize constructed maps side-by-side with the input trajectory data. This chapter showcases map construction results from three different cities and also presents the characteristics of the respective datasets.
Map construction algorithms are useful beyond GPS-derived trajectory datasets
. This chapter gives some examples of early-stage research towards novel applications. In recent years an ever increasing amount of social media data has become available. When considering the geospatial dimension of this data, using geocoded tweets for example, one can u...
Directions and paths, as commonly provided by navigation systems, are usually derived considering absolute metrics, e.g., finding the shortest or the fastest path within an underlying road network. With the aid of Volunteered Geographic Information (VGI), i.e., geo-spatial information contained in user generated content, we aim at obtaining paths t...
In this demonstration we re-visit the problem of finding an optimal route from location A to B. Currently, navigation systems compute shortest, fastest, most economic routes or any combination thereof. More often than not users want to consider “soft” qualitative metrics such as popularity, scenic value, and general appeal of a route. Routing algor...
Shortest-path computation is a well-studied problem in algorithmic theory. An aspect that has only recently attracted attention is the use of databases in combination with graph algorithms to compute distance queries on large graphs. To this end, we propose a novel, efficient, pure-SQL framework for answering exact distance queries on large-scale g...