Conference Paper

Intermodal public transit routing using Linked Connections

To read the full-text of this research, you can request a copy directly from the authors.


Ever since public transit agencies have found their way to the Web, they inform travelers using route planning software made available on their website. These travelers also need to be informed about other modes of transport, for which they have to consult other websites, or for which they have to ask the transit agency's server maintainer to implement new functionalities. In this demo, we introduce an affordable publishing method for transit data, called Linked Connections, that can be used for intermodal route planning, by allowing user agents to execute the route planning algorithm. We publish paged documents containing a stream of hops between transit stops sorted by departure time. Using these documents, clients are able to perform intermodal route planning in a reasonable time. Furthermore, such clients are fully in charge of the algorithm, and can now also route in different ways by integrating datasets of a user’s choice. When visiting our demo, conference attendees will be able to calculate intermodal routes by querying the Web of data using their phone’s browser, without expensive server infrastructure.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The OpenStreetMap data for example can be used to build very specific route planning applications, such as a cycling route planning for the city of Brussels 1 or a route planner for motorcyclists that want scenic and curvy roads. 2 Public transit data is a noteworthy success story of Open Data, in large part due to the General Transit Feed Specification (GTFS): the preferred data format of the Google Transit APIs, which many people interact with through Google Maps. This is an example of a rising tide lifting all boats; most operators just want to get their data into the Google APIs, but the same data can be used by others for any use case. ...
... The Linked Connections specification [2] proposes a Linked Data alternative to GTFS data dumps, in which the public transit connections are published as a paginated collection, and each page contains data from a certain time interval. Applications that need only need data from a specific interval to answer a query can thus be more selective in the data they have to process. ...
... The Linked Connections specification [2] defines a way to publish transit data that falls somewhere in the middle of the Linked Data Fragments axis. Connections are defined as vehicles going from one stop to another without an intermediate halt. ...
Public transit operators often publish their open data in a data dump, but developers with limited computational resources may not have the means to process all this data efficiently. In our prior work we have shown that geospatially partitioning an operator’s network can improve query times for client-side route planning applications by a factor of 2.4. However, it remains unclear whether this works for all network types, or other kinds of applications. To answer these questions, we must evaluate the same method on more networks and analyze the effect of geospatial partitioning on each network separately. In this paper we process three networks in Belgium: (i) the national railways, (ii) the regional operator in Flanders, and (iii) the network of the city of Brussels, using both real and artificially generated query sets. Our findings show that on the regional network, we can make query processing 4 times more efficient, but we could not improve the performance over the city network by more than 12%. Both the network’s topography, and to a lesser extent how users interact with the network, determine how suitable the network is for partitioning. Thus, we come to a negative answer to our question: our method does not work equally well for all networks. Moreover, since the network’s topography is the main determining factor, we expect this finding to apply to other graph-based geospatial data, as well as other Link Traversal-based applications.
... Synthetic public transport datasets are particularly important and needed in cases where public transport route planning algorithms are evaluated. The Linked Connections framework [11] and Connection Scan Algorithm [12] are examples of such public transport route planning systems. Because of the limited availability of real-world datasets with desired properties, these systems were evaluated with only a very low number of datasets, respectively one and three datasets. ...
... A connection is the actual departure time at a stop and an arrival at the next stop. These connections can be given a IRI, and described using RDF, using the Linked Connections [11] ontology. For this base algorithm and its derivatives, a connection object is the smallest building block of a transit schedule. ...
... This is serialized to a JSON format (https:/ / linkedconnections/ benchmark-belgianrail#transit-schedules) that was introduced for benchmarking the Linked Connections route planner [11]. ...
... Choosing a use-case specific fragmentation strategy, can result in a cost-efficient data publishing interface while still being able to evaluate queries. The Linked Connections (LC) framework [2] enables third parties to develop route planning algorithms that evaluate queries over different data sources, taking into account multiple modes of transport (e.g., train, bus or tram). To achieve this, the server interface only exposes a paged collection of an ordered public transit connections list. ...
... In our approach we use GTFS and GFTS-RT feeds provided by public transport agencies as input data for our system. A connection in the LC framework is defined as hop from one transit stop to another, indicating location and time of departure, and location and time of arrival [2]. Moreover, to provide context to HTTP user agents and link the terms and identifiers of GTFS and Connections to the Linked Open Data cloud, the Linked GTFS 2 and LC 3 ontologies were introduced. ...
Full-text available
Using Linked Data based approaches, public transport companies are able to share their time tables and its updates in an affordable way while allowing user agents to perform multimodal route planning algorithms. Providing time table updates, usually published as data streams, means that data is being constantly modified and if there is a large analytical query, its response might be affected due to the changing data. In this demo we introduce a mechanism to tackle this problem by guaranteeing that a user agent will always receive version based responses, therefore ensuring data consistency. Such mechanism also enables access to historical data that could be used for deep analysis of transport systems. However, how this data shall be archived, in order to keep this ap- proach scalable and inexpensive is still a matter of study. In a demonstrator, we published and query data from the Belgium national train system (SNCB) and Madrid Regional Transport Consortium (CRTM). This paper represents the first step towards establishing an affordable framework to publish reliable transport data.
... Different ticks on the axis illustrate client effort versus server expressivity: on the far left, data dumps offer high server availability, yet the effort needed by data reusers is high, and query logs do not reach the server. Moving to the right, we identify Linked Connections [8] as an in-between solution. On the far right, route planning services require high server effort, leading to detailed query logs. ...
... In Figure 6we illustrate these two options as two extremes, with other options that are yet to be discovered. When a server only allows to set e.g., a departure station and a departure time, then the server cannot log the arrival station, yet the client is still able to plan a route by executing the algorithm on the client-side [8]. We would be able to fully rely on the query logs if the expressivity would be maximal (extreme right) and caching would be turned off. ...
Conference Paper
In the field of smart cities, researchers need an indication of how people move in and between cities. Yet, getting statistics of travel flows within public transit systems has proven to be troublesome. In order to get an indication of public transit travel flows in Belgium, we analyzed the query logs of the iRail API, a highly expressive route planning API for the Belgian railways. We were able to study ∼100k to 500k requests for each month between October 2012 and November 2015, which is between 0.56% and 1.66% of the amount of monthly passengers. Using data visualizations, we illustrate the commuting patterns in Belgium and confirm that Brussels, the capital, acts as a central hub. The Flemish region appears to be polycentric, while in the Walloon region, everything converges on Brussels. The findings correspond to the real travel demand, according to experts of the passenger federation Trein Tram Bus. We conclude that query logs of route planners are of high importance in getting an indication of travel flows. However, better travel intentions would be acquirable using dedicated HTTP POST requests.
... To cope with the shortcomings of SPARQL endpoints, the Semantic Web Community has created some technologies, such as Triple Pattern Fragments (TPFs), which divides the query processing between clients and servers and allows to restrict the kinds of queries the client can send to the server [13]. Linked Connections are an example of a customized query interface for the consumption of open data in the Transportation area [2]. Linked Connections implement HTTP content negotiation 7 [1]. ...
Conference Paper
Full-text available
Linked Open Universities apply Linked Data to publish information about universities. In 2010, the Open University in the UK was launched as the first initiative to expose public information from the university in an accessible, open, integrated, and Web-based format. Since then, universities around the world have been joining that initiative by deploying their own linked open data platforms. However, during the publication of their data using the linked data principles, universities face challenges such as the lack of a unified, well-accepted vocabulary; the need of coping with the heterogeneity of datasets; the high cost to host the existing SPARQL endpoints; the performance shortcomings in federated queries over current SPARQL endpoints and the incomplete-ness of datasets. The aim of this Ph.D. research proposal is to advance on the Linked Open University context with a proposal for the University of Informatics Sciences from Cuba addressing some of these challenges.
... The Linked Connections specification [9] defines a way to publish transit data that falls somewhere in the middle of the Linked Data Fragments axis. Connections are defined as vehicles going from one stop to another without an intermediate halt. ...
Conference Paper
Public transit operators often publish their open data as a single data dump, but developers with limited computational resources may not be able to process all this data. Existing work has already focused on fragmenting the data by departure time, so that data consumers can be more selective in the data they process. However, each fragment still contains data from the entire operator’s service area. We build upon this idea by fragmenting geospatially as well as by departure time. Our method is robust to changes in the original data, such as the deletion or the addition of stops, which is crucial in scenarios where data publishers do not control the data itself. In this paper we explore popular clustering methods such as k-means and METIS, alongside two simple domain-specific methods of our own. We compare the effectiveness of each for the use case of client-side route planning, focusing on the ease of use of the data and the cacheability of the data fragments. Our results show that simply clustering stops by their proximity to 8 transport hubs yields the most promising results: queries are 2.4 times faster and download 4 times less data. More than anything though, our results show that the difference between clustering methods is small, and that engineers can safely choose practical and simple solutions. We expect that this insight also holds true for publishing other geospatial data such as road networks, sensor data, or points of interest.
... For instance, the ORCA project aims at automatically assigning nurses to patient calls in a hospital based on their context [1]. Linked Connections define a way to publish raw transit data, to be used for intermodal route planning [3]. In projects like the aforementioned, it is common that the Semantic Web is not solely used as a means to publish data, but also as a catalyst to execute other actions, e.g., calling a real-world nurse, or executing a route planning algorithm. ...
Conference Paper
Applications built on top of the Semantic Web are emerging as a novel solution in different areas, such as decision making and route planning. However, to connect results of these solutions – i.e., the semantically annotated data – with real-world applications, this semantic data needs to be connected to actionable events. A lot of work has been done (both semantically as non-semantically) to describe and define Web services, but there is still a gap on a more abstract level, i.e., describing interfaces independent of the technology used. In this paper, we present a data model, specification, and ontology to semantically declare and describe functions independently of the used technology. This way, we can declare and use actionable events in semantic applications, without restricting ourselves to programming language-dependent implementations. The ontology allows for extensions, and is proposed as a possible solution for semantic applications in various domains.
Route planning providers manually integrate different geo-spatial datasets before offering a Web service to developers, thus creating a closed world view. In contrast, combining open datasets at runtime can provide more information for user-specific route planning needs. For example, an extra dataset of bike sharing availabilities may provide more relevant information to the occasional cyclist. A strategy for automating the adoption of open geo-spatial datasets is needed to allow an ecosystem of route planners able to answer more specific and complex queries. This raises new challenges such as (i) how open geo-spatial datasets should be published on the Web to raise interoperability, and (ii) how route planners can discover and integrate relevant data for a certain query on the fly. We republished OpenStreetMap’s road network as “Routable Tiles” to facilitate its integration into open route planners. To achieve this, we use a Linked Data strategy and follow an approach similar to vector tiles. In a demo, we show how client-side code can automatically discover tiles and perform a shortest path algorithm. We provide four contributions: (i) we launched an open geo-spatial dataset that is available for everyone to reuse at no cost, (ii) we published a Linked Data version of the OpenStreetMap ontology, (iii) we introduced a hypermedia specification for vector tiles that extends the Hydra ontology, and (iv) we released the mapping scripts, demo and routing scripts as open source software.
Conference Paper
A large amount of public transport data is made available by many different providers, which makes RDF a great method for integrating these datasets. Furthermore, this type of data provides a great source of information that combines both geospatial and temporal data. These aspects are currently undertested in RDF data management systems, because of the limited availability of realistic input datasets. In order to bring public transport data to the world of benchmarking, we need to be able to create synthetic variants of this data. In this paper, we introduce a dataset generator with the capability to create realistic public transport data. This dataset generator, and the ability to configure it on different levels, makes it easier to use public transport data for benchmarking with great flexibility.
Conference Paper
While some public transit data publishers only provide a data dump – which only few reusers can afford to integrate within their applications – others provide a use case limiting origin-destination route planning api. The Linked Connections framework instead introduces a hypermedia api, over which the extendable base route planning algorithm “Connections Scan Algorithm” can be implemented. We compare the cpu usage and query execution time of a traditional server-side route planner with the cpu time and query execution time of a Linked Connections interface by evaluating query mixes with increasing load. We found that, at the expense of a higher bandwidth consumption, more queries can be answered using the same hardware with the Linked Connections server interface than with an origin-destination api, thanks to an average cache hit rate of 78%. The findings from this research show a cost-efficient way of publishing transport data that can bring federated public transit route planning at the fingertips of anyone.
Full-text available
When providing directions to a place, web and mobile mapping services are all able to suggest the shortest route. The goal of this work is to automatically suggest routes that are not only short but also emotionally pleasant. To quantify the extent to which urban locations are pleasant, we use data from a crowd-sourcing platform that shows two street scenes in London (out of hundreds), and a user votes on which one looks more beautiful, quiet, and happy. We consider votes from more than 3.3K individuals and translate them into quantitative measures of location perceptions. We arrange those locations into a graph upon which we learn pleasant routes. Based on a quantitative validation, we find that, compared to the shortest routes, the recommended ones add just a few extra walking minutes and are indeed perceived to be more beautiful, quiet, and happy. To test the generality of our approach, we consider Flickr metadata of more than 3.7M pictures in London and 1.3M in Boston, compute proxies for the crowdsourced beauty dimension (the one for which we have collected the most votes), and evaluate those proxies with 30 participants in London and 54 in Boston. These participants have not only rated our recommendations but have also carefully motivated their choices, providing insights for future work.
Conference Paper
As the Web of Data is growing at an ever increasing speed, the lack of reliable query solutions for live public data becomes apparent. SPARQL implementations have matured and deliver impressive performance for public SPARQL endpoints, but poor availability—especially under high loads—prevents their use in real-world applications. We propose to tackle this availability problem with basic Linked Data Fragments, a concept and related techniques to publish and consume queryable data by moving intelligence from the server to the client. This paper formalizes the concept, introduces a client-side query processing algorithm using a dynamic iterator pipeline, and verifies its availability under load. The results indicate that, at the cost of lower performance, query techniques with basic Linked Data Fragments lead to high availability, thereby allowing for reliable applications on top of public, queryable Linked Data.
Conference Paper
For publishers of Linked Open Data, providing queryable access to their dataset is costly. Those that offer a public sparql endpoint often have to sacrifice high availability; others merely provide non-queryable means of access such as data dumps. We have developed a client-side query execution approach for which servers only need to provide a lightweight triple-pattern-based interface, enabling queryable access at low cost. This paper describes the implementation of a client that can evaluate sparql queries over such triple pattern fragments of a Linked Data dataset. Graph patterns of sparql queries can be solved efficiently by using metadata in server responses. The demonstration consists of sparql client for triple pattern fragments that can run as a standalone application, browser application, or library.
Conference Paper
This paper studies the problem of computing optimal journeys in dynamic public transit networks. We introduce a novel algorithmic framework, called Connection Scan Algorithm (CSA), to compute journeys. It organizes data as a single array of connections, which it scans once per query. Despite its simplicity, our algorithm is very versatile. We use it to solve earliest arrival and multi-criteria profile queries. Moreover, we extend it to handle the minimum expected arrival time (MEAT) problem, which incorporates stochastic delays on the vehicles and asks for a set of (alternative) journeys that in its entirety minimizes the user’s expected arrival time at the destination. Our experiments on the dense metropolitan network of London show that CSA computes MEAT queries, our most complex scenario, in 272 ms on average.
We study the problem of computing all Pareto-optimal journeys in a dynamic public transit network for two criteria: arrival time and number of transfers. Existing algorithms consider this as a graph problem, and solve it using variants of Dijkstra's algorithm. Unfortunately, this leads to either high query times or suboptimal solutions. We take a different approach. We introduce RAPTOR, our novel round-based public transit router. Unlike previous algorithms, it is not Dijkstra-based, looks at each route (such as a bus line) in the network at most once per round, and can be made even faster with simple pruning rules and parallelization using multiple cores. Because it does not rely on preprocessing, RAPTOR works in fully dynamic scenarios. Moreover, it can be easily extended to handle flexible departure times or arbitrary additional criteria, such as fare zones. When run on London's complex public transportation network, RAPTOR computes all Pareto-optimal journeys between two random locations an order of magnitude faster than previous approaches, which easily enables interactive applications.
Round-based public transit routing
  • D Delling
  • T Pajor
  • R F F Werneck
D. Delling, T. Pajor, and R. F. F. Werneck. Round-based public transit routing. In Proceedings of the 14th Meeting on Algorithm Engineering and Experiments (ALENEX'12), 2012.
The shortest path to happiness: Recommending beautiful, quiet, and happy routes in the city
  • D Quercia
  • R Schifanella
  • L M Aiello
D. Quercia, R. Schifanella, and L. M. Aiello. The shortest path to happiness: Recommending beautiful, quiet, and happy routes in the city. In Proceedings of the 25th ACM conference on Hypertext and social media, pages 116-125. ACM, 2014.