Christian S. Jensen

Aalborg University, Ålborg, North Denmark, Denmark

Are you Christian S. Jensen?

Claim your profile

Publications (467)77.05 Total impact

  • Source
    Quan Z. Sheng, Jing He, Guoren Wang, Christian S. Jensen
    World Wide Web 07/2014; 17(4). · 1.20 Impact Factor
  • Qiang Qu, Siyuan Liu, Bin Yang, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: Increasing volumes of geo-referenced data are becoming available. This data includes so-called points of interest that describe businesses, tourist attractions, etc. by means of a geo-location and properties such as a textual description or ratings. We propose and study the efficient implementation of a new kind of query on points of interest that takes into account both the locations and properties of the points of interest. The query takes a result cardinality, a spatial range, and property-related preferences as parameters, and it returns a compact set of points of interest with the given cardinality and in the given range that satisfies the preferences. Specifically, the points of interest in the result set cover so-called allying preferences and are located far from points of interest that possess so-called alienating preferences. A unified result rating function integrates the two kinds of preferences with spatial distance to achieve this functionality. We provide efficient exact algorithms for this kind of query. To enable queries on large datasets, we also provide an approximate algorithm that utilizes a nearest-neighbor property to achieve scalable performance. We develop and apply lower and upper bounds that enable search-space pruning and thus improve performance. Finally, we provide a generalization of the above query and also extend the algorithms to support the generalization. We report on an experimental evaluation of the proposed algorithms using real point of interest data from Google Places for Business that offers insight into the performance of the proposed solutions.
  • Yu Ma, Bin Yang, Christian S. Jensen
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the increasing availability of moving-object tracking data, trajectory search and matching is increasingly important. We propose and investigate a novel problem called personalized trajectory matching (PTM). In contrast to conventional trajectory similarity search by spatial distance only, PTM takes into account the significance of each sample point in a query trajectory. A PTM query takes a trajectory with user-specified weights for each sample point in the trajectory as its argument. It returns the trajectory in an argument data set with the highest similarity to the query trajectory. We believe that this type of query may bring significant benefits to users in many popular applications such as route planning, carpooling, friend recommendation, traffic analysis, urban computing, and location-based services in general. PTM query processing faces two challenges: how to prune the search space during the query processing and how to schedule multiple so-called expansion centers effectively. To address these challenges, a novel two-phase search algorithm is proposed that carefully selects a set of expansion centers from the query trajectory and exploits upper and lower bounds to prune the search space in the spatial and temporal domains. An efficiency study reveals that the algorithm explores the minimum search space in both domains. Second, a heuristic search strategy based on priority ranking is developed to schedule the multiple expansion centers, which can further prune the search space and enhance the query efficiency. The performance of the PTM query is studied in extensive experiments based on real and synthetic trajectory data sets.
    The VLDB Journal 06/2014; · 1.40 Impact Factor
  • Bin Yang, Chenjuan Guo, Christian S. Jensen, Manohar Kaul, Shuo Shang
    [Show abstract] [Hide abstract]
    ABSTRACT: Different uses of a road network call for the consideration of different travel costs: in route planning, travel time and distance are typically considered, and green house gas (GHG) emissions are increasingly being considered. Further, travel costs such as travel time and GHG emissions are time-dependent and uncertain. To support such uses, we propose techniques that enable the construction of a multi-cost, time-dependent, uncertain graph (MTUG) model of a road network based on GPS data from vehicles that traversed the road network. Based on the MTUG, we define stochastic skyline routes that consider multiple costs and time-dependent uncertainty, and we propose efficient algorithms to retrieve stochastic skyline routes for a given source-destination pair and a start time. Empirical studies with three road networks in Denmark and a substantial GPS data set offer insight into the design properties of the MTUG and the efficiency of the stochastic skyline routing algorithms.
    2014 IEEE 30th International Conference on Data Engineering (ICDE); 03/2014
  • Anders Skovsgaard, Darius Sidlauskas, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: With the rapidly increasing deployment of Internet-connected, location-aware mobile devices, very large and increasing amounts of geo-tagged and timestamped user-generated content, such as microblog posts, are being generated. We present indexing, update, and query processing techniques that are capable of providing the top-k terms seen in posts in a user-specified spatio-temporal range. The techniques enable interactive response times in the millisecond range in a realistic setting where the arrival rate of posts exceeds today's average tweet arrival rate by a factor of 4-10. The techniques adaptively maintain the most frequent items at various spatial and temporal granularities. They extend existing frequent item counting techniques to maintain exact counts rather than approximations. An extensive empirical study with a large collection of geo-tagged tweets shows that the proposed techniques enable online aggregation and query processing at scale in realistic settings.
    2014 IEEE 30th International Conference on Data Engineering (ICDE); 03/2014
  • Laura Radaelli, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: Indoor positioning systems based on fingerprinting techniques generally require costly initialization and maintenance by trained surveyors. Organic positioning systems aim to eliminate these deficiencies by managing their own accuracy and obtaining input from users and other sources. Such systems introduce new challenges, e.g., detection and filtering of erroneous user input, estimation of the positioning accuracy, and means of obtaining user input when necessary. We envision a fully organic indoor positioning system, where all available sources of information are exploited in order to provide room-level accuracy with no active intervention of users. For example, such systems can exploit pre-installed cameras to associate a user's location with a Wi-Fi fingerprint from the user's phone; and it can use a calendar to determine whether a user is in the room reported by the positioning system. Numerous possibilities for integration exist that may provide better indoor positioning.
    Proceedings of the Fifth ACM SIGSPATIAL International Workshop on Indoor Spatial Awareness; 11/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper introduces a new class of temporal expression – named temporal expressions – and methods for recognising and interpreting its members. The commonest temporal expressions typically contain date and time words, like April or hours. Research into recognising and interpreting these typical expressions is mature in many languages. However, there is a class of expressions that are less typical, very varied, and difficult to automatically interpret. These indicate dates and times, but are harder to detect because they often do not contain time words and are not used frequently enough to appear in conventional temporally-annotated corpora – for example Michaelmas or Vasant Panchami. Using Wikipedia and linked data, we automatically construct a resource of English named temporal expressions, and use it to extract training examples from a large corpus. These examples are then used to train and evaluate a named temporal expression recogniser. We also introduce and evaluate rules for automatically interpreting these expressions, and we observe that use of the rules improves temporal annotation performance over existing corpora.
    Recent Advances in Natural Language Processing, RANLP 2013, Hissar, Bulgaria; 09/2013
  • Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: The web is increasingly being used by mobile users, and it is increasingly possible to accurately geo-position mobile users. In addition, increasing volumes of geo-tagged web content are becoming available. Further, indications are that a substantial fraction of web keyword queries target local content. When combined, these observations suggest that spatial keyword querying is important and indeed gaining in importance. A prototypical spatial keyword query takes a user location and user-supplied keywords as parameters and returns web content that is spatially and textually relevant to these parameters. The paper reviews key concepts related to spatial keyword querying and reviews recent proposals by the author and his colleagues for spatial keyword querying functionality that is easy to use, relevant to users, and can be supported efficiently.
    Proceedings of the 7th International Workshop on Ranking in Databases; 08/2013
  • Kenneth S. Bøgh, Anders Skovsgaard, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: The notion of point-of-interest (PoI) has existed since paper road maps began to include markings of useful places such as gas stations, hotels, and tourist attractions. With the introduction of geopositioned mobile devices such as smartphones and mapping services such as Google Maps, the retrieval of PoIs relevant to a user's intent has became a problem of automated spatio-textual information retrieval. Over the last several years, substantial research has gone into the invention of functionality and efficient implementations for retrieving nearby PoIs. However, with a couple of exceptions existing proposals retrieve results at single-PoI granularity. We assume that a mobile device user issues queries consisting of keywords and an automatically supplied geo-position, and we target the common case where the user wishes to find nearby groups of PoIs that are relevant to the keywords. Such groups are relevant to users who wish to conveniently explore several options before making a decision such as to purchase a specific product. Specifically, we demonstrate a practical proposal for finding top-k PoI groups in response to a query. We show how problem parameter settings can be mapped to options that are meaningful to users. Further, although this kind of functionality is prone to combinatorial explosion, we will demonstrate that the functionality can be supported efficiently in practical settings.
    Proceedings of the VLDB Endowment. 08/2013; 6(12):1226-1229.
  • Bin Yang, Chenjuan Guo, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: The monitoring of a system can yield a set of measurements that can be modeled as a collection of time series. These time series are often sparse, due to missing measurements, and spatiotemporally correlated, meaning that spatially close time series exhibit temporal correlation. The analysis of such time series offers insight into the underlying system and enables prediction of system behavior. While the techniques presented in the paper apply more generally, we consider the case of transportation systems and aim to predict travel cost from GPS tracking data from probe vehicles. Specifically, each road segment has an associated travel-cost time series, which is derived from GPS data. We use spatio-temporal hidden Markov models (STHMM) to model correlations among different traffic time series. We provide algorithms that are able to learn the parameters of an STHMM while contending with the sparsity, spatio-temporal correlation, and heterogeneity of the time series. Using the resulting STHMM, near future travel costs in the transportation network, e.g., travel time or greenhouse gas emissions, can be inferred, enabling a variety of routing services, e.g., eco-routing. Empirical studies with a substantial GPS data set offer insight into the design properties of the proposed framework and algorithms, demonstrating the effectiveness and efficiency of travel cost inferencing.
    Proceedings of the VLDB Endowment. 07/2013; 6(9):769-780.
  • Vaida Ceikute, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: Mobile location-based services is a very successful class of services that are being used frequently by users with GPS-enabled mobile devices such as smartphones. This paper presents a study of how to exploit GPS trajectory data, which is available in increasing volumes, for the assessment of the quality of one kind of location-based service, namely routing services. Specifically, the paper presents a framework that enables the comparison of the routes provided by routing services with the actual driving behaviors of local drivers. Comparisons include route length, travel time, and also route popularity, which are enabled by common driving behaviors found in available trajectory data. The ability to evaluate the quality of routing services enables service providers to improve the quality of their services and enables users to identify the services that best serve their needs. The paper covers experiments with real vehicle trajectory data and an existing online navigation service. It is found that the availability of information about previous trips enables better prediction of route travel time and makes it possible to provide the users with more popular routes than does a conventional navigation service.
    Proceedings of the 2013 IEEE 14th International Conference on Mobile Data Management - Volume 01; 06/2013
  • Artur Baniukevic, Christian S. Jensen, Hua Lu
    [Show abstract] [Hide abstract]
    ABSTRACT: Reliable indoor positioning is an important foundation for emerging indoor location based services. Most existing indoor positioning proposals rely on a single wireless technology, e.g., Wi-Fi, Bluetooth, or RFID. A hybrid positioning system combines such technologies and achieves better positioning accuracy by exploiting the different capabilities of the different technologies. In a hybrid system based on Wi-Fi and Bluetooth, the former works as the main infrastructure to enable fingerprint based positioning, while the latter (via hotspot devices) partitions the indoor space as well as a large Wi-Fi radio map. As a result, the Wi-Fi based online position estimation is improved in a divide-and-conquer manner. We study three aspects of such a hybrid indoor positioning system. First, to avoid large positioning errors caused by similar reference positions that are hard to distinguish, we design a deployment algorithm that identifies and separates such positions into different smaller radio maps by deploying Bluetooth hotspots at particular positions. Second, we design methods that improve the partition switching that occurs when a user leaves the detection range of a Bluetooth hotspot. Third, we propose three architectural options for placement of the computation workload. We evaluate all proposals using both simulation and walkthrough experiments in two indoor environments of different sizes. The results show that our proposals are effective and efficient in achieving very good indoor positioning performance.
    Proceedings of the 2013 IEEE 14th International Conference on Mobile Data Management - Volume 01; 06/2013
  • Laura Radaelli, Dovydas Sabonis, Hua Lu, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: With the proliferation of mobile computing, positioning systems are becoming available that enable indoor location-based services. As a result, indoor tracking data is also becoming available. This paper puts focus on one use of such data, namely the identification of typical movement patterns among indoor moving objects. Specifically, the paper presents a method for the identification of movement patterns. Leveraging concepts from sequential pattern mining, the method takes into account the specifics of spatial movement and, in particular, the specifics of tracking data that captures indoor movement. For example, the paper's proposal supports spatial aggregation and utilizes the topology of indoor spaces to achieve better performance. The paper reports on empirical studies with real and synthetic data that offer insights into the functional and computational aspects of its proposal.
    Proceedings of the 2013 IEEE 14th International Conference on Mobile Data Management - Volume 01; 06/2013
  • Ove Andersen, Christian S. Jensen, Kristian Torp, Bin Yang
    [Show abstract] [Hide abstract]
    ABSTRACT: Reduction in greenhouse gas emissions from transportation is essential in combating global warming and climate change. Eco-routing enables drivers to use the most eco-friendly routes and is effective in reducing vehicle emissions. The EcoTour system assigns eco-weights to a road network based on GPS and fuel consumption data collected from vehicles to enable ecorouting. Given an arbitrary source-destination pair in Denmark, EcoTour returns the shortest route, the fastest route, and the eco-route, along with statistics for the three routes. EcoTour also serves as a testbed for exploring advanced solutions to a range of challenges related to eco-routing.
    2013 14th IEEE International Conference on Mobile Data Management (MDM); 06/2013
  • Dingming Wu, Man Lung Yiu, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: Web users and content are increasingly being geo-positioned. This development gives prominence to spatial keyword queries, which involve both the locations and textual descriptions of content. We study the efficient processing of continuously moving top-k spatial keyword (MkSK) queries over spatial text data. State-of-the-art solutions for moving queries employ safe zones that guarantee the validity of reported results as long as the user remains within the safe zone associated with a result. However, existing safe-zone methods focus solely on spatial locations and ignore text relevancy. We propose two algorithms for computing safe zones that guarantee correct results at any time and that aim to optimize the server-side computation as well as the communication between the server and the client. We exploit tight and conservative approximations of safe zones and aggressive computational space pruning. We present techniques that aim to compute the next safe zone efficiently, and we present two types of conservative safe zones that aim to reduce the communication cost. Empirical studies with real data suggest that the proposals are efficient. To understand the effectiveness of the proposed safe zones, we study analytically the expected area of a safe zone, which indicates on average for how long a safe zone remains valid, and we study the expected number of influence objects needed to define a safe zone, which gives an estimate of the average communication cost. The analytical modeling is validated through empirical studies.
    ACM Transactions on Database Systems (TODS). 04/2013; 38(1).
  • Bin Yang, Nicolas Fantini, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: A wide variety of desktop and mobile Web applications involve geo-tagged content, e.g., photos and (micro-) blog postings. Such content, often called User Generated Geo-Content (UGGC), plays an increasingly important role in many applications. However, a great demand also exists for "core" UGGC where the geo-spatial aspect is not just a tag on other content, but is the primary content, e.g., a city street map with up-to-date road construction data. Along these lines, the iPark system aims to turn volumes of GPS data obtained from vehicles into information about the locations of parking spaces, thus enabling effective parking search applications. In particular, we demonstrate how iPark helps ordinary users annotate an existing digital map with two types of parking, on-street parking and parking zones, based on vehicular tracking data.
    Proceedings of the 16th International Conference on Extending Database Technology; 03/2013
  • Xiaohui Li, Vaida Ceikute, Christian S. Jensen, Kian-Lee Tan
    [Show abstract] [Hide abstract]
    ABSTRACT: Finding a location for a new facility such that the facility attracts the maximal number of customers is a challenging problem. Existing studies either model customers as static sites and thus do not consider customer movement, or they focus on theoretical aspects and do not provide solutions that are shown empirically to be scalable. Given a road network, a set of existing facilities, and a collection of customer route traversals, an optimal segment query returns the optimal road network segment(s) for a new facility. We propose a practical framework for computing this query, where each route traversal is assigned a score that is distributed among the road segments covered by the route according to a score distribution model. The query returns the road segment(s) with the highest score. To achieve low latency, it is essential to prune the very large search space. We propose two algorithms that adopt different approaches to computing the query. Algorithm AUG uses graph augmentation, and ITE uses iterative road-network partitioning. Empirical studies with real data sets demonstrate that the algorithms are capable of offering high performance in realistic settings.
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the increasing availability of terrain data, e.g., from aerial laser scans, the management of such data is attracting increasing at- tention in both industry and academia. In particular, spatial queries, e.g., k-nearest neighbor and reverse nearest neighbor queries, in Euclidean and spatial network spaces are being extended to ter- rains. Such queries all rely on an important operation, that of finding shortest surface distances. However, shortest surface dis- tance computation is very time consuming. We propose techniques that enable efficient computation of lower and upper bounds of the shortest surface distance, which enable faster query processing by eliminating expensive distance computations. Empirical studies show that our bounds are much tighter than the best-known bounds in many cases and that they enable speedups of up to 43 times for some well-known spatial queries.
    PVLDB. 03/2013;
  • Source
    Kostas Tzoumas, Amol Deshpande, Christian S. Jensen
    [Show abstract] [Hide abstract]
    ABSTRACT: Query optimizers rely on statistical models that succinctly describe the underlying data. Models are used to derive cardinality estimates for intermediate relations, which in turn guide the optimizer to choose the best query execution plan. The quality of the resulting plan is highly dependent on the accuracy of the statistical model that represents the data. It is well known that small errors in the model estimates propagate exponentially through joins, and may result in the choice of a highly sub-optimal query execution plan. Most commercial query optimizers make the attribute value independence assumption: all attributes are assumed to be statistically independent. This reduces the statistical model of the data to a collection of one-dimensional synopses (typically in the form of histograms), and it permits the optimizer to estimate the selectivity of a predicate conjunction as the product of the selectivities of the constituent predicates. However, this independence assumption is more often than not wrong, and is considered to be the most common cause of sub-optimal query execution plans chosen by modern query optimizers. We take a step towards a principled and practical approach to performing cardinality estimation without making the independence assumption. By carefully using concepts from the field of graphical models, we are able to factor the joint probability distribution over all the attributes in the database into small, usually two-dimensional distributions, without a significant loss in estimation accuracy. We show how to efficiently construct such a graphical model from the database using only two-way join queries, and we show how to perform selectivity estimation in a highly efficient manner. We integrate our algorithms into the PostgreSQL DBMS. Experimental results indicate that estimation errors can be greatly reduced, leading to orders of magnitude more efficient query execution plans in many cases. Optimization time is kept in the range of tens of milliseconds, making this a practical approach for industrial-strength query optimizers.
    The VLDB Journal 02/2013; 22(1). · 1.40 Impact Factor

Publication Stats

9k Citations
77.05 Total Impact Points


  • 1970–2014
    • Aalborg University
      • • Department of Computer Science
      • • Department of Mathematical Sciences
      Ålborg, North Denmark, Denmark
  • 2010–2013
    • Aarhus University
      • Department of Computer Science
      Aarhus, Central Jutland, Denmark
  • 2012
    • Nanyang Technological University
      • School of Computer Engineering
      Tumasik, Singapore
  • 2006
    • National University of Singapore
      • Department of Computer Science
      Singapore, Singapore
    • Libera Università di Bozen-Bolzano
      • Faculty of Computer Science
      Bozen, Trentino-Alto Adige, Italy
  • 2005
    • Microsoft
      Washington, West Virginia, United States
  • 1992–2004
    • University of Maryland, College Park
      • Department of Computer Science
      Maryland, United States
    • The University of Arizona
      • Department of Computer Science
      Tucson, AZ, United States
  • 1999
    • National Technical University of Athens
      • School of Electrical and Computer Engineering
      Athens, Attiki, Greece