Christian S. Jensen

Aarhus University, Aarhus, Central Jutland, Denmark

Are you Christian S. Jensen?

Claim your profile

Publications (436)63.98 Total impact

  • Xiaohui Li, Vaida Ceikute, Christian S. Jensen, Kian-Lee Tan
    [show abstract] [hide abstract]
    ABSTRACT: Finding a location for a new facility such that the facility attracts the maximal number of customers is a challenging problem. Existing studies either model customers as static sites and thus do not consider customer movement, or they focus on theoretical aspects and do not provide solutions that are shown empirically to be scalable. Given a road network, a set of existing facilities, and a collection of customer route traversals, an optimal segment query returns the optimal road network segment(s) for a new facility. We propose a practical framework for computing this query, where each route traversal is assigned a score that is distributed among the road segments covered by the route according to a score distribution model. The query returns the road segment(s) with the highest score. To achieve low latency, it is essential to prune the very large search space. We propose two algorithms that adopt different approaches to computing the query. Algorithm AUG uses graph augmentation, and ITE uses iterative road-network partitioning. Empirical studies with real data sets demonstrate that the algorithms are capable of offering high performance in realistic settings.
    03/2013;
  • [show abstract] [hide abstract]
    ABSTRACT: GPS-enabled devices are pervasive nowadays. Finding movement patterns in trajectory data stream is gaining in importance. We propose a group discovery framework that aims to efficiently support the online discovery of moving objects that travel together. The framework adopts a sampling-independent approach that makes no assumptions about when positions are sampled, gives no special importance to sampling points, and naturally supports the use of approximate trajectories. The framework's algorithms exploit state-of-the-art, density-based clustering (DBScan) to identify groups. The groups are scored based on their cardinality and duration, and the top-k groups are returned. To avoid returning similar subgroups in a result, notions of domination and similarity are introduced that enable the pruning of low-interest groups. Empirical studies on real and synthetic data sets offer insight into the effectiveness and efficiency of the proposed framework.
    IEEE Transactions on Knowledge and Data Engineering 01/2013; 25(12):2752-2766. · 1.89 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: This paper considers a cloud computing setting in which similarity querying of metric data is outsourced to a service provider. The data is to be revealed only to trusted users, not to the service provider or anyone else. Users query the server for the most similar data objects to a query example. Outsourcing offers the data owner scalability and a low-initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise confidential. Given this setting, the paper presents techniques that transform the data prior to supplying it to the service provider for similarity queries on the transformed data. Our techniques provide interesting trade-offs between query cost and accuracy. They are then further extended to offer an intuitive privacy guarantee. Empirical studies with real data demonstrate that the techniques are capable of offering privacy while enabling efficient and accurate processing of similarity queries.
    IEEE Transactions on Knowledge and Data Engineering 03/2012; · 1.89 Impact Factor
  • Darius Šidlauskas, Simonas Šaltenis, Christian S. Jensen
    [show abstract] [hide abstract]
    ABSTRACT: We are witnessing a proliferation of Internet-worked, geo-positioned mobile devices such as smartphones and personal navigation devices. Likewise, location-related services that target the users of such devices are proliferating. Consequently, server-side infrastructures are needed that are capable of supporting the location-related query and update workloads generated by very large populations of such moving objects. This paper presents a main-memory indexing technique that aims to support such workloads. The technique, called PGrid, uses a grid structure that is capable of exploiting the parallelism offered by modern processors. Unlike earlier proposals that maintain separate structures for updates and queries, PGrid allows both long-running queries and rapid updates to operate on a single data structure and thus offers up-to-date query results. Because PGrid does not rely on creating snapshots, it avoids the stop-the-world problem that occurs when workload processing is interrupted to perform such snapshotting. Its concurrency control mechanism relies instead on hardware-assisted atomic updates as well as object-level copying, and it treats updates as non-divisible operations rather than as combinations of deletions and insertions; thus, the query semantics guarantee that no objects are missed in query results. Empirical studies demonstrate that PGrid scales near-linearly with the number of hardware threads on four modern multi-core processors. Since both updates and queries are processed on the same current data-store state, PGrid outperforms snapshot-based techniques in terms of both query freshness and CPU cycle-wise efficiency.
    01/2012;
  • Xin Cao, Gao Cong, Bin Cui, Christian S. Jensen, Quan Yuan
    [show abstract] [hide abstract]
    ABSTRACT: Community Question Answering (CQA) is a popular type of service where users ask questions and where answers are obtained from other users or from historical question-answer pairs. CQA archives contain large volumes of questions organized into a hierarchy of categories. As an essential function of CQA services, question retrieval in a CQA archive aims to retrieve historical question-answer pairs that are relevant to a query question. This article presents several new approaches to exploiting the category information of questions for improving the performance of question retrieval, and it applies these approaches to existing question retrieval models, including a state-of-the-art question retrieval model. Experiments conducted on real CQA data demonstrate that the proposed techniques are effective and efficient and are capable of outperforming a variety of baseline methods significantly.
    ACM Transactions on Information Systems - TOIS. 01/2012;
  • Jeppe Rishede Thomsen, Man Lung Yiu, Christian S. Jensen
    [show abstract] [hide abstract]
    ABSTRACT: Web search is ubiquitous in our daily lives. Caching has been extensively used to reduce the computation time of the search engine and reduce the network traffic beyond a proxy server. Another form of web search, known as online shortest path search, is popular due to advances in geo-positioning. However, existing caching techniques are ineffective for shortest path queries. This is due to several crucial differences between web search results and shortest path results, in relation to query matching, cache item overlapping, and query cost variation. Motivated by this, we identify several properties that are essential to the success of effective caching for shortest path search. Our cache exploits the optimal subpath property, which allows a cached shortest path to answer any query with source and target nodes on the path. We utilize statistics from query logs to estimate the benefit of caching a specific shortest path, and we employ a greedy algorithm for placing beneficial paths in the cache. Also, we design a compact cache structure that supports efficient query matching at runtime. Empirical results on real datasets confirm the effectiveness of our proposed techniques.
    01/2012;
  • Hua Lu, Xin Cao, Christian S. Jensen
    [show abstract] [hide abstract]
    ABSTRACT: Indoor spaces accommodate large numbers of spatial objects, e.g., points of interest (POIs), and moving populations. A variety of services, e.g., location-based services and security control, are relevant to indoor spaces. Such services can be improved substantially if they are capable of utilizing indoor distances. However, existing indoor space models do not account well for indoor distances. To address this shortcoming, we propose a data management infrastructure that captures indoor distance and facilitates distance-aware query processing. In particular, we propose a distance-aware indoor space model that integrates indoor distance seamlessly. To enable the use of the model as a foundation for query processing, we develop accompanying, efficient algorithms that compute indoor distances for different indoor entities like doors as well as locations. We also propose an indexing framework that accommodates indoor distances that are pre-computed using the proposed algorithms. On top of this foundation, we develop efficient algorithms for typical indoor, distance-aware queries. The results of an extensive experimental evaluation demonstrate the efficacy of the proposals.
    01/2012;
  • Hua Lu, Christian S. Jensen
    [show abstract] [hide abstract]
    ABSTRACT: The skyline of a multidimensional point set consists of the points that are not dominated by other points. In a scenario where product features are represented by multidimensional points, the skyline points may be viewed as representing competitive products. A product provider may wish to upgrade uncompetitive products to become competitive, but wants to take into account the upgrading cost. We study the top-k product upgrading problem. Given a set P of competitor products, a set T of products that are candidates for upgrade, and an upgrading cost function f that applies to T, the problem is to return the k products in T that can be upgraded to not be dominated by any products in P at the lowest cost. This problem is non-trivial due to not only the large data set sizes, but also to the many possibilities for upgrading a product. We identify and provide solutions for the different options for upgrading an uncompetitive product, and combine the solutions into a single solution. We also propose a spatial join-based solution that assumes P and T are indexed by an R-tree. Given a set of products in the same R-tree node, we derive three lower bounds on their upgrading costs. These bounds are employed by the join approach to prune upgrade candidates with uncompetitive upgrade costs. Empirical studies with synthetic and real data show that the join approach is efficient and scalable.
    01/2012;
  • Darius Šidlauskas, Christian S. Jensen, Simonas Šaltenis
    [show abstract] [hide abstract]
    ABSTRACT: Deployments of networked sensors fuel online applications that feed on real-time sensor data. This scenario calls for techniques that support the management of workloads that contain queries as well as very frequent updates. This paper compares two well-chosen approaches to exploiting the parallelism offered by modern processors for supporting such workloads. A general approach to avoiding contention among parallel hardware threads and thus exploiting the parallelism available in processors is to maintain two copies, or snapshots, of the data: one for the relatively long-duration queries and one for the frequent and very localized updates. The snapshot that receives the updates is frequently made available to queries, so that queries see up-to-date data. The snapshots may be physical or virtual. Physical snapshots are created using the C library memcpy function. Virtual snapshots are created by the fork system function that creates a new process that initially has the same data snapshot as the process it was forked from. When the new process carries out updates, this triggers the actual memory copying in a copy-on-write manner at memory page granularity. This paper characterizes the circumstances under which each technique is preferable. The use of physical snapshots is surprisingly efficient.
    01/2012;
  • [show abstract] [hide abstract]
    ABSTRACT: A range of applications call for a mobile client to continuously monitor others in close proximity. Past research on such problems has covered two extremes: It has offered totally centralized solutions, where a server takes care of all queries, and totally distributed solutions, in which there is no central authority at all. Unfortunately, none of these two solutions scales to intensive moving object tracking applications, where each client poses a query. In this paper, we formulate the moving continuous query (MCQ) problem and propose a balanced model where servers cooperatively take care of the global view and handle the majority of the workload. Meanwhile, moving clients, having basic memory and computation resources, handle small portions of the workload. This model is further enhanced by dynamic region allocation and grid size adjustment mechanisms that reduce the communication and computation cost for both servers and clients. An experimental study demonstrates that our approaches offer better scalability than competitors.
    Mobile Data Management (MDM), 2012 IEEE 13th International Conference on; 01/2012
  • Source
    Dan Lin, Christian S. Jensen, Rui Zhang, Lu Xiao, Jiaheng Lu
    [show abstract] [hide abstract]
    ABSTRACT: With the growing use of location-based services, location privacy attracts increasing attention from users, industry, and the research community. While considerable effort has been devoted to inventing techniques that prevent service providers from knowing a user's exact location, relatively little attention has been paid to enabling so-called peer-wise privacy--the protection of a user's location from unauthorized peer users. This paper identifies an important efficiency problem in existing peer-privacy approaches that simply apply a filtering step to identify users that are located in a query range, but that do not want to disclose their location to the querying peer. To solve this problem, we propose a novel, privacy-policy enabled index called the PEB-tree that seamlessly integrates location proximity and policy compatibility. We propose efficient algorithms that use the PEB-tree for processing privacy-aware range and kNN queries. Extensive experiments suggest that the PEB-tree enables efficient query processing.
    09/2011;
  • [show abstract] [hide abstract]
    ABSTRACT: Modern processors consist of multiple cores that each support parallel processing by multiple physical threads, and they offer ample main-memory storage. This paper studies the use of such processors for the processing of update-intensive moving-object workloads that contain very frequent updates as well as contain queries. The non-trivial challenge addressed is that of avoiding contention between long-running queries and frequent updates. Specifically, the paper proposes a grid-based indexing technique. A static grid indexes a near up-to-date snapshot of the data to support queries, while a live grid supports updates. An efficient cloning technique that exploits the memcpy system call is used to maintain the static grid. An empirical study conducted with three modern processors finds that very frequent cloning, on the order of tens of milliseconds, is feasible, that the proposal scales linearly with the number of hardware threads, and that it significantly outperforms the previous state-of-the-art approach in terms of update throughput and query freshness.
    08/2011: pages 186-204;
  • C.R. Vicente, I. Assent, C.S. Jensen
    [show abstract] [hide abstract]
    ABSTRACT: An online Route Planning Service (RPS) computes a route from one location to another. Current RPSs such as Google Maps require the use of precise locations. However, some users may not want to disclose their source and destination locations due to privacy concerns. An approach that supplies fake locations to an existing service incurs a substantial loss of quality of service, and the service may well return a result that may be not helpful to the user. We propose a solution that is able to return accurate route planning results when source and destination regions are used in order to achieve privacy. The solution re-uses a standard online RPS rather than replicate this functionality, and it needs no trusted third party. The solution is able to compute the exact results without leaking of the exact locations to the RPS or un-trusted parties. In addition, we provide heuristics that reduce the number of times that the RPS needs to be queried, and we also describe how the accuracy and privacy requirements can be relaxed to achieve better performance. An empirical study offers insight into key properties of the approach.
    Mobile Data Management (MDM), 2011 12th IEEE International Conference on; 07/2011
  • [show abstract] [hide abstract]
    ABSTRACT: Location-Based Services (LBSs) constitutes one of the most popular classes of mobile services. However, while current LBSs typically target outdoor settings, we lead large parts of our lives indoors. The availability of easy-to-use and low-cost indoor positioning services is essential in also enabling indoor LBSs. Existing indoor positioning services typically use a single technology such as Wi-Fi, RFID or Bluetooth. Wi-Fi based indoor positioning is relatively easy to deploy, but does often not offer good positioning accuracy. In contrast, the use of RFID or Bluetooth for positioning requires considerable investments in equipment in order to ensure good positioning accuracy. Motivated by these observations, we propose a hybrid approach to indoor positioning. In particular, we introduce Bluetooth hotspots into an indoor space with an existing Wi-Fi infrastructure such that better positioning is achieved than what can be achieved by each technology in isolation. We design a flexible and extensible system architecture with an effective online position estimation algorithm for the hybrid system. The system is evaluated empirically in the building of our department. The results show that the hybrid approach improves positioning accuracy markedly.
    Mobile Data Management (MDM), 2011 12th IEEE International Conference on; 07/2011
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Geo-social networks (GeoSNs) provide context-aware services that help associate location with users and content. The proliferation of GeoSNs indicates that they're rapidly attracting users. GeoSNs currently offer different types of services, including photo sharing, friend tracking, and "check-ins." However, this ability to reveal users' locations causes new privacy threats, which in turn call for new privacy-protection methods. The authors study four privacy aspects central to these social networks - location, absence, co-location, and identity privacy - and describe possible means of protecting privacy in these circumstances.
    IEEE Internet Computing 07/2011; · 2.04 Impact Factor
  • Source
    Hua Lu, Bin Yang, C.S. Jensen
    [show abstract] [hide abstract]
    ABSTRACT: To facilitate a variety of applications, positioning systems are deployed in indoor settings. For example, Bluetooth and RFID positioning are deployed in airports to support real-time monitoring of delays as well as off-line flow and space usage analyses. Such deployments generate large collections of tracking data. Like in other data management applications, joins are indispensable in this setting. However, joins on indoor tracking data call for novel techniques that take into account the limited capabilities of the positioning systems as well as the specifics of indoor spaces. This paper proposes and studies probabilistic, spatio-temporal joins on historical indoor tracking data. Two meaningful types of join are defined. They return object pairs that satisfy spatial join predicates either at a time point or during a time interval. The predicates considered include “same X,” where X is a semantic region such as a room or hallway. Based on an analysis on the uncertainty inherent to indoor tracking data, effective join probabilities are formalized and evaluated for object pairs. Efficient two-phase hash-based algorithms are proposed for the point and interval joins. In a filter-and-refine framework, an R-tree variant is proposed that facilitates the retrieval of join candidates, and pruning rules are supplied that eliminate candidate pairs that do not qualify. An empirical study on both synthetic and real data shows that the proposed techniques are efficient and scalable.
    Data Engineering (ICDE), 2011 IEEE 27th International Conference on; 05/2011
  • Source
    Man Lung Yiu, Christian S. Jensen, Jesper Møller, Hua Lu
    [show abstract] [hide abstract]
    ABSTRACT: Users of mobile services wish to retrieve nearby points of interest without disclosing their locations to the services. This article addresses the challenge of optimizing the query performance while satisfying given location privacy and query accuracy requirements. The article's proposal, SpaceTwist, aims to offer location privacy for k nearest neighbor (kNN) queries at low communication cost without requiring a trusted anonymizer. The solution can be used with a conventional DBMS as well as with a server optimized for location-based services. In particular, we believe that this is the first solution that expresses the server-side functionality in a single SQL statement. In its basic form, SpaceTwist utilizes well-known incremental NN query processing on the server. When augmented with a server-side granular search technique, SpaceTwist is capable of exploiting relaxed query accuracy guarantees for obtaining better performance. We extend SpaceTwist with so-called ring ranking, which improves the communication cost, delayed termination, which improves the privacy afforded the user, and the ability to function in spatial networks in addition to Euclidean space. We report on analytical and empirical studies that offer insight into the properties of SpaceTwist and suggest that our proposal is indeed capable of offering privacy with very good performance in realistic settings.
    ACM Trans. Database Syst. 01/2011; 36:10.
  • Source
    Hua Lu, Christian S. Jensen, Zhenjie Zhang
    [show abstract] [hide abstract]
    ABSTRACT: Given a set of multidimensional points, a skyline query returns the interesting points that are not dominated by other points. It has been observed that the actual cardinality (s) of a skyline query result may differ substantially from the desired result cardinality (k), which has prompted studies on how to reduce s for the case where ks . Based on these observations, the paper proposes a new approach, called skyline ordering, that forms a skyline-based partitioning of a given data set such that an order exists among the partitions. Then, set-wide maximization techniques may be applied within each partition. Efficient algorithms are developed for skyline ordering and for resolving size constraints using the skyline order. The results of extensive experiments show that skyline ordering yields a flexible framework for the efficient and scalable resolution of arbitrary size constraints on skyline queries. Index Terms—Skyline queries, query processing, database management.
    IEEE Transactions on Knowledge and Data Engineering 01/2011; 23:991-1005. · 1.89 Impact Factor
  • Source
    PVLDB. 01/2011; 4:290-301.
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Equal access to public information and services for all is an essential part of the United Nations (UN) Declaration of Human Rights. Today, the Web plays an important role in providing information and services to citizens. Unfortunately, many government Web sites are poorly designed and have accessibility barriers that prevent people with disabilities from using them. This article combines current Web accessibility benchmarking methodologies with a sound strategy for comparing Web accessibility among countries and continents. Furthermore, the article presents the first global analysis of the Web accessibility of 192 United Nation Member States made publically available. The article also identifies common properties of Member States that have accessible and inaccessible Web sites and shows that implementing antidisability discrimination laws is highly beneficial for the accessibility of Web sites, while signing the UN Rights and Dignity of Persons with Disabilities has had no such effect yet. The article demonstrates that, despite the commonly held assumption to the contrary, mature, high-quality Web sites are more accessible than lower quality ones. Moreover, Web accessibility conformance claims by Web site owners are generally exaggerated.
    Journal of Information Technology & Politics 01/2011; 8(1):41-67.

Publication Stats

7k Citations
63.98 Total Impact Points

Institutions

  • 2010–2011
    • Aarhus University
      • Department of Computer Science
      Aarhus, Central Jutland, Denmark
  • 1970–2011
    • Aalborg University
      • • Department of Computer Science
      • • Department of Mathematical Sciences
      Aalborg, Region North Jutland, Denmark
  • 2006
    • National University of Singapore
      • Department of Computer Science
      Singapore, Singapore
    • Libera Università di Bozen-Bolzano
      • Faculty of Computer Science
      Bozen, Trentino-Alto Adige, Italy
  • 2005
    • Microsoft
      Washington, West Virginia, United States
  • 1992–2004
    • University of Maryland, College Park
      • Department of Computer Science
      Maryland, United States
    • The University of Arizona
      • Department of Computer Science
      Tucson, AZ, United States
  • 1999
    • National Technical University of Athens
      • School of Electrical and Computer Engineering
      Athens, Attiki, Greece