Christian S. Jensen

Aalborg University, Ålborg, North Denmark, Denmark

Are you Christian S. Jensen?

Claim your profile

Publications (493)111.36 Total impact

  • Dario Fe · Morten Greve-Pedersen · Christian Sig Jensen · C. S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: In the joint project “FORAGESELECT”, we aim to implement Genome Wide Selection (GWS) in breeding of perennial ryegrass (Lolium perenne L.), in order to increase genetic response in important agronomic traits such as yield, seed production, stress tolerance and disease resistance, while decreasing greenhouse emissions and nitrogen loss. GWS model building includes 1) development of a robust quantitative genotyping method for an outcrossing species, 2) tailoring of multi-locational, multi-annual phenotype data, 3) association analysis and development of prediction models. As part of (2) the aim of this study was to estimate the genetic and environmental variance in the training set composed of F2 families selected from a ten year breeding period. Variance components were estimated on 1193 of those families, sown in 2001, 2003 and 2005 in five locations around Europe. Families were tested together with commercial varieties used as checks. The first analyses focused on yield (green and dry matter) and the data were analyzed using a mixed model including fixed effect of experiment, effect of check variety, and random effects within experiment effects to recover interblock information, effects of pedigree (parents), repeated effect of the same family and residual error. Results showed the presence of a significant genetic variance among the random factors, indicating the existence of a considerably high variance in the commercial population. This will provide good opportunities for future improvement programs based on GWS. Future work will focus on developing association models based on tailored phenotype data and genotype-by-sequencing-derived allele frequencies.
    International Plant and Animal Genome Conference XXI 2013; 11/2015
  • Jian Dai · Bin Yang · Chenjuan Guo · Christian S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Using the growing volumes of vehicle trajectory data, it becomes increasingly possible to capture time-varying and uncertain travel costs in a road network, including travel time and fuel consumption. The current paradigm represents a road network as a graph, assigns weights to the graph's edges by fragmenting trajectories into small pieces that fit the underlying edges, and then applies a routing algorithm to the resulting graph. We propose a new paradigm that targets more accurate and more efficient estimation of the costs of paths by associating weights with sub-paths in the road network. The paper provides a solution to a foundational problem in this paradigm, namely that of computing the time-varying cost distribution of a path. The solution consists of several steps. We first learn a set of random variables that capture the joint distributions of sub-paths that are covered by sufficient trajectories. Then, given a departure time and a path, we select an optimal subset of learned random variables such that the random variables' corresponding paths together cover the path. This enables accurate joint distribution estimation of the path, and by transferring the joint distribution into a marginal distribution, the travel cost distribution of the path is obtained. The use of multiple learned random variables contends with data sparseness, the use of multi-dimensional histograms enables compact representation of arbitrary joint distributions that fully capture the travel cost dependencies among the edges in paths. Empirical studies with substantial trajectory data from two different cities offer insight into the design properties of the proposed solution and suggest that the solution is effective in real-world settings.
  • Christian S. Jensen · Christopher Jermaine · Xiaofang Zhou ·
    [Show abstract] [Hide abstract]
    ABSTRACT: The papers in this special section were presented a the 29th International Conference on Data Engineering was held in Brisbane, QLD, Australia, on April 8-11, 2013.
    IEEE Transactions on Knowledge and Data Engineering 07/2015; 27(7):1739-1740. DOI:10.1109/TKDE.2015.2419315 · 2.07 Impact Factor
  • Chenjuan Guo · Bin Yang · Ove Andersen · Christian S. Jensen · Kristian Torp ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Eco-routing is a simple yet effective approach to substantially reducing the environmental impact, e.g., fuel consumption and greenhouse gas (GHG) emissions, of vehicular transportation. Eco-routing relies on the ability to reliably quantify the environmental impact of vehicles as they travel in a spatial network. The procedure of quantifying such vehicular impact for road segments of a spatial network is called eco-weight assignment. EcoMark 2.0 proposes a general framework for eco-weight assignment to enable eco-routing. It studies the abilities of six instantaneous and five aggregated models to estimating vehicular environmental impact. In doing so, it utilizes travel information derived from GPS trajectories (i.e., velocities and accelerations) and actual fuel consumption data obtained from vehicles. The framework covers analyses of actual fuel consumption, impact model calibration, and experiments for assessing the utility of the impact models in assigning eco-weights. The application of EcoMark 2.0 indicates that the instantaneous model EMIT and the aggregated model SIDRA-Running are suitable for assigning eco-weights under varying circumstances. In contrast, other instantaneous models should not be used for assigning eco-weights, and other aggregated models can be used for assigning eco-weights under certain circumstances.
    GeoInformatica 07/2015; 19(3). DOI:10.1007/s10707-014-0221-7 · 0.75 Impact Factor
  • Xin Cao · Gao Cong · Tao Guo · Christian S. Jensen · Beng Chin Ooi ·
    [Show abstract] [Hide abstract]
    ABSTRACT: With the proliferation of geo-positioning and geo-tagging techniques, spatio-textual objects that possess both a geographical location and a textual description are gaining in prevalence, and spatial keyword queries that exploit both location and textual description are gaining in prominence. However, the queries studied so far generally focus on finding individual objects that each satisfy a query rather than finding groups of objects where the objects in a group together satisfy a query. We define the problem of retrieving a group of spatio-textual objects such that the group's keywords cover the query's keywords and such that the objects are nearest to the query location and have the smallest inter-object distances. Specifically, we study three instantiations of this problem, all of which are NP-hard. We devise exact solutions as well as approximate solutions with provable approximation bounds to the problems. In addition, we solve the problems of retrieving top-k groups of three instantiations, and study a weighted version of the problem that incorporates object weights. We present empirical studies that offer insight into the efficiency of the solutions, as well as the accuracy of the approximate solutions.
    ACM Transactions on Database Systems 06/2015; 40(2). DOI:10.1145/2772600 · 0.68 Impact Factor
  • Dingming Wu · Byron Choi · Jianliang Xu · Christian S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: A moving top-$k$ spatial keyword (M $k$ SK) query, which takes into account a continuously moving query location, enables a mobile client to be continuously aware of the top-$k$ spatial web objects that best match a query with respect to location and text relevance. The increasing mobile use of the web and the proliferation of geo-positioning render it of interest to consider a scenario where spatial keyword search is outsourced to a separate service provider capable at handling the voluminous spatial web objects available from various sources. A key challenge is that the service provider may return inaccurate or incorrect query results (intentionally or not), e.g., due to cost considerations or invasion of hackers. Therefore, it is attractive to be able to authenticate the query results at the client side. Existing authentication techniques are either inefficient or inapplicable for the kind of query we consider. We propose new authentication data structures, the MIR-tree and MIR $^*$ -tree, that enable the authentication of MkSK queries at low computation and communication costs. We design a verification object for authenticating MkSK queries, and we provide algorithms for constructing verification objects and using these for verifying query results. A thorough experimental study on real data s- ows that the proposed techniques are capable of outperforming two baseline algorithms by orders of magnitude.
    IEEE Transactions on Knowledge and Data Engineering 04/2015; 27(4):922-935. DOI:10.1109/TKDE.2014.2350252 · 2.07 Impact Factor
  • Bin Yang · Chenjuan Guo · Yu Ma · Christian S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: A driver’s choice of a route to a destination may depend on the route’s length and travel time, but a multitude of other, possibly hard-to-formalize aspects, may also factor into the driver’s decision. There is evidence that a driver’s choice of route is context dependent, e.g., varies across time, and that route choice also varies from driver to driver. In contrast, conventional routing services support little in the way of context dependence, and they deliver the same routes to all drivers. We study how to identify context-aware driving preferences for individual drivers from historical trajectories, and thus how to provide foundations for personalized navigation, but also professional driver education and traffic planning. We provide techniques that are able to capture time-dependent and uncertain properties of dynamic travel costs, such as travel time and fuel consumption, from trajectories, and we provide techniques capable of capturing the driving behaviors of different drivers in terms of multiple dynamic travel costs. Further, we propose techniques that are able to identify a driver’s contexts and then to identify driving preferences for each context using historical trajectories from the driver. Empirical studies with a large trajectory data set offer insight into the design properties of the proposed techniques and suggest that they are effective.
    The VLDB Journal 02/2015; 24(2). DOI:10.1007/s00778-015-0378-1 · 1.57 Impact Factor
  • Source
    Francesco Lettich · Salvatore Orlando · Claudio Silvestri · Christian S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: The ability to timely process significant amounts of continuously updated spatial data is mandatory for an increasing number of applications. Parallelism enables such applications to face this data-intensive challenge and allows the devised systems to feature low latency and high scalability. In this paper we focus on a specific data-intensive problem, concerning the repeated processing of huge amounts of range queries over massive sets of moving objects, where the spatial extents of queries and objects are continuously modified over time. To tackle this problem and significantly accelerate query processing we devise a hybrid CPU/GPU pipeline that compresses data output and save query processing work. The devised system relies on an ad-hoc spatial index leading to a problem decomposition that results in a set of independent data-parallel tasks. The index is based on a point-region quadtree space decomposition and allows to tackle effectively a broad range of spatial object distributions, even those very skewed. Also, to deal with the architectural peculiarities and limitations of the GPUs, we adopt non-trivial GPU data structures that avoid the need of locked memory accesses and favour coalesced memory accesses, thus enhancing the overall memory throughput. To the best of our knowledge this is the first work that exploits GPUs to efficiently solve repeated range queries over massive sets of continuously moving objects, characterized by highly skewed spatial distributions. In comparison with state-of-the-art CPU-based implementations, our method highlights significant speedups in the order of 14x-20x, depending on the datasets, even when considering very cheap GPUs.
  • Source
    Darius Šidlauskas · Simonas Šaltenis · Christian S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: The efficient processing of workloads that interleave moving-object updates and queries is challenging. In addition to the conflicting needs for update-efficient versus query-efficient data structures, the increasing parallel capabilities of multi-core processors yield challenges. To prevent concurrency anomalies and to ensure correct system behavior, conflicting update and query operations must be serialized. In this setting, it is a key concern to avoid that operations are blocked, which leaves processing cores idle. To enable efficient processing, we first examine concurrency degrees from traditional transaction processing in the context of our target domain and propose new semantics that enable a high degree of parallelism and ensure up-to-date query results. We define the new semantics for range and \(k\)-nearest neighbor queries. Then, we present a main-memory indexing technique called parallel grid that implements the proposed semantics as well as two other variants supporting different semantics. This enables us to quantify the effects that different degrees of consistency have on performance. We also present an alternative time-partitioning approach. Empirical studies with the above and three existing proposals conducted on modern processors show that our proposals scale near-linearly with the number of hardware threads and thus are able to benefit from increasing on-chip parallelism.
    The VLDB Journal 10/2014; 23(5):817-841. DOI:10.1007/s00778-014-0353-2 · 1.57 Impact Factor
  • Darius Šidlauskas · Christian S. Jensen ·

  • Source
    Quan Z. Sheng · Jing He · Guoren Wang · Christian S. Jensen ·

    World Wide Web 07/2014; 17(4). DOI:10.1007/s11280-014-0280-6 · 1.47 Impact Factor
  • Qiang Qu · Siyuan Liu · Bin Yang · Christian S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Increasing volumes of geo-referenced data are becoming available. This data includes so-called points of interest that describe businesses, tourist attractions, etc. by means of a geo-location and properties such as a textual description or ratings. We propose and study the efficient implementation of a new kind of query on points of interest that takes into account both the locations and properties of the points of interest. The query takes a result cardinality, a spatial range, and property-related preferences as parameters, and it returns a compact set of points of interest with the given cardinality and in the given range that satisfies the preferences. Specifically, the points of interest in the result set cover so-called allying preferences and are located far from points of interest that possess so-called alienating preferences. A unified result rating function integrates the two kinds of preferences with spatial distance to achieve this functionality. We provide efficient exact algorithms for this kind of query. To enable queries on large datasets, we also provide an approximate algorithm that utilizes a nearest-neighbor property to achieve scalable performance. We develop and apply lower and upper bounds that enable search-space pruning and thus improve performance. Finally, we provide a generalization of the above query and also extend the algorithms to support the generalization. We report on an experimental evaluation of the proposed algorithms using real point of interest data from Google Places for Business that offers insight into the performance of the proposed solutions.
  • Yu Ma · Bin Yang · Christian S. Jensen ·

  • Source
    Shuo Shang · Ruogu Ding · Kai Zheng · Christian S. Jensen · Panos Kalnis · Xiaofang Zhou ·
    [Show abstract] [Hide abstract]
    ABSTRACT: With the increasing availability of moving-object tracking data, trajectory search and matching is increasingly important. We propose and investigate a novel problem called personalized trajectory matching (PTM). In contrast to conventional trajectory similarity search by spatial distance only, PTM takes into account the significance of each sample point in a query trajectory. A PTM query takes a trajectory with user-specified weights for each sample point in the trajectory as its argument. It returns the trajectory in an argument data set with the highest similarity to the query trajectory. We believe that this type of query may bring significant benefits to users in many popular applications such as route planning, carpooling, friend recommendation, traffic analysis, urban computing, and location-based services in general. PTM query processing faces two challenges: how to prune the search space during the query processing and how to schedule multiple so-called expansion centers effectively. To address these challenges, a novel two-phase search algorithm is proposed that carefully selects a set of expansion centers from the query trajectory and exploits upper and lower bounds to prune the search space in the spatial and temporal domains. An efficiency study reveals that the algorithm explores the minimum search space in both domains. Second, a heuristic search strategy based on priority ranking is developed to schedule the multiple expansion centers, which can further prune the search space and enhance the query efficiency. The performance of the PTM query is studied in extensive experiments based on real and synthetic trajectory data sets.
    The VLDB Journal 06/2014; 23(3). DOI:10.1007/s00778-013-0331-0 · 1.57 Impact Factor
  • Xin Cao · Gao Cong · Christian S. Jensen · Man Lung Yiu ·
    [Show abstract] [Hide abstract]
    ABSTRACT: We consider an application scenario where points of interest (PoIs) each have a web presence and where a web user wants to iden- tify a region that contains relevant PoIs that are relevant to a set of keywords, e.g., in preparation for deciding where to go to conve- niently explore the PoIs. Motivated by this, we propose the length- constrained maximum-sum region (LCMSR) query that returns a spatial-network region that is located within a general region of in- terest, that does not exceed a given size constraint, and that best matches query keywords. Such a query maximizes the total weight of the PoIs in it w.r.t. the query keywords. We show that it is NP- hard to answer this query. We develop an approximation algorithm with a (5 + ε) approximation ratio utilizing a technique that scales node weights into integers. We also propose a more efficient heuris- tic algorithm and a greedy algorithm. Empirical studies on real data offer detailed insight into the accuracy of the proposed algorithms and show that the proposed algorithms are capable of computing results efficiently and effectively.
    Proceedings of the VLDB Endowment 05/2014; 7(9):733-744. DOI:10.14778/2732939.2732946
  • Anders Skovsgaard · Darius Sidlauskas · Christian S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: With the rapidly increasing deployment of Internet-connected, location-aware mobile devices, very large and increasing amounts of geo-tagged and timestamped user-generated content, such as microblog posts, are being generated. We present indexing, update, and query processing techniques that are capable of providing the top-k terms seen in posts in a user-specified spatio-temporal range. The techniques enable interactive response times in the millisecond range in a realistic setting where the arrival rate of posts exceeds today's average tweet arrival rate by a factor of 4-10. The techniques adaptively maintain the most frequent items at various spatial and temporal granularities. They extend existing frequent item counting techniques to maintain exact counts rather than approximations. An extensive empirical study with a large collection of geo-tagged tweets shows that the proposed techniques enable online aggregation and query processing at scale in realistic settings.
    2014 IEEE 30th International Conference on Data Engineering (ICDE); 03/2014
  • Bin Yang · Chenjuan Guo · Christian S. Jensen · Manohar Kaul · Shuo Shang ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Different uses of a road network call for the consideration of different travel costs: in route planning, travel time and distance are typically considered, and green house gas (GHG) emissions are increasingly being considered. Further, travel costs such as travel time and GHG emissions are time-dependent and uncertain. To support such uses, we propose techniques that enable the construction of a multi-cost, time-dependent, uncertain graph (MTUG) model of a road network based on GPS data from vehicles that traversed the road network. Based on the MTUG, we define stochastic skyline routes that consider multiple costs and time-dependent uncertainty, and we propose efficient algorithms to retrieve stochastic skyline routes for a given source-destination pair and a start time. Empirical studies with three road networks in Denmark and a substantial GPS data set offer insight into the design properties of the MTUG and the efficiency of the stochastic skyline routing algorithms.
    2014 IEEE 30th International Conference on Data Engineering (ICDE); 03/2014
  • Claudio Silvestri · Francesco Lettich · Salvatore Orlando · Christian S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we investigate the use of GPUs to solve a data-intensive problem that involves huge amounts of moving objects. The scenario which we focus on regards objects that continuously move in a 2D space, where a large percentage of them also issues range queries. The processing of these queries entails a large quantity of objects falling into the range queries to be returned. In order to solve this problem by maintaining a suitable throughput, we partition the time into ticks, and defer the parallel processing of all the objects events (location updates and range queries) occurring in a given tick to the next tick, thus slightly delaying the overall computation. We process in parallel all the events of each tick by adopting an hybrid approach, based on the combined use of CPU and GPU, and show the suitability of the method by discussing performance results. The exploitation of a GPU allow us to achieve a speedup of more than 20× on several datasets with respect to the best sequential algorithm solving the same problem. More importantly, we show that the adoption of new bitmap-based intermediate data structure we propose to avoid memory access contention entails a 10× speedup with respect to naive GPU based solutions.
    Proceedings of the 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing; 02/2014
  • Darius Šidlauskas · Christian S. Jensen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: A recent PVLDB paper reports on experimental analyses of ten spatial join techniques in main memory. We build on this comprehensive study to raise awareness of the fact that empirical running time performance findings in main-memory settings are results of not only the algorithms and data structures employed, but also their implementation, which complicates the interpretation of the results. In particular, we re-implement the worst performing technique without changing the underlying high-level algorithm, and we then offer evidence that the resulting re-implementation is capable of outperforming all the other techniques. This study demonstrates that in main memory, where no time-consuming I/O can mask variations in implementation, implementation details are very important; and it offers a concrete illustration of how it is difficult to make conclusions from empirical running time performance findings in main-memory settings about data structures and algorithms studied.
  • Shuo Shang · Kai Zheng · Christian S. Jensen · Bin Yang · Panos Kalnis · Guohe Li · Ji-Rong Wen ·
    [Show abstract] [Hide abstract]
    ABSTRACT: The discovery of regions of interest in large cities is an important challenge. We propose and investigate a novel query called the path nearby cluster (PNC) query that finds regions of potential interest (e.g., sightseeing places and commercial districts) with respect to a user-specified travel route. Given a set of spatial objects (e.g., POIs, geo-tagged photos, or geo-tagged tweets) and a query route , if a cluster has high spatial-object density and is spatially close to , it is returned by the query (a cluster is a circular region defined by a center and a radius). This query aims to bring important benefits to users in popular applications such as trip planning and location recommendation. Efficient computation of the PNC query faces two challenges: how to prune the search space during query processing, and how to identify clusters with high density effectively. To address these challenges, a novel collective search algorithm is developed. Conceptually, the search process is conducted in the spatial and density domains concurrently. In the spatial domain, network expansion is adopted, and a set of vertices are selected from the query route as expansion centers. In the density domain, clusters are sorted according to their density distributions and they are scan- ed from the maximum to the minimum. A pair of upper and lower bounds are defined to prune the search space in the two domains globally. The performance of the PNC query is studied in extensive experiments based on real and synthetic spatial data.
    IEEE Transactions on Knowledge and Data Engineering 01/2014; 27(6):1-1. DOI:10.1109/TKDE.2014.2382583 · 2.07 Impact Factor

Publication Stats

11k Citations
111.36 Total Impact Points


  • 1970-2015
    • Aalborg University
      • • Department of Computer Science
      • • Department of Mathematical Sciences
      Ålborg, North Denmark, Denmark
  • 2010-2014
    • Aarhus University
      • Department of Computer Science
      Aarhus, Central Jutland, Denmark
  • 1992-2004
    • University of Maryland, College Park
      • Department of Computer Science
      Maryland, United States
  • 2000
    • The University of Arizona
      • Department of Computer Science
      Tucson, AZ, United States
  • 1999
    • National Technical University of Athens
      • School of Electrical and Computer Engineering
      Athens, Attiki, Greece