Qing Li

Lands Department of The Government of the Hong Kong Special Administrative Region, Hong Kong, Hong Kong

Are you Qing Li?

Claim your profile

Publications (258)99.46 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, for the first time, we identify and solve the problem of efficient reverse k-skyband (RkSB) query processing. Given a set P of multi-dimensional points and a query point q, an RkSB query returns all the points in P whose dynamic k-skyband contains q. We formalize RkSB retrieval, and then propose five algorithms for computing the RkSB of an arbitrary query point efficiently. Our methods utilize a conventional data-partitioning index (e.g., R-tree) on the dataset, and employ pre-computation, reuse and pruning techniques to boost the query efficiency. In addition, we extend our solutions to tackle an interesting variant of reverse skyline queries, namely, ranked reverse skyline (RRS) query where, given a data set P, a parameter K, and a preference function f, the goal is to find the K reverse skyline points that have the minimal score according to the user-specified function f. Extensive experiments using both real and synthetic data sets demonstrate the effectiveness of our proposed pruning heuristics and the performance of our proposed algorithms under a variety of experimental settings.
    Information Sciences 02/2015; 293:11–34. · 3.89 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The skyline operator has been extensively explored in the literature, and most of the existing approaches assume that all dimensions are available for all data items. However, many practical applications such as sensor networks, decision making, and location-based services, may involve incomplete data items, i.e., some dimensional values are missing, due to the device failure or the privacy preservation. This paper is the first, to our knowledge, study of k-skyband (kSB) query processing on incomplete data, where multi-dimensional data items are missing some values of their dimensions. We formalize the problem, and then present two efficient algorithms for processing it. Our methods introduce some novel concepts including expired skyline, shadow skyline, and thickness warehouse, in order to boost the search performance. As a second step, we extend our techniques to tackle constrained skyline (CS) and group-by skyline (GBS) queries over incomplete data. Extensive experiments with both real and synthetic data sets demonstrate the effectiveness and efficiency of our proposed algorithms under various experimental settings.
    Expert Systems with Applications 08/2014; 41(10):4959–4974. · 1.97 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Rare category discovery aims at identifying unlabeled data examples of rare categories in a given data set. The existing approaches to rare category discovery often need a certain number of labeled data examples as the training set, which are usually difficult and expensive to acquire in practice. To save the cost however, if these methods only use a small training set, their accuracy may not be satisfactory for real applications. In this paper, for the first time, we propose the concept of rare category exploration, aiming to discover all data examples of a rare category from a seed (which is a labeled data example of this rare category) instead of from a training set. To this end, we present an approach known as the FRANK algorithm which transforms rare category exploration to local community detection from a seed in a kNN (k-nearest neighbors) graph with an automatically selected k value. Extensive experimental results on real data sets verify the effectiveness and efficiency of our FRANK algorithm.
    Expert Systems with Applications 07/2014; 41(9):4197–4210. · 1.97 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the increase in resource-sharing websites such as YouTube and Flickr, many shared resources have arisen on the Web. Personalized searches have become more important and challenging since users demand higher retrieval quality. To achieve this goal, personalized searches need to take users' personalized profiles and information needs into consideration. Collaborative tagging (also known as folksonomy) systems allow users to annotate resources with their own tags, which provides a simple but powerful way for organizing, retrieving and sharing different types of social resources. In this article, we examine the limitations of previous tag-based personalized searches. To handle these limitations, we propose a new method to model user profiles and resource profiles in collaborative tagging systems. We use a normalized term frequency to indicate the preference degree of a user on a tag. A novel search method using such profiles of users and resources is proposed to facilitate the desired personalization in resource searches. In our framework, instead of the keyword matching or similarity measurement used in previous works, the relevance measurement between a resource and a user query (termed the query relevance) is treated as a fuzzy satisfaction problem of a user's query requirements. We implement a prototype system called the Folksonomy-based Multimedia Retrieval System (FMRS). Experiments using the FMRS data set and the MovieLens data set show that our proposed method outperforms baseline methods.
    Neural networks: the official journal of the International Neural Network Society 06/2014; · 1.88 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: One of the most challenging problems in aspect-based opinion mining is aspect extraction, which aims to identify expressions that describe aspects of products (called aspect expressions) and categorize domain-specific synonymous expressions. Although a number of methods of aspect extraction have been proposed before, very few of them are designed to improve the interpretability of generated aspects. Existing methods either generate multiple fine-grained aspects without proper categorization or categorize semantically unrelated product aspects (e.g., by unsupervised topic modeling). In this paper, we first examine previous studies on product aspect extraction. To overcome the limitations of existing methods, two novel semi-supervised models for product aspect extraction are then proposed. More specifically, the proposed methodology first extracts seeding aspects and related terms from detailed product descriptions readily available on E-commerce websites. Next, product reviews are regrouped according to these seeding aspects so that more effective textual contexts for topic modeling are built. Finally, two novel semi-supervised topic models are developed to extract human-comprehensible product aspects. For the first proposed topic model, the Fine-grained Labeled LDA (FL-LDA), seeding aspects are applied to guide the model to discover words that are related to these seeding aspects. For the second model, the Unified Fine-grained Labeled LDA (UFL-LDA), we incorporate unlabeled documents to extend the FL-LDA model so that words related to the seeding aspects or other high-frequency words in customer reviews are extracted. Our experimental results demonstrate that the proposed methods outperform state-of-the-art methods.
    Knowledge-Based Systems 06/2014; · 3.06 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents an efficient and robust content-based large medical image retrieval method in mobile Cloud computing environment, called the Mirc. The whole query process of the Mirc is composed of three steps. First, when a clinical user submits a query image Iq, a parallel image set reduction process is conducted at a master node. Then the candidate images are transferred to the slave nodes for a refinement process to obtain the answer set. The answer set is finally transferred to the query node. The proposed method including an priority-based robust image block transmission scheme is specifically designed for solving the instability and the heterogeneity of the mobile cloud environment, and an index-support image set reduction algorithm is introduced for reducing the data transfer cost involved. We also propose a content-aware and bandwidth-conscious multi-resolution-based image data replica selection method and a correlated data caching algorithm to further improve the query performance. The experimental results show that the performance of our approach is both efficient and effective, minimizing the response time by decreasing the network transfer cost while increasing the parallelism of I/O and CPU.
    Information Sciences 04/2014; 263:60–86. · 3.89 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we study a new skyline operator, namely, mutual skyline query (MSQ), which retrieves all the data objects that are contained in the dynamic skyline and meanwhile the reverse skyline of a specified query object q. MSQ has many applications such as marketing analysis, task allocation, and personalized matching. Motivated by this, we first formalize MSQ in both monochromatic and bichromatic cases, and then propose several algorithms for processing MSQ. Our methods utilize a conventional data-partitioning index on the dataset, employ the advantage of reusing technique, and exploit effective pruning heuristics to improve the query processing. Extensive experiments using both real and synthetic datasets demonstrate the effectiveness and efficiency of our proposed algorithms under various experimental settings.
    Expert Systems with Applications 03/2014; 41(4):1885-1900. · 1.97 Impact Factor
  • ICWL Workshops; 01/2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: A location-aware news feed system enables mobile users to share geo-tagged user-generated messages, e.g., a user can receive nearby messages that are the most relevant to her. In this paper, we present MobiFeed that is a framework designed for scheduling news feeds for mobile users. MobiFeed consists of three key functions, location prediction, relevance measure, and news feed scheduler. The location prediction function is designed to predict a mobile user's locations based on an existing path prediction algorithm. The relevance measure function is implemented by combining the vector space model with non-spatial and spatial factors to determine the relevance of a message to a user. The news feed scheduler works with the other two functions to generate news feeds for a mobile user at her current and predicted locations with the best overall quality. To ensure that MobiFeed can scale up to a larger number of messages, we design a heuristic news feed scheduler.
    Proceedings of the 20th International Conference on Advances in Geographic Information Systems; 11/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The k -nearest-neighbor ( k -NN) query is one of the most popular spatial query types for location-based services (LBS). In this paper, we focus on k -NN queries in time-dependent road networks, where the travel time between two locations may vary significantly at different time of the day. In practice, it is costly for a LBS provider to collect real-time traffic data from vehicles or roadside sensors to compute the best route from a user to a spatial object of interest in terms of the travel time. Thus, we design SMashQ, a server-side spatial mashup framework that enables a database server to efficiently evaluate k -NN queries using the route information and travel time accessed from an external Web mapping service, e.g., Microsoft Bing Maps. Due to the expensive cost and limitations of retrieving such external information, we propose three shared execution optimizations for SMashQ, namely, object grouping , direction sharing , and user grouping , to reduce the number of external Web mapping requests and provide highly accurate query answers. We evaluate SMashQ using Microsoft Bing Maps, a real road network, real data sets, and a synthetic data set. Experimental results show that SMashQ is efficient and capable of producing highly accurate query answers.
    Distributed and Parallel Databases 09/2012; 31(2):1-29. · 1.00 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper, for the first time, addresses the problem of efficient reverse k-skyband (RkSB) query processing. Given a set P of multi-dimensional points and a query point q, an RkSB query returns all the points in P whose dynamic k-skyband contains q. We formalize the RkSB query, and then propose three algorithms for computing the RkSB of an arbitrary query point efficiently. Our methods utilize a conventional data-partitioning index (e.g., R-tree) on the dataset, as well as employ pre-computation and pruning techniques to improve the query performance. Extensive experiments using both real and synthetic datasets demonstrate the effectiveness of our proposed pruning heuristics and the performance of our proposed algorithms.
    Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I; 04/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Currently, recommender system becomes more and more important and challenging, as users demand higher recommendation quality. Collaborative tagging systems allow users to annotate resources with their own tags which can reflect users' attitude on these resources and some attributes of resources. Based on our observation, we notice that there is co-occurrence effect of features, which may cause the change of user's favor on resources. Current recommendation methods do not take it into consideration. In this paper, we propose an assistant and enhanced method to improve the performance of other methods by combining co-occurrence effect of features in collaborative tagging environment.
    Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications; 04/2012
  • 01/2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Despite the ubiquity of physical obstacles (e.g., buildings, hills, and blindages, etc.) in the real world, most of spatial queries ignore the obstacles. In this article, we study a novel form of continuous nearest-neighbor queries in the presence of obstacles, namely continuous obstructed nearest-neighbor (CONN) search, which considers the impact of obstacles on the distance between objects. Given a data set P, an obstacle set O, and a query line segment q, in a two-dimensional space, a CONN query retrieves the nearest neighbor p ∈ P of each point p′ on q according to the obstructed distance, the shortest path between p and p′ without crossing any obstacle in O. We formalize CONN search, analyze its unique properties, and develop algorithms for exact CONN query-processing assuming that both P and O are indexed by conventional data-partitioning indices (e.g., R-trees). Our methods tackle CONN retrieval by performing a single query for the entire query line segment, and only process the data points and obstacles relevant to the final query result via a novel concept of control points and an efficient quadratic-based split point computation approach. Then, we extend our techniques to handle variations of CONN queries, including (1) continuous obstructed k nearest neighbor (COkNN) search which, based on obstructed distances, finds the k (≥ 1) nearest neighbors (NNs) to every point along q; and (2) trajectory obstructed k nearest-neighbor (TOkNN) search, which, according to obstructed distances, returns the k NNs for each point on an arbitrary trajectory (consisting of several consecutive line segments). Finally, we explore approximate COkNN (ACOkNN) retrieval. Extensive experiments with both real and synthetic datasets demonstrate the efficiency and effectiveness of our proposed algorithms under various experimental settings.
    ACM Transactions on Database Systems 05/2011; 36:9. · 0.75 Impact Factor
  • Source
    Journal of multimedia 04/2011; 6:115-121.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Recommender systems have gained great popularity in Internet applications in recent years, due to that they facilitate users greatly in information retrieval despite the explosive data growth. Similar to other popular domains such as the movie-, music-, and book- recommendations, cooking recipe selection is also a daily activity in which user experiences can be greatly improved by adopting appropriate recommendation strategies. Based on content-based and collaborative filtering approaches, we present in this paper a comprehensive recipe recommendation framework encompassing the modeling of the recipe cooking procedures and adoption of folksonomy to boost the recommendations. Empirical studies are conducted on a real data set to show that our method outperforms baselines in the recipe domain.
    Web Technologies and Applications - 13th Asia-Pacific Web Conference, APWeb 2011, Beijing, China, April 18-20, 2011. Proceedings; 01/2011
  • Fan Ye, Qing Li, Enhong Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: Mobile Peer to Peer (MP2P) networks provide decentralization, self-organization, scalability characters, but suffer from high latency and link break problems. In this paper, we study the cache/replication placement and cache update problems arising in such kind of networks. While researchers have proposed various replication placement algorithms to place data across the network to address the problem, it was proven as NP-hard. As a result, many heuristic algorithms have been brought forward for solving the problem. In our paper, we propose an effective and low cost cache placement strategy combined with an update scheme which can be easily implemented in a decentralized way. Extensive experiments are conducted to demonstrate the efficiency of the cache placement and update scheme.
    World Wide Web 01/2011; 14:243-259. · 1.62 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Service-Oriented Computing SOC has recently gained attention both within industry and academia; however, its characteristics cannot be easily solved using existing distributed computing technologies. Composition and interaction issues have been the central concerns, because SOC applications are composed of heterogeneous and distributed processes. To tackle the complexity of inter-organizational service integration, the authors propose a methodology to decompose complex process requirements into different types of flows, such as control, data, exception, and security. The subset of each type of flow necessary for the interactions with each partner can be determined in each service. These subsets collectively constitute a process view, based on which interactions can be systematically designed and managed for system integration through service composition. The authors illustrate how the proposed SOC middleware, named FlowEngine, implements and manages these flows with contemporary Web services technologies. An experimental case study in an e-governmental environment further demonstrates how the methodology can facilitate the design of complex inter-organizational processes.
    Journal of Database Management 01/2011; 22:32-63. · 0.90 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Many metrics such as degree, closeness, and PageRank have been introduced to determine the relative importance of a node within a network. The desired function of a network, however, is domain-specific. For example, the robustness can be crucial for a communication network, while efficiency is more preferred for fast spreading of advertisements in viral marketing. The information provided by some widely used measures are often conflicting under such varying demands. In this paper, we present a novel framework for evaluating network metrics regarding typical functional requirements. We also propose an analysis of five well established measures to compare their performance of ranking nodes on functional importance in a real-life network.
    Proceedings of the 20th International Conference on World Wide Web, WWW 2011, Hyderabad, India, March 28 - April 1, 2011 (Companion Volume); 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: K-nearest-neighbor (k-NN) queries have been widely studied in time-independent and time-dependent spatial networks. In this paper, we focus on k-NN queries in time-dependent spatial networks where the driving time between two locations may vary significantly at different time of the day. In practice, it is costly for a database server to collect real-time traffic data from vehicles or roadside sensors to compute the best route from a user to an object of interest in terms of the driving time. Thus, we design a new spatial query processing paradigm that uses a spatial mashup to enable the database server to efficiently evaluate k-NN queries based on the route information accessed from an external Web mapping service, e.g., Google Maps, Yahoo! Maps and Microsoft Bing Maps. Due to the expensive cost and limitations of retrieving such external information, we propose a new spatial query processing algorithm that uses shared execution through grouping objects and users based on the road network topology and pruning techniques to reduce the number of external requests to the Web mapping service and provides highly accurate query answers. We implement our algorithm using Google Maps and compare it with the basic algorithm. The results show that our algorithm effectively reduces the number of external requests by 90% on average with high accuracy, i.e., the accuracy of estimated driving time and query answers is over 92% and 87%, respectively.
    Advances in Spatial and Temporal Databases - 12th International Symposium, SSTD 2011, Minneapolis, MN, USA, August 24-26, 2011, Proceedings; 01/2011

Publication Stats

1k Citations
99.46 Total Impact Points

Institutions

  • 2014
    • Lands Department of The Government of the Hong Kong Special Administrative Region
      Hong Kong, Hong Kong
  • 1999–2014
    • City University of Hong Kong
      • Department of Computer Science
      Chiu-lung, Kowloon City, Hong Kong
  • 1970–2014
    • The University of Hong Kong
      • • Department of Computer Science
      • • Department of Information Technology & Engineering
      Hong Kong, Hong Kong
  • 2007–2011
    • USTC-CityU Joint Advanced Research Center
      Hong Kong, Hong Kong
    • Carnegie Mellon University
      • Computer Science Department
      Pittsburgh, Pennsylvania, United States
  • 2010
    • Shanghai University
      • School of Computer Engineering and Sciences
      Shanghai, Shanghai Shi, China
  • 2009
    • Zhejiang University
      • College of Computer Science and Technology
      Hangzhou, Zhejiang Sheng, China
  • 2007–2009
    • Arizona State University
      Phoenix, Arizona, United States
  • 2008
    • University of Science and Technology of China
      • School of Computer Science and Technology
      Hefei, Anhui Sheng, China
    • Southwestern University of Finance and Economics
      Hua-yang, Sichuan, China
    • Wuhan University
      Wu-han-shih, Hubei, China
  • 2007–2008
    • Zhejiang Normal University
      Jinhua, Zhejiang Sheng, China
  • 2006
    • Renmin University of China
      • School of Information
      Beijing, Beijing Shi, China
  • 1998
    • The Hong Kong Polytechnic University
      • Department of Computing
      Hong Kong, Hong Kong
  • 1997
    • University of New South Wales
      • School of Computer Science and Engineering
      Kensington, New South Wales, Australia
  • 1994–1995
    • The Hong Kong University of Science and Technology
      • Department of Computer Science and Engineering
      Kowloon, Hong Kong
  • 1988
    • University of Southern California
      Los Angeles, California, United States