Conference Paper

An autonomous agent approach to query optimization in stream grids.

DOI: 10.4018/joci.2010100102 Conference: MEDES '09: International ACM Conference on Management of Emergent Digital EcoSystems, Lyon, France, October 27-30, 2009
Source: DBLP


Stream grids are wide-area grid computing environments that are fed by a set of stream data sources, and Queries arrive at the grid from users and applications external to the system. The kind of queries considered in this work is long-running continuous LRC queries, which are neither short-lived nor infinitely long lived. The queries are "open" from the grid perspective as the grid cannot control or predict the arrival of a query with time, location, required data and query revocations. Query optimization in such an environment has two major challenges, i.e., optimizing in a multi-query environment and continuous optimization, due to new query arrivals and revocations. As generating a globally optimal query plan is an intractable problem, this work explores the idea of emergent optimization where globally optimal query plans emerge as a result of local autonomous decisions taken by the grid nodes. Drawing concepts from evolutionary game theory, grid nodes are modeled as autonomous agents that seek to maximize a self-interest function using one of a set of different strategies. Grid nodes change strategies in response to variations in query arrival and revocation patterns, which is also autonomously decided by each grid node.

Download full-text


Available from: Srinath Srinivasa, Mar 11, 2014
15 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent research and development efforts show the increasing importance of processing data streams, not only in the context of sensor networks, but also in information retrieval networks. With the advent of various mobile devices being able to participate in ubiquitous (wireless) networks, a major challenge is to develop data stream management systems (DSMS) for information retrieval in such networks. In this paper, we present the architecture of our StreamGlobe system, which is focused on meeting the challenges of efficiently querying data streams in an ad-hoc network environment. StreamGlobe is based on a federation of heterogeneous peers ranging from small, possibly mobile devices to stationary servers. On this foundation, self-organizing network optimization and expressive in-network query processing capabilities enable powerful information processing and retrieval. Data streams in StreamGlobe are represented in XML and queried using XQuery. We report on our ongoing implementation effort and briefly show our research agenda.
    Proceedings of the 1st Workshop on Data Management for Sensor Networks, in conjunction with VLDB, DMSN 2004, Toronto, Canada, August 30, 2004; 01/2004
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Complex queries are becoming commonplace, with the growing use of decision support systems. Decision support queries often have a lot of common sub-expressions within each query, and queries are often run as a batch. Multi query optimization aims at exploiting common sub-expressions, to reduce the evaluation cost of queries, by computing them once and then caching them for future use, both within individual queries and across queries in a batch. In case cache space is limited, the total size of sub-expressions that are worth caching may exceed available cache space. Prior work in multi query optimization involves choosing a set of common sub-expressions that fit in available cache space, and once computed, retaining their results across the execution of all queries in a batch. Such optimization algorithms do not consider the possibility of dynamically changing the cache contents. This may lead to sub-expressions occupying cache space even if they are not used by subsequent queries. The available cache space can be best utilized by evaluating the queries in an appropriate order and changing the cache contents as queries are executed. We present several algorithms that consider these factors, in order to reduce the cost of query evaluation
    Database Engineering & Applications, 2001 International Symposium on.; 02/2001
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Data stream processing is currently gaining importance due to the developments in novel application areas like e-science, e-health, and e-business (considering RFID, for example). Focusing on e-science, it can be observed that scientific experiments and observations in many fields, e. g., in physics and astronomy, create huge volumes of data which have to be interchanged and processed. With experimental and observational data coming in particular from sensors, online simulations, etc., the data has an inherently streaming nature. Furthermore, continuing advances will result in even higher data volumes, rendering storing all of the delivered data prior to processing increasingly impractical. Hence, in such e-science scenarios, processing and sharing of data streams will play a decisive role. It will enable new possibilities for researchers, since they will be able to subscribe to interesting data streams of other scientists without having to set up their own devices or experiments. This results in much better utilization of expensive equipment such as telescopes, satellites, etc. Further, processing and sharing data streams on-the-fly in the network helps to reduce network traffic and to avoid network congestion. Thus, even huge streams of data can be handled efficiently by removing unnecessary parts early on, e. g., by early filtering and aggregation, and by sharing previously generated data streams and processing results.
    Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30 - September 2, 2005; 01/2005
Show more