Conference Proceeding

Query Optimization over Distributed Data Stream

Software Coll., Northeastern Univ., Shenyang, China
09/2009; DOI:10.1109/HIS.2009.198 pp.415 - 418 In proceeding of: Hybrid Intelligent Systems, 2009. HIS '09. Ninth International Conference on, Volume: 2
Source: IEEE Xplore

ABSTRACT Recent research efforts in the fields of data stream processing show the increasing importance of processing data streams, e.g., in the e-science domain. Together with the advent of peer-to-peer (P2P) networks and grid computing, this leads to the necessity of developing new techniques for distributing and processing continuous queries over data streams in such networks. These systems often have to process multiple similar but different continuous aggregation queries simultaneously. Since executing each query separately can lead to significant scalability and performance problems, it is vital to share resources by exploiting similarities in the queries. The challenge is to identify overlapping computations that may not be obvious in the queries themselves. In this paper, we propose a novel algorithmic solution for problem of finding the minimum number of queries in such a distributed-streams setting, in order to optimize the communicate cost across the network. The experiment result show that our approach gives us as much as magnitude performance improvement over the no-share settings.

0 0
 · 
0 Bookmarks
 · 
25 Views

Keywords

advent
 
communicate cost
 
data streams
 
different continuous aggregation queries
 
distributed-streams
 
e-science domain
 
experiment result
 
exploiting similarities
 
increasing importance
 
magnitude performance improvement
 
new techniques
 
no-share settings
 
overlapping computations
 
performance problems
 
process multiple similar
 
processing continuous queries
 
processing data streams
 
share resources
 
significant scalability
 
systems
 

Shuang Wang