-
[show abstract]
[hide abstract]
ABSTRACT: Nowadays, World Wide Web is full of rich information, including text data, XML data, multimedia data, time series data, etc.
The web is usually represented as a large graph and PageRank is computed to rank the importance of web pages. In this paper,
we study the problem of ranking evolving time series and discovering leaders from them by analyzing lead-lag relations. A
time series is considered to be one of the leaders if its rise or fall impacts the behavior of many other time series. At
each time point, we compute the lagged correlation between each pair of time series and model them in a graph. Then, the leadership
rank is computed from the graph, which brings order to time series. Based on the leadership ranking, the leaders of time series
are extracted. However, the problem poses great challenges since the dynamic nature of time series results in a highly evolving
graph, in which the relationships between time series are modeled. We propose an efficient algorithm which is able to track
the lagged correlation and compute the leaders incrementally, while still achieving good accuracy. Our experiments on real
weather science data and stock data show that our algorithm is able to compute time series leaders efficiently in a real-time
manner and the detected leaders demonstrate high predictive power on the event of general time series entities, which can
enlighten both weather monitoring and financial risk control.
KeywordsPageRank–time series–lagged correlation–leadership rank–incremental correlation update
World Wide Web 04/2012; 14(1):1-25. · 0.51 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We study a problem of detecting priming events based on a time series index
and an evolving document stream. We define a priming event as an event which
triggers abnormal movements of the time series index, i.e., the Iraq war with
respect to the president approval index of President Bush. Existing solutions
either focus on organizing coherent keywords from a document stream into events
or identifying correlated movements between keyword frequency trajectories and
the time series index. In this paper, we tackle the problem in two major steps.
(1) We identify the elements that form a priming event. The element identified
is called influential topic which consists of a set of coherent keywords. And
we extract them by looking at the correlation between keyword trajectories and
the interested time series index at a global level. (2) We extract priming
events by detecting and organizing the bursty influential topics at a micro
level. We evaluate our algorithms on a real-world dataset and the result
confirms that our method is able to discover the priming events effectively.
01/2012;
-
Database Systems for Advanced Applications, 15th International Conference, DASFAA 2010, Tsukuba, Japan, April 1-4, 2010, Proceedings, Part I; 01/2010
-
Database Technologies 2010, Twenty-First Australasian Database Conference (ADC 2010), Brisbane, Australia, 18-22 January, 2010, Proceedings; 01/2010
-
Frontiers of Computer Science in China. 01/2009; 3:145-157.
-
[show abstract]
[hide abstract]
ABSTRACT: In many real world applications, decisions are usually made by collecting and judging information from multiple different
data sources. Let us take the stock market as an example. We never make our decision based on just one single piece of advice,
but always rely on a collection of information, such as the stock price movements, exchange volumes, market index, as well
as the information from the news articles, expert comments and special announcements (e.g., the increase of stamp duty). Yet,
modeling the stock market is difficult because: (1) The process related to market states (up and down) is a stochastic process,
which is hard to capture by using the deterministic approach; and (2) The market state is invisible but will be influenced
by the visible market information, such as stock prices and news articles. In this paper, we try to model the stock market
process by using a Non-homogeneous Hidden Markov Model (NHMM) which takes multiple sources of information into account when
making a future prediction. Our model contains three major elements: (1) External event, which denotes the events happening
within the stock market (e.g., the drop of US interest rate); (2) Observed market state, which denotes the current market
status (e.g. the rise in the stock price); and (3) Hidden market state, which conceptually exists but is invisible to the
market participants. Specifically, we model the external events by using the information contained in the news articles, and
model the observed market state by using the historical stock prices. Base on these two pieces of observable information and
the previous hidden market state, we aim to identify the current hidden market state, so as to predict the immediate market
movement. Extensive experiments were conducted to evaluate our work. The encouraging results indicate that our proposed approach
is practically sound and effective.
08/2008: pages 77-89;
-
[show abstract]
[hide abstract]
ABSTRACT: We study the problem of detecting the shape anomalies in this paper. Our shape anomaly detection algorithm is performed on the one-dimensional representation (time series) of shapes, whose similarity is modeled by a generalized segmental hidden Markov model (HMM) under a scaling, translation and rotation invariant manner. Experimental results show that our proposed approach can find shape anomalies in a large collection of shapes effectively and efficiently.
Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on; 05/2008
-
Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, April 7-12, 2008, Cancún, México; 01/2008
-
Progress in WWW Research and Development, 10th Asia-Pacific Web Conference, APWeb 2008, Shenyang, China, April 26-28, 2008. Proceedings; 01/2008