Di Wu

The Chinese University of Hong Kong, Hong Kong, Hong Kong

Are you Di Wu?

Claim your profile

Publications (9)0.51 Total impact

  • Article: Leadership discovery when data correlatively evolve
    [show abstract] [hide abstract]
    ABSTRACT: Nowadays, World Wide Web is full of rich information, including text data, XML data, multimedia data, time series data, etc. The web is usually represented as a large graph and PageRank is computed to rank the importance of web pages. In this paper, we study the problem of ranking evolving time series and discovering leaders from them by analyzing lead-lag relations. A time series is considered to be one of the leaders if its rise or fall impacts the behavior of many other time series. At each time point, we compute the lagged correlation between each pair of time series and model them in a graph. Then, the leadership rank is computed from the graph, which brings order to time series. Based on the leadership ranking, the leaders of time series are extracted. However, the problem poses great challenges since the dynamic nature of time series results in a highly evolving graph, in which the relationships between time series are modeled. We propose an efficient algorithm which is able to track the lagged correlation and compute the leaders incrementally, while still achieving good accuracy. Our experiments on real weather science data and stock data show that our algorithm is able to compute time series leaders efficiently in a real-time manner and the detected leaders demonstrate high predictive power on the event of general time series entities, which can enlighten both weather monitoring and financial risk control. KeywordsPageRank–time series–lagged correlation–leadership rank–incremental correlation update
    World Wide Web 04/2012; 14(1):1-25. · 0.51 Impact Factor
  • Source
    Article: Detecting Priming News Events
    [show abstract] [hide abstract]
    ABSTRACT: We study a problem of detecting priming events based on a time series index and an evolving document stream. We define a priming event as an event which triggers abnormal movements of the time series index, i.e., the Iraq war with respect to the president approval index of President Bush. Existing solutions either focus on organizing coherent keywords from a document stream into events or identifying correlated movements between keyword frequency trajectories and the time series index. In this paper, we tackle the problem in two major steps. (1) We identify the elements that form a priming event. The element identified is called influential topic which consists of a set of coherent keywords. And we extract them by looking at the correlation between keyword trajectories and the interested time series index at a global level. (2) We extract priming events by detecting and organizing the bursty influential topics at a micro level. We evaluate our algorithms on a real-world dataset and the result confirms that our method is able to discover the priming events effectively.
    01/2012;
  • Source
    Conference Proceeding: Detecting Leaders from Correlated Time Series.
    Database Systems for Advanced Applications, 15th International Conference, DASFAA 2010, Tsukuba, Japan, April 1-4, 2010, Proceedings, Part I; 01/2010
  • Source
    Conference Proceeding: Stock risk mining by news.
    Database Technologies 2010, Twenty-First Australasian Database Conference (ADC 2010), Brisbane, Australia, 18-22 January, 2010, Proceedings; 01/2010
  • Article: Stock prediction: an event-driven approach based on bursty keywords.
    Frontiers of Computer Science in China. 01/2009; 3:145-157.
  • Chapter: Integrating Multiple Data Sources for Stock Prediction
    [show abstract] [hide abstract]
    ABSTRACT: In many real world applications, decisions are usually made by collecting and judging information from multiple different data sources. Let us take the stock market as an example. We never make our decision based on just one single piece of advice, but always rely on a collection of information, such as the stock price movements, exchange volumes, market index, as well as the information from the news articles, expert comments and special announcements (e.g., the increase of stamp duty). Yet, modeling the stock market is difficult because: (1) The process related to market states (up and down) is a stochastic process, which is hard to capture by using the deterministic approach; and (2) The market state is invisible but will be influenced by the visible market information, such as stock prices and news articles. In this paper, we try to model the stock market process by using a Non-homogeneous Hidden Markov Model (NHMM) which takes multiple sources of information into account when making a future prediction. Our model contains three major elements: (1) External event, which denotes the events happening within the stock market (e.g., the drop of US interest rate); (2) Observed market state, which denotes the current market status (e.g. the rise in the stock price); and (3) Hidden market state, which conceptually exists but is invisible to the market participants. Specifically, we model the external events by using the information contained in the news articles, and model the observed market state by using the historical stock prices. Base on these two pieces of observable information and the previous hidden market state, we aim to identify the current hidden market state, so as to predict the immediate market movement. Extensive experiments were conducted to evaluate our work. The encouraging results indicate that our proposed approach is practically sound and effective.
    08/2008: pages 77-89;
  • Conference Proceeding: Detection of Shape Anomalies: A Probabilistic Approach Using Hidden Markov Models
    [show abstract] [hide abstract]
    ABSTRACT: We study the problem of detecting the shape anomalies in this paper. Our shape anomaly detection algorithm is performed on the one-dimensional representation (time series) of shapes, whose similarity is modeled by a generalized segmental hidden Markov model (HMM) under a scaling, translation and rotation invariant manner. Experimental results show that our proposed approach can find shape anomalies in a large collection of shapes effectively and efficiently.
    Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on; 05/2008
  • Conference Proceeding: Detection of Shape Anomalies: A Probabilistic Approach Using Hidden Markov Models.
    Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, April 7-12, 2008, Cancún, México; 01/2008
  • Conference Proceeding: Mining Multiple Time Series Co-movements.
    Progress in WWW Research and Development, 10th Asia-Pacific Web Conference, APWeb 2008, Shenyang, China, April 26-28, 2008. Proceedings; 01/2008