TIDES - a new descriptor for time series oscillation behavior.
ABSTRACT Sensor networks have increased the amount and variety of temporal data available, requiring the definition of new techniques
for data mining. Related research typically addresses the problems of indexing, clustering, classification, summarization,
and anomaly detection. There is a wide range of techniques to describe and compare time series, but they focus on series’
values. This paper concentrates on a new aspect—that of describing oscillation patterns. It presents a technique for time
series similarity search, and multiple temporal scales, defining a descriptor that uses the angular coefficients from a linear
segmentation of the curve that represents the evolution of the analyzed series. This technique is generalized to handle co-evolution,
in which several phenomena vary at the same time. Preliminary experiments with real datasets showed that our approach correctly
characterizes the oscillation of single time series, for multiple time scales, and is able to compute the similarity among
sets of co-evolving series.
- SourceAvailable from: cs.ucr.edu[show abstract] [hide abstract]
ABSTRACT: The problem of indexing time series has attracted much interest. Most algorithms used to index time series utilize the Euclidean distance or some variation thereof. However, it has been forcefully shown that the Euclidean distance is a very brittle distance measure. Dynamic time warping (DTW) is a much more robust distance measure for time series, allowing similar shapes to match even if they are out of phase in the time axis. Because of this flexibility, DTW is widely used in science, medicine, industry and finance. Unfortunately, however, DTW does not obey the triangular inequality and thus has resisted attempts at exact indexing. Instead, many researchers have introduced approximate indexing techniques or abandoned the idea of indexing and concentrated on speeding up sequential searches. In this work, we introduce a novel technique for the exact indexing of DTW. We prove that our method guarantees no false dismissals and we demonstrate its vast superiority over all competing approaches in the largest and most comprehensive set of time series indexing experiments ever undertaken.Knowledge and Information Systems 01/2005; 7(3):358-386. · 2.23 Impact Factor
Conference Proceeding: An online algorithm for segmenting time series[show abstract] [hide abstract]
ABSTRACT: In recent years, there has been an explosion of interest in mining time-series databases. As with most computer science problems, representation of the data is the key to efficient and effective solutions. One of the most commonly used representations is piecewise linear approximation. This representation has been used by various researchers to support clustering, classification, indexing and association rule mining of time-series data. A variety of algorithms have been proposed to obtain this representation, with several algorithms having been independently rediscovered several times. In this paper, we undertake the first extensive review and empirical comparison of all proposed techniques. We show that all these algorithms have fatal flaws from a data-mining perspective. We introduce a novel algorithm that we empirically show to be superior to all others in the literatureData Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on; 02/2001
- [show abstract] [hide abstract]
ABSTRACT: We consider a scenario where nodes in a sensor network hold numeric items, and the task is to evaluate simple functions of the distributed data. In this note we present distributed protocols for computing the median with sublinear space and communication complexity per node. Specifically, we give a deterministic protocol for computing median with polylog complexity and a randomized protocol that computes an approximate median with polyloglog communication complexity per node. On the negative side, we observe that any deterministic protocol that counts the number of distinct data items must have linear complexity in the worst case.Theor. Comput. Sci. 01/2007; 370:254-264.