Eamonn Keogh

University of California, Riverside, Riverside, CA, USA

Are you Eamonn Keogh?

Claim your profile

Publications (2)1.54 Total impact

  • Source
    Article: iSAX: disk-aware mining and indexing of massive time series datasets
    Jin Shieh, Eamonn Keogh
    [show abstract] [hide abstract]
    ABSTRACT: Current research in indexing and mining time series data has produced many interesting algorithms and representations. However, the algorithms and the size of data considered have generally not been representative of the increasingly massive datasets encountered in science, engineering, and business domains. In this work, we introduce a novel multi-resolution symbolic representation which can be used to index datasets which are several orders of magnitude larger than anything else considered in the literature. To demonstrate the utility of this representation, we constructed a simple tree-based index structure which facilitates fast exact search and orders of magnitude faster, approximate search. For example, with a database of one-hundred million time series, the approximate search can retrieve high quality nearest neighbors in slightly over a second, whereas a sequential scan would take tens of minutes. Our experimental evaluation demonstrates that our representation allows index performance to scale well with increasing dataset sizes. Additionally, we provide analysis concerning parameter sensitivity, approximate search effectiveness, and lower bound comparisons between time series representations in a bit constrained environment. We further show how to exploit the combination of both exact and approximate search as sub-routines in data mining algorithms, allowing for the exact mining of truly massive real world datasets, containing tens of millions of time series.
    Data Mining and Knowledge Discovery 04/2012; 19(1):24-57. · 1.54 Impact Factor
  • Article: i SAX: indexing and mining terabyte sized time series
    Jin Shieh, Eamonn Keogh
    [show abstract] [hide abstract]
    ABSTRACT: Current research in indexing and mining time series data has produced many interesting algorithms and representations. However, the algorithms and the size of data considered have generally not been representative of the increasingly massive datasets encountered in science, engineering, and business domains. In this work, we show how a novel multi-resolution symbolic representation can be used to index datasets which are several orders of magnitude larger than anything else considered in the literature. Our approach allows both fast exact search and ultra fast approximate search. We show how to exploit the combination of both types of search as sub-routines in data mining algorithms, allowing for the exact mining of truly massive real world datasets, containing millions of time series.

Institutions

  • 2012
    • University of California, Riverside
      • Department of Computer Science and Engineering
      Riverside, CA, USA