Shruti Kasetty’s research while affiliated with University of California, Riverside and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


Kasetty, S.: On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. Data Mining and Knowledge Discovery 7(4), 349-371
  • Article

January 2003

·

418 Reads

·

1,235 Citations

Data Mining and Knowledge Discovery

Eamonn Keogh

·

Shruti Kasetty

In the last decade there has been an explosion of interest in mining time series data. Literally hundreds of papers have introduced new algorithms to index, classify, cluster and segment time series. In this work we make the following claim. Much of this work has very little utility because the contribution made (speed in the case of indexing, accuracy in the case of classification and clustering, model accuracy in the case of segmentation) offer an amount of improvement that would have been completely dwarfed by the variance that would have been observed by testing on many real world datasets, or the variance that would have been observed by changing minor (unstated) implementation details.To illustrate our point, we have undertaken the most exhaustive set of time series experiments ever attempted, re-implementing the contribution of more than two dozen papers, and testing them on 50 real world, highly diverse datasets. Our empirical results strongly support our assertion, and suggest the need for a set of time series benchmarks and more careful empirical evaluation in the data mining community.


On the need for time series data mining benchmarks

July 2002

·

79 Reads

·

217 Citations

In the last decade there has been an explosion of interest in mining time series data. Literally hundreds of papers have introduced new algorithms to index, classify, cluster and segment time series. In this work we make the following claim. Much of this work has very little utility because the contribution made (speed in the case of indexing, accuracy in the case of classification and clustering, model accuracy in the case of segmentation) offer an amount of "improvement" that would have been completely dwarfed by the variance that would have been observed by testing on many real world datasets, or the variance that would have been observed by changing minor (unstated) implementation details.To illustrate our point, we have undertaken the most exhaustive set of time series experiments ever attempted, re-implementing the contribution of more than two dozen papers, and testing them on 50 real world, highly diverse datasets. Our empirical results strongly support our assertion, and suggest the need for a set of time series benchmarks and more careful empirical evaluation in the data mining community.

Citations (2)


... With the advancement of techniques in sensing, networking, storage, and data processing, it has become feasible to collect, store, and process massive collections of measurements over time [106,134,150,157,170], referred to as time series. Time series analysis, with its ability to capture temporal relationships between data points, has attracted significant interest across various academic and industrial domains [78,97,129,143,161,162,212,222] such as electrical engineering [104,170], astronomy [5,199], finance [37,74], energy [10,12], environment [77,90,147,206], bioinformatics [16,17,67], medicine [49,165,171], and psychology [109]. In various domains and Internet-of-Things applications, the increasing volume of time series data has prompted the need for efficient techniques in data processing and analysis [61,98,99,115,133,137]. ...

Reference:

A Survey on Time-Series Distance Measures
On the need for time series data mining benchmarks
  • Citing Conference Paper
  • July 2002

... However, for system analysis, the most critical data often pertains to specific channels over limited time periods. Therefore, techniques for efficiently searching necessary data are essential [1], [2]. ...

Kasetty, S.: On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. Data Mining and Knowledge Discovery 7(4), 349-371
  • Citing Article
  • January 2003

Data Mining and Knowledge Discovery