September 2023
·
11 Reads
Lecture Notes in Computer Science
Data changing, or drifting, over time is a major problem when using classical machine learning on data streams. One approach to deal with this is to detect changes and react accordingly, for example by retraining the model. Most existing drift detection methods only report that a drift has happened between two time windows, but not when exactly. In this paper, we present extensions for three popular methods, MMDDDM, HDDDM, and D3, to determine precisely when the drift happened, i.e. between which samples. One major advantage of our extensions is that no additional hyperparameters are required. In experiments, with an emphasis on high-dimensional, real-world datasets, we show that they successfully identify when the drifts happen, and in some cases even lead to fewer false positives and false negatives (undetected drifts), while making the methods only negligibly slower. In general, our extensions may enable a faster, more robust adaptation to changes in data streams.