Employing chunk size adaptation to overcome concept
Jędrzej Kozal1, Filip Guzy1, and Michał Woźniak1
1Wrocław University of Science and Technology
Modern analytical systems must be ready to process streaming data and cor-
rectly respond to data distribution changes. The phenomenon of changes in data
distributions is called concept drift, and it may harm the quality of the used models.
Additionally, the possibility of concept drift appearance causes that the used algo-
rithms must be ready for the continuous adaptation of the model to the changing
data distributions. This work focuses on non-stationary data stream classiﬁcation,
where a classiﬁer ensemble is used. To keep the ensemble model up to date, the new
base classiﬁers are trained on the incoming data blocks and added to the ensemble
while, at the same time, outdated models are removed from the ensemble. One
of the problems with this type of model is the fast reaction to changes in data dis-
tributions. We propose a new Chunk Adaptive Restoration framework that can be
adapted to any block-based data stream classiﬁcation algorithm. The proposed algo-
rithm adjusts the data chunk size in the case of concept drift detection to minimize
the impact of the change on the predictive performance of the used model. The
conducted experimental research, backed up with the statistical tests, has proven
that Chunk Adaptive Restoration signiﬁcantly reduces the model’s restoration time.
Data stream mining focuses on the knowledge extraction from streaming data,
mainly for the predictive model construction aimed at assigning arriving instances
to one of the predeﬁned categories. This process is characterized by additional
diﬃculties that arise when data distribution evolves over time. It is visible
in many practical tasks as spam detection, where the spammers still change
the message format to cheat anti-spam systems. Another example is medical
diagnostics, where new SARS-CoV-2 mutations may cause diﬀerent symptoms,
which forces doctors to adapt and improve diagnostic methods .
The mentioned above phenomenon is called concept drift, and its nature
can vary due to both the character and the rapidity. It forces classiﬁcation
models to adapt to new data characteristics and forget old, useless concepts. An
important characteristic of such systems is their reaction to the concept drift
phenomenon, i.e., how much predictive performance deteriorates when it occurs
and when the classiﬁcation system will obtain the approved predictive quality for
the new concept. We should also consider another limitation: the classiﬁcation
arXiv:2110.12881v1 [cs.LG] 25 Oct 2021
system should be ready to classify incoming objects immediately, and dedicated
computing and memory resources are limited.
Data processing models used by stream data classiﬁcation systems can be
roughly divided into two categories: online (object by object) processing (online
learners), or block-based (chunk by chunk) data processing (block-based learners)
]. Online learners require model parameters to be updated when a new object
appears, while the block-based method requires updates once per batch. The
advantage of online learners is their fast adaptation to concept drift. However,
in many practical applications, the eﬀort of necessary computation (related to
updating models after processing each object) is unacceptable. The model update
can require many operations that involve changing data statistics, updating
the model’s internal structure, or learning a new model from scratch. These
requirements can become prohibitive for high-velocity streams. Hence, more
popular is block-based data processing, which requires less computational eﬀort.
However, it limits the model’s potential for quick adaptation to changes in
data distribution and fast restoration of performance after concept drift. In
consequence, a signiﬁcant problem is the proper selection of the chunk size.
Smaller data block size results in faster adaptation. However, it increases the
overall computing load. On the other hand, larger data chunks require less
computation but result in a lower adaptive capacity of the classiﬁcation model.
Another valid consideration is the impact of chunk size on prediction stability.
Models trained on smaller chunks typically have larger prediction variance, while
models trained with larger chunks tend to have more stable predictions when
the data stream is stationary. If concept drift occurs, a larger chunk increase
probability that the data from diﬀerent concepts will be placed in the same batch.
Hence, selecting the chunk size is a trade-oﬀ encompassing computation power,
adaptation speed, and predictions variance.
The trade-oﬀ described above includes features that are equally desired in
many applications. Especially consumption of computation power and adaptation
speed are both important when processing large data streams. We propose a
new method that alleviates the downfalls of choosing between small or large
chunk sizes by dynamically changing the current batch size. More precisely, our
work introduces the Chunk-Adaptive Restoration (CAR), a framework based on
combined drift and stabilization detection techniques that adjusts the chunk sizes
during the concept drift. This approach slightly redeﬁnes the previous problem
based on the observation that for many practical classiﬁcation tasks, a period of
changes in data distributions is followed by stabilization. Hence, we propose that
when the concept drift occurs, the model should be quickly upgraded, i.e., the
data should be processed in small chunks, and during the stabilization period,
the data block size may be extended. The advantage of the proposed method
is its universality and the possibility of using it with various chunk-based data
This work oﬀers the following contributions:
Proposing the Chunk-Adaptive Restoration framework to empower ﬂuent
restoration after concept drift appearance.
Formulating the Variance-based Stabilization Detection Method, a technique
complementary to all concept drift detectors that simpliﬁes chunk size
adaptation and metrics calculation.
2 RELATED WORKS
Employing Chunk-Adaptive Restoration for the adaptive data chunk size
setting for selected state-of-the-art algorithms.
Introducing a new stream evaluation metric, Sample Restoration, to show
the gains of the proposed methods.
Experimental evaluation of the proposed approach based on various syn-
thetic and real data streams and a detailed evaluation of its usefulness for
the selected state-of-art methods.
2 Related works
This section provides a review of the related works. Firstly, we will discuss chal-
lenges speciﬁc to the learning from non-stationary data streams. Next, we discuss
diﬀerent methods of processing data streams. Following, we describe existing drift
detection algorithms and ensemble methods. We continue by reviewing existing
evaluation protocols and computational and memory requirements. We conclude
this section by providing examples of other data stream learning methods that
employ variable chunk size.
2.1 Challenges related to data stream mining
A data stream is a sequence of objects described by their attributes. In the
case of a classiﬁcation task, each learning object should be labeled. The number
of items may be vast, potentially inﬁnite. Observations in the stream may
arrive at diﬀerent times, and the time intervals between their arrival could vary
considerably. The main diﬀerences between analyzing data streams and static
datasets include :
•No one can control the order of incoming objects
The computation resources are limited, but the analyzer should be ready
to process the incoming item in a reasonable time
The memory resources are also limited, but the data stream size may be
huge or inﬁnite, which causes memorizing all the items impossible
Data streams are susceptible to change, i.e., data distributions may change
The labels of arriving items are not for free, for some cases impossible to
get, or available with delay (e.g., in banking for credit approval task after
a few years)
The canonical classiﬁers usually do not consider that the probabilistic charac-
teristics of the classiﬁcation task may evolve [
]. Such a phenomenon is known
as concept drift [
] and a few concept drift taxonomies have been proposed. The
most popular consider how rapid the drift is, then we can distinguish sudden
drift and incremental one. An additional diﬃculty is a case when, during the
transition between two concepts, objects from two diﬀerent concepts appear for
some time simultaneously (gradual drift). We can also take into consideration
the inﬂuence of the probabilistic characteristics on the classiﬁcation task :
2.2 Methods for processing data streams 2 RELATED WORKS
virtual concept drift does not impact the decision boundaries but aﬀects
the probability density functions [
], and Widmer and Kubat [
it rather to incomplete data representation than to the true changes in
real concept drift aﬀects the posterior probabilities and may impact the
unconditional probability density function .
2.2 Methods for processing data streams
The data stream can be divided into small portions of the data called data
chunks. This method is known as batch-based or chunk-based learning. Choosing
the proper size of the chunk is crucial because it may signiﬁcantly aﬀect the
]. Unfortunately, the unpredictable appearance of the concept
drift makes it diﬃcult. Several approaches may help overcome this problem,
e.g., using diﬀerent windows for processing data [
] or adjusting chunk size
]. Unfortunately, most chunk-based classiﬁcation methods assume
that the size of the data chunk is priorly set and remains unchanged during the
Instead of chunk-based learning, the algorithm can learn incrementally (online)
as well. Training examples arrive one by one at a given time, and they are not
kept in memory. The advantage of this solution is the need for small memory
resources. However, the eﬀort of necessary computation related to updating
models after processing each individual object is unacceptable, especially in the
high-velocity data streams, i.e., Internet of Things (IoT) applications.
When processing a non-stationary data stream, we can rely on a drift detector
to point moments when data distribution has changed and take appropriate
actions. The alternative is to use inherent adaptation properties of models
(update & forget). In the following subsection, we will discuss both of these
2.3 Drift detection methods
A drift detector is an algorithm that can inform about any changes taking place
within data stream distributions. The data labels or a classiﬁer’s performance
(measured using any metric, such as accuracy) is required to detect a real concept
]. We have to realize that drift detection is a non-trivial task. The
detection should be done as quickly as possible to replace an outdated model and
minimize restoration time. On the other hand, false alarms are unacceptable, as
they will lead to an incorrect model adaptation and resource spending where
there is no need for it [
]. DDM (Drift Detection Method) [
] is one of the most
popular detectors that incrementally estimates an error of a classiﬁer. Because
we assume the classiﬁer training method’s convergence, the error should decrease
with the appearance of subsequent learning objects [
]. If the reverse behavior is
observed, then we may suspect a change of probability distributions. DDM uses
the three-sigma rule to detect a drift. EDDM (Early Drift Detection Methods)
] is an extension of DDM, where the window size selection procedure is based
on the same heuristics. Additionally, the distance error rate is being used instead
of the classiﬁer’s error rate. Blanco et al. [
] proposed very interesting drift
2.4 Ensemble methods 2 RELATED WORKS
detectors that use the non-parametric estimation of classiﬁer error employing
Hoeﬀding’s and McDiarmid’s inequalities.
2.4 Ensemble methods
One of the most promising data stream classiﬁcation research directions, which
usually employs chunk-based data processing is the classiﬁer ensemble approach
]. Its advantage is that the classiﬁer ensemble can easily adapt to the concept
drift using diﬀerent updating strategies :
Dynamic combiners – individual classiﬁers are trained in advance, and they
are not updated anymore. The ensemble classiﬁer adapts to changing data
distribution by changing the combination rule parameters.
Updating training data – incoming examples are used to retrain component
classiﬁers (e.g., online bagging ).
•Updating ensemble members [64, 67].
Changing ensemble lineup – replacing outdated classiﬁers in the ensemble,
e.g., new individual models are trained on the most recent data and added
to the ensemble. The ensemble pruning procedure is applied, which chooses
the most valuable set of individual classiﬁers .
A comprehensive overview of techniques using classiﬁer ensemble [
presented by Krawczyk et al. Let us shortly characterize some popular strategies
used during the experiments. Streaming Ensemble Algorithm (SEA) [
] is the
simple classiﬁer ensemble with changing lineup, where the individual classiﬁers
are trained on the successive data chunks. To keep the model up-to-date, the
base classiﬁers with the lowest accuracy are removed from the ensemble. Wang et
al. proposed Accuracy Weighted Ensembles (AWE) [
] employing the weighted
voting rules, where weights depend on the accuracy obtained on the testing data.
Brzezinski and Stefanowski proposed Accuracy Updated Ensemble (AUE), which
extends AWE by using online classiﬁers and updating them according to the
current distribution [
]. Wozniak et al. developed Weighted Aging Ensemble
(WAE), which trains base classiﬁers on successive data chunks, and the ﬁnal
decision is made on weighted voting, where weights depend on accuracy and
ensemble diversity. This algorithm additionally employs the decoy function to
decrease the weights of outdated individuals .
2.5 Existing evaluation methodology
Because this work mainly focuses on improving classiﬁer behavior after the
concept drift appearance, apart from the classiﬁer’s predictive performance, we
should also consider memory consumption, the time required to update the model,
and time to decide. However, it should also be possible to evaluate how the
model reacts to changes in the data distribution. Shaker and Hüllermeier [
presented a complete framework for evaluating the recovery rate, including the
proposition of two metrics restoration time and maximum performance loss. In
this framework, the notion of pure streams was introduced i.e., streams containing
only one concept. Two pure streams
are mixed into third stream
, starting with concepts only from the ﬁrst stream and gradually increasing a
2.6 Computational and memory requirements 2 RELATED WORKS
percentage of concepts from the second stream. Restoration time was deﬁned as a
length of the time interval between two events - ﬁrst a performance measured on
drops below 95% of a
performance, and then the performance on
above 95% of
performance. The Maximum performance loss is the maximum
performance and lowest performance on either
Zliobaite et al. [
] proposed that evaluating the proﬁt from the model update
should consider the memory and computing resources involved in its update.
2.6 Computational and memory requirements
While designing a data stream classiﬁer, we should also consider the computation
power and memory limitations and that we usually have limited access to data
labels. These data stream characteristics pose the need for other algorithms than
ones previously developed for batch learning, where data are stored inﬁnitely and
persistently. Such learning algorithms cannot fulﬁll all data stream requirements,
such as memory usage constraints, limited processing time, and one scan of
incoming examples. However, simple incremental learning is usually insuﬃcient,
as it does not meet tight computational demands and does not tackle evolving
nature of data sources .
Constraints on memory and time have resulted in diﬀerent windowing tech-
niques, sampling (e.g., reservoir sampling), and other summarization approaches.
Also, we have to realize that when the concept drift appears, data from the past
may become irrelevant or even harmful for the current models, deteriorating the
predictive performance of the classiﬁers. Thus an appropriate implementation of
a forgetting mechanism (where old data instances are discarded) is crucial.
2.7 Other approaches that modify chunk size
Dynamic chunk size adaptation was proposed in some works earlier [
Liu et al. [
] utilize information about the occurrence of drift from drift detector.
If drift occurs in the middle of the chunk, data is divided into two chunks, hence
dynamic chunk size. If there is no drift inside the chunk, the whole batch is used.
In the prepared chunk, the majority class is undersampled. A new classiﬁer is
trained and added to the ensemble, and older classiﬁers are updated. Lu et al.
] also utilize an ensemble framework for imbalanced stream learning. In this
approach, chunk size grows incrementally. Two chunks are compared based on
ensembles predictions variance. An algorithm for calculating prediction variance
called subunderbagging is introduced. Computed variance is compared using
F-test. Chunk size increases if the p-value is less than a predeﬁned threshold;
otherwise, the whole ensemble is updated with the selected chunk size. The
whole process repeats as long as the p-value is lower than the threshold. In both
of these works, dynamic chunk size was used as means of handling imbalanced
data streams. In contrast, we show that changing chunk size can be beneﬁcial
when handling concept drifts in general. Therefore, we do not focus primarily on
Bifet et al. [
] introduced a method for handling concept drift with varying
chunk sizes. Each incoming chunk is divided into two parts: older and new.
Empirical means of data in each subchunk are compared using Hoeﬀding bound.
If the diﬀerence between two means exceeds the threshold deﬁned by conﬁdence
value, then data in the older window is qualiﬁed as out of date and is dropped.
Later window with data for current concept grows, until next drift is detected
and data is split again. This approach allows for detecting drift inside the chunk.
This paper presents a general framework that can be used for training any chunk-
based classiﬁer ensemble. This approach aims to reduce the restoration time, i.e.,
a period needed to stabilize the classiﬁcation model performance after concept
drift occurs. As we mentioned, most methods assume a ﬁxed data chunk size,
which is a parameter of these algorithms. Our proposal does not modify the
core of a learning algorithm itself. Still, based on the predictive performance
estimated on a given data chunk, it only indicates what data chunk size is to be
taken by a given algorithm in the next step. We provide schema of our method
in Fig. 1. The intuition tells us that after the occurrence of the concept drift, the
size of the chunk should be small to quickly train new models that will replace
the models learned on the data from the previous concept in the ensemble. When
the stabilization is reached, the ensemble contains base models trained on data
from a new concept. In this moment we can extend the chunk size so classiﬁers
in the ensemble can achieve better performance and even greater stability by
learning on larger portions of data from the streams because the analyzed concept
is already stable.
Figure 1: Chunk-Adaptive Restoration visualization. Red line marks the concept
drift, green line marks the stabilization.
Let us present the proposed framework in detail.
3.1 Chunk-Adaptive Restoration
Starting the learning process, we sample the data from the stream with a constant
and monitor the classiﬁer performance using a concept drift detector
to detect changes in data distribution. When the drift occurs, we decrease the
chunk size to the smaller value
is the predeﬁned size of a batch for
concept drift. Size of subsequent chunks after drift at given time
using the following equation:
1. The chunk size grows continuously with each step to reach the
unless the stabilization is detected. Then the chunk size is set to
immediately. Let us introduce the Variance-based Stabilization Detection Method
(VSDM) to detect the predictive performance stabilization. First, we deﬁne the
3.2 Memory and time complexity 3 METHODS
ﬁxed-sized sliding window
containing the last
predictive performance metric
values obtained for the most recent chunks. We also introduce the stabilization
. The stabilization is detected when the following condition is met:
V ar(W)< s(2)
)is a variance of scores obtained for the last
data stream with detected drift and stabilization is presented in Fig. 2. The
primary assumption of the proposed method is a faster model adaptation caused
by the increased number of updates after a concept drift. This strategy allows for
using the larger chunk sizes when the data is not changing. It also reduces the
computational costs of retraining models. Alg. 1 present the whole procedure.
Our method works with existing models for online learning. For this reason, we
argue that the approach proposed in this paper is easier to deploy in practice.
Figure 2: Exemplary accuracy for data stream with abrupt concept. Red line
denotes drift detection, green stabilization detection, and blue beginning of a real
3.2 Memory and time complexity
Our method only impacts the size of the chunk. All other factors like the number
of features or classiﬁers in the ensemble are the same as in the basic approach.
For this reason, we will focus here only on the impact of chunk size on memory
and time complexity. With memory complexity, our method could impact only
the size of buﬀers for storing samples from a stream. When no drift is detected,
the standard chunk size is used. This dictates the required size of buﬀers for
storing samples. For this reason, memory complexity for storing samples is
CAR works the same way as a base method when no drift is detected, and
the data stream is stable. Therefore, in this case, the time complexity is the
same as in the base method. When drift is detected sizes of subsequent chunks
3.2 Memory and time complexity 3 METHODS
Algorithm 1 Chunk-Adaptive Restoration algorithm
Input: m- model
S- data stream
dd - drift detector
sd - stabilization detector
n- number of chunks
t- chunk index
c- base chunk size
cd- base drift chunk size
ct-tth chunk size
test() - procedure that tests model with a chunk and returns the predictive
performance metric (ppm)
train() - procedure that trains model with a chunk
change_detected() - procedure that informs about drift occurrence with the
drift detector and the last score
stabilization_detected() - procedure that detects stabilization with the stabi-
lization detector and the stabilization window
1: for t= 1 to ndo
2: ppm ←test(m, S(t))
3: if stabilization_detected(sd, ppm)then
6: ct←min(bαct−1c, c)
7: end if
8: if change_detected(dd, ppm)then
10: end if
11: train(m, S(t))
12: end for
3.3 Sample Restoration 3 METHODS
are changed. Time complexity depends on model complexity
number of learning examples provided to model to train on. For simplicity we
)represents both ensemble and base model complexity. With
this assumptions time complexity of base model (when CAR is not enabled)
)). When CAR is enabled and concept drift is detected chunk size
is changed to
. Each consecutive chunk at time
= 0 directly after the drift was detected. Chunk size grows until stabilization is
detected or current chunk size is restored to original size
. For simplicity we skip
case when stabilization is detected. With this assumption, we write condition for
restoring the original chunk size:
is time when chunk size is restored to original value. From this
equation we obtain tsdirectly:
The number of operations required by CAR after concept drift was detected
Using big-O notation:
g(αtcd)) = O(g(αtscd)) = O(g(c
cd)) = O(g(c)) (6)
Therefore CAR time complexity depends only on chunk size and computational
complexity of used models.
3.3 Sample Restoration
Restoration time cannot be directly utilized in this work, as we do not have
access to pure streams with separate concepts. For this reason, we introduce a
new Sample Restoration (SR) metric to evaluate the Chunk-Adaptive Restoration
performance compared to standard methods used for learning models on data
streams with concept drift. We assume that there is a sequence of
between two stabilization points. Each element of such a sequence is determined
by the chunk size
and the achieved model’s accuracy
. Let us deﬁne the
index of the minimum accuracy as:
tmin = argmin
and the restoration threshold is given by the following formula:
1) is the percentage of the performance that has to be restored,
and the multiplier is the maximum accuracy score of our model after the point
when it achieved its minimum score. Finally, we look for the lowest index
which the model exceeds the assumed restoration threshold:
Sample Restoration is computed as the sum of chunk sizes from the concept
drift’s beginning to the tr:
In general, SR is the number of samples needed to obtain the
percent of the
maximum performance achieved on the subsequent task.
Chunk-Adaptive Restoration is a method designed to reduce the number of samples
used to restore the model’s performance during the concept drift. We expect to
signiﬁcantly reduce the Sample Restoration for each trained model depending
on the chunk size adaptation level. The experimental study was formulated to
answer the following research questions:
RQ1: How do diﬀerent chunk sizes impact predictive performance?
RQ2: How does the Chunk-Adaptive Restoration inﬂuence the learning process?
RQ3: How many samples can be saved during the restoration phase?
How do diﬀerent classiﬁer ensemble models behave with the application of
Chunk-Adaptive Restoration applied?
RQ5: How robouts to noise Chunk-Adaptive Restoration is?
4.1 Experiment setup
Experiments were carried out using both synthetic and real
datasets. Stream-learn library [
] was employed to generate the synthetic data
containing three types of concept drift: abrupt, gradual, and increment, all
generated with the recurring or unique concepts. We tested parameters such as
chunk sizes and the stream length for each type of concept drift. All streams
were generated with 5 concept drifts, 2 classes, 20 input features, of which 2
were informative and 2 were redundant. In the case of incremental and gradual
drifts concept, sigmoid spacing was set to 5. Apart from the synthetic ones,
we employed the Usenet [
] and Insects [
] data streams. Unfortunately, the
original Usenet dataset contains a small number of samples, so two selected
concepts were repeated to create a recurring-drifted data stream. Each chunk
of the Insects data stream was randomly oversampled because of the signiﬁcant
imbalance ratio. Tab. 1 contains detailed description of all utilized data streams.
4.1 Experiment setup 4 EXPERIMENT
# Source Drift type
1 stream-learn abrupt recurring 500 300000
2 stream-learn abrupt recurring 1000 150000
3 stream-learn abrupt recurring 10000 60000
4 stream-learn abrupt recurring 500 250000
5 stream-learn abrupt nonrecurring 500 300000
6 stream-learn abrupt nonrecurring 1000 150000
7 stream-learn abrupt nonrecurring 10000 60000
8 stream-learn abrupt nonrecurring 500 250000
9 stream-learn gradual recurring 500 300000
10 stream-learn gradual recurring 1000 150000
11 stream-learn gradual recurring 10000 60000
12 stream-learn gradual recurring 500 250000
13 stream-learn gradual nonrecurring 500 300000
14 stream-learn gradual nonrecurring 1000 150000
15 stream-learn gradual nonrecurring 10000 60000
16 stream-learn gradual nonrecurring 500 250000
17 stream-learn incremental recurring 500 300000
18 stream-learn incremental recurring 1000 150000
19 stream-learn incremental recurring 10000 60000
20 stream-learn incremental recurring 500 250000
21 stream-learn incremental nonrecurring 500 300000
22 stream-learn incremental nonrecurring 1000 150000
23 stream-learn incremental nonrecurring 10000 60000
24 stream-learn incremental nonrecurring 500 250000
25 usenet abrupt recurring 1000 120000
26 insects-abrupt-imbalanced abrupt nonrecurring 1000 355275
27 insects-gradual-imbalanced gradual nonrecurring 1000 143323
Table 1: Data streams used for experiments.
4.2 Impact of chunk size on performance 4 EXPERIMENT
The Fast Hoeﬀding Drift Detection Method [
] was employed
as a concept drift detector. We used implementation available on the public
]. The size of a window in FHDDM was equal to 1000, and the
error probability allowed δ= 0.000001.
Three models classiﬁer ensembles dedicated to data
stream classiﬁcation were chosen for comparison:
•Weighted Aging Classiﬁer (WAE) 
•Accuracy Weighted Ensemble (AWE) ,
•Streaming Ensemble Algorithm (SEA) ,
All ensembles contained 10 base classiﬁers.
In our experiments, we apply the models mentioned
above to selected data streams with concept drift. We measure Sample Restora-
tion. These results are reported as a baseline. Next, we apply Chunk-Adaptive
Restoration and repeat experiments to establish the proposed model’s inﬂuence
on the ability to handle concept drift quickly. As the experiments were conducted
with the balanced data, the accuracy was used as the only indicator of the model’s
performance. As the experimental protocol Test-Then-Train was employed [
Because Sample Restoration can be computed for each
drift and concept drift can occur multiple times, we report average Sample
Restoration for each stream with standard deviation. To assess the statistical
signiﬁcance of the results, we used a one-sided Wilcoxon signed-rank test in a
direct comparison between the models with the 95% conﬁdence level.
To enable independent reproduction of our experiments, we
provide a github repository with code
. This repo also contains detailed results
of all experiments. Stream-learn [
] implementation of the ensemble models was
utilized with the Gaussian Naïve Bayes and CART as base classiﬁers from sklearn
]. Detailed information about used packages is provided in the yml ﬁle with a
speciﬁcation of the conda environment.
4.2 Impact of chunk size on performance
In our ﬁrst experiment, we examine the impact of the chunk size on the model
performance and general capability for handling data with concept drift. We
train the AWE model on a synthetic data stream with diﬀerent chunk sizes to
evaluate these properties. The stream consists of 20 features, 2 classes, and it
contains only 1 abrupt drift. Results are presented in Fig. 3. As expected, chunk
size has an impact on the maximal accuracy that the model can achieve. It is
especially visible before drift, where models with larger chunks obtain the best
accuracy. Also, with larger chunks variance of accuracy is lower. In ensemble-
based approaches, a base classiﬁer is trained on a single chunk. A larger chunk
means that more data is available to the underlying model. Therefore it allows
for the training of a more accurate model. Interestingly we can see that for all
chunk sizes, performance is restored roughly at the same time. Regardless of
the chunk size, a similar number of updates is required to bring back the model
performance. Please keep in mind that the x-axis in Fig. 3 is the number of
4.3 Hyperparameter tuning 4 EXPERIMENT
chunks. It means that models trained on larger chunks require a larger number
of learning examples to restore accuracy.
Figure 3: Impact of chunk size on obtained accuracy.
These results give the rationale behind our method. When drift is detected,
we change chunk size to decrease the consumption of learning examples required
for restoring accuracy. Next, we gradually increase chunk size to improve the
maximum possible performance when the model recovers from drift. It allows for
a quick reaction to drift and does not limit the model’s maximum performance.
In principle, not all models are compatible with changing chunk size. Also, batch
size cannot be decreased indeﬁnitely. Minimal chunk size should be determined
case by case, dependent on the base learner used in an ensemble or used model
in general. Later in our experiments, we use chunk sizes of 500, 1000, and 10000
to obtain a reliable estimate of how our method will perform in diﬀerent settings.
4.3 Hyperparameter tuning
After chunk size was selected, we ﬁne-tuned other hyperparameters, and then we
proceeded to further experiments. Firstly set two values manually, based on our
observations. First is
(i.e., constant that determines how fast chunk size grows
after drift was detected) equal to 1
1. Second is drift chunk size equal to 30, as it
is a typical window length in drift detectors.
Next, we ﬁnd the best for the stabilization window size and the stabilization
threshold. We conduct grid search with windows size values 30, 50, 100, and
stabilization thresholds 0.1, 0.01, 0.001, 0.0001. For experiments we use synthetic
data streams 1-24 from Tab. 1. Used data streams have diﬀerent random number
generator seeds in this and later experiments. Results were collected for WAE,
AWE, SEA ensembles with Naïve Bayes base model. We use Sample Restoration
0.8 as a performance indicator. For each set of parameters, Sample Restoration
was averaged over all streams used to obtain one value. Results are provided in
the table 2.
4.4 Impact on concept drift handling capability 4 EXPERIMENT
drift chunk size
30 50 100
0.1 59210.11 59210.11 59210.11
0.01 58489.47 58675.99 58709.98
0.001 55328.20 55363.95 57669.70
0.0001 52846.04 55962.58 62398.56
Table 2: Sample Restoration 0.8 for various hyperparameter setting. Lower is
From provided data, we can conclude that the smaller the drift chunk size,
the lower the SR is. This observation is in line with intuition about our method.
Smaller drift chunk size provides a larger beneﬁt during drift compared to normal
chunk size. The same dependency can be observed for the stabilization threshold.
Intuitively, a lower threshold means that stabilization is harder to reach. We
argue that this can be beneﬁcial in some cases when working with gradual or
incremental drift. In this scenario, if stabilization is reached too fast, then chunk
size is immediately brought back to the standard size, and there is no beneﬁt
from a smaller chunk size at all. Lowering the stabilization threshold could help
in these cases. In later experiments, we use the stabilization window size equal
to 30 and the variance stabilization threshold equal to 0.0001.
4.4 Impact on concept drift handling capability
In this part of the experiments, we compare the performance of the proposed
method to baseline. Results were collected following the experimental protocol
described in the previous sections. To save space, we do not provide results for
all models and streams. Instead, we plot accuracy achieved by models on selected
data streams. These results are presented in Fig. 4, 5, 6, and 7. All learning
curves were smoothed using a 1D Gaussian ﬁlter with σ= 1.
From provided plots, we can deduce that the largest gains from employing the
CAR method can be observed for an abrupt data stream. In streams with gradual
and incremental drifts, there are fewer or none sudden drop of accuracy that the
model can quickly react to. For this reason, the CAR method does not provide a
large beneﬁt with this kind of concept drifts. During a more detailed analysis of
obtained results, we observed that the stabilization for gradual and incremental
drifts is hard to detect. Many false positives usually cause an early return to
the original chunk size, inﬂuencing the performance achieved on those two types
of drifts. FHDDM caused another problem regarding the early detection of
the gradual and incremental concept drifts. Usually, this is a desired feature.
In our method, early drift detection initiates the chunk size change when two
data concepts are still overlapping during stream processing. As the transition
between two concepts takes much time, when one concept starts to dominate, the
chunk size could be restored to its original value too early, aﬀecting the achieved
We also observe larger gains from applying CAR on streams with bigger chunk
size. To illustrate please compare results from Fig. 4 to Fig. 5. One possible
explanation behind this trend is that gains obtained from employing CAR are
4.4 Impact on concept drift handling capability 4 EXPERIMENT
proportional to the diﬀerence in size between the base and drift chunk size. In
our experiments, drift chunk size was equal to 30 for all streams and models.
This explanation is also in line with the results of hyperparameter experiments
provided in Tab. 2.
We conclude this section by providing a statistical analysis of our results.
Tab. 3 shows the results of the Wilcoxon test for Naïve Bayes and CART base
models. We state meaningful diﬀerences in the Sample Restoration between the
baseline and the CAR method for all models.
Figure 4: Accuracy for stream-learn
data stream (1).
Figure 5: Accuracy for Usenet dataset
Figure 6: Accuracy for abrupt Insects
Figure 7: Accuracy for gradual Insects
4.5 Impact of noise on the CAR eﬀectiveness 4 EXPERIMENT
SR(0.9) SR(0.8) SR(0.7)
Statistic p-value Statistic p-value Statistic p-value
WAE 40.0 0.0006 30.0 0.0002 45.0 0.0009
AWE 22.0 9.675e-05 26.0 0.0001 36.0 0.0004
SEA 0.0 1.821e-05 23.0 0.0001 1.0 1.389e-05
SR(0.9) SR(0.8) SR(0.7)
Statistic p-value Statistic p-value Statistic p-value
WAE 14.0 6.450e-05 54.0 0.003 55.0 0.003
AWE 0.0 1.229e-05 6.0 2.543e-05 21.0 0.0001
SEA 23.0 0.0001 43.0 0.001 42.0 0.001
Table 3: Wilcoxon test results
4.5 Impact of noise on the CAR eﬀectiveness
Real-world data often contain noise in labeling. For this reason, we evaluate
if the proposed method can be used for data with varying amounts of noise in
labels. We generate a synthetic data stream with two classes, base chunk size
1000, drift chunk size 100, and single, abrupt concept drift. We randomly select a
predeﬁned fraction of samples in each chunk and ﬂip labels for selected learning
examples. Next, we measure the accuracy of the AUE model with Gaussian
Naïve Bayes base model on a generated dataset with noise levels 0, 0.1, 0.2, 0.3,
and 0.4. Results are presented in Fig. 8. We note for low levels of noise i.e., up
to 0.3, restoration time is shorter. With a larger amount of noise, there is no
sudden drop in accuracy. Therefore CAR has no impact on the speed of reaction
It should be noted that results for CAR with noise levels 0.2, 0.3, and 0.4 were
generated with the stabilization detector turned oﬀ. With a higher amount of
noise, stabilization was detected very fast. Therefore chunk size was quickly set
to base value. In this case, there was no beneﬁt of applying CAR. This indicates
that the stabilization method should be reﬁned to handle noisy data well.
4.6 Lessons learned
Firstly we evaluated the impact of chunk size on the process of learning in the
data stream with single concept drift. We learn that models with larger chunk
size can obtain larger maximum accuracy, but the required number of updates
to restore accuracy is similar regardless of chunk size (RQ1 answered). The main
goal of introducing the Chunk-Adaptive Restoration was to prove its advantages
in controlling the number of samples during the restoration period while dealing
with abrupt concept drift. The statistical tests have shown a signiﬁcant beneﬁt of
employing it in diﬀerent stream learning scenarios (RQ2 answered). The highest
gains of employing the method were observed when the large original chunk size
was used. With a bigger chunk size, there are fewer model updates, resulting in
a delay of reaction to concept drift.
The number of samples that can be saved depends on the drift type and the
Figure 8: Impact of noise in labels on proposed method eﬀectiveness. (Upper)
baseline accuracy for synthetic data stream with diﬀerent noise level added to
labels. (Lower) CAR accuracy for the same synthetic data stream. In case of
Noise levels 0.2, 0.3, and 0.4 stabilization detector was turned oﬀ.
original chunk size. When dealing with abrupt drift, the sample restoration time
can be around 50% better than the baseline (RQ3 answered). We noticed that
for each of the analyzed classiﬁer ensemble methods, CAR minimized restoration
time and achieved better average predictive performance. It is worth noting
that the simpler the algorithm, the greater the proﬁt from using CAR. The
most considerable proﬁt was observed for SEA and AWE, while in the case of
WAE, sometimes the native version outperformed CAR for the Average Sample
Restoration metric (RQ4 answered). When a small amount of noise is present
in labels, CAR can still be useful, however in some cases stabilization detector
should not be used. With a larger amount of noise, there is no gain from using
the proposed method (RQ5 answered).
The work focused on the Chunk-Adaptive Restoration framework, which is dedi-
cated to chunk-based data stream classiﬁers enabling better recovery from concept
drifts. To achieve this goal, we proposed new methods for stabilization detection
and chunk size adaptation. Their usefulness was evaluated based on computer
experiments conducted on the real and synthetic data streams. Obtained results
show a signiﬁcant diﬀerence between the predictive performance of the baseline
models and models employing CAR. Chunk-Adaptive Restoration is strongly
recommended for abrupt concept drift scenarios because it signiﬁcantly can
reduce model downtime. The performance gain is not visible for other types of
concept drift, but it still achieves acceptable results. The future works may focus
Improving the Chunk-Adaptive Restoration behavior for gradual and incre-
mental concept drifts.
•Adapting the Chunk-Adaptive Restoration to the case of limited access to
labels using a semi-supervised and active learning approach.
Proposing a more ﬂexible method of changing data chunk size, e.g., based
on the model stability assessment.
Adapting the proposed method to imbalanced data stream classiﬁcation
task, where changing the data chunk size may be correlated with the
intensity of data preprocessing (e.g., the intensity of data oversampling).
•Improve stabilization method to better handle data streams with noise.
This work is supported by the CEUS-UNISONO programme, which has received
funding from the National Science Centre, Poland under grant agreement No.
Ksieniewicz, P. & Zyblewski, P. stream-learn–open-source Python library for diﬃcult
data stream batch analysis. ArXiv Preprint ArXiv:2001.11077. (2020)
Katakis, I., Tsoumakas, G. & Vlahavas, I. Tracking recurring contexts using ensemble
classiﬁers: An application to email ﬁltering. Knowledge And Information Systems.
22 pp. 371-391 (2010,3)
Wang, H., Fan, W., Yu, P. & Han, J. Mining Concept-Drifting Data Streams
Using Ensemble Classiﬁers. Proceedings Of The Ninth ACM SIGKDD Interna-
tional Conference On Knowledge Discovery And Data Mining. pp. 226-235 (2003),
Brzeziński, D. & Stefanowski, J. Accuracy Updated Ensemble for Data Streams
with Concept Drift. Hybrid Artiﬁcial Intelligent Systems. pp. 155-163 (2011)
Brzezinski, D. & Stefanowski, J. Reacting to Diﬀerent Types of Concept Drift: The
Accuracy Updated Ensemble Algorithm. IEEE Transactions On Neural Networks
And Learning Systems. 25, 81-94 (2014)
Street, N. & Kim, Y. A Streaming Ensemble Algorithm (SEA) for Large-Scale
Oza, N. Online bagging and boosting. 2005 IEEE International Conference On
Systems, Man And Cybernetics. 3pp. 2340-2345 Vol. 3 (2005)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O.,
Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,
Cournapeau, D., Brucher, M., Perrot, M. & Duchesnay, E. Scikit-learn: Machine
Learning in Python. Journal Of Machine Learning Research.
pp. 2825-2830 (2011)
Muhlbaier, M., Topalis, A. & Polikar, R. Learn<sup>
Ensemble of Classiﬁers With Dynamically Weighted Consult-and-Vote for Eﬃcient
Incremental Learning of New Classes. IEEE Transactions On Neural Networks.
Souza, V., Reis, D., Maletzke, A. & Batista, G. Challenges in Benchmarking
Stream Learning Algorithms with Real-world Data. Data Mining And Knowledge
Discovery. pp. 1-54 (2020)
Sahoo, D., Pham, Q., Lu, J. & Hoi, S. Online Deep Learning: Learning Deep
Neural Networks on the Fly. Proceedings Of The Twenty-Seventh International Joint
Conference On Artiﬁcial Intelligence, IJCAI-18. pp. 2660-2666 (2018,7)
Hinton, G. Connectionist learning procedures. Artiﬁcial Intelligence.
Parisi, G., Kemker, R., Part, J., Kanan, C. & Wermter, S. Continual lifelong
learning with neural networks: A review. Neural Networks. 113 pp. 54 - 71 (2019)
Li, X., Zhou, Y., Wu, T., Socher, R. & Xiong, C. Learn to Grow: A Continual
Structure Learning Framework for Overcoming Catastrophic Forgetting. Proceedings
Of The 36th International Conference On Machine Learning.
Kemker, R., Abitino, A., McClure, M. & Kanan, C. Measuring Catastrophic
Forgetting in Neural Networks. ArXiv. abs/1708.02072 (2018)
Zhou, Z., Shin, J., Zhang, L., Gurudu, S., Gotway, M. & Liang, J. Fine-Tuning
Convolutional Neural Networks for Biomedical Image Analysis: Actively and In-
crementally. 2017 IEEE Conference On Computer Vision And Pattern Recognition
(CVPR). pp. 4761-4772 (2017)
Penna, A., Mohammadi, S., Jojic, N. & Murino, V. Summarization and Classiﬁca-
tion of Wearable Camera Streams by Learning the Distributions over Deep Features
of Out-of-Sample Image Sequences. IEEE International Conference On Computer
Vision, ICCV 2017, Venice, Italy, October 22-29, 2017. pp. 4336-4344 (2017)
Pasricha, R., Gujral, E. & Papalexakis, E. Identifying and Alleviating Concept Drift
in Streaming Tensor Decomposition. Machine Learning And Knowledge Discovery In
Databases - European Conference, ECML PKDD 2018, Dublin, Ireland, September
10-14, 2018, Proceedings, Part II. 11052 pp. 327-343 (2018)
Khan, Z., Lehtomäki, J., Shahid, A. & Moerman, I. DEMO: Real-time Edge
Analytics and Concept Drift Computation for Eﬃcient Deep Learning From Spectrum
Data. 39th IEEE Conference On Computer Communications, INFOCOM Workshops
2020, Toronto, ON, Canada, July 6-9, 2020. pp. 1290-1291 (2020)
Yu, L., Twardowski, B., Liu, X., Herranz, L., Wang, K., Cheng, Y., Jui, S. & Weijer,
J. Semantic Drift Compensation for Class-Incremental Learning. 2020 IEEE/CVF
Conference On Computer Vision And Pattern Recognition, CVPR 2020, Seattle,
WA, USA, June 13-19, 2020. pp. 6980-6989 (2020)
Korycki, L. & Krawczyk, B. Adversarial Concept Drift Detection under Poi-
soning Attacks for Robust Data Stream Mining. CoRR.
Lo, Y., Liao, W., Chang, C. & Lee, Y. Temporal Matrix Factorization for Tracking
Concept Drift in Individual User Preferences. IEEE Trans. Comput. Soc. Syst..
Sun, Y., Xue, B., Zhang, M. & Yen, G. Evolving Deep Convolutional Neural
Networks for Image Classiﬁcation. IEEE Trans. Evol. Comput.. 24, 394-407 (2020)
Croce, F. & Hein, M. Minimally distorted Adversarial Examples with a Fast
Adaptive Boundary Attack. Proceedings Of The 37th International Conference On
Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event.
Wang, S. & Zhang, L. Self-adaptive Re-weighted Adversarial Domain Adaptation.
Proceedings Of The Twenty-Ninth International Joint Conference On Artiﬁcial
Intelligence, IJCAI 2020. pp. 3181-3187 (2020)
Gomes, H., Read, J., Bifet, A., Barddal, J. & Gama, J. Machine Learning for
Streaming Data: State of the Art, Challenges, and Opportunities. SIGKDD Explo-
rations Newsletter. 21, 6-22 (2019,11)
Krawczyk, B. & Others Ensemble learning for data stream analysis: A survey. Inf.
Fusion. 37 pp. 132 - 156 (2017)
Schmidhuber, J. Deep learning in neural networks: An overview.. Neural Networks.
61 pp. 85-117 (2015)
Tsymbal, A. The Problem of Concept Drift: Deﬁnitions and Related Work. (Trinity
Widmer, G. & Kubat, M. Learning in the Presence of Concept Drift and Hidden
Context. Machine Learning. 23 pp. 69-101 (1996)
Liang, K., Li, C., Wang, G. & Carin, L. Generative Adversarial Network Training
is a Continual Learning Problem. ArXiv. abs/1811.11083 (2018)
Zhai, M., Chen, L., Tung, F., He, J., Nawhal, M. & Mori, G. Lifelong GAN:
Continual Learning for Conditional Image Generation. 2019 IEEE/CVF International
Conference On Computer Vision (ICCV). pp. 2759-2768 (2019)
Gama, J., Žliobaite, Bifet, A., Pechenizkiy, M. & Bouchachia, A. A
Survey on Concept Drift Adaptation. ACM Comput. Surv..
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
Courville, A. & Bengio, Y. Generative Adversarial Nets. Proceedings Of The 27th
International Conference On Neural Information Processing Systems - Volume 2. pp.
Liu, Z., Luo, P., Wang, X. & Tang, X. Deep Learning Face Attributes in the Wild.
Proceedings Of International Conference On Computer Vision (ICCV). (2015,12)
Krizhevsky, A., Nair, V. & Hinton, G. CIFAR-10 (Canadian Institute for Advanced
Research). (0), http://www.cs.toronto.edu/ kriz/cifar.html
Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a Novel Image Dataset for
Benchmarking Machine Learning Algorithms. (2017,8,28)
Karras, T., Aila, T., Laine, S. & Lehtinen, J. Progressive Growing of GANs
for Improved Quality, Stability, and Variation. CoRR.
Srivastava, A., Valkov, L., Russell, C., Gutmann, M. & Sutton, C. VEEGAN:
Reducing Mode Collapse in GANs using Implicit Variational Learning. (2017)
Che, T., Li, Y., Jacob, A., Bengio, Y. & Li, W. Mode Regularized Generative Adver-
sarial Networks. CoRR. abs/1612.02136 (2016), http://arxiv.org/abs/1612.02136
Shaker, A. & Hüllermeier, E. Recovery analysis for adaptive learning from non-
stationary data streams: Experimental design and case study. Neurocomputing.
pp. 250-264 (2015)
Arjovsky, M., Chintala, S. & Bottou, L. Wasserstein Generative Adversarial
Networks. Proceedings Of The 34th International Conference On Machine Learning.
70 pp. 214-223 (2017,8,6)
Radford, A., Metz, L. & Chintala, S. Unsupervised Representation
Learning with Deep Convolutional Generative Adversarial Networks. (2015),
http://arxiv.org/abs/1511.06434, cite arxiv:1511.06434Comment: Under review as a
conference paper at ICLR 2016
Montavon, G., Binder, A., Lapuschkin, S., Samek, W. & Müller, K. Layer-Wise
Relevance Propagation: An Overview. Explainable AI: Interpreting, Explaining And
Visualizing Deep Learning. pp. 193-209 (2019)
He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition.
CoRR. abs/1512.03385 (2015), http://arxiv.org/abs/1512.03385
Lu, J., Liu, A., Dong, F., Gu, F., Gama, J. & Zhang, G. Learning under Concept
Drift: A Review. IEEE Trans. Knowl. Data Eng.. 31, 2346-2363 (2019)
Fawcett, T. An introduction to ROC analysis. Pattern Recognition Letters.
874 (2006), https://www.sciencedirect.com/science/article/pii/S016786550500303X,
ROC Analysis in Pattern Recognition
Li, J., Qu, S., Li, X., Szurley, J., Kolter, J. & Metze, F. Adversarial Music: Real
world Audio Adversary against Wake-word Detection System. Advances In Neural
Information Processing Systems 32: Annual Conference On Neural Information
Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC,
Canada. pp. 11908-11918 (2019)
Li, J. & Xue, Y. Scribble-to-Painting Transformation with Multi-Task Generative
Adversarial Networks. Proceedings Of The Twenty-Eighth International Joint Con-
ference On Artiﬁcial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019.
pp. 5916-5922 (2019)
Borji, A. Pros and cons of GAN evaluation measures. Comput. Vis. Image Underst..
179 pp. 41-65 (2019)
Rostami, M., Kolouri, S., Pilly, P. & McClelland, J. Generative Continual Concept
Learning. The Thirty-Fourth AAAI Conference On Artiﬁcial Intelligence, AAAI 2020,
The Thirty-Second Innovative Applications Of Artiﬁcial Intelligence Conference,
IAAI 2020, The Tenth AAAI Symposium On Educational Advances In Artiﬁcial
Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. pp. 5545-5552
Parisi, G., Kemker, R., Part, J., Kanan, C. & Wermter, S. Continual lifelong
learning with neural networks: A review. Neural Networks. 113 pp. 54-71 (2019)
Pan, M., Huang, W., Li, Y., Zhou, X. & Luo, J. xGAIL: Explainable Generative
Adversarial Imitation Learning for Explainable Human Decision Analysis. KDD ’20:
The 26th ACM SIGKDD Conference On Knowledge Discovery And Data Mining,
Virtual Event, CA, USA, August 23-27, 2020. pp. 1334-1343 (2020)
Junsawang, P., Phimoltares, S. & Lursinsap, C. Streaming chunk incremental
learning for class-wise data stream classiﬁcation with fast learning speed and low
structural complexity. PloS One. 14, e0220624 (2019)
Wang, H., Fan, W., Yu, P. & Han, J. Mining concept-drifting data streams
using ensemble classiﬁers. Proceedings Of The Ninth ACM SIGKDD International
Conference On Knowledge Discovery And Data Mining. pp. 226-235 (2003)
Bifet, A., Gavald, R., Holmes, G. & Pfahringer, B. Machine Learning for Data
Streams: With Practical Examples in MOA. (The MIT Press,2018)
Bahri, M., Bifet, A., Gama, J., Gomes, H. & Maniu, S. Data stream analysis:
Foundations, major tasks and tools. Wiley Interdiscip. Rev. Data Min. Knowl. Discov..
11 (2021), https://doi.org/10.1002/widm.1405
Krempl, G., Žliobaite, I., Brzeziński, D., Hüllermeier, E., Last, M., Lemaire, V.,
Noack, T., Shaker, A., Sievi, S., Spiliopoulou, M. & Stefanowski, J. Open Challenges
for Data Stream Mining Research. SIGKDD Explor. Newsl..
, 1-10 (2014,9),
Ramirez-Gallego, S., Krawczyk, B., Garcia, S., Wozniak, M. & Her-
rera, F. A survey on data preprocessing for data stream mining: Cur-
rent status and future directions. Neurocomputing.
pp. 39 - 57 (2017),
Kuncheva, L. Classiﬁer Ensembles for Changing Environments. Multiple Classiﬁer
Systems, 5th International Workshop, MCS 2004, Cagliari, Italy, June 9-11, 2004,
Proceedings. 3077 pp. 1-15 (2004)
Jackowski, K. Fixed-size ensemble classiﬁer system evolutionarily adapted to
a recurring context with an unlimited pool of classiﬁers. Pattern Analysis And
Applications. 17, 709-724 (2014,11), https://doi.org/10.1007/s10044-013-0318-x
Oza, N. & Tumer, K. Classiﬁer ensembles: Select real-world applications. Inf.
Fusion. 9, 4-20 (2008,1)
Bifet, A., Holmes, G., Pfahringer, B., Read, J., Kranen, P., Kremer, H., Jansen,
T. & Seidl, T. MOA: a Real-time Analytics Open Source Framework. Proc. Euro-
pean Conference On Machine Learning And Principles And Practice Of Knowledge
Discovery In Databases (ECML PKDD 2011), Athens, Greece. pp. 617-620 (2011)
Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R. & Gavaldà, R. New ensemble
methods for evolving data streams. Proceedings Of The 15th ACM SIGKDD In-
ternational Conference On Knowledge Discovery And Data Mining. pp. 139-148
 Duda, R., Hart, P. & Stork, D. Pattern Classiﬁcation. (Wiley,2001)
Widmer, G. & Kubat, M. Eﬀective learning in dynamic environments by explicit
context tracking. Machine Learning: ECML-93. 667 pp. 227-243 (1993)
Rodriguez, J. & Kuncheva, L. Combining Online Classiﬁcation Approaches for
Changing Environments. Proceedings Of The 2008 Joint IAPR International Work-
shop On Structural, Syntactic, And Statistical Pattern Recognition. pp. 520-529
 Lazarescu, M., Venkatesh, S. & Bui, H. Using multiple windows to track concept
drift. Intell. Data Anal.. 8, 29-59 (2004,1)
Sobolewski, P. & Wozniak, M. Concept Drift Detection and Model Selection with
Simulated Recurrence and Ensembles of Statistical Detectors. Journal Of Universal
Computer Science. 19, 462-483 (2013,2,28)
Gustafsson, F. Adaptive Filtering and Change Detection. Adaptive Filtering And
Change Detection. pp. 510 (2000,10)
Gama, J., Medas, P., Castillo, G. & Rodrigues, P. Learning with drift detection.
In SBIA Brazilian Symposium On Artiﬁcial Intelligence. pp. 286-295 (2004)
Raudys, S. Statistical and Neural Classiﬁers: An Integrated Approach to Design.
(Springer Publishing Company, Incorporated,2014)
Baena-Garcıa, M., Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R. & Morales-
Bueno, R. Early drift detection method. Fourth International Workshop On Knowl-
edge Discovery From Data Streams. 6pp. 77-86 (2006)
Blanco, I., Campo-Avila, J., Ramos-Jimenez, G., Bueno, R., Diaz, A. & Mota, Y.
Online and Non-Parametric Drift Detection Methods Based on Hoeﬀding’s Bounds.
IEEE Trans. Knowl. Data Eng.. 27, 810-823 (2015)
Zliobaite, I., Budka, M. & Stahl, F. Towards cost-sensitive adaptation: When is it
worth updating your predictive model?. Neurocomputing. 150 pp. 240-249 (2015)
Woźniak, M., Kasprzak, A. & Cal, P. Weighted Aging Classiﬁer Ensemble for the
Incremental Drifted Data Streams. Flexible Query Answering Systems. pp. 579-588
Bifet, A., Holmes, G., Kirkby, R. & Pfahringer, B. MOA: Massive Online Analysis.
J. Mach. Learn. Res.. 11 pp. 1601-1604 (2010,8)
Anonymous Chunk Adaptive Restoration. GitHub Repository. (2021),
Liu, N., Zhu, W., Liao, B. & Ren, S. Weighted Ensemble with Dynamical Chunk
Size for Imbalanced Data Streams in Nonstationary Environment. (2017,1)
Lu, Y., Cheung, Y. & Yan Tang, Y. Adaptive Chunk-Based Dynamic Weighted
Majority for Imbalanced Data Streams With Concept Drift. IEEE Transactions On
Neural Networks And Learning Systems. 31, 2764-2778 (2020)
Bifet, A. & Gavaldà, R. Learning from Time-Changing Data with Adaptive
Windowing. Proceedings Of The 7th SIAM International Conference On Data Mining.
Harvey, W., Carabelli, A., Jackson, B., Gupta, R., Thomson, E., Harrison, E.,
Ludden, C., Reeve, R., Rambaut, A., Peacock, S., Robertson, D. & Consortium,
C. SARS-CoV-2 variants, spike mutations and immune escape.
Microbiology. 19, 409-424 (2021,7), https://doi.org/10.1038/s41579-021-00573-0
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O.,
Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,
Cournapeau, D., Brucher, M., Perrot, M. & Duchesnay, E. Scikit-learn: Machine
Learning in Python.
Journal Of Machine Learning Research
pp. 2825-2830 (2011)