Pawel Zyblewski

Pawel Zyblewski
Wroclaw University of Science and Technology | WUT · Department of Systems and Computer Networks

PhD

About

19
Publications
2,090
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
133
Citations

Publications

Publications (19)
Article
Full-text available
Among the difficulties being considered in data stream processing, a particularly interesting one is the phenomenon of concept drift. Methods of concept drift detection are frequently used to eliminate the negative impact on the quality of classification in the environment of evolving concepts. This article proposes Statistical Drift Detection Ense...
Preprint
Full-text available
The abundance of information in digital media, which in today's world is the main source of knowledge about current events for the masses, makes it possible to spread disinformation on a larger scale than ever before. Consequently, there is a need to develop novel fake news detection approaches capable of adapting to changing factual contexts and g...
Article
stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. Its main component is a stream generator, which allows producing a synthetic data stream that may incorporate each of the three main concept drift types (i.e., sudden, gradual and incremental drift) in their recurring or...
Chapter
With the advancement of internet technologies, network traffic monitoring and cyber-attack detection are becoming more and more important for critical infrastructure. Unfortunately, there are still relatively few works in the literature that interpret the available benchmark data as data streams and take into account the dynamic characteristics of...
Preprint
Full-text available
One of the significant problems of streaming data classification is the occurrence of concept drift, consisting of the change of probabilistic characteristics of the classification task. This phenomenon destabilizes the performance of the classification model and seriously degrades its quality. An appropriate strategy counteracting this phenomenon...
Chapter
Real data streams often, in addition to the possibility of concept drift occurrence, can display a high imbalance ratio. Another important problem with real classification tasks, often overlooked in the literature, is the cost of obtaining labels. This work aims to connect three rarely combined research directions i.e., data stream classification,...
Article
Ensembles of classifiers deserve attention because their stability and accuracy are usually superior compared to the single classifier. One of the aspects regarding the construction of multiple classifier systems is the fusion of each base model output. The state-of-the-art fusion of base classifiers approaches uses class labels, a rank array, or a...
Conference Paper
Despite the fact that real-life data streams may often be characterized by the dynamic changes in the prior class probabilities, there is a scarcity of articles trying to clearly describe and classify this problem as well as suggest new methods dedicated to resolving this issue. The following paper aims to fill this gap by proposing a novel data st...
Chapter
Ensemble methods in combination with data preprocessing techniques are one of the most used approaches to dealing with the problem of imbalanced data classification. At the same time, the literature indicates the potential capability of classifier selection/ensemble pruning methods to deal with imbalance without the use of preprocessing, due to the...
Chapter
The following work aims to propose a new method of constructing an ensemble of classifiers diversified by the appropriate selection of the problem subspace. The experiments were performed on a numerical dataset in which three groups are present: healthy controls, glaucoma suspects, and glaucoma patients. Overall, it consists of medical records from...
Chapter
A significant problem when building classifiers based on data stream is information about the correct label. Most algorithms assume access to this information without any restrictions. Unfortunately, this is not possible in practice because the objects can come very quickly and labeling all of them is impossible, or we have to pay for providing the...
Chapter
Using fake news as a political or economic tool is not new, but the scale of their use is currently alarming, especially on social media. The authors of misinformation try to influence the users' decisions, both in the economic and political sphere. The facts of using disinformation during elections are well known. Currently, two fake news detectio...
Article
free access till end of October 2020 -> use this link https://authors.elsevier.com/a/1blq25a7-GjBOl This work aims to connect two rarely combined research directions, i.e., non-stationary data stream classification and data analysis with skewed class distributions. We propose a novel framework employing stratified bagging for training base classif...
Article
Full-text available
One of the crucial problems of designing a classifier ensemble is the proper choice of the base classifier line-up. Basically, such an ensemble is formed on the basis of individual classifiers, which are trained in such a way to ensure their high diversity or they are chosen on the basis of pruning which reduces the number of predictive models in o...
Chapter
Imbalanced data analysis remains one of the critical challenges in machine learning. This work aims to adapt the concept of Dynamic Classifier Selection (dcs) to the pattern classification task with the skewed class distribution. Two methods, using the similarity (distance) to the reference instances and class imbalance ratio to select the most con...
Chapter
Learning from the non-stationary imbalanced data stream is a serious challenge to the machine learning community. There is a significant number of works addressing the issue of classifying non-stationary data stream, but most of them do not take into consideration that the real-life data streams may exhibit high and changing class imbalance ratio,...
Preprint
Full-text available
stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. Its main component is a stream generator, which allows to produce a synthetic data stream that may incorporate each of the three main concept drift types (i.e. sudden, gradual and incremental drift) in their recurring or...
Chapter
The purpose of ensemble pruning is to reduce the number of predictive models in order to improve efficiency and predictive performance of the ensemble. In clustering-based approach, we are looking for groups of similar models, and then we prune each of them separately in order to increase overall diversity of the ensemble. In this paper we propose...
Chapter
Full-text available
The nature of analysed data may cause the difficulty of the many practical data mining tasks. This work is focusing on two of the important research topics associated with data analysis, i.e., data stream classification as well as data analysis with imbalanced class distributions. We propose the novel classification method, employing a classifier s...

Network

Cited By