Zakarya FarouEötvös Loránd University · Faculty of Informatics
Zakarya Farou
Ph.D. Candidate in Data Science and Engineering
Assistant Lecturer and research Fellow
About
19
Publications
8,624
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
28
Citations
Introduction
Zakarya Farou is an assistant lecturer and mentor teacher at Eötvös Loránd University (ELTE). He obtained his bachelor's degree with honors in computer science and mathematics in 2017 from Guelma University, Algeria. He then received a master's degree in computer science, specializing in software and systems architectures, with honors from Eötvös Loránd University (ELTE) in 2019.
Research interests Includes Applied data science, generative models, and class imbalance.
Additional affiliations
September 2018 - present
April 2018 - July 2018
Lingua inter
Position
- French teacher
Education
September 2019 - August 2024
September 2017 - July 2019
September 2014 - July 2017
Publications
Publications (19)
1-nearest neighbor (1NN) with Dynamic Time Warping (DTW) distance is a popular time series classification technique. In the last decades, research on DTW aimed to improve its classification accuracy , memory usage, and efficiency. According to a recent study, the appropriate selection of the Warping Window Size (WWS) is crucial for the accuracy of...
Generative adversarial networks (GANs) could be used efficiently for image and video generation, where labeled training data are available in bulk. In general, building a good machine learning model requires a reasonable amount of labeled training data. However, there are areas such that the biomedical field where the creation of such a data set is...
Recently, keystroke dynamics has gained popularity as one of the main sources of behavioral biometrics for providing continuous user authentication. Keystroke dynamics is appealing for many reasons: It is less obtrusive since users will be typing on the computer keyboard anyway. It does not require extra hardware.
Analyzing how the data is typed in...
Over the last few years, skin segmentation has been widely applied in diverse aspects of computer vision applications and biometric applications including face detection, face tracking, and face/hand-gesture recognition systems. Due to its importance, we observed a reawakened interest in developing skin segmentation approaches. In this paper, we of...
The significance of data in training cutting-edge machine learning models is essential, particularly in addressing the widespread challenge of learning from imbalanced datasets.
This dissertation proposes innovative methodologies to boost classification efficacy and model resilience by utilizing advanced resampling techniques tailored for imbalanc...
There are several machine learning algorithms addressing class imbalance problem, requiring standardized metrics for adequete performance evaluation. This paper reviews several metrics for imbalanced learning in binary and multi-class problems. We emphasize considering class separability, imbalance ratio, and noise when choosing suitable metrics. A...
Multi-class imbalance problems are non-standard derivative data science problems. These problems are associated with the skewness in the data underlying distribution, which, in turn, raises numerous issues for conventional machine learning techniques. To address the lack of data in imbalance problems, we can either collect new data or oversample th...
Spam filtering is a non-standard derivative data science problem aiming to catch unsolicited and undesirable messages and prevent those messages from reaching a user's inbox. To solve the abovementioned problem, we propose a text augmentation approach using the most similar synonyms called TAMS. We used Random Forest and Bidirectional LSTM classifi...
Bird species are recognized as essential biodiversity indicators: they are responsive to changes in sensitive ecosystems, while population-level changes in behavior are both visible and quantifiable. Therefore, ecologists monitor them to determine factors causing population fluctuation and help conserve and manage threatened and endangered species....
Nowadays, the problem of class imbalance is relatively prevalent.
This problem is associated with the skewness in the data underlying distribution, which, in turn, presents innumerable problems for
conventional machine learning techniques. Existing approaches to addressing the growing challenges of multi-class imbalanced learning are classified int...
Accurate particle identification is an ongoing task in the European organization for nuclear research, known as CERN where the challenge remains that targeted particles/events represent tiny minorities in front of the overwhelming presence of common particles such as protons. This paper presents a directed undersampling using an active learning met...
Many practical classification problems are difficult because of unbounded size and the imbalance nature of data. The class imbalance problem becomes a significant issue in data mining. An imbalance problem occurs where one of the classes has more samples than other classes. When a disproportionate ratio of samples per class is present, most machine...
Accurate particle identification is an ongoing task in the European organization for nuclear research, known as CERN where the challenge remains that targeted particles/events represent tiny minorities in front of the overwhelming presence of common particles such as protons. This paper presents a directed undersampling using an active learning met...
Over the last few years, skin segmentation has been widely applied in diverse aspects of computer vision and biometric applications including face detection, face tracking, and face/hand-gesture recognition systems. Due to its importance, we observed a reawakened interest in developing skin segmentation approaches. In this paper, we offer a compari...
Generative adversarial networks (GANs) could be used efficiently for image and video generation when labeled training data is available in bulk. In general, building a good machine learning model requires a reasonable amount of labeled training data. However, there are areas such as the biomedical field where the creation of such a dataset is time-...
This work presents a user's verification system for web applications based on keystroke dynamics, the methodology proposed is a user-friendly, low-cost, none intrusive, easy to integrate into existing systems. The algorithm works by monitoring the typing of users in real-time, capturing split times in which the key was pressed and released. Three c...
This thesis presents the realization, the context, and the prospects of a home automation control interface. Home automation or smart home (also known as demotics) is building automation for the home, which has advanced automatic systems to ensure comfort functions (tele-control of temperature, windows, and doors ...), the security of goods and peo...