Zakarya Farou

Zakarya Farou
Eötvös Loránd University · Faculty of Informatics

Ph.D. Candidate in Data Science and Engineering
Assistant Lecturer and research Fellow

About

19
Publications
8,624
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
28
Citations
Introduction
Zakarya Farou is an assistant lecturer and mentor teacher at Eötvös Loránd University (ELTE). He obtained his bachelor's degree with honors in computer science and mathematics in 2017 from Guelma University, Algeria. He then received a master's degree in computer science, specializing in software and systems architectures, with honors from Eötvös Loránd University (ELTE) in 2019. Research interests Includes Applied data science, generative models, and class imbalance.
Additional affiliations
September 2018 - present
Telekom Innovation Laboratories
Position
  • PhD Student
Description
  • Research fellow
April 2018 - July 2018
Lingua inter
Position
  • French teacher
Education
September 2019 - August 2024
Eötvös Loránd University
Field of study
  • Data science and Engineering
September 2017 - July 2019
Eötvös Loránd University
Field of study
  • Computer science
September 2014 - July 2017
University of Guelma
Field of study
  • Computer science

Publications

Publications (19)
Conference Paper
Full-text available
1-nearest neighbor (1NN) with Dynamic Time Warping (DTW) distance is a popular time series classification technique. In the last decades, research on DTW aimed to improve its classification accuracy , memory usage, and efficiency. According to a recent study, the appropriate selection of the Warping Window Size (WWS) is crucial for the accuracy of...
Preprint
Full-text available
Generative adversarial networks (GANs) could be used efficiently for image and video generation, where labeled training data are available in bulk. In general, building a good machine learning model requires a reasonable amount of labeled training data. However, there are areas such that the biomedical field where the creation of such a data set is...
Thesis
Full-text available
Recently, keystroke dynamics has gained popularity as one of the main sources of behavioral biometrics for providing continuous user authentication. Keystroke dynamics is appealing for many reasons: It is less obtrusive since users will be typing on the computer keyboard anyway. It does not require extra hardware. Analyzing how the data is typed in...
Preprint
Full-text available
Over the last few years, skin segmentation has been widely applied in diverse aspects of computer vision applications and biometric applications including face detection, face tracking, and face/hand-gesture recognition systems. Due to its importance, we observed a reawakened interest in developing skin segmentation approaches. In this paper, we of...
Thesis
Full-text available
The significance of data in training cutting-edge machine learning models is essential, particularly in addressing the widespread challenge of learning from imbalanced datasets. This dissertation proposes innovative methodologies to boost classification efficacy and model resilience by utilizing advanced resampling techniques tailored for imbalanc...
Chapter
There are several machine learning algorithms addressing class imbalance problem, requiring standardized metrics for adequete performance evaluation. This paper reviews several metrics for imbalanced learning in binary and multi-class problems. We emphasize considering class separability, imbalance ratio, and noise when choosing suitable metrics. A...
Chapter
Full-text available
Multi-class imbalance problems are non-standard derivative data science problems. These problems are associated with the skewness in the data underlying distribution, which, in turn, raises numerous issues for conventional machine learning techniques. To address the lack of data in imbalance problems, we can either collect new data or oversample th...
Conference Paper
Full-text available
Spam filtering is a non-standard derivative data science problem aiming to catch unsolicited and undesirable messages and prevent those messages from reaching a user's inbox. To solve the abovementioned problem, we propose a text augmentation approach using the most similar synonyms called TAMS. We used Random Forest and Bidirectional LSTM classifi...
Thesis
Full-text available
Bird species are recognized as essential biodiversity indicators: they are responsive to changes in sensitive ecosystems, while population-level changes in behavior are both visible and quantifiable. Therefore, ecologists monitor them to determine factors causing population fluctuation and help conserve and manage threatened and endangered species....
Thesis
Full-text available
Nowadays, the problem of class imbalance is relatively prevalent. This problem is associated with the skewness in the data underlying distribution, which, in turn, presents innumerable problems for conventional machine learning techniques. Existing approaches to addressing the growing challenges of multi-class imbalanced learning are classified int...
Chapter
Accurate particle identification is an ongoing task in the European organization for nuclear research, known as CERN where the challenge remains that targeted particles/events represent tiny minorities in front of the overwhelming presence of common particles such as protons. This paper presents a directed undersampling using an active learning met...
Thesis
Full-text available
Many practical classification problems are difficult because of unbounded size and the imbalance nature of data. The class imbalance problem becomes a significant issue in data mining. An imbalance problem occurs where one of the classes has more samples than other classes. When a disproportionate ratio of samples per class is present, most machine...
Preprint
Full-text available
Accurate particle identification is an ongoing task in the European organization for nuclear research, known as CERN where the challenge remains that targeted particles/events represent tiny minorities in front of the overwhelming presence of common particles such as protons. This paper presents a directed undersampling using an active learning met...
Article
Full-text available
Over the last few years, skin segmentation has been widely applied in diverse aspects of computer vision and biometric applications including face detection, face tracking, and face/hand-gesture recognition systems. Due to its importance, we observed a reawakened interest in developing skin segmentation approaches. In this paper, we offer a compari...
Article
Generative adversarial networks (GANs) could be used efficiently for image and video generation when labeled training data is available in bulk. In general, building a good machine learning model requires a reasonable amount of labeled training data. However, there are areas such as the biomedical field where the creation of such a dataset is time-...
Poster
Full-text available
This work presents a user's verification system for web applications based on keystroke dynamics, the methodology proposed is a user-friendly, low-cost, none intrusive, easy to integrate into existing systems. The algorithm works by monitoring the typing of users in real-time, capturing split times in which the key was pressed and released. Three c...
Thesis
Full-text available
This thesis presents the realization, the context, and the prospects of a home automation control interface. Home automation or smart home (also known as demotics) is building automation for the home, which has advanced automatic systems to ensure comfort functions (tele-control of temperature, windows, and doors ...), the security of goods and peo...

Network

Cited By