Younès Bennani

Younès Bennani
Université Sorbonne Paris Nord (USPN)

Full Professor, IEEE Senior member

About

254
Publications
34,334
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,558
Citations
Introduction
Prof. Dr. Younès Bennani research interests are in Machine Learning and Data Science. Prof. Dr. Younès Bennani's areas of expertise are unsupervised learning, transfer learning, cluster analysis, dimensionality reduction, features selection, features construction, and large-scale data mining. He has published 3 books and approximately 350 papers in refereed conferences procedings or journals or as contributions in books. He is a senior member of the IEEE.

Publications

Publications (254)
Chapter
This paper deals with the problem of unsupervised domain adaptation that aims to learn a classifier with a slight target risk while labeled samples are only available in the source domain. The proposed approach, called DA-SSL (Domain Adaptation meets Semi-Supervised Learning) attempts to find a joint subspace of the source and target domains using...
Chapter
The Vertex Separator Problem of a directed graph consists in finding all combinations of vertices which can disconnect the source and the terminal of the graph, these combinations are minimal if they contain only the minimal number of vertices. In this paper, we introduce a new quantum algorithm based on a movement strategy to find these separators...
Preprint
Full-text available
Finding the failure scenarios of a system is a very complex problem in the field of Probabilistic Safety Assessment (PSA). In order to solve this problem we will use the Hidden Quantum Markov Models (HQMMs) to create a generative model. Therefore, in this paper, we will study and compare the results of HQMMs and classical Hidden Markov Models HMM o...
Preprint
Full-text available
In this paper we provide the quantum version of the Convex Non-negative Matrix Factorization algorithm (Convex-NMF) by using the D-wave quantum annealer. More precisely, we use D-wave 2000Q to find the low rank approximation of a fixed real-valued matrix X by the product of two non-negative matrices factors W and G such that the Frobenius norm of t...
Article
Non-negative matrix factorization (NMF) is an unsupervised algorithm for clustering where a non-negative data matrix is factorized into (usually) two matrices with the property that all the matrices have no negative elements. This factorization raises the problem of instability, which means whenever we run NMF for the same dataset, we get different...
Chapter
A growing interest in data privacy protection is mostly motivated by countries and organizations’ desire to demonstrate democracy and practice guidelines by opening their data. Anonymization of data refers to the process of removing sensitive data’s identifiers while keeping their structure and also the information type [35]. A fundamental challeng...
Preprint
Full-text available
In this paper, we tackle the inductive semi-supervised learning problem that aims to obtain label predictions for out-of-sample data. The proposed approach, called Optimal Transport Induction (OTI), extends efficiently an optimal transport based transductive algorithm (OTP) to inductive tasks for both binary and multi-class settings. A series of ex...
Preprint
Full-text available
In this paper, we propose a novel approach for unsupervised domain adaptation, that relates notions of optimal transport, learning probability measures and unsupervised learning. The proposed approach, HOT-DA, is based on a hierarchical formulation of optimal transport, that leverages beyond the geometrical information captured by the ground metric...
Chapter
Full-text available
Self-Organizing Map is an algorithm that computes a set of artificial neurons to model the distribution of a data-set. This model is composed of a graph of neurons connected by neighborhood links. The main advantage of a SOM model is the conservation of a low-dimensional topology, which allows a visual representation of the data distribution. It is...
Chapter
In this paper, we tackle the inductive semi-supervised learning problem that aims to obtain label predictions for out-of-sample data. The proposed approach, called Optimal Transport Induction (OTI), extends efficiently an optimal transport based transductive algorithm (OTP) to inductive tasks for both binary and multi-class settings. A series of ex...
Preprint
Full-text available
In this paper, we tackle the transductive semi-supervised learning problem that aims to obtain label predictions for the given unlabeled data points according to Vapnik's principle. Our proposed approach is based on optimal transport, a mathematical theory that has been successfully used to address various machine learning problems, and is starting...
Preprint
Full-text available
This paper deals with a clustering algorithm for histogram data based on a Self-Organizing Map (SOM) learning. It combines a dimension reduction by SOM and the clustering of the data in a reduced space. Related to the kind of data, a suitable dissimilarity measure between distributions is introduced: the $L_2$ Wasserstein distance. Moreover, the nu...
Article
Full-text available
Collaborative learning has recently achieved very significant results. It still suffers, however, from several issues, including the type of information that needs to be exchanged, the criteria for stopping and how to choose the right collaborators. We aim in this paper to improve the quality of the collaboration and to resolve these issues via a n...
Chapter
Semi Non-negative Matrix Factorization (SNMF) is a machine learning algorithm that is used to decompose large data matrices where the data matrix is unconstrained (i.e., it may have mixed signs). We develop the quantum version of SNMF using quantum gradient descent, and we show that the quantum version of SNMF provides an exponential speedup compar...
Chapter
In the collaborative clustering framework, the hope is that by combining several clustering solutions, each one with its own bias and imperfections, one will get a better overall solution. The goal is that each local computation, quite possibly applied to distinct data sets, benefits from the work done by the other collaborators. This article is de...
Chapter
Collaborative clustering is a promising approach in the learning from other learners research area. Although extensive research have been done to improve the collaborative approaches, they still suffer from several issues, including the mechanism of exchanging the information and how to measure the quality of this information. In this paper we intr...
Preprint
Full-text available
In the collaborative clustering framework, the hope is that by combining several clustering solutions, each one with its own bias and imperfections, one will get a better overall solution. The goal is that each local computation, quite possibly applied to distinct data sets, benefits from the work done by the other collaborators. This article is de...
Preprint
Full-text available
Semi-supervised learning provides an effective paradigm for leveraging unlabeled data to improve a model\s performance. Among the many strategies proposed, graph-based methods have shown excellent properties, in particular since they allow to solve directly the transductive tasks according to Vapnik\s principle and they can be extended efficiently...
Preprint
Full-text available
Collaborative learning has recently achieved very significant results. It still suffers, however, from several issues, including the type of information that needs to be exchanged, the criteria for stopping and how to choose the right collaborators. We aim in this paper to improve the quality of the collaboration and to resolve these issues via a n...
Chapter
Full-text available
According to Microsoft, by 2025, 100% of new cars will be connected and by 2030, 15% of new cars will be autonomous, and will take care of sending, receiving and analyzing “large amounts of data”. Automobiles are becoming data centers on wheels. All these pieces of information can be used by many stakeholders (road authorities, leasing companies, m...
Article
Numerous parameters impact apatite (U-Th-Sm)/He (AHe) thermochronological dates, such as radiation damage, chemical content, crystal size and geometry, and their knowledge is essential for better geological interpretations. The present study investigates a new method based on advanced data mining techniques, to unravel the parameters that could pla...
Chapter
Significant results have been achieved recently by exchanging information between multiple learners for clustering tasks. However, this approaches still suffer from a few issues regarding the choice of the information to trade, the stopping criteria and the trade-of between the information extracted from the data and the information exchanged by th...
Article
Full-text available
The interest in data anonymization is exponentially growing, motivated by the will of the governments to open their data. The main challenge of data anonymization is to find a balance between data utility and the amount of disclosure risk. One of the most known frameworks of data anonymization is k -anonymity, this method assumes that a dataset is...
Conference Paper
Full-text available
Recommender Systems are software tools used to generate and provide suggestions for items and other entities to the users by exploiting various strategies. They are widely used and influence the daily life of almost everyone in different domains like e-commerce, social media, entertainment, or transportation in the mobility industry. The efficient...
Research
Full-text available
All famous machine learning algorithms that comprise both supervised and semi-supervised learning work well only under a common assumption: the training and test data follow the same distribution. When the distribution changes, most statistical models must be reconstructed from newly collected data, which for some applications can be costly or impo...
Chapter
The amount of devices gathering and using personal data without the person’s approval is exponentially growing. The European General Data Protection Regulation (GDPR) came following the requests of individuals who felt at risk of personal privacy breaches. Consequently, privacy preservation through machine learning algorithms were designed based on...
Preprint
Full-text available
All famous machine learning algorithms that correspond to both supervised and semi-supervised learning work well only under a common assumption: training and test data follow the same distribution. When the distribution changes, most statistical models must be reconstructed from new collected data that, for some applications, may be costly or impos...
Chapter
This paper presents a new generative unsupervised learning algorithm based on a representation of the clusters distribution by histograms. The main idea is to reduce the model complexity through cluster-defined projections of the data on independent axes. The results show that the proposed approach performs efficiently compared with other algorithm...
Chapter
Full-text available
Quantum machine learning is a new area of research with the recent work on quantum versions of supervised and unsupervised algorithms. In recent years, many quantum machine learning algorithms have been proposed providing a speed-up over the classical algorithms. In this paper, we propose an analysis and a comparison of three quantum distances for...
Chapter
Full-text available
Non-negative matrix factorization is a machine learning technique that is used to decompose large data matrices imposing the non-negativity constraints on the factors. This technique has received a significant amount of attention as an important problem with many applications in different areas such as language modeling, text mining, clustering, mu...
Article
Collaborative Clustering is a data mining task the aim of which is to use several clustering algorithms to analyze different aspects of the same data. The aim of collaborative clustering is to reveal the common underlying structure of data spread across multiple data sites by applying clustering techniques. The idea of collaborative clustering is t...
Article
Among the variety of algorithms that have been developed for clustering, prototype-based approaches are very popular due to their low computational complexity, allowing real-life applications. In such algorithms, the data set is summarized by a small set of prototypes. Each prototype usually represents a cluster of objects. However, the definition...
Book
All famous machine learning algorithms that correspond to both supervised and semi-supervised learning work well only under a common assumption: training and test data follow the same distribution. When the distribution changes, most statistical models must be reconstructed from new collected data that may be costly or even impossible to get for so...
Conference Paper
Graph clustering techniques are very useful for detecting densely connected groups in large graphs. Many existing graph clustering methods mainly focus on the topological structure, but ignore the vertex properties. Existing graph clustering methods have been recently extended to deal with nodes attribute. In this paper we propose a new method whic...
Chapter
Data anonymization is the process of de-identifying sensitive data while preserving its format and data type. The masked data can be a realistic or a random sequence of data, dependent on the technique used for anonymization. Individual privacy can be at risk if a published data set is not properly de-identified. The most known approach of anonymiz...
Conference Paper
Full-text available
Collaborative clustering is a recent learning paradigm concerned with the unsupervised analysis of complex multi-view data using several algorithms working together. Well known applications of collaborative clustering include multi-view clustering and distributed data clustering, where several algorithms exchange information in order to mutually i...
Conference Paper
Full-text available
Graph clustering techniques are very useful for detecting densely connected groups in large graphs. Many existing graph clustering methods mainly focus on the topological structure, but ignore the vertex properties. Existing graph clustering methods have been recently extended to deal with nodes attribute. First we motivate the interest in the stud...
Article
Full-text available
Collaborative filtering is a well-known technique for recommender systems. Collaborative filtering models use the available preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. Collaborative filtering suffers from the data sparsity problem when users only rate a small set of items which...
Article
Full-text available
The number of website is increasing speedily, and clients purchase their website from the enterprise that suggests them the best domain name with a good price. In order to give the relevant domain name, enterprise is always eager to have a good system of suggestion that suits the client request. Recommender system has been an effective key solution...
Conference Paper
Full-text available
Pour tenter de faire sens des masses de données disponibles en quantité croissante, il est nécessaire de disposer d'outils performants limitant l'implication, souvent chronophage, de l'expert. Les méthodes non supervisées d'exploration de données telles que les méthodes de clustering sont une réponse à ce besoin. Cependant, leur mise en oeuvre effe...
Conference Paper
Full-text available
In this paper, we present a novel method for co-clustering, an unsupervised learning approach that aims at discovering homogeneous groups of data instances and features by grouping them simultaneously. The proposed method uses the en-tropy regularized optimal transport between empirical measures defined on data instances and features in order to ob...
Article
Multi-label classification is becoming increasingly widespread as a data mining technique. Its objective is to categorize models in several non-exclusive groups, and is applied in such areas as news categorization, image labeling and music classification, among others. Our contribution is to use the paradigm of active learning with the topological...