
Elias EghoOrange Labs · Orange Labs Research
Elias Egho
PHD
About
18
Publications
1,955
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
199
Citations
Citations since 2017
Introduction
Publications
Publications (18)
Sequential data are generated in many domains of science and technology. Although many studies have been carried out for sequence classification in the past decade, the problem is still a challenge, particularly for pattern-based methods. We identify two important issues related to pattern-based sequence classification, which motivate the present w...
Computing the similarity between sequences is a very important challenge for many different data mining tasks. There is a plethora of similarity measures for sequences in the literature, most of them being designed for sequences of items. In this work, we study the problem of measuring the similarity between sequences of itemsets. We focus on the n...
All domains of science and technology produce large and heterogeneous data. Although much work has been done in this area, mining such data is still a challenge. No previous research targets the mining of heterogeneous multidimensional sequential data. In this work, we present a new approach to extract heterogeneous multidimensional sequential patt...
Nowadays data sets are available in very complex and heterogeneous ways.
Mining of such data collections is essential to support many real-world
applications ranging from healthcare to marketing. In this work, we focus on
the analysis of "complex" sequential data by means of interesting sequential
patterns. We approach the problem using the elegant...
Sequence classification has become a fundamental problem in data mining and machine learning. Feature based classification is one of the techniques that has been used widely for sequence classification. Mining sequential classification rules plays an important role in feature based classification. Despite the abundant literature in this area, minin...
All domains of science and technology produce large and heterogeneous data. Although a lot of work was done in this area, mining such data is still a challenge. No previous research work targets the mining of heterogeneous multidimensional sequential data. This thesis proposes a contribution to knowledge discovery in heterogeneous sequential data....
All domains of science and technology produce large and heterogeneous data. Although a lot of work was done in this area, mining such data is still a challenge. No previous research work targets the mining of heterogeneous multidimensional sequential data. This thesis proposes a contribution to knowledge discovery in heterogeneous sequential data....
Sequential pattern mining is aimed at extracting correlations among temporal data. Many different methods were proposed to either enumerate sequences of set valued data (i.e., itemsets) or sequences containing dimensional items. However, in real-world scenarios, data sequences are described as combination of both multidimensional items and itemsets...
In this paper, we are interested in the analysis of sequential data and we propose an original framework based on FCA. For that, we introduce sequential pattern structures, an original specification of pattern structures for dealing with sequential data. Sequential pattern structures are given by a subsumption operation between set of sequences, ba...
In this paper, we are interested in the analysis of sequential data and we propose an original framework based on Formal Concept Analysis (FCA). For that, we introduce sequential pattern structures, an original specification of pattern structures for dealing with sequential data. Pattern structures are used in FCA for dealing with complex data such...
With the increasing burden of chronic illnesses, administrative health care databases hold valuable information that could be used to monitor and assess the processes shaping the trajectory of care of chronic patients. In this context, temporal data mining methods are promising tools, though lacking flexibility in addressing the complex nature of m...
Computing the similarity between sequences is a very important challenge for many different data mining tasks. There is a plethora of similarity measures for sequences in the literature, most of them being designed for sequences of items. In this work, we study the problem of measuring the similarity ratio between sequences of itemsets. We present...
Sequential pattern mining is aimed at extracting correlations among temporal data. Many different methods were proposed to either enumerate sequences of set valued data (i.e., itemsets) or sequences containing multidimensional items. However, in real-world scenarios, data sequences are described as events of both multidimensional items and set valu...
This paper presents a research work in the domains of sequential pattern mining and formal concept analysis. Using a combined method, we show how concept lattices and interestingness measures such as stability can improve the task of discovering knowledge in symbolic sequential data. We give example of a real medical application to illustrate how t...
Questions
Question (1)
I am currently working on mining sequential classification rules with Hadoop. I am searching for big labeled sequential data having sequences as a sequence of items like <a,b,a,b,a> and for each sequence there is a class label, i.e., (<a,b,a,b,a> : class)
Does any one have access to such kind of big data?
Projects
Project (1)
coming soon...
A PhD will start on this topic (October 2017).
The general context is the security and the cybercrime (detection of the frauds). Given these concerns, system to detect frauds, abnormal behaviors or signatures (of attacks) are among the major components of the current devices to protect systems. The attempts of frauds, intrusions or attacks are protean, evolutionary and made furtive by the simple Big Data volumetric of the traffic and/or the transactions.
The supervision, of a network or transactions requires the consideration, on one hand the specificity of deceitful acts and on the other hand constraints connected to the data which are varied, voluminous and with high velocity.
The approach of detection by comparison to a database of behavior said normal, for example by means of rules, presents limits. The establishment of rules is a complex work, because they have to follow constantly the evolutions of the known frauds. Besides, the unlisted deceitful detection of acts is not possible in essence with this method.
Face to these points, IA methods, using learning machines, widen the potential of fraud detection, known or unknowns, and can be associated with technologies or mechanisms relative to Big Data.