Amedeo Napoli

Amedeo Napoli
Lorrain de Recherche en Informatique et Ses Applications | Loria · LORIA

About

368
Publications
31,120
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,008
Citations
Citations since 2017
46 Research Items
1253 Citations
2017201820192020202120222023050100150200250
2017201820192020202120222023050100150200250
2017201820192020202120222023050100150200250
2017201820192020202120222023050100150200250

Publications

Publications (368)
Preprint
Full-text available
Visual counterfactual explanations identify modifications to an image that would change the prediction of a classifier. We propose a set of techniques based on generative models (VAE) and a classifier ensemble directly trained in the latent space, which all together, improve the quality of the gradient required to compute visual counterfactuals. Th...
Preprint
In this paper, we revisit pattern mining and study the distribution underlying a binary dataset thanks to the closure structure which is based on passkeys, i.e., minimum generators in equivalence classes robust to noise. We introduce $\Delta$-closedness, a generalization of the closure operator, where $\Delta$ measures how a closed set differs from...
Preprint
In this paper we are interested in studying concise representations of concepts and dependencies, i.e., implications and association rules. Such representations are based on equivalence classes and their elements, i.e., minimal generators, minimum generators including keys and passkeys, proper premises, and pseudo-intents. All these sets of attribu...
Article
Pattern mining is one of the most studied fields in data mining. Being mostly motivated by practitioners, pattern mining algorithms are often based on heuristics and are lacking suitable formalization. In this paper, we are revisiting pattern mining, and especially itemset mining, which allows one to analyze binary datasets in searching for interes...
Article
Full-text available
Pattern mining is well established in data mining research, especially for mining binary datasets. Surprisingly, there is much less work about numerical pattern mining and this research area remains under-explored. In this paper we propose Mint, an efficient MDL-based algorithm for mining numerical datasets. The MDL principle is a robust and reliab...
Article
Full-text available
Efficiently discovering causal relations from data and representing them in a way that facilitates their use is an important problem in science that has received much attention. In this paper, we propose an adaptation of the Formal Concept Analysis formalism to the problem of discovering and representing causal relations. We show that Formal Concep...
Article
Full-text available
In this paper, we study structures such as distributive lattices, distributive semilattices, and median graphs from an algorithmic point of view. Such structures are very useful in classification and phylogeny for representing lineage relationships for example. A distributive lattice can be considered as a median graph while a distributive ∨-semila...
Chapter
It is known that a distributive lattice is a median graph, and that a distributive ∨-semilattice can be thought of as a median graph iff every triple of elements such that the infimum of each couple of its elements exists, has an infimum. Since a lattice without its bottom element is obviously a ∨-semilattice, using the FCA formalism, we investigat...
Chapter
Constraints, in a broad sense, are restrictions that exist (or should exist) in a dataset. There are many different kinds of constraints, that differ not only in their semantics, but also, in the domains in which they are present: database design, knowledge discovery, data analysis, to name a few. Formal Concept Analysis and Pattern Structures has...
Article
Full-text available
Knowledge graphs are freely aggregated, published, and edited in the Web of data, and thus may overlap. Hence, a key task resides in aligning (or matching) their content. This task encompasses the identification, within an aggregated knowledge graph, of nodes that are equivalent, more specific, or weakly related. In this article, we propose to matc...
Chapter
Cosmic shear estimation is an essential scientific goal for large galaxy surveys. It refers to the coherent distortion of distant galaxy images due to weak gravitational lensing along the line of sight. It can be used as a tracer of the matter distribution in the Universe. The unbiased estimation of the local value of the cosmic shear can be obtain...
Preprint
Full-text available
Unintended biases in machine learning (ML) models are among the major concerns that must be addressed to maintain public trust in ML. In this paper, we address process fairness of ML models that consists in reducing the dependence of models on sensitive features, without compromising their performance. We revisit the framework FixOut that is inspir...
Chapter
Algorithmic decisions are now being used on a daily basis, and based on Machine Learning (ML) processes that may be complex and biased. This raises several concerns given the critical impact that biased decisions may have on individuals or on society as a whole. Not only unfair outcomes affect human rights, they also undermine public trust in ML an...
Preprint
Pattern mining is well established in data mining research, especially for mining binary datasets. Surprisingly, there is much less work about numerical pattern mining and this research area remains under-explored. In this paper, we propose Mint, an efficient MDL-based algorithm for mining numerical datasets. The MDL principle is a robust and relia...
Preprint
Knowledge graphs are concurrently published and edited in the Web of data. Hence they may overlap, which makes key the task that consists in matching their content. This task encompasses the identification, within and across knowledge graphs, of nodes that are equivalent, more specific, or weakly related. In this article, we propose to match nodes...
Preprint
Full-text available
Algorithmic decisions are now being used on a daily basis, and based on Machine Learning (ML) processes that may be complex and biased. This raises several concerns given the critical impact that biased decisions may have on individuals or on society as a whole. Not only unfair outcomes affect human rights, they also undermine public trust in ML an...
Preprint
In this paper, we are revisiting pattern mining and especially itemset mining, which allows one to analyze binary datasets in searching for interesting and meaningful association rules and respective itemsets in an unsupervised way. While a summarization of a dataset based on a set of patterns does not provide a general and satisfying view over a d...
Chapter
An increasing number of data and knowledge sources are accessible by human and software agents in the expanding Semantic Web. Sources may differ in granularity or completeness, and thus be complementary. Consequently, they should be reconciled in order to unlock the full potential of their conjoint knowledge. In particular, units should be matched...
Preprint
Features mined from knowledge graphs are widely used within multiple knowledge discovery tasks such as classification or fact-checking. Here, we consider a given set of vertices, called seed vertices, and focus on mining their associated neighboring vertices, paths, and, more generally, path patterns that involve classes of ontologies linked with k...
Chapter
This chapter presents several types of reasoning based on analogy and similarity. Case-based reasoning, presented in Sect. 2, consists in searching a case (where a case represents a problem-solving episode) similar to the problem to be solved and to adapt it to solve this problem. Section 3 is devoted to analogical reasoning and to recent developme...
Chapter
In this chapter, we introduceFerré, Sébastien Formal Concept Analysis (FCA) andHuchard, Marianne some of its extensions. FCA is a formalismKaytoue, Mehdi based on lattice theory aimedKuznetsov, Sergei O. at data analysis and knowledge processing. FCA allows the design of so-called concept lattices from binary and complex data. These concept lattice...
Preprint
In the expanding Semantic Web, an increasing number of sources of data and knowledge are accessible by human and software agents. Sources may differ in granularity or completeness, and thus be complementary. Consequently, unlocking the full potential of the available knowledge requires combining them. To this aim, we define the task of knowledge re...
Chapter
Artificial Intelligence and Machine Learning are becoming increasingly present in several aspects of human life, especially, those dealing with decision making. Many of these algorithmic decisions are taken without human supervision and through decision making processes that are not transparent. This raises concerns regarding the potential bias of...
Chapter
In this article, we present an original use of Redescription Mining (RM) for discovering definitions of classes and incompatibility (disjointness) axioms between classes of individuals in the web of data. RM is aimed at mining alternate descriptions from two datasets related to the same set of individuals. We reuse this process for providing defini...
Chapter
In this article, we compare the use of Redescription Mining (RM) and Association Rule Mining (ARM) for discovering class definitions in Linked Open Data (LOD). RM is aimed at mining alternate descriptions from two datasets related to the same set of individuals. We reuse RM for providing category definitions in DBpedia in terms of necessary and suf...
Chapter
Pattern Mining is a well-studied field in Data Mining and Machine Learning. The modern methods are based on dynamically updating models, among which MDL-based ones ensure high-quality pattern sets. Formal concepts also characterize patterns in a condensed form. In this paper we study MDL-based algorithm called Krimp in FCA settings and propose a mo...
Chapter
We tackle the problem of constructing the representation context of a pattern structure. First, we present a naive algorithm that computes the representation context a pattern structure. Then, we add a sampling technique in order to reduce the size of the output. We show that these techniques reduce significantly the size of the representation cont...
Book
This book constitutes the refereed proceedings of the 21th International Conference on Knowledge Engineering and Knowledge Management, EKAW 2018, held in Nancy, France, in November 2018. The 36 full papers presented were carefully reviewed and selected from 104 submissions. The papers cover all aspects of eliciting, acquiring, modeling, and managin...
Preprint
Full-text available
We present in this article a lightweight ontology named PGxO and a set of rules for its instantiation, which we developed as a frame for reconciling and tracing pharmacogenomics (PGx) knowledge. PGx studies how genomic variations impact variations in drug response phenotypes. Knowledge in PGx is typically composed of units that have the form of ter...
Preprint
Full-text available
We present in this article a lightweight ontology named PGxO and a set of rules for its instantiation, which we developed as a frame for reconciling and tracing pharmacogenomics (PGx) knowledge. PGx studies how genomic variations impact variations in drug response phenotypes. Knowledge in PGx is typically composed of units that have the form of ter...
Conference Paper
We define a pattern structure whose objects are elements of a supporting ontology. In this framework, descriptions constitute trees, made of triples subject-predicate-object, and for which we provide a meaningful similarity operator. The specificity of the descriptions depends on a hyperparameter corresponding to their depth. This formalism is comp...
Conference Paper
Full-text available
This paper focuses on a framework based on Formal Concept Analysis and the Pattern Structures for classifying sets of RDF triples. Firstly, this paper proposes a method to construct a pattern structure for the classification of RDF triples w.r.t. domain knowledge. More precisely, the poset of classes representing subjects and objects and the poset...
Conference Paper
Full-text available
In this article we present a novel approach to rare sequence mining using pattern structures. Particularly, we are interested in mining closed sequences, a type of maximal sub-element which allows providing a succinct description of the patterns in a sequence database. We present and describe a sequence pattern structure model in which rare closed...
Article
Full-text available
The exponential explosion of the set of patterns is one of the main challenges in pattern mining. This challenge is approached by introducing a constraint for pattern selection. One of the first constraints proposed in pattern mining is support (frequency) of a pattern in a dataset. Frequency is an anti-monotonic function, i.e., given an infrequent...
Article
Full-text available
Functional dependencies (FDs) provide valuable knowledge on the relations between attributes of a data table. A functional dependency holds when the values of an attribute can be determined by another. It has been shown that FDs can be expressed in terms of partitions of tuples that are in agreement w.r.t. the values taken by some subsets of attrib...
Article
Full-text available
With an increased interest in machine processable data and with the progress of semantic technologies, many datasets are now published in the form of RDF triples for constituting the so-called Web of Data. Data can be queried using SPARQL but there are still needs for integrating, classifying and exploring the data for data analysis and knowledge d...
Conference Paper
Full-text available
This article aims at presenting recent advances in Formal Concept Analysis (2010-2015), especially when the question is dealing with complex data (numbers, graphs, sequences, etc.) in domains such as databases (functional dependencies), data-mining (local pattern discovery), information retrieval and information fusion. As these advances are mainly...
Article
Full-text available
In this paper, we revisit an original proposition on pattern structures for structured sets of attributes. There are several reasons for carrying out this kind of research work. The original proposition does not give many details on the whole framework, and especially on the possible ways of implementing the similarity operation. There exists an al...
Conference Paper
Full-text available
Over the last years, computer networks have evolved into highly dynamic and interconnected environments, involving multiple heterogeneous devices and providing a myriad of services on top of them. This complex landscape has made it extremely difficult for security administrators to keep accurate and be effective in protecting their systems against...
Article
Full-text available
The popularization and quick growth of Linked Open Data (LOD) has led to challenging aspects regarding quality assessment and data exploration of the RDF triples that shape the LOD cloud. Particularly, we are interested in the completeness of data and its potential to provide concept definitions in terms of necessary and sufficient conditions. In t...
Conference Paper
Full-text available
Formal concept analysis (FCA) is a well-founded method for data analysis and has many applications in data mining. Pattern structures is an extension of FCA for dealing with complex data such as sequences or graphs. However the computational complexity of computing with pattern structures is high and projections of pattern structures were introduce...
Article
Full-text available
In this paper we explore the possibility of defining an original pattern structure for managing syntactic trees. More precisely, we are interested in the extraction of relations such as drug-drug interactions (DDIs) in medical texts where sentences are represented as syntactic trees. In this specific pattern structure, called STPS, the similarity o...
Article
One of the first models to be proposed as a document index for retrieval purposes was a lattice structure, decades before the introduction of Formal Concept Analysis. Nevertheless, the main notions that we consider so familiar within the community ("extension", "intension", "closure operators", "order") were already an important part of it. In the...
Article
With an increased interest in machine processable data, more and more data is now published in RDF (Resource Description Framework) format. This RDF data is present in independent and distributed resources which needs to be centralized, navigated and searched for domain specific applications. This paper proposes a new approach based on Formal Conce...
Conference Paper
Full-text available
In pattern mining, the main challenge is the exponential explosion of the set of patterns. Typically, to solve this problem, a constraint for pattern selection is introduced. One of the first constraints proposed in pattern mining is support (frequency) of a pattern in a dataset. Frequency is an anti-monotonic function, i.e., given an infrequent pa...
Article
Full-text available
This study is dedicated to an introduction of a novel method that automatically extracts potential structural alerts from a dataset of molecules. These triggering structures can be further used for knowledge discovery and for classification purposes. Computation of the structural alerts results from an implementation of a sophisticated workflow whi...
Article
Full-text available
Nowadays data sets are available in very complex and heterogeneous ways. Mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using the elegant...
Article
Full-text available
Computing functional dependencies from a relation is an important database topic, with many applications in database management, reverse engineering and query optimization. Whereas it has been deeply investigated in those fields, strong links exist with the mathematical framework of Formal Concept Analysis. Considering the discovery of functional d...
Article
Full-text available
Given a relation 𝓡 ⊆ 𝓞 × 𝓐 on a set 𝓞 of objects and a set 𝓐 of attributes, the AOC-poset (Attribute/Object Concept poset), is the partial order defined on the “introducers” of objects and attributes in the corresponding concept lattice. In this paper, we present Hermes, a simple and efficient algorithm for building an AOC-poset which runs in O(m i...
Article
Case-based reasoning relies on four main steps: retrieval, adaptation, revision and retention. This article focuses on the adaptation step; we propose differential adaptation as an operational formalization of adaptation for numerical problems. The solution to a target problem is designed on the basis of relations existing between a source case (pr...
Article
Full-text available
Data mining aims at finding interesting patterns from datasets, where "interesting" means reflecting intrinsic dependencies in the domain of interest rather than just in the dataset. Concept stability is a popular relevancy measure in FCA but its behaviour have never been studied on various datasets. In this paper we propose an approach to study th...
Article
Full-text available
This volume includes the proceedings of the fourth edition of the FCA4AI --What can FCA do for Artificial Intelligence?-- Workshop co-located with the IJCAI 2015 Conference in Buenos Aires (Argentina). Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classification. FCA allows one to build a concept l...
Article
In this work we present a novel technique for exhaustive bicluster enumeration using formal concept analysis (FCA). Particularly, we use pattern structures (an extension of FCA dealing with complex data) to mine similar row/column biclusters, a specialization of biclustering when attribute values have coherent variations. We show how biclustering c...
Article
Full-text available
All domains of science and technology produce large and heterogeneous data. Although much work has been done in this area, mining such data is still a challenge. No previous research targets the mining of heterogeneous multidimensional sequential data. In this work, we present a new approach to extract heterogeneous multidimensional sequential patt...
Article
In this work we introduce a novel technique to enumerate constant row/column value biclusters using formal concept analysis. To achieve this, a numerical data-table (standard input for biclustering algorithms) is modelled as a many-valued context where rows represent objects and columns represent attributes. Using equivalence relations defined for...
Article
Full-text available
Computing the similarity between sequences is a very important challenge for many different data mining tasks. There is a plethora of similarity measures for sequences in the literature, most of them being designed for sequences of items. In this work, we study the problem of measuring the similarity between sequences of itemsets. We focus on the n...
Book
Ce chapitre présente plusieurs modes de raisonnement s’appuyant sur l’analogie et la similarité. Le raisonnement à partir de cas consiste à chercher des cas (épisodes de résolution de problèmes) similaires au problème à résoudre et à les adapter pour résoudre ce problème. Les développements récents ont porté sur l’analogie et la proportion analogiq...
Conference Paper
In this review paper, we present some recent results on the characterization of Functional Dependencies and variations with the formalism of Pattern Structures and Formal Concept Analysis. Although these dependencies have been paramount in database theory, they have been used in different fields: artificial intelligence and knowledge discovery, amo...
Article
Sequential pattern mining is aimed at extracting correlations among temporal data. Many different methods were proposed to either enumerate sequences of set valued data (i.e., itemsets) or sequences containing dimensional items. However, in real-world scenarios, data sequences are described as combination of both multidimensional items and itemsets...
Article
Full-text available
Biclustering numerical data became a popular data-mining task at the beginning of 2000’s, especially for gene expression data analysis and recommender systems. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute data-table. So-called biclusters of similar values can be tho...