Fabio Fassetti

Fabio Fassetti
  • Ph. D.
  • University of Calabria

About

85
Publications
7,355
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
989
Citations
Current institution
University of Calabria
Additional affiliations
January 2007 - present
University of Calabria
Education
May 2004 - May 2004
University of Calabria
Field of study
  • Computer Engineering

Publications

Publications (85)
Article
Full-text available
Explainable AI refers to techniques by which the reasons underlying decisions taken by intelligent artifacts are single out and provided to users. Outlier detection is the task of individuating anomalous objects within a given data population they belong to. In this paper we propose a new technique to explain why a given data object has been single...
Article
Full-text available
textbf{Latent}}\varvec{Out}}$$ Latent Out is a recently introduced algorithm for unsupervised anomaly detection which enhances latent space-based neural methods, namely ( Variational ) Autoencoders , GANomaly and ANOGan architectures. The main idea behind it is to exploit both the latent space and the baseline score of these architectures in order...
Chapter
Explainable AI refers to techniques by which the reasons underlying decisions taken by intelligent artifacts are single out and provided to users. Outlier detection is the task of individuating anomalous objects within a given data population they belong to. In this paper we propose a new technique to explain why a given data object has been single...
Preprint
Reconstruction error-based neural architectures constitute a classical deep learning approach to anomaly detection which has shown great performances. It consists in training an Autoencoder to reconstruct a set of examples deemed to represent the normality and then to point out as anomalies those data that show a sufficiently large reconstruction e...
Article
Full-text available
Motivation An interesting problem is to study how gene co-expression varies in two different populations, associated with healthy and unhealthy individuals, respectively. To this aim, two important aspects should be taken into account: (i) in some cases, pairs/groups of genes show collaborative attitudes, emerging in the study of disorders and dise...
Chapter
In last years deep learning approaches to anomaly detection are becoming very popular. In most of the first methods the paradigm is to train neural networks initially designed for compression (Auto Encoders) or data generation (GANs) and to detect anomalies as a collateral result. Recently new architectures have been introduced in which the express...
Chapter
Question Answering (QA) is a critical NLP task mainly based on deep learning models that allow users to answer questions in natural language and get a response. Since available general-purpose datasets are often not effective enough to suitably train a QA model, one of the main problems in this context is related to the availability of datasets whi...
Chapter
\({{\textrm{Latent}}Out}\) is a recently introduced algorithm for unsupervised anomaly detection which enhances latent space-based neural methods, namely (Variational) Autoencoders, GANomaly and ANOGan architectures. The main idea behind it is to exploit both the latent space and the baseline score of these architectures in order to provide a refin...
Chapter
The scientific impact of researchers is often evaluated based on the citations they receive from the others, thus the definition of citation metrics has long been analysed to find criteria able at jointly consider the quantity and the quality of the exchanged citations. Here, we propose a network based approach aimed at estimating the researchers i...
Chapter
Given a database and one single anomalous data point, the Outlying Aspect Mining problem consists in explaining the abnormality of that data point w.r.t. the data population stored in the input database. Thus, the problem requires the discovery of the sets of attributes and associated values that account for the abnormality of a data point within a...
Article
Full-text available
Tachistoscopes are devices that display a word for several seconds and ask the user to write down the word. They have been widely employed to increase recognition speed, to increase reading comprehension and, especially, to individuate reading difficulties and disabilities. Once the therapist is provided with the answers of the patients, a challeng...
Article
Reasoning with minimal models is at the heart of many knowledge representation systems. Yet, it turns out that this task is formidable even when very simple theories are considered. It is, therefore, crucial to devise methods that attain good performances in most cases. To this end, a path to follow is to find ways to break the task at hand into se...
Article
Full-text available
Anomaly detection methods exploiting autoencoders (AE) have shown good performances. Unfortunately, deep non-linear architectures are able to perform high dimensionality reduction while keeping reconstruction error low, thus worsening outlier detecting performances of AEs. To alleviate the above problem, recently some authors have proposed to explo...
Article
Full-text available
In this work we deal with the problem of detecting and explaining anomalous values in categorical datasets. We take the perspective of perceiving an attribute value as anomalous if its frequency is exceptional within the overall distribution of frequencies. As a first main contribution, we provide the notion of frequency occurrence . This measure c...
Chapter
Active Learning is a machine learning scenario in which methods are trained by iteratively submitting a query to a human expert and then taking into account his feedback for the following computations. The application of such paradigm to the anomaly detection task takes the name of Active Anomaly Detection (AAD). Reinforcement Learning describes a...
Chapter
Among the XAI (eXplainable Artificial Intelligence) techniques, local explanations are witnessing increasing interest due to the user need to trust specific black-box decisions. In this work we explore a novel local explanation approach appliable to any kind of classifier based on generating masking models. The idea underlying the method is to lear...
Article
In the last few years, the interactions among competing endogenous RNAs (ceRNAs) have been recognized as a key post-transcriptional regulatory mechanism in cell differentiation, tissue development, and disease. Notably, such sponge phenomena substracting active microRNAs from their silencing targets have been recognized as having a potential oncosu...
Chapter
Explaining predictions of classifiers is a fundamental problem in eXplainable Artificial Intelligence (XAI). LIME (for Local Interpretable Model-agnostic Explanations) is a recently proposed XAI technique able to explain any classifier by providing an interpretable model which approximates the black-box locally to the instance under consideration....
Chapter
Datasets from different domains usually contain data defined over a wide set of attributes or features linked through correlation relationship. Moreover, there are some applications in which not all the attributes should be treated in the same fashion as some of them can be perceived like independent variables that are responsible for the definitio...
Chapter
Finding outliers in networks is a central task in different application domains. Here, we exploit the stochastic block model framework to study the network from a generative point of view and design a score able to highlight those nodes whose connection with the rest of the network violates in some way the law according to which the rest of the nod...
Article
Full-text available
Enabling information systems to face anomalies in the presence of uncertainty is a compelling and challenging task. In this work the problem of unsupervised outlier detection in large collections of data objects modeled by means of arbitrary multidimensional probability density functions is considered. We present a novel definition of uncertain dis...
Chapter
Anomaly detection methods exploiting autoencoders (AE) have shown good performances. Unfortunately, deep non-linear architectures are able to perform high dimensionality reduction while keeping reconstruction error low, thus worsening outlier detecting performances of AEs. To alleviate the above problem, recently some authors have proposed to explo...
Chapter
This work addresses the problem of helping speech therapists in interpreting results of tachistoscopes. These are instruments widely employed to diagnose speech and reading disorders. Roughly speaking, they work as follows. During a session, some strings of letters, which may or not correspond to existing words, are displayed to the patient for an...
Chapter
In this work we deal with the problem of detecting and explaining exceptional behaving values in categorical datasets. As a first main contribution we provide the notion of frequency occurrence which can be thought as a form of Kernel Density Estimation applied to the domain of frequency values. As a second contribution, we define an outlierness me...
Chapter
Stuttering is a widespread speech disorder involving about the of the population and the of children under the age of 5. Much work in literature studies causes, mechanisms and epidemiology and much work is devoted to illustrate treatments, prognosis and how to diagnose stutter. Relevantly, a stuttering evaluation requires the skills of a multi-dime...
Article
Full-text available
Background RNA editing is an important mechanism for gene expression in plants organelles. It alters the direct transfer of genetic information from DNA to proteins, due to the introduction of differences between RNAs and the corresponding coding DNA sequences. Software tools successful for the search of genes in other organisms not always are able...
Chapter
The ADBIS conferences provide an international forum for the presentation of research on database theory, development of advanced DBMS technologies, and their applications. The 22nd edition of ADBIS, held on September 2–5, 2018, in Budapest, Hungary, includes six thematic workshops collecting contributions from various domains representing new tren...
Chapter
This paper proposes a platform for achieving accountability across distributed business processes involving heterogeneous entities that need to establish various types of agreements in a standard way. The devised solution integrates blockchain and digital identity technologies in order to exploit the guarantees about the authenticity of the involve...
Chapter
The enormous growth of information available in database systems has led to a significant development of techniques for knowledge discovery. At the heart of the knowledge discovery process is the application of data mining algorithms in charge of extracting hidden relationships among pieces of stored information. Information thus extracted from dat...
Conference Paper
Tachistoscopes are devices that display a word for several seconds and ask the user to write down the word. They have been widely employed to increase recognition speed, to increase reading comprehension and, specially, to individuate reading difficulties and disabilities. Once the therapist is provided with the answers of the patients, a challengi...
Preprint
Full-text available
In plant mitochondria an essential mechanism for gene expression is RNA editing, often influencing the synthesis of functional proteins. RNA editing alters the linearity of genetic information transfer, intro- ducing differences between RNAs and their coding DNA sequences that hind both experimental and computational research of genes. Thus common...
Preprint
Full-text available
In plant mitochondria an essential mechanism for gene expression is RNA editing, often influencing the synthesis of functional proteins. RNA editing alters the linearity of genetic information transfer, intro- ducing differences between RNAs and their coding DNA sequences that hind both experimental and computational research of genes. Thus common...
Chapter
Here we consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/ unhealthy samples of an input dataset. The presented approach is based on a network model of the input gene expression data, where there is a labeled graph for each sample. This is the first attempt to build a different gr...
Chapter
Biological networks rely on the storage and retrieval of data associated to the physical interactions and/or functional relationships among different actors. In particular, the attention may be on the interactions among cellular components, such as proteins, genes, RNA, or for example on phenotype–genotype associations. Data from which biological n...
Chapter
When biological networks are considered, the extraction of interesting knowledge often involves subgraphs isomorphism check that is known to be NP-complete. For this reason, many approaches try to simplify the problem under consideration by considering structures simpler than graphs, such as trees or paths. Furthermore, the number of existing appro...
Chapter
This chapter is devoted to a discussion on exceptional pattern discovery, namely on scenarios, contexts, and techniques concerning the mining of patterns which are so rare or so frequent to be considered as exceptional and, then, of interest for an expert to shed lights on the domain. Frequent patterns have found broad applications in areas like as...
Conference Paper
We show that minimal models of positive propositional theories can be decomposed based on the structure of the dependency graph of the theories. This observation can be useful for many applications involving computation with minimal models. As an example of such benefits, we introduce new algorithms for minimal model finding and checking that are b...
Article
Full-text available
The outlying property detection problem is the problem of discovering the properties distinguishing a given object, known in advance to be an outlier in a database, from the other database objects. In this paper, we analyze the problem within a context where numerical attributes are taken into account, which represents a relevant case left open in...
Book
This work provides a review of biological networks as a model for analysis, presenting and discussing a number of illuminating analyses. Biological networks are an effective model for providing insights about biological mechanisms. Networks with different characteristics are employed for representing different scenarios. This powerful model allows...
Article
We consider the problem of mining gene expression data in order to single out interesting features that characterize healthy/unhealthy samples of an input dataset. We present and approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attem...
Conference Paper
We present a technique for node anomaly detection in networks where arcs are annotated with time of creation. The technique aims at singling out anomalies by taking simultaneously into account information concerning both the structure of the network and the order in which connections have been established. The latter information is obtained by time...
Conference Paper
We consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/unhealthy samples of an input dataset. We present an approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attempt t...
Article
In this work, we introduce a novel definition of outlier, namely the Gradient Outlier Factor (or GOF), with the aim to provide a definition that unifies with the statistical one on some standard distributions but has a different behavior in the presence of mixture distributions. Intuitively, the GOF score measures the probability to stay in the nei...
Conference Paper
In plant mitochondria an essential mechanism for gene expression is RNA editing, often influencing the synthesis of functional proteins. RNA editing alters the linearity of genetic information transfer. Indeed it causes differences between RNAs and their coding DNA sequences that hinder both experimental and computational research of genes. Therefo...
Article
We present a novel definition of outlier whose aim is to embed an available domain knowledge in the process of discovering outliers. Specifically, given a background knowledge, encoded by means of a set of first-order rules, and a set of positive and negative examples, our approach aims at singling out the examples showing abnormal behavior. The te...
Conference Paper
We consider the problem of unsupervised outlier detection in large collections of data objects when objects are modeled by means of arbitrary multidimensional probability density functions. Specifically, we present a novel definition of outlier in the context of uncertain data under the attribute level uncertainty model, according to which an uncer...
Article
Designing algorithms capable of efficiently constructing minimal models of CNFs is an important task in AI. This paper provides new results along this research line and presents new algorithms for performing minimal model finding and checking over positive propositional CNFs and model minimization over propositional CNFs. An algorithmic schema, cal...
Conference Paper
Determining a good sets of pivots is a challenging task for metric space indexing. Several techniques to select pivots from the data to be indexed have been introduced in the literature. In this paper, we propose a pivot placement strategy which exploits the natural data orientation in order to select space points which achieve a good alignment wit...
Article
We consider the problem of discovering attributes, or properties, accounting for the a priori stated abnormality of a group of anomalous individuals (the outliers) with respect to an overall given population (the inliers). To this aim, we introduce the notion of exceptional property and define the concept of exceptionality score, which measures the...
Article
This work deals with the problem of classifying uncertain data. With this aim we introduce the Uncertain Nearest Neighbor (UNN) rule, which represents the generalization of the deterministic nearest neighbor rule to the case in which uncertain objects are available. The UNN rule relies on the concept of nearest neighbor class, rather than on that o...
Article
In this study, we deal with the problem of efficiently answering range queries over uncertain objects in a general metric space. In this study, an uncertain object is an object that always exists but its actual value is uncertain and modeled by a multivariate probability density function. As a major contribution, this is the first work providing an...
Conference Paper
We present L-SME, a system to efficiently identify loosely structured motifs in genome-wide applications. L-SME is innovative in three aspects. Firstly, it handles wider classes of motifs than earlier motif discovery systems, by supporting boxes swaps and skips in the motifs structure as well as various kinds of similarity functions. Secondly, in a...
Article
Full-text available
This work deals with the problem of classifying uncertain data. With this aim the Uncertain Nearest Neighbor (UNN) rule is here introduced, which represents the generalization of the deterministic nearest neighbor rule to the case in which uncertain objects are available. The UNN rule relies on the concept of nearest neighbor class, rather than on...
Article
Full-text available
Plants have played a special role in inositol polyphosphate (IP) research since in plant seeds was discovered the first IP, the fully phosphorylated inositol ring of phytic acid (IP6). It is now known that phytic acid is further metabolized by the IP6 Kinases (IP6Ks) to generate IP containing pyro-phosphate moiety. The IP6K are evolutionary conserv...
Data
Full-text available
Accession numbers of the genes referred in the figures.
Conference Paper
Full-text available
IP6 Kinases (IP6Ks) are important mammalian enzymes involved in inositol phosphates metabolism. Although IP6Ks have not yet been identified in plant chromosomes, there are many clues suggesting that the corresponding gene might be found in plant mtDNA, encrypted and hidden by virtue of editing and/or trans-splicing processes. In this paper, we prop...
Conference Paper
Full-text available
A new technique, SNIPER, is proposed for learning a model that deals with continuous values of exceptionality. Specifically, given some training objects associated with a continuous attribute F, SNIPER induces a rule-based model for the identification of those objects likely to score the maximum values for F. The purpose of SNIPER differs from the...
Article
This work proposes a method for detecting distance-based outliers in data streams under the sliding window model. The novel notion of one-time outlier query is introduced in order to detect anomalies in the current window at arbitrary points-in-time. Three algorithms are presented. The first algorithm exactly answers to outlier queries, but has lar...
Conference Paper
Assume a population partitioned in two subpopulations, e.g. a set of normal individuals and a set of abnormal individuals, is given. Assume, moreover, that we look for a characterization of the reasons discriminating one subpopulation from the other. In this paper, we provide a technique by which such an evidence can be mined, by introducing the no...
Article
Full-text available
Head-elementary-set-free (HEF) programs were proposed in (Gebser et al. 2007) and shown to generalize over head-cycle-free programs while retaining their nice properties. It was left as an open problem in (Gebser et al. 2007) to establish the complexity of identifying HEF programs. This note solves the open problem by showing that the problem is co...
Conference Paper
Full-text available
In this paper we describe an experience resulting from the collaboration among data mining researchers, domain experts of the Italian revenue agency, and IT professionals, aimed at detecting fraudulent VAT credit claims. The outcome is an auditing methodology based on a rule-based system, which is capable of trading among conflicting issues, such a...
Conference Paper
We present a novel definition of outlier in the context of inductive logic programming. Given a set of positive and negative examples, the definition aims at singling out the examples showing anomalous behavior. We note that the task here pursued is different from noise removal, and, in fact, the anomalous observations we discover are different in...
Article
Assume you are given a data population characterized by a certain number of attributes. Assume, moreover, you are provided with the information that one of the individuals in this data population is abnormal, but no reason whatsoever is given to you as to why this particular individual is to be considered abnormal. In several cases, you will be ind...
Article
In this work a novel distance-based outlier detection algorithm, named DOLPHIN, working on disk-resident datasets and whose I/O cost corresponds to the cost of sequentially reading the input dataset file twice, is presented. It is both theoretically and empirically shown that the main memory usage of DOLPHIN amounts to a small fraction of the datas...
Article
Full-text available
In this work a novel distance-based outlier detection algorithm, named DOLPHIN, working on disk-resident datasets and whose I/O cost corresponds to the cost of sequentially reading the input dataset file twice, is presented. It is both theoretically and empirically shown that the main memory usage of DOLPHIN amounts to a small fraction of the datas...
Article
The discovery of information encoded in biological sequences is assuming a prominent role in identifying genetic diseases and in deciphering biological mechanisms. This information is usually encoded in patterns frequently occurring in the sequences, also called motifs. In fact, motif discovery has received much attention in the literature, and sev...
Conference Paper
In this work we propose an unsupervised data cleaning method whose goal is to single out possibly erroneous entries in a database with textual fields. The method is particularly useful when no domain information is available about the correctness of the individual entries. With this aim, an unsupervised outlier detection like technique is proposed...
Conference Paper
Full-text available
ABSTRACT In this work a method,for detecting distance-based outliers in data streams is presented. We deal with the sliding win- dow model, where outlier queries are performed in order to detect anomalies in the current window. Two algorithms are presented. The flrst one exactly answers outlier queries, but has larger space requirements. The second...
Conference Paper
In this work a novel algorithm, named DOLPHIN, for detecting distance-based outliers is presented. The proposed algorithm performs only two sequential scans of the dataset. It needs to store into main memory a portion of the dataset, to efficiently search for neighbors and early prune inliers. The strategy pursued by the algorithm allows to keep th...
Conference Paper
Functional dependencies (FDs) are an integral part of relational database theory since they are used in integrity enforcement and in database design. Despite their importance FDs are often not specified or some of them are not expected by database designers, but they occur in the data and the need of inferring them from data arises. Furthermore, in...
Conference Paper
Full-text available
Functional dependencies (FDs) are an integral part of database theory since they are used in integrity enforcement and in database de- sign. Recently, functional dependencies satisfled by XML data (XFDs) have been introduced. In this work approximate functional dependen- cies that are XFDs approximately satisfled by a considerable part of the XML d...
Conference Paper
Full-text available
In the last few years, the completion of the human genome sequencing showed up a wide range of new challenging issues involving raw data analysis. In particular, the discovery of information implicitly encoded in biological sequences is assuming a prominent role in identifying genetic diseases and in deciphering biological mechanisms. This informat...

Network

Cited By