About
100
Publications
8,832
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
575
Citations
Introduction
My research work mainly focuses on two fields: machine learning, with special emphasis to learning algorithms dealing with data quality, and computing education for pupils in the pre-university system. I work at the Milan University, where I actively teach Operating Systems, Big data analytics, Simulation, Computer programming 3 and Computing didactics. I am also in charge for the for the activities for vocational orientation of the Science and Technology division of the University.
Current institution
Additional affiliations
February 2011 - present
October 2002 - January 2011
June 2001 - October 2002
Education
January 1997 - February 2000
October 1990 - February 1996
Publications
Publications (100)
We propose an algorithm for inferring membership functions of fuzzy sets by exploiting a procedure originated in the realm of support vector clustering. The available data set consists of points associated with a quantitative evaluation of their membership degree to a fuzzy set. The data are clustered in order to form a core gathering all points de...
We cope with the key step of bootstrap methods of generating a possibly infinite sequence of random data preserving properties of the distribution law, starting from a primary sample actually drawn from this distribution. We solve this task in a cooperative way within a community of generators where each improves its performance from the analysis o...
We propose a variant of two SVM regression algorithms expressly tailored in order to exploit additional information summarizing the relevance of each data item, as a measure of its relative importance w.r.t. the remaining examples. These variants, enclosing the original formulations when all data items have the same relevance, are preliminary teste...
We discuss a procedure which extracts statistical and entropic information from data in order to discover Boolean rules underlying them. We work within a granular computing framework where logical implications between statistics on the observed sample and properties on the whole data population are stressed in terms of both probabilistic and possib...
We formulate a new family of bootstrap algorithms suitable for learning non-Boolean functions from data. Within the Algorithmic Inference framework, the key idea is to consider a population of functions that are compatible with the observed sample. We generate items of this population from standard random seeds and reverse seed probabilities on the...
This paper presents a teaching methodology mixing elements from the domains of music and informatics as a key enabling to expose primary school pupils to basic aspects of computational thinking. This methodology is organized in two phases exploiting LEGO® bricks respectively as a physical tool and as a metaphor in order to let participants discover...
We present a new classification method for Bebras tasks based on the ISTE/CSTA operational definition of computational thinking. The classification can be appreciated by teachers without a formal education in informatics and it helps in detecting the cognitive skills involved by tasks, and makes their educational potential more explicit.
A major challenge in bio-medicine is finding the genetic causes of human diseases, and researchers are often faced with a large number of candidate genes. Gene prioritization methods provide a valuable support in guiding researchers to detect reliable candidate causative-genes for a disease under study. Indeed, such methods rank genes according to...
Negative examples in automated protein function prediction
(AFP), that is proteins known not to possess a given protein function,
are usually not directly stored in public proteome and genome
databases, such as the Gene Ontology database. Nevertheless, most computational
methods need negative examples to infer new predictions.
A variety of algorith...
Recently, several actions aimed at introducing informatics concepts to young students have been proposed. Among these, the “Hour of Code” initiative addresses a wide audience in several countries worldwide, with the goal of giving everyone the opportunity to learn computer science. This paper compares Hour of Code with an alternative, yet similar,...
This paper analyses the results of the 2014 edition of the Italian Bebras/Kangourou contest, exploiting the Item Response Theory statistical methodology in order to infer the difficulty of each of the proposed tasks starting from the scores attained by the participants. Such kind of analysis, enabling the organizers of the contest to check whether...
In order to introduce informatic concepts to students of Italian secondary schools, we devised a number of interactive works
hops conceived for pupils aged 10--17. Each workshop is intended to give
pupils the opportunity to explore a computer science topic:
investigate it firsthand, make hypotheses that can then be tested in
a guided context during...
In this paper, after discussing the state of computer science
education in Italy, we report our experiments in introducing
informatic concepts to students of Italian secondary schools through
a mix of tangible and abstract object manipulations: a strategy
which we call \emph{algomotricity}. The goal we set ourselves was to
let pupils discover the f...
We propose a Support Vector-based methodology for learning classifiers from partially labeled data. Its novelty stands in a formulation not based on the cluster hypothesis, stating that learning algorithms should search among classifiers whose decision surface is far from the unlabeled points. On the contrary, we assume such points as specimens of...
This paper describes how the classification of imbalanced datasets through support vector machines using the boundary movement method can be easily explained in terms of a cost-sensitive learning algorithm characterized by giving each example a cost in function of its class. Moreover, it is shown that under this interpretation the boundary movement...
In this paper we report on our experiments in teaching computer science concepts with a mix of tangible and abstract object manipulations. The goal we set ourselves was to let pupils discover the challenges one has to meet to automatically manipulate formatted text. We worked with a group of 25 secondary school pupils (9-10th grade), and they were...
We describe a teaching activity about word-processors we proposed to a group of 25 pupils in 9th/10th grades of an Italian secondary school. While the pupils had some familiarity with word-processor operations, they had had no formal instruction about the automatic elaboration of formatted texts. The proposed kinesthetic/tactile activities turned o...
We deal with a special class of games against nature which correspond to subsymbolic learning problems where we know a local descent direction in the error landscape but not the amount gained at each step of the learning procedure. Namely, Alice and Bob play a game where the probability of victory grows monotonically by unknown amounts with the res...
We discuss a bridge way of inference between Agnostic Learning and Prior Knowledge based on an inference goal represented not by the attainment of truth but simply by a suitable organization of the knowledge we have accumulated on the observed data. In a framework where this knowledge is not definite, we smear it across a series of possible models...
The aim of this paper is to analyse the phenomenon of accuracy degradation in the samples given as input to SVM classification algorithms. In particular, the effect of accuracy degradation on the performance of the learnt classifiers is investigated and compared, if possible, with theoretical results. The study shows how a family of SVM classificat...
We introduce a regression method that fully exploits both global and local information about a set of points in search of a suitable function explaining their mutual relationships. The points are assumed to form a repository of information granules. At a global level, statistical methods discriminate between regular points and outliers. Then the lo...
Learning algorithms consider a sample consisting of pairs (pattern, label) and output a decision rule, possibly: (i) associating each pattern with the corresponding label, and (ii) generalizing to new patterns drawn from the same distribution of the original sample. This work proposes a set of methodologies to be applied to existing learning strate...
We formulate a new family of bootstrap algorithms suitable for learning non-Boolean functions from data. Within the Algorithmic Inference framework, the key idea is to consider a population of functions that are compatible with the observed sample. We generate items of this population from standard random seeds and reverse seed probabilities on the...
We propose the use of clustering methods in order to discover the quality of each element in a training set to be subsequently fed to a regression algorithm. The paper shows that these methods, used in combination with regression algorithms taking into account the additional information conveyed by this kind of quality, allow the attainment of high...
A traditional way of introducing the inference facility in operational contexts is through the match box metaphor. You buy
your box and wonder how many matches will fire, how many not. You cannot check all them otherwise you will be satisfied with
your knowledge but cannot use the obtained information on the current box because it became empty. Thu...
In Chapters 6 and 8 we discussed separately symbolic rules for identifying membership functions to cluster and subsymbolic
rules to learn functions that suitably translate a set of inputs into an output variable. Here we look for complete procedures
performing both tasks having the final aim of computing suitable functions from input to output. The...
Consider a highly nonlinear function like the one in Fig. 10.1 as the linear fit of experimental points describing the relationship
between variables x and y. Willing to understand this relationship we may decide to express it by a 5-th degree polynomial, whose fitting looks like
in Fig. 10.1(a). A better way is to identify three fuzzy sets centere...
We focus on the development of fuzzy sets by presenting various ways of designing fuzzy sets and determining their membership
functions.
In its primary form of connectionist paradigms, social computation raised in the eighties of the last century with the role
of bringing an answer to the failure of the parallel computation as a general way of overcoming the speed limits of sequential
computers. In that time these limits where mainly theoretical, encountered when people did try to s...
Learning, as a single value inference facility is unavoidably connected with an optimization problem. You fix a cost function
compliant with your probability model, if any, and look for a solution in the parameter space that minimizes the function
up to a given approximation. You search in the parameter space since you want to assess a function to...
An obvious strategy you may follow in grouping objects is that objects located on the same shelf are much similar to each
other, whilst objects belonging to different shelves are very dissimilar. These groups of objects will be named clusters, understanding that they become classes once we acknowledge their utility and reward them for this by givin...
Willing to use fuzzy set models described in the previous chapter, we face with the similar problem typically encountered
with the probabilistic framework of adapting models to the operational problem at hand. This passes through the identification
of the model free parameters from a set of experimental data, a task that we call estimation by analo...
Hansel was a very smart kid. Unable to bring with him a ton of pebbles to mark the track from house into the woods, he used
them as granules of information suggesting the way. Even better, he exploited their luminescence to give them an order, hence
dealing with them as a sample of the road.
We will assume the mechanism shown in Fig. 2.1 as the mother of all samples. You may see the contrivance in its center as a spring generating standard random numbers for free which are
transmuted into numbers following a given distribution law after having passed through the gears of the mechanism. By default,
we assume that the standard numbers ar...
We may organize the search for the metric d at the basis of the cost function ℓ into two main steps: search for a sound representation of the data and use of a metric appropriate to the representation. The term sound stands for a representation allowing to
better understanding the data, for instance by decoupling original signals, removing noise, d...
While in Part V we have been engaged in assembling computational tools previously assessed to build a complete system from signals to decisions, in this part we close the book with some examples of how to merge conceptual tools. The aim is to render them adequate to face complex data within elementary tasks. The strategy is to drill some either ana...
We devise an SVM for partitioning a sample space affected by random binary labels. In the hypothesis that a smooth, possibly
symmetric, conditional label distribution graduates the passage from the all 0-label domain to the all 1-label domain and
under other regularity conditions, the algorithm supplies an estimate of the above probabilities. Withi...
We propose a modified SVM algorithm for the classification of data augmented with explicit quality quantification for each
example in the training set. As the extension to nonlinear decision functions through the use of kernels brings to a non-convex
optimization problem, we develop an approximate solution. Finally, the proposed approach is applied...
We propose an aging mechanism which develops in artificial bacterial populations fighting against antibiotic molecules. The
mechanism is based on very elementary information gathered by each individual and elementary reactions as well. Though we
do not interpret the aging process in strictly biological terms, it appears compliant with recent studie...
Regression theory is the hat of various methodologies for approximating a function whose analytical form is typically known up to a finite number of parameters. We use the algorithmic inference statistical framework to find regions where nonlinear functions underlying samples are totally included with a given confidence. The key point is to conside...
We develop a hybrid strategy combing thruth-functionality, kernel, support vectors and regression to construct highly informative
regression curves. The idea is to use statistical methods to form a confidence region for the line and then exploit the structure
of the sample data falling in this region for identifying the most fitting curve. The fitn...
We deal with a complex game between Alice and Bob where each contender’s probability of victory grows monotonically by unknown amounts with the resources employed. For a fixed effort on Alice’s part, Bob increases his resources on the basis of the results for each round (victory, tie or defeat) with the aim of reducing the probability of defeat to...
We face a complex game between Alice and Bob where the victory probability of each contender grows monotonically by unknown amounts with the resources s/he employs. For a fixed effort on Alice's part Bob increases his resources on the basis of the results of the individual contests (victory, tie or defeat) with the aim of reducing the defeat probab...
XML−RPC is a protocol used to remotely execute a program independently of the particular hardware and operating system used on both ends of the communication channel, in that all the conveyed information consists in text containing XML encodings coupled with HTTP headers. A sagacious mix of J/Link, XML, and RegularExpression technologies available...
We introduce a procedure for mapping general data records onto Boolean vectors, in the philosophy of ICA procedures. The task is demanded of a neural network with double duty: i) extracting a compressed version of the data in a tight hidden layer of a self-associative multilayer architecture, and ii) mapping it onto Boolean vectors that optimize an...
We find very tight bounds on the accuracy of a support vector machine classification error within the algorithmic inference framework. The framework is specially suitable for this kind of classifier since (i) we know the number of support vectors really employed, as an ancillary output of the learning procedure, and (ii) we can appreciate confidenc...
We discuss a method for inferring Boolean functions from examples. The method is inherently fuzzy in two respects: i) we work with a pair of formulas representing rough sets respectively included by and including the support of the goal function, and ii) we manage the gap between the sets for simplifying their expressions. Namely, we endow the gap...
The typical way of judging about either the efficacy of a new treatment or, on the contrary, the damage of a pollutant agent is through a test of hypothesis having its ineffectiveness as null hypothesis. This is the typical operational field of Kolmogorov's statistical framework where wastes of data (for instance non significant deaths in a pollute...
We augment a linear regression procedure by a thruth-functional method in order to identify a highly informative regression
line. The idea is to use statistical methods to identify a confidence region for the line and exploit the structure of the
sample data falling in this region for identifying the most fitting line. The fitness function is relat...
We propose importing results from monotone game theory to model the evolution of a bacterial population under antibiotic attack.
This allows considering the bacterium aging as a relevant phenomenon moving the evolution far away from the usual linear predator-prey
paradigms. We obtain an almost nonparametric aging mechanism based on a thresholding o...
With the aim of getting understandable symbolic rules to explain a given phenomenon, we split the task of learning these rules from sensory data in two phases: a multilayer perceptron maps features into propositional variables and a set of subsequent layers operated by a PAC-like algorithm learns Boolean expressions on these variables. The special...
We consider the task of monitoring the awareness state of a driver engaged in attention demanding manoeuvres. A Boolean normal form launches a flag when the driver is paying special attention to his guiding. The contrasting analysis of these flags with the physical parameters of the car may alert a decision system whenever the driver awareness is j...
We infer symbolic rules for deciding the awareness state of a driver on the basis of physiological signals traced on his body through noninvasive techniques. We use a standard device for collecting signals and a three-level procedure for: 1) extracting features from them, 2) computing Boolean independent components of the features acting as proposi...
We revisit the linear regression problem in terms of a computational learning problem whose task is to identify a confidence
region for a continuous function belonging in particular to the straight lines family. Within the Algorithmic Inference framework
this function is deputed to explain a relation between pairs of variables that are observed thr...
We reconsider in the Algorithmic Inference framework the accuracy of a Boolean function learnt from examples. This framework
is specially suitable when the Boolean function is learnt through a Support Vector Machine, since (i) we know the number of
support vectors really employed as an ancillary output of the learning procedure, and (ii) we can app...
We introduce a very complex game based on an approximate solution of a NP-hard problem, so that the probability of victory grows monotonically, but of an unknown amount, with the resources each player employs. We formulate this model in the computational learning framework and focus on the problem of computing a confiden ce interval for the losing...
We infer symbolic rules for deciding the awareness state of a driver on the basis of physiological signals traced on his body
through non invasive techniques. We use a standard device for collecting signals and a three-level procedure for: 1) extracting
features from them, 2) computing Boolean independent components of the features acting as propos...
We infer symbolic rules for deciding the awareness state of a driver on the basis of physiological signals traced on his body through non invasive techniques. We use a standard device for collecting signals and a three-level procedure for: 1) extracting features from them, 2) computing Boolean independent components of the features acting as propos...
We discuss a Probably Approximate Correct (PAC) learning paradigm for Boolean formulas, which we call PAC meditation, where the class of formulas to be learnt is not known in advance. We split the building of the hypothesis in various levels of increasing description complexity according to additional inductive biases received at run time. In order...
In this paper we present an architecture of attention-based control for artificial agents. The agent is responsible for monitoring adaptively the user in order to detect context switches in his state. Assuming a successful detection appropriate action will be taken. Simulation results based on a simple scenario show that Attention is an appropriate...
We introduce a very complex game based on an approximate solution of a NP-hard problem, so that the probability of victory grows monotonically, but of an unknown amount, with the resources each player employs. We formulate this model in the computational learning framework and focus on the problem of computing a confidence interval for the losing p...
We consider an integrated subsymbolic-symbolic procedure for extracting symbolically explained classi cation rules from data. A multilayer perceptron maps features into propositional variables and a set of subsequent layers operated by a PAC-like algorithm learns boolean expressions on these variables. The peculiarities of the whole procedure are:...
Introduction A very innovative aspect of PAC learning is to assume that probabilities are random variables per se. This is not the point of confidence intervals in classic statistical theory, where the randomness is due to the extremes of the intervals rather than to the value of the probabilistic parameter at hand. For instance, in the inequality...
In order to state a symbolic description of observed data we learn a minimal monotone DNF (Disjunctive Normal Form) formula consistent with them. Then, in the idea that a short formula - i.e. made up of few monomials, each represented by the product of few literals - is better understandable by the user than a longer one, we propose here an algorit...
We provide three steps in the direction of shifting probability from a descriptive tool of unpredictable events to a way of understanding them. At a very elementary level we state an operational definition of probability based solely on symmetry assumptions about observed data. This definition converges, however, to the Kolmogorov one within a spec...
Suppose we are observing data consisting in pairs (x
i
, y
i
) whose components are linked by a relation. Suppose in addition that the measurement process at the basis of their collection is affected by noise.
We will adopt as our basic paradigm the Probably Approximately Correct (PAC) learning mode introduced by Valiant [Valiant, 1984]. Whereby a learning algorithm is a procedure for generating an indexed family of functions h
m
within a class, with probability (P
error)m
of producing wrong outputs converging to 0 in probability. The convergence occurs...
We use the statistical framework known as algorithmic inference explained in Chapter 1 of this book to provide confidence intervals for the cumulative distribution and hazard function of a set of survival data [Boracchi and Biganzoli, 2001].
We narrow the width of the confidence interval introduced by Vapnik and Chervonenkis for the risk function in PAC learning boolean functions through non-consistent hypotheses. To obtain this improvement for a large class of learning algorithms we introduce both a theoretical framework for statistical inference of functions and a concept class compl...
We provide some theoretical results on sample complexity of PAC learning when the hypotheses are given by subsymbolical devices such as neural networks. In this framework we give new foundations to the notion of degrees of freedom of a statistic and relate it to the complexity of a concept class. Thus, for a given concept class and a given sample s...
We cope with the key step of bootstrap methods of generating a possibly infinite sequence of random data preserving properties of the distribution law, starting from a primary sample actually drawn from this distribution. We deal with two hardware resource constraints: i. absence of a long term memory, requiring an on line estimation of the bootstr...
We introduce a theoretic framework for Probably Approximately Correct learning. This enables us to compute the distribution law of the random variable representing the probability of region where the hypothesis is incorrect. The distinguishing feature in respect to the inference of an analogous probability from Bernoulli variable is the dependence...