
Everton Alvares Cherman- PhD Candidate
- University of São Paulo
Everton Alvares Cherman
- PhD Candidate
- University of São Paulo
About
37
Publications
11,283
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
921
Citations
Introduction
Current institution
Publications
Publications (37)
Quantification is an expanding research topic in Machine Learning literature. While in classification we are interested in obtaining the class of individual observations, in quantification we want to estimate the total number of instances that belong to each class. This subtle difference allows the development of several algorithms that incur small...
Quantification is an expanding research topic in Machine Learning literature. While in classification we are interested in obtaining the class of individual observations, in quantification we want to estimate the total number of instances that belong to each class. This subtle difference allows the development of several algorithms that incur small...
This paper proposes one-class quantification, a new Machine Learning task. Quantification estimates the class distribution of an unlabeled sample of instances. Similarly to one-class classification, we assume that only a sample of examples of a single class is available for learning, and we are interested in counting the cases of such class in a te...
Early approaches to detect concept drifts in data streams without actual class labels aim at minimizing external labeling costs. However, their functionality is dubious when presented with changes in the proportion of the classes over time, as such methods keep reporting concept drifts that would not damage the performance of the current classifica...
Active learning is an iterative supervised learning task where learning algorithms can actively query an oracle, i.e. a human annotator that understands the nature of the problem, to obtain the ground truth. The motivation behind this approach is to allow the learner to interactively choose the data it will learn from, which can lead to significant...
An event is defined as “a particular thing which happens at a specific time and place” and can be extracted from news articles, social networks, forums, as well as any digital documents associated with metadata describing temporal and geographical information. In practice, this knowledge is a digital representation (virtual world) of various phenom...
Active learning is an iterative supervised learning task where learning algorithms can actively query an oracle, i.e. a human annotator that understands the nature of the pro blem, for labels. As the learner is allowed to interactively choose the data from which it learns, it is expected that the learner will perform better with less training. The...
In supervised learning, simple baseline classifiers can be constructed by
only looking at the class, i.e., ignoring any other information from the
dataset. The single-label learning community frequently uses as a reference the
one which always predicts the majority class. Although a classifier might
perform worse than this simple baseline classifie...
Most supervised learning methods consider that each dataset instance is associated with a unique label. However, there are several domains in which the instances are associated with a set of labels (a multi-label). An alternative to investigate properties of multi-label data and their relationship with the learning performance consists in explorato...
Lazy multi-label learning algorithms have become an important research topic within the multi-label community. These algorithms usually consider the set of standard k-Nearest Neighbors of a new instance to predict its labels (multi-label). The prediction is made by following a voting criteria within the multi-labels of the set of k-Nearest Neighbor...
A controlled environment based on known properties of the dataset used by a learning algorithm is useful to empirically evaluate machine learning algorithms. Synthetic (artificial) datasets are used for this purpose. Although there are publicly available frameworks to generate synthetic single-label datasets, this is not the case for multi-label da...
The feature selection process aims to select a subset of relevant features to be used in model construction, reducing data dimensionality by removing irrelevant and redundant features. Although effective feature selection methods to support single-label learning are abound, this is not the case for multi-label learning. Furthermore, most of the mul...
Recommending given names is a special case of recommender system that is little explored, but has gained a great interest recently. Indication of names related to a user's query or suggestion of names for parents in order to choose a name for their unborn child are examples of applications of name recommendation. In this paper, we present results f...
Application of Image Analysis and Artificial Intelligence methods to identify patterns has been stimulated by the increasing of storage of exams using images. In this work, it is presented a methodology to assist the characterization of colic tissues which was applied to a set of 15 images of colonoscopy exams. These images were described by 11 tex...
Feature selection is an important task in machine learning, which can effectively reduce the dataset dimensionality by removing irrelevant and/or redundant features. Although a large body of research deals with feature selection in single-label data, in which measures have been proposed to filter out irrelevant features, this is not the case for mu...
Machine learning research relies to a large extent on experimental observations. The evaluation of classifiers is often carried out by empirical comparison with classifiers generated by different learning algorithms, allowing the identification of the best algorithm for the problem at hand. Nevertheless, previously to this evaluation, it is importa...
In multi-label learning, each example in the dataset is associated with a set of labels, and the task of the generated classifier is to predict the label set of unseen examples. Feature selection is an important task in machine learning, which aims to find a small number of features that describes the dataset as well as, or even better, than the or...
In multi-label classification, examples can be associated with multiple labels simultaneously. The task of learning from multi-label data can be addressed by methods that transform the multi-label classification problem into several single-label classification problems. The binary relevance approach is one of these methods, where the multi-label le...
Hierarchies are effective data models for organizing textual collections, particularly for automatic document classification into cat-egories and subcategories. However, the majority of existing methods on hierarchical classification require human-labeled document set. More-over, humans have good insight to manage the categories of higher levels of...
Defining the attributes in terms of fuzzy sets is an essential part in designing a fuzzy system. The main tasks involved in defining the fuzzy data base include deciding the type of fuzzy set (triangular, trapezoidal, etc), the number of fuzzy sets for each attribute, and their distribution in each attribute domain. In the absence of an expert, the...
Traditional classification algorithms consider learning problems that contain only one label, i.e., each example is associated with one single nominal target variable characterizing its property. However, the number of practical applications involving data with multiple target variables has increased. To learn from this sort of data, multi-label cl...
In multi-label classification, each example can be associated with multiple labels simultaneously. The task of learning from
multi-label data can be addressed by methods that transform the multi-label classification problem into several single-label
classification problems. The binary relevance approach is one of these methods, where the multi-labe...
PURPOSE: to evaluate the predictive quality of computational models to differentiate colic tissues, based on Cooccorrurence Matrices (MC) representation of Coloscopic Images (IC). MATERIALS AND METHODS: image analysis and artificial intelligence methods were employed to construct computational models. Sixty seven IC images, containing polyp, were c...
PURPOSE: to evaluate the predictive quality of computational models to differentiate colic tissues, based on Cooccorrurence Matrices (MC) representation of Coloscopic Images (IC). MATERIALS AND METHODS: image analysis and artificial intelligence methods were employed to construct computational models. Sixty seven IC images, containing polyp, were c...
Data Mining is a process related to analysis, understanding and knowledge extraction from databases. In order to perform this process it is usually necessary to represent the data in the so called attribute-value format. This work proposes an extension of a methodology which supports, through a semi-automatic process, the construction of a table in...
The Data Mining process may help specialists on decision making by applying patterns extraction techniques based on attribute-value tables. An automatic medical report information mapping method is being developed in-tending to reduce the necessary time of the process and to avoid possible sub-jectivity on the manual information mapping. This work...
Palavras-chave: Descoberta de Conhecimento em Bases de Dados, Gastroenterologia, Modelo Cliente-Servidor. Abstract - Technological advancement allows processing and registering of a growing amount of data. This phenomenon is observed in areas such as medicine, in which hospitals and clinics store an expressive quantity of patients' exams, often in...
The growing data accumulation in multiple knowledge fields made impracticable a complete manual data analysis. Computational process, as the Knowledge Discover in Databases, using the extracted knowledge, may help specialists in the decision making process. However, in order to perform such process, these information need to be organized in a struc...