• Home
  • Rafael Geraldeli Rossi
Rafael Geraldeli Rossi

Rafael Geraldeli Rossi
Federal University of Mato Grosso do Sul, Três Lagoas, Brazil

PhD
Senior Data Scientist at iFood

About

61
Publications
15,229
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
597
Citations
Additional affiliations
August 2011 - October 2015
University of São Paulo
Position
  • PhD Student
August 2011 - October 2015
University of São Paulo
Position
  • PhD Student

Publications

Publications (61)
Article
The advancement of techniques and computational tools for data mining has been boosting the music market with applications focused on user experience. These techniques explore musical data looking for patterns and trends that can guide business strategies. One of the key steps in these applications is the vector representation of the original text....
Article
Context Mobile app reviews are a rich source of information for software evolution and maintenance. Several studies have shown the effectiveness of exploring relevant reviews in the software development lifecycle, such as release planning and requirements engineering tasks. Popular apps receive even millions of reviews, thereby making manual extrac...
Article
Full-text available
In this paper, we introduce the concept of learning to sense, which aims to emulate a complex characteristic of human reasoning: the ability to monitor and understand a set of interdependent events for decision-making processes. Event datasets are composed of textual data and spatio-temporal features that determine where and when a given phenomenon...
Preprint
Full-text available
Atualmente há uma quantidade massiva de textos sendo produzida no universo digital. Esse grande conjunto de textos pode conter conhecimentoútil para diversasáreas, tanto acadêmicas quanto empresariais. Uma das formas para extração de conhecimento e gerenciamento de grandes volumes de textó e a classificação automática. Uma maneira de tornar mais at...
Conference Paper
Events are phenomena that occur at a specific time and place. Its detection can bring benefits to society since it is possible to extract knowledge from these events. Event detection is a multimodal task since these events have textual, geographical, and temporal components. Most multimodal research in the literature uses the concatenation of the c...
Chapter
The dynamism of fake news evolution and dissemination plays a crucial role in influencing and confirming personal beliefs. To minimize the spread of disinformation approaches proposed in the literature, automatic fake news detection generally learns models through binary supervised algorithms considering textual and contextual information. However,...
Article
Full-text available
Fake news can rapidly spread through internet users and can deceive a large audience. Due to those characteristics, they can have a direct impact on political and economic events. Machine Learning approaches have been used to assist fake news identification. However, since the spectrum of real news is broad, hard to characterize, and expensive to l...
Article
Positive and Unlabeled Learning (PUL) uses unlabeled documents and a few positive documents for retrieving a set of ”interest” documents from a text collection. Usually, PUL approaches are based on the vector space model. However, when dealing with semi-supervised learning for text classification or information retrieval, graph-based approaches hav...
Technical Report
Full-text available
A Análise de Sentimentosé um processo que tem por objetivo principal extrair as polaridades dos sentimentos expressos nas opiniões em relação a um tópico de interesse. Essaárea de pesquisa vem ganhando atenção, tanto na Web quanto na academia, pois instituições, pessoas e companhias se inte-ressam em saber a opinião real de um grupo de pessoas a re...
Conference Paper
Full-text available
Computational techniques can be used to identify musical trends and patterns, helping people filtering and selecting music according to their preferences. In this scenario, researches claim that the future of music permeates artificial intelligence, which will play the role of composing music that best fits the tastes of consumers. So, extracting p...
Article
Event analysis from news and social networks is a promising way to understand complex social phenomena. Each event consists of different components, which indicate what happened, when, where, and the people and organizations involved. Heterogeneous networks are useful for modeling large event datasets, where we map different types of objects (e.g....
Conference Paper
Full-text available
Dado o volume massivo de textos sendo produzido nos dias atuais, a classificação automática de textos tem se tornado interessante tanto para fins acadêmicos quanto empresariais. Tradicionalmente, a classificação automática de textos é realizada por meio de aprendizado de máquina multi-classe, o qual requer que o usuário apresente textos rotulados d...
Conference Paper
Técnicas computacionais podem ser usadas para identificar tendências e padrões musicais, ajudando as pessoas a filtrar e selecionar músicas de acordo com suas preferências. Nesse cenário, pesquisas afirmam que o futuro da música permeia a inteligência artificial, que desempenhará o papel de compor músicas que melhor atendam aos gostos dos consumido...
Technical Report
Full-text available
Resumo. O espanholé hoje a segunda língua mais falada do mundo e se en-contra cada vez mais presente na vida dos brasileiros, seja na vida acadêmica, viagens ou negócios, ou ainda por ser a língua predominante na América. Neste contexto, com o advento e evolução da internet, diversas ferramentas on-line para tradução surgiram de forma a propiciar u...
Conference Paper
Full-text available
The texts automatic classification (TAC) has become interesting for academic and business purposes due to the massive volume of texts being produced. TAC is usually performed through multi-class learning, in which a user must provide labeled texts for all classes of an application domain. However, in scenarios in which the intent is to verify if a...
Article
Accurate semantic representation models are essential in text mining applications. For a successful application of the text mining process, the text representation adopted must keep the interesting patterns to be discovered. Although competitive results for automatic text classification may be achieved with traditional bag of words, such representa...
Article
Aspect-Based Sentiment Analysis (ABSA) is a promising approach to analyze consumer reviews at a high level of detail, where the opinion about each feature of the product or service is considered. ABSA usually explores supervised inductive learning algorithms, which requires intense human effort for the labeling process. In this paper, we investigat...
Technical Report
Full-text available
Devido à popularização da internet e ao surgimento das redes sociais, todos os dias são produzidos milhares de dados em forma de textos, especialmente em ambientes que proporcionam o rápido compartilhamento de mensa-gens entre usuários, assim como observado no Twitter. Visto que muitos destes textos são opinativos e expressam os sentimentos dos usu...
Technical Report
Full-text available
Nos dias atuais há um grande volume de tráfego terrestre, sendo o maior responsável pelo transporte de cargas e pessoas no país. Monitorar e manter essas vias é um grande desafio para os órgãos responsáveis. Na maioria das vezes, a inspeção manual é preferível para realizar o monitoramento das vias, pois acredita-se ser o modo mais econômico financ...
Technical Report
Full-text available
Analisando o volume massivo de textos que trafegam na internet e suas estimativas de crescimento, classificar automaticamente os textos que circulam na rede mundial de computadores tem se tornado interessante para fins acadêmicos e empresariais. Uma das formas de se realizar esta classificação automáticá e utilizando técnicas de aprendizado de máqu...
Conference Paper
Events can be defined as "something that occurs at specific place and time associated with some specific actions". In general, events extracted from news articles and social networks are used to map the information from web to the various phenomena that occur in our physical world. One of the main steps to perform this relationship is the use of ma...
Conference Paper
Full-text available
An event is defined as “a particular thing which happens at a specific time and place” and can be extracted from news articles, social networks, forums, as well as any digital documents associated with metadata describing temporal and geographical information. In practice, this knowledge is a digital representation (virtual world) of various phenom...
Article
Full-text available
Many real-world applications, such as those related to sensors, allow collecting large amounts of inexpensive unlabeled sequential data. However, the use of supervised machine learning methods is frequently hindered by the high costs involved in gathering labels for such data. These methods assume the availability of a considerable amount of labele...
Conference Paper
Full-text available
The quality of the pavement of roads and streets has significant influence in the final price of goods and services, in the safety of pedestrians and also in the driver’s comfort. Thus, the development of tools for continuous monitoring of the pavement, intending to obtain a more precise and adequate maintenance plan is essential. In order to reduc...
Article
Due to the volume of texts available in digital form, the organization, management and knowledge extraction are laborious and frequently impossible to be handled. To automatically cope with these tasks, usually classification models are generated through supervised learning techniques. Unfortunately, this type of learning usually demands a huge hum...
Article
Full-text available
Aspect-Based Sentiment Analysis (ABSA) allows to analyze the sentiment from each product aspect, e.g., the camera quality, operating system and the storage capacity of a smartphone. Two main tasks to perform ABSA are: (i) the terms/words related to the aspects and (ii) performing sentiment analysis for each identified aspect. Several approaches to...
Article
Transductive classification is an useful way to classify a collection of unlabelled textual documents when only a small fraction of this collection can be manually labelled. Graph-based algorithms have aroused considerable interests in recent years to perform transductive classification since the graph-based representation facilitates label propaga...
Article
Transductive classification is a useful way to classify texts when labeled training examples are insufficient. Several algorithms to perform transductive classification considering text collections represented in a vector space model have been proposed. However, the use of these algorithms is unfeasible in practical applications due to the independ...
Conference Paper
Full-text available
Transductive classification is a useful way to classify texts when just few labeled examples are available. Transductive classification algorithms rely on term frequency to directly classify texts represented in vector space model or to build networks and perform label propagation. Related terms tend to belong to the same class and this information...
Conference Paper
Full-text available
Na Análise de Sentimentos baseada em Aspectos (ASBA) é possível analisar o sentimento de cada aspecto de um produto, por exemplo, a qualidade da câmera, sistema operacional e capacidade de armazenamento de um Smartphone. Trabalhos existentes utilizando aprendizado de máquina para ASBA requerem (i) conhecer previamente os possíveis aspectos ou (ii)...
Conference Paper
Full-text available
Causative verbs can assist in the identification of causative relations. Portuguese has a large number of verbs that would make the manual labelling of causative verbs an manually expensive task. This paper presents a classification strategy which uses the characteristics of causative verbs co-occurring with common nouns to classify Brazilian Portu...
Conference Paper
Full-text available
The popularization of music distribution in electronic format has increased the amount of music with incomplete metadata. The incompleteness of data can hamper some important tasks, such as music and artist recommendation. In this scenario, transductive classification can be used to classify the whole dataset considering just few labeled instances....
Conference Paper
Full-text available
A bipartite heterogeneous network is one of the simplest ways to represent a textual document collection. In such case, the network consists of two types of vertices, representing documents and terms, and links connecting terms to the documents. Transductive algorithms are usually applied to perform classification of networked objects. This type of...
Article
Algorithms for numeric data classification have been applied for text classification. Usually the vector space model is used to represent text collections. The characteristics of this representation such as sparsity and high dimensionality sometimes impair the quality of general-purpose classifiers. Networks can be used to represent text collection...
Article
Full-text available
Incremental clustering is a very useful approach to organize dynamic text collections. Due to the time/space restrictions for incremental clustering, the textual documents must be preprocessed to maintain only their most important information. Domain independent statistical keyword extraction methods are useful in this scenario, since they analyze...
Article
Full-text available
Several text mining techniques have been proposed to deal with the huge number of textual documents that are available and that have been published nowadays. Mainly classification techniques, which assign pre-defined labels to new documents, and clustering techniques, which separates texts into clusters. The techniques proposed in literature are us...
Conference Paper
Full-text available
Recommending given names is a special case of recommender system that is little explored, but has gained a great interest recently. Indication of names related to a user's query or suggestion of names for parents in order to choose a name for their unborn child are examples of applications of name recommendation. In this paper, we present results f...
Conference Paper
Terms are the basis for general text mining and natural language processing applications. However, the manual term extraction is unfeasible due to the huge number of words presented in a domain corpus and also the human effort required to do the extraction. For the term extraction task, machine learning techniques have been used to perform automati...
Conference Paper
Full-text available
Incremental clustering is a very useful approach to organize dynamic text collections. Due to the time/space restrictions for incremental clustering, the textual documents must be preprocessed to maintain only their most important information. Statistical keyword extraction methods from single documents are useful in this scenario. However, differe...
Conference Paper
Full-text available
Usually, algorithms for categorization of numeric data have been applied for text categorization after a preprocessing phase which assigns weights for textual terms deemed as attributes. However, due to characteristics of textual data, some algorithms for data categorization are not efficient for text categorization. Characteristics of textual data...
Conference Paper
Full-text available
A simple and intuitive way to organize a huge document collection is by a topic hierarchy. Generally two steps are carried out to build a topic hierarchy automatically: 1) hierarchical document clustering and 2) cluster labeling. For both steps, a good textual document representation is essential. The bag-of-words is the common way to represent tex...
Conference Paper
Full-text available
The technological progress designing new devices and the scientific growth in the field of Human-Computer Interaction are enabling new interaction modalities to move from research to commercial products. However, developing multimodal interfaces is still a difficult task due to the lack of tools that consider not only code generation, but usability...
Conference Paper
Full-text available
Considering the huge growth of the number of documents in the dig- ital universe and the possibility of obtaining some competitive advantage in processing them, this paper describes some of the difficulties of working with text collections. More specifically, it shows some of the challenges on the step considered one of the most important of the Te...
Article
Full-text available
Resumo. Neste relatório técnico é apresentada a ferramenta IEsystem, que extrai metadados de coleções de artigos científi-cos. Esta ferramenta é capaz de realizar a extração de metadados mesmo quando os artigos científicos são provenientes de diferentes fontes ou escritos em diferentes línguas. O processo de extração de metadados pauta-se em modelo...
Article
Full-text available
The amount of textual documents available in digital format is incredibly large. Sometimes, it is impossible for a human being to man-age and extract knowledge from a large amount of textual documents. In order to deal with these challenges, automatic techniques to organize, manage and extract knowledge from textual data are becoming very im-portan...

Network

Cited By