João Paulo Papa

João Paulo Papa
São Paulo State University | Unesp · Department of Computing

Associate Professor

About

481
Publications
138,619
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
9,112
Citations
Citations since 2017
284 Research Items
7451 Citations
201720182019202020212022202305001,0001,500
201720182019202020212022202305001,0001,500
201720182019202020212022202305001,0001,500
201720182019202020212022202305001,0001,500
Additional affiliations
May 2016 - present
São Paulo State University
Position
  • Professor (Associate)
October 2015 - October 2015
March 2014 - February 2015
Harvard University
Position
  • PostDoc Position
Education
March 2003 - February 2005
Federal University of São Carlos
Field of study
  • Computer Science
March 1999 - December 2002
São Paulo State University
Field of study
  • Information Systems

Publications

Publications (481)
Article
Semantic segmentation is an essential task in medical imaging research. Many powerful deep-learning-based approaches can be employed for this problem, but they are dependent on the availability of an expansive labeled dataset. In this work, we augment such supervised segmentation models to be suitable for learning from unlabeled data. Our semi-supe...
Preprint
Full-text available
The continuous computational power growth in the last decades has made solving several optimization problems significant to humankind a tractable task; however, tackling some of them remains a challenge due to the overwhelming amount of candidate solutions to be evaluated, even by using sophisticated algorithms. In such a context, a set of nature-i...
Preprint
Full-text available
Research on remote sensing image classification significantly impacts essential human routine tasks such as urban planning and agriculture. Nowadays, the rapid advance in technology and the availability of many high-quality remote sensing images create a demand for reliable automation methods. The current paper proposes two novel deep learning-base...
Preprint
Scene change detection is an image processing problem related to partitioning pixels of a digital image into foreground and background regions. Mostly, visual knowledge-based computer intelligent systems, like traffic monitoring, video surveillance, and anomaly detection, need to use change detection techniques. Amongst the most prominent detection...
Preprint
Full-text available
Automatic Text Summarization (ATS) is becoming relevant with the growth of textual data; however, with the popularization of public large-scale datasets, some recent machine learning approaches have focused on dense models and architectures that, despite producing notable results, usually turn out in models difficult to interpret. Given the challen...
Preprint
Video segmentation consists of a frame-by-frame selection process of meaningful areas related to foreground moving objects. Some applications include traffic monitoring, human tracking, action recognition, efficient video surveillance, and anomaly detection. In these applications, it is not rare to face challenges such as abrupt changes in weather...
Preprint
Full-text available
This work presents a thorough review concerning recent studies and text generation advancements using Generative Adversarial Networks. The usage of adversarial learning for text generation is promising as it provides alternatives to generate the so-called "natural" language. Nevertheless, adversarial text generation is not a simple task as its fore...
Article
This paper presents an open-source implementation of PL-kNN, a parameterless version of the k-Nearest Neighbors algorithm. The proposed model, developed in Python 3.6, was designed to avoid the choice of the k parameter required by the standard k-Nearest Neighbors technique. Essentially, the model computes the number of nearest neighbors of a targe...
Preprint
Full-text available
Machine Learning algorithms have been extensively researched throughout the last decade, leading to unprecedented advances in a broad range of applications, such as image classification and reconstruction, object recognition, and text categorization. Nonetheless, most Machine Learning algorithms are trained via derivative-based optimizers, such as...
Conference Paper
Metaheuristic algorithms present elegant solutions to many problems regardless of their domain. The Jellyfish Search (JS) algorithm is inspired by how jellyfish searches for food in ocean currents and performs movements within the swarm. In this work, we propose a new version of the JS algorithm called No-Boundary Jellyfish Search (NBJS) to improve...
Preprint
Full-text available
Identifying anomalies has become one of the primary strategies towards security and protection procedures in computer networks. In this context, machine learning-based methods emerge as an elegant solution to identify such scenarios and learn irrelevant information so that a reduction in the identification time and possible gain in accuracy can be...
Article
Despite the recent success of machine learning algorithms, most models still face several drawbacks when considering more complex tasks requiring interaction between different sources, such as multimodal input data and logical time sequence. On the other hand, the biological brain is highly sharpened in this sense, empowered to automatically manage...
Preprint
Full-text available
In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors prod...
Chapter
Automatic summarization captures the most relevant information and condenses it into an understandable text in natural language. Such a task can be classified as either extractive or abstractive summarization. Research on Brazilian Portuguese-based abstractive summarization is still scarce. This work explores abstractive summarization in Portuguese...
Article
Full-text available
The COVID-19 pandemic has devastated the entire globe since its first appearance at the end of 2019. Although vaccines are now in production, the number of contaminations remains high, thus increasing the number of specialized personnel that can analyze clinical exams and points out the final diagnosis. Computed tomography and X-ray images are the...
Preprint
Full-text available
The global burden of dengue, a mosquito-borne viral infection, has alarmingly increased in recent decades. The rise in disease occurrence is mainly attributed to changes in the climate, human ecology, globalization, and demography. In such a scenario, an accurate prediction of a dengue outbreak is essential to reduce the morbidity rate significantl...
Conference Paper
Com a popularização da tecnologia de vetorização AVX-512 na última década, tornou-se interessante verificar seu desempenho em novas aplicações. Este artigo apresenta um estudo sobre o uso da tecnologia AVX512 em um algoritmo de aprendizado de máquina baseado em grafos, o Parallel Optimum-Path Forest (POPF). Os experimentos conduzidos mostram um gan...
Article
Significant challenges still remain despite the impressive recent advances in machine learning techniques, particularly in multimedia data understanding. One of the main challenges in real-world scenarios is the nature and relation between training and test datasets. Very often, only small sets of coarse-grained labeled data are available to train...
Preprint
Full-text available
Complex wounds usually face partial or total loss of skin thickness, healing by secondary intention. They can be acute or chronic, figuring infections, ischemia and tissue necrosis, and association with systemic diseases. Research institutes around the globe report countless cases, ending up in a severe public health problem, for they involve human...
Preprint
Full-text available
Demands for minimum parameter setup in machine learning models are desirable to avoid time-consuming optimization processes. The $k$-Nearest Neighbors is one of the most effective and straightforward models employed in numerous problems. Despite its well-known performance, it requires the value of $k$ for specific data distribution, thus demanding...
Conference Paper
Full-text available
Collaborative Filtering stands as an underlying strategy to reasonably deal with large-scale problems like scalability and high sparsity. In the classifier fusion context, one could benefit from adopting such a strategy to learn decision templates effectively for the sake of computation efficiency. This paper introduces a framework that explores co...
Article
Full-text available
In this study, we aimed to develop an artificial intelligence clinical decision support solution to mitigate operator-dependent limitations during complex endoscopic procedures such as endoscopic submucosal dissection and peroral endoscopic myotomy, for example, bleeding and perforation. A DeepLabv3-based model was trained to delineate vessels, tis...
Article
Oral cancer could be prevented. The primary strategy is based on prevention. Most patients with oral cancer present to the hospital network with advanced staging and a low chance of cure. This condition may be related to physicians' difficulty of making an early diagnosis. With the advancement of information technology, artificial intelligence (AI)...
Article
Full-text available
This paper proposes a novel multimodal self-supervised architecture for energy-efficient audio-visual (AV) speech enhancement that integrates Graph Neural Networks with canonical correlation analysis (CCA-GNN). The proposed approach lays its foundations on a state-of-the-art CCA-GNN that learns representative embeddings by maximizing the correlatio...
Conference Paper
Full-text available
Complex wounds usually face partial or total loss of skin thickness, healing by secondary intention. They can be acute or chronic, figuring infections, ischemia and tissue necrosis, and association with systemic diseases. Research institutes around the globe report countless cases, ending up in a severe public health problem, for they involve human...
Conference Paper
Full-text available
Demands for minimum parameter setup in machine learning models are desirable to avoid time-consuming optimization processes. The k-Nearest Neighbors is one of the most effective and straightforward models employed in numerous problems. Despite its well-known performance, it requires the value of k for specific data distribution, thus demanding expe...
Preprint
Regularization helps to improve machine learning techniques by penalizing the models during training. Such approaches act in either the input, internal, or output layers. Regarding the latter, label smoothing is widely used to introduce noise in the label vector, making learning more challenging. This work proposes a new label regularization method...
Preprint
Full-text available
Despite the recent success of machine learning algorithms, most of these models still face several drawbacks when considering more complex tasks requiring interaction between different sources, such as multimodal input data and logical time sequence. On the other hand, the biological brain is highly sharpened in this sense, empowered to automatical...
Article
Background and Objective: Cervical cancer is one of the leading causes of womens death. Like any other disease, cervical cancers early detection and treatment with the best possible medical advice are the paramount steps that should be taken to ensure the minimization of after-effects of contracting this disease. PaP smear images are one the most e...
Conference Paper
Identifying anomalies has become one of the primary strategies towards security and protection procedures in computer networks. In this context, machine learning-based methods emerge as an elegant solution to identify such scenarios and learn irrelevant information so that a reduction in the identification time and possible gain in accuracy can be...
Article
Full-text available
Biometric recognition provides straightforward methods to deal with the problem of identifying people under certain circumstances. Additionally, a well-calibrated biometric system enhances security policies and prevents malicious attempts, such as fraud or identity theft. Deep learning has arisen to foster the problem by extracting high-level featu...
Preprint
Deep learning architectures have achieved promising results in different areas (e.g., medicine, agriculture, and security). However, using those powerful techniques in many real applications becomes challenging due to the large labeled collections required during training. Several works have pursued solutions to overcome it by proposing strategies...
Preprint
Full-text available
In the past decades, fuzzy logic has played an essential role in many research areas. Alongside, graph-based pattern recognition has shown to be of great importance due to its flexibility in partitioning the feature space using the background from graph theory. Some years ago, a new framework for both supervised, semi-supervised, and unsupervised l...
Conference Paper
Em virtude dos processos de otimização em software advindos de tecnologias mais recentes, este estudo busca analisar a vantagem trazida por implementações de vetorização baseadas em hardware, neste caso, AVX2 e AVX-512, em um cenário de multiplicação matricial. Os resultados apresentam que a vetorização proporciona ganhos bastante expressivos, dest...
Article
Full-text available
Convolutional Neural Networks have been widely employed in a diverse range of computer vision-based applications, including image classification, object recognition, and object segmentation. Nevertheless, one weakness of such models concerns their hyperparameters' setting, being highly specific for each particular problem. One common approach is to...
Preprint
Full-text available
Deep learning has been proposed for the assessment and classification of medical images. However, many medical image databases with appropriately labeled and annotated images are small and imbalanced, and thus unsuitable to train and validate such models. The option is to generate synthetic images and one successful technique has been patented whic...
Article
Full-text available
The electroencephalogram (EEG) introduced a massive potential for user identification. Several studies have shown that EEG provides unique features in addition to typical strength for spoofing attacks. EEG provides a graphic recording of the brain’s electrical activity that electrodes can capture on the scalp at different places. However, selecting...
Conference Paper
Seam carving is a computational method capable of resizing images for both reduction and expansion based on its content, instead of the image geometry. Although the technique is mostly employed to deal with redundant information, i.e., regions composed of pixels with similar intensity, it can also be used for tampering images by inserting or removi...
Conference Paper
In the last decade, exponential data growth supplied the machine learning-based algorithms' capacity and enabled their usage in daily life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors...
Preprint
Full-text available
In the last decade, exponential data growth supplied the machine learning-based algorithms' capacity and enabled their usage in daily life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors...
Preprint
Full-text available
Seam carving is a computational method capable of resizing images for both reduction and expansion based on its content, instead of the image geometry. Although the technique is mostly employed to deal with redundant information, i.e., regions composed of pixels with similar intensity, it can also be used for tampering images by inserting or removi...
Chapter
Fake news has become a research topic of great importance in Natural Language Processing due to its negative impact on our society. Although its pertinence, there are few datasets available in Brazilian Portuguese and mostly comprise few samples. Therefore, this paper proposes creating a new fake news dataset named FakeRecogna that contains a great...
Chapter
Full-text available
This work proposes the PetroBERT, which is a BERT-based model adapted to the oil and gas exploration domain in Portuguese. PetroBERT was pre-trained using the Petrolês corpus and a private daily drilling report corpus over BERT multilingual and BERTimbau. The proposed model was evaluated in the NER and sentence classification tasks and achieved int...
Preprint
Full-text available
In the last decade, machine learning-based approaches became capable of performing a wide range of complex tasks sometimes better than humans, demanding a fraction of the time. Such an advance is partially due to the exponential growth in the amount of data available, which makes it possible to extract trustworthy real-world information from them....
Preprint
Full-text available
The fast-spreading information over the internet is essential to support the rapid supply of numerous public utility services and entertainment to users. Social networks and online media paved the way for modern, timely-communication-fashion and convenient access to all types of information. However, it also provides new chances for ill use of the...
Article
In the last decade, machine learning-based approaches became capable of performing a wide range of complex tasks sometimes better than humans, demanding a fraction of the time. Such an advance is partially due to the exponential growth in the amount of data available, which makes it possible to extract trustworthy real-world information from them....
Preprint
Full-text available
This paper proposes a novel multimodal self-supervised architecture for energy-efficient AV speech enhancement by integrating graph neural networks with canonical correlation analysis (CCA-GNN). This builds on a state-of-the-art CCA-GNN that aims to learn representative embeddings by maximizing the correlation between pairs of augmented views of th...
Preprint
Full-text available
Machine Learning has attracted considerable attention throughout the past decade due to its potential to solve far-reaching tasks, such as image classification, object recognition, anomaly detection, and data forecasting. A standard approach to tackle such applications is based on supervised learning, which is assisted by large sets of labeled data...
Article
Full-text available
This paper presents a Fuzzy Cognitive Map model to quantify implicit bias in structured datasets where features can be numeric or discrete. In our proposal, problem features are mapped to neural concepts that are initially activated by experts when running what-if simulations, whereas weights connecting the neural concepts represent absolute correl...
Preprint
Full-text available
In general, biometry-based control systems may not rely on individual expected behavior or cooperation to operate appropriately. Instead, such systems should be aware of malicious procedures for unauthorized access attempts. Some works available in the literature suggest addressing the problem through gait recognition approaches. Such methods aim a...
Preprint
Full-text available
Several image processing tasks, such as image classification and object detection, have been significantly improved using Convolutional Neural Networks (CNN). Like ResNet and EfficientNet, many architectures have achieved outstanding results in at least one dataset by the time of their creation. A critical factor in training concerns the network's...
Article
Full-text available
and Eldorado's Institute of Technology, Brazil JOÃO PAULO PAPA, São Paulo State University-UNESP, Brazil Several image processing tasks, such as image classification and object detection, have been significantly improved using Convolutional Neural Networks (CNN). Like ResNet and EfficientNet, many architectures have achieved outstanding results in...
Chapter
In the past years, we have observed an increasing number of applications that require machine learning techniques to sort out problems that are not straightforward to humans. The reasons vary from information that is not clearly visible to the human eye (e.g., microscopic patterns in medical images) or the massive amount of data to analyze. This bo...
Chapter
Recent advances in machine learning algorithms have been aiding humans and improving their decision-making capacities in various applications, such as medical imaging, image classification and reconstruction, object recognition, and text categorization. A graph-based classifier, known as Optimum-Path Forest (OPF), has been extensively researched in...
Chapter
The Optimum-Path Forest (OPF) is a framework for the design of graph-based classifiers, which covers supervised, semisupervised, and unsupervised applications. The OPF is mainly characterized by its low training and classification times as well as competitive results against well-established machine learning techniques, such as Support Vector Machi...
Article
Full-text available
Electroencephalogram signals (EEG) have provided biometric identification systems with great capabilities. Several studies have shown that EEG introduces unique and universal features besides specific strength against spoofing attacks. Essentially, EEG is a graphic recording of the brain’s electrical activity calculated by sensors (electrodes) on t...
Chapter
Seam carving is a computational method capable of resizing images for both reduction and expansion based on its content, instead of the image geometry. Although the technique is mostly employed to deal with redundant information, i.e., regions composed of pixels with similar intensity, it can also be used for tampering images by inserting or removi...
Chapter
In the last decade, exponential data growth supplied the machine learning-based algorithms’ capacity and enabled their usage in daily life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors...
Preprint
Full-text available
This paper presents a Fuzzy Cognitive Map model to quantify implicit bias in structured datasets where features can be numeric or discrete. In our proposal, problem features are mapped to neural concepts that are initially activated by experts when running what-if simulations, whereas weights connecting the neural concepts represent absolute correl...
Conference Paper
Machine Learning algorithms have been extensively researched throughout the last decade, leading to unprecedented advances in a broad range of applications, such as image classification and reconstruction, object recognition, and text categorization. Nonetheless, most Machine Learning algorithms are trained via derivative-based optimizers, such as...
Article
Single-target regression can accurately predict the crop’s performance but fails to generalize problems with more than one true and cross-validatable solution. An alternative to output multiple numeric values upon the input, we think, would be multi-target regression (MTR) with either Random Forest (RF) or k-nearest neighbors (KNN). Therefore, we c...
Article
Full-text available
Plant stomata are essential structures (pores) that control the exchange of gases between plant leaves and the atmosphere, and also they influence plant adaptation to climate through photosynthesis and transpiration stream. Many works in literature aim for a better understanding of these structures and their role in the evolution process and the be...