About
584
Publications
187,635
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
10,230
Citations
Citations since 2017
Introduction
Additional affiliations
December 2011 - February 2012
August 1997 - present
September 1993 - August 1997
Publications
Publications (584)
Introduction: Even with the increased use of telehealth from the COVID-19 pandemic onward, there needs to be more knowledge about its usability for patients with non-communicable diseases from the point of view of the health professional, which is the main objective of this study. The secondary objectives will be to describe the user’s profile, dis...
Background:
Cognitive and functional decline are common problems in older adults, especially in those 75+ years old. Currently, there is no specific plasma biomarker able to predict this decline in healthy old-age people. Machine learning (ML) is a subarea of artificial intelligence (AI), which can be used to predict outcomes Aim: This study aimed...
Background:
During the COVID-19 pandemic, telehealth was expanded without the opportunity to evaluate the adopted technology's usability extensively.
Objective:
To synthesize evidence on health professionals' perceptions regarding the usability of telehealth systems in the primary care of individuals with non-communicable diseases (NCDs) (hypert...
In this article we study and characterize the phenomenon of the hyperprolific authors, who are the most productive researchers according to a given repository in a specific period of time. Particularly, we are interested in investigating and characterizing a subset of such hyperprolific authors who present a sudden growth in the number of published...
Graph Pattern Mining (GPM) is an important, rapidly evolving, and computation demanding area. GPM computation relies on subgraph enumeration, which consists in extracting subgraphs that match a given property from an input graph. Graphics Processing Units (GPUs) have been an effective platform to accelerate applications in many areas. However, the...
BACKGROUND
Telehealth has been established as a strategy to provide health care for patients with hypertension and diabetes from the COVID-19 pandemic onward. However, little is known about its usability from a healthcare professional's perspective.
OBJECTIVE
To assess evidence on health professionals’ perceptions of the usability of telehealth sy...
The debate over the COVID-19 pandemic is constantly trending at online conversations since its beginning in 2019. The discussions in many social media platforms is related not only to health aspects of the disease, but also public policies and non-pharmacological measures to mitigate the spreading of the virus and propose alternative treatments. Di...
This article describes the construction and deployment of the Covid Data Analytics Repository, a source for interdisciplinary studies about the impact of the COVID-19 pandemic in Brazil. We collected different types of data from official (IBGE, DATASUS) and non-official (Brasil.IO) sources, online social networks (Instagram, Twitter), and from a se...
O presente estudo busca caracterizar o primeiro ano da pandemia de COVID-19 no Brasil como um fenômeno social por meio da análise da correlação entre o agravamento/atenuação da pandemia e o vocabulário utilizado no Twitter nas semanas que precedem essas variações. Entre outros resultados, observou-se que termos politicamente motivados e com teor ne...
Recent efforts have focused on identifying multidisciplinary teams and detecting co-Authorship Networks based on exploring topic modeling to identify researchers’ expertise. Though promising, none of these efforts perform a real-life evaluation of the quality of the built topics. This paper proposes a Semantic Academic Profiler (SAP) framework that...
In the context of COVID-19 pandemic, social networks such as Twitter and YouTube stand out as important sources of information. YouTube, as the largest and most engaging online media consumption platform, has a large influence in the spread of information and misinformation, which makes it important to study how it deals with the problems that aris...
People recovered from COVID-19 may still present complications including respiratory and neurological sequelae. In other viral infections, cognitive impairment occurs due to brain damage or dysfunction caused by vascular lesions and inflammatory processes. Persistent cognitive impairment compromises daily activities and psychosocial adaptation. Som...
Serviços de monitoramento como o Shodan são cada vez mais populares no rastreamento de aplicações e vulnerabilidades na Internet. Neste artigo caracterizamos e discutimos vulnerabilidades encontradas na Internet brasileira utilizando dados de monitoramento provenientes do Shodan. Além disso, discutimos métodos de Ciências dos Dados para melhorar a...
This paper deals with the problem of modeling counterfactual reasoning in scenarios where, apart from the observed endogenous variables, we have a latent variable that affects the outcomes and, consequently, the results of counterfactuals queries. This is a common setup in healthcare problems, including mental health. We propose a new framework whe...
This work considers the general task of estimating the sum of a bounded function over the edges of a graph, given neighborhood query access and where access to the entire network is prohibitively expensive. To estimate this sum, prior work proposes Markov chain Monte Carlo (MCMC) methods that use random walks started at some seed vertex and whose e...
The COVID-19 pandemic and the need for social distancing have created a demand for new and innovative solutions in healthcare systems worldwide. One of the strategies that have been implemented are chatbots, which can be helpful in providing reliable health information and preventing people from seeking assistance in healthcare centers and being un...
Este artigo apresenta a construção e publicação de um repositório de dados utilizados e desenvolvidos no âmbito do projeto Covid Data Analytics (CDA), executado pelo Departamento de Ciência da Computação da UFMG. O projeto visou monitorar aspectos referentes à situação social, econômica e epidemiológica da COVID-19 no Brasil a partir da análise de...
Background:
Human behavior is crucial in health outcomes. Particularly, individual behavior is a determinant of the success of measures to overcome critical conditions, such as a pandemic. In addition to intrinsic public health challenges associated with COVID-19, in many countries, some individuals decided not to get vaccinated, streets were crow...
The electrocardiogram (ECG) is the most commonly used exam for the evaluation of cardiovascular diseases. Here we propose that the age predicted by artificial intelligence (AI) from the raw ECG (ECG-age) can be a measure of cardiovascular health. A deep neural network is trained to predict a patient’s age from the 12-lead ECG in the CODE study coho...
Campanhas de phishing frequentemente utilizam páginas Web que imitam páginas legítimas para enganar as vítimas. Apesar dos esforços da comunidade científica em combater essa atividade, o phishing fica cada vez mais sofisticado e continua fazendo vítimas. Neste artigo apresentamos um novo arcabouço de monitoramento de páginas de phishing que combina té...
Automatic classification of diagnoses has been a long term challenge for Computer Science and related disciplines. Textual clinical reports can be used as a great source of data for such diagnoses. However, building classification models from them is not a trivial task. The problem tackled in this work is the identification of the medical diagnoses...
A qualidade do registro de saúde envolve aspectos como a padronização de dados, integridade e confiabilidade da informação. As terminologias são recursos que otimizam a linguagem escrita e falada entre profissionais, normalizando termos para facilitar a comunicação sobre a saúde entre as pessoas. Neste sentido, a capacitação dos profissionais de sa...
Objective:
Rheumatic heart disease (RHD) affects an estimated 39 million people worldwide and is the most common acquired heart disease in children and young adults. Echocardiograms are the gold standard for diagnosis of RHD, but there is a shortage of skilled experts to allow widespread screenings for early detection and prevention of the disease...
In 2020, the activist movement @sleeping_giants_pt (SGB) made a splash in Brazil. Similar to its international counterparts, the movement carried "campaigns" against media outlets spreading misinformation. In those, SGB targeted companies whose ads were shown in these outlets, publicly asking them to remove the ads. In this work, we present a caref...
The lack of authentication in the Internet’s data plane allows hosts to falsify (spoof) the source IP address in packet headers. IP source spoofing is the basis for amplification denial-of-service (DoS) attacks. Current approaches to locate sources of spoofed traffic lack coverage or are not deployable today. We propose a mechanism that a network w...
The electrocardiogram (ECG) is the most commonly used exam for the screening and evaluation of cardiovascular diseases. Here we propose that the age predicted by artificial intelligence (AI) from the raw ECG tracing (ECG-age) can be a measure of cardiovascular health and provide prognostic information. A deep convolutional neural network was traine...
We propose a new method to generate explanations for end-to-end classification models. The explanations consist of meaningful features to the user, namely contextual features. We instantiate our approach in the scenario of automated electrocardiogram (ECG) diagnosis and analyze the explanations generated in terms of interpretability and robustness....
This work considers the general task of estimating the sum of a bounded function over the edges of a graph that is unknown a priori, where graph vertices and edges are built on-the-fly by an algorithm and the resulting graph is too large to be kept in memory or disk. Prior work proposes Markov Chain Monte Carlo (MCMC) methods that simultaneously sa...
With the rise of tracking data, sports analytics can influence the game's tactical aspects like never before. In football, measuring the quality of the players' positioning to receive a pass in condition to score has much value. The Off-Ball Scoring Opportunity model was built to do just that. With that, players receive credit for being well-positi...
A collection of individuals is represented by point patterns. Each individual is a finite set of geographical locations representing their visiting pattern to places in a region. We present SCPP, an algorithm for clustering these individuals considering the spatial patterns of their visiting locations. We adopted a probabilistic framework based on...
Automatic diagnoses of diseases has been a long term challenge for Computer Science and related disciplines. Textual clinical reports can be used as a great source of data for such diagnoses. However, building classification models from them is not a trivial task. The problem tackled in this work is the identification of the medical diagnoses that...
Twitter has been one of the main sources of information and discussion during the COVID-19 pandemics. This paper characterizes a set of more than 56 million tweets written in Portuguese and collected over a period of 70 days. Our analysis includes the volume of messages, text of tweets, location of tweets, the main elements of tweets (e.g. hashtags...
A capacitação dos profissionais de saúde é um elemento fundamental para o sucesso da ação transformadora da tecnologia em prol da equidade, integralidade, longitudinalidade e universalidade em saúde pública. O registro de dados com qualidade, em sistemas de informação alinhados às demandas do cidadão e de sua comunidade, permite uma melhor percepçã...
This study introduces ANA, a chatbot assistant about COVID-19 in Brazilian Portuguese, developed by a multidisciplinary team of Linguists, Computer Scientists and Medical Researchers. ANA aims to assist Brazilian Portuguese-speaking patients seeking information about the disease as well as screen suspect cases, initially in the Brazilian cities of...
The popularization of Online Social Networks has changed the dynamics of content creation and consumption. In this setting, society has witnessed an amplification in phenomena such as misinformation and hate speech. This dissertation studies these issues through the lens of users. In three case studies in social networks, we: (i) provide insight on...
Dense subgraph detection is a well-known problem in graph theory. The hierarchical organization of graphs as dense subgraphs, however, goes beyond simple clustering, as it allows the analysis of the network at different scales.Although there are several hierarchical decomposition methods for unipartite graphs, only a few approaches for the bipartit...
Government purchases are the usual instrument for public acquisition of goods and services. Despite extensive legislation, several control and auditing mechanisms, frauds are still diverse and commonplace at all levels of public administration, wasting public resources. Through the use of frequent patterns, temporal correlation and combined analysi...
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
The role of automatic electrocardiogram (ECG) analysis in clinical practice is limited by the accuracy of existing models. Deep Neural Networks (DNNs) are models composed of stacked transformations that learn tasks by examples. This technology has recently achieved striking success in a variety of task and there are great expectations on how it mig...
Medical data processing has found a new dimension with the extensive use of machine-learning techniques to classify and extract features. Machine learning strongly benefits from computing accelerators. However, such accelerators are not easily available at hospital premises, although they can be easily found on public cloud infrastructures or resea...
Cambridge Core - Knowledge Management, Databases and Data Mining - Data Mining and Machine Learning - by Mohammed J. Zaki
Location-aware information is now commonplace, as the ubiquity and pervasiveness of technology enabled its generation and storage at large scale. These data constitute a rich representation of entities’ whereabouts and behavior as they move on the map. Although several studies reported considerable predictability of such mobility patterns, several...
Abstract High-performance computing (HPC) and massive data processing (Big Data) are two trends that are beginning to converge. In that process, aspects of hardware architectures, systems support and programming paradigms are being revisited from both perspectives. This paper presents our experience on this path of convergence with the proposal of...
Dense subgraphs detection is a well known problem in Computer Science. Hierarchical organization of graphs as dense subgraphs, however, goes beyond simple clustering as it allows the analysis of the network at different scales. Despite the fact there are several works on hierarchical decomposition for unipartite graphs, only a few works for the bip...
Government purchases are the usual instrument for public acquisition of goods and services. Despite extensive legislation and several control and auditing mechanisms, frauds are still diverse and commonplace at all levels of public administration. This work proposes a methodology for detecting anomalies in government purchases. The methodology prom...
Standard spatial cluster detection methods used in public health surveillance assign each disease case to a single location (typically, the patient's home address), aggregate locations to small areas, and monitor the number of cases in each area over time. However, such methods cannot detect clusters of disease resulting from visits to non-resident...
Despite advances in prevention and mitigation mechanisms, phishing remains a threat. One reason for this is that phishers continuously improve their techniques. In this paper we study and characterize one of these improvements: phishers' use of redirection chains to evade identification mechanisms and avoid takedown of the infrastructure hostin...
Mudanças de caminho causadas por eventos como engenharia de tráfego, alteração de parcerias de troca de tráfego, ou falhas de enlace impactam vários caminhos na Internet. Plataformas de monitoramento topológico realizam medições periódicas usando traceroute para um grande número de destinos. Esta abordagem, porém, é inadequada para identificar prec...
Non-profits and the media claim there is a radicalization pipeline on YouTube. Its content creators would sponsor fringe ideas, and its recommender system would steer users towards edgier content. Yet, the supporting evidence for this claim is mostly anecdotal, and there are no proper measurements of the influence of YouTube's recommender system. I...
Smart urban transportation management can be considered as a multifaceted big data challenge. It strongly relies on the information collected into multiple, widespread, and heterogeneous data sources as well as on the ability to extract actionable insights from them. Besides data, full stack (from platform to services and applications) Information...
In this paper we propose Fractal, a high performance and high productivity system for supporting distributed graph pattern mining (GPM) applications. Fractal employs a dynamic (auto-tuned) load-balancing based on a hierarchical and locality-aware work stealing mechanism, allowing the system to adapt to different workload characteristics. Additional...
Associative classification refers to a class of algorithms that is very efficient in classification problems. Data in such domain are multidimensional, with data instances represented as points of a fixed-length attribute space, and are exploited from two large sets: training and testing datasets. Models, known as classifiers, are mined in the trai...
Objective
We develop new spatial scan models that use individuals' movement data, rather than a single location per individual, in order to identify areas with a high relative risk of infection by dengue disease.IntroductionTraditionally, surveillance systems for dengue and other infectious diseases locate each individual case by home address, aggr...
We present a Deep Neural Network (DNN) model for predicting electrocardiogram (ECG) abnormalities in short-duration 12-lead ECG recordings. The analysis of the digital ECG obtained in a clinical setting can provide a full evaluation of the cardiac electrical activity and have not been studied in an end-to-end machine learning scenario. Using the da...
Analysis of public transportation data in large cities is a challenging problem. Managing data ingestion, data storage, data quality enhancement, modelling and analysis requires intensive computing and a non-trivial amount of resources. In EUBra-BIGSEA (Europe–Brazil Collaboration of Big Data Scientific Research Through Cloud-Centric Applications)...
Discrimination-aware models in machine learning are a recent topic of study that aim to minimize the adverse impact of machine learning decisions for certain groups of people due to ethical and legal implications. We propose a benchmark framework for assessing discrimination-aware models. Our framework consists of systematically generated biased da...