Marcos Ennes Barreto

Marcos Ennes Barreto
The London School of Economics and Political Science | LSE · Department of Statistics

PhD in Computer Science (UFRGS, Brazil, 2010)
Teaching Professor of Data Science (LSE, London, UK)

About

93
Publications
17,333
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
937
Citations
Citations since 2017
60 Research Items
755 Citations
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
Additional affiliations
November 2016 - November 2018
University College London
Position
  • PostDoc Position
Description
  • Post-doctoral researcher - The Royal Society - Newton International Fellow
September 2010 - present
Universidade Federal da Bahia (UFBA)
Position
  • Professor (Associate)
Description
  • Associate professor and researcher at the Computer Science Department. Head of AtyImo Data Science Research Group (www.atyimo.ufba.br).
August 2002 - November 2002
Universitat Ramon Llull
Position
  • Professor
Education
September 2017 - November 2018
University College London
Field of study
  • Data Science, Health Informatics
November 2016 - October 2018
University College London
Field of study
  • Data Science, Health Informatics
April 2000 - May 2010
Universidade Federal do Rio Grande do Sul (UFRGS)
Field of study
  • Parallel and distributed processing

Publications

Publications (93)
Article
Objective: Multimorbidity, or the occurrence of two or more chronic conditions, is a global challenge, with implications for mortality, morbidity, disability, and life quality. Psychiatric disorders are common among the chronic diseases that affect patients with multimorbidity. It is still not well understood whether psychiatric symptoms, especiall...
Article
Full-text available
Objectives Covid-19 databases have detailed information about each affected person in Brazil, but it has flaws in counting the number of cases, which are underreported. We aimed to construct and correct the cases dataset by linking different sources of data observations to study the pandemic evolution in Brazilian municipalities. ApproachUsing the...
Article
Full-text available
Objective Brazil is one of the most unequal countries in the world. We aimed to create a Social Disparities Index for Covid-19 (SDI-Covid-19) using linked administrative data to understand the role of inequalities in the COVID-19 pandemic to target social and health policies. ApproachUsing linked administrative data (2010 Census and National Regist...
Article
Full-text available
Background Data integration and visualisation techniques have been widely used in scientific research to allow the exploitation of large volumes of data and support highly complex or long-lasting research questions. Integration allows data from different sources to be aggregated into a single database comprising variables of interest for different...
Article
Full-text available
Background: Multimorbidity, or the occurrence of two or more chronic conditions, is a global challenge, with implications for mortality, morbidity, disability and life quality. Psychiatric disorders are common among the chronic diseases that affect patients with multimorbidity. It is still not well understood whether psychiatric symptoms, especial...
Conference Paper
Popular participation in public health actions is essential for fighting Covid-19, especially in vulnerable urban communities where the lack of geographical data at fine resolution scale hinders appropriate spatial responses. This work proposes a crowdsourcing-based solution that captures georeferenced data regarding the population's perception of...
Preprint
Full-text available
Background: Data integration and visualization techniques have been widely used in scientific research to allow the exploitation of large volumes of data and support highly complex or long-lasting research questions. Integration allows data from different sources to be aggregated into a single database comprising variables of interest for different...
Article
Full-text available
The Brazilian Early Childhood Friendly Municipal Index (IMAPI) is a population-based approach to monitor the nurturing care environment for early childhood development (ECD) using routine information system data. It is unknown whether IMAPI can be applied to document metropolitan urban territorial differences in nurturing care environments. We used...
Article
Full-text available
Background: In infancy, males are at higher risk of dying than females. Birthweight and gestational age are potential confounders or mediators but are also familial and correlated, posing epidemiological challenges that can be addressed by studying male-female twin pairs. Methods: We studied 28 558 male-female twin pairs born in Brazil between 2...
Article
Full-text available
Focus of Presentation Males and females differ substantially in their exposures and outcomes across the life-course. Previous research into sex differences has been limited by an inability to account for inter-individual differences in genetic factors and in their early-life environment. Studying within male-female twin pair differences offers a un...
Article
Full-text available
Objective: identify in the literature the state of the art of gamification frameworks and models developed for health contexts. Methods: an integrative literature review of articles indexed in the LILACS, SciELO, PubMed, CINAHL, Scopus and Web of Science databases, in english and published between January 2010 and July 2020. Results: among the 10 s...
Preprint
Full-text available
Background: Data integration and visualization techniques have been widely used in scientific research to allow the exploitation of large volumes of data and support highly complex or long-lasting research questions. Integration allows data from different sources to be aggregated into a single database comprising variables of interest for different...
Article
Full-text available
Providing an enabling nurturing care environment for early childhood development (ECD) that cuts across the five domains of the Nurturing Care Framework (i.e., good health, adequate nutrition, opportunities for early learning, security and safety and responsive caregiving) has become a global priority. Brazil is home to approximately 18.5 million c...
Article
Full-text available
The Nurturing Care Framework (NCF) calls for establishing a global monitoring and accountability systems for early childhood development (ECD). Major gaps to build low‐cost and large‐scale ECD monitoring systems at the local level remain. In this manuscript, we describe the process of selecting nurturing care indicators at the municipal level from...
Article
Introduction Epidemiological studies of twin pairs provide researchers with the opportunity to better understand the roles of genetics and the environment on human traits and health conditions. Twin births are also of interest for public health, given they are five times more likely to be of low birth weight and preterm compared to singletons. Male...
Article
Full-text available
The disease caused by the new coronavirus (COVID-19) has been plaguing the world for months and growing more rapidly as the days go by. Therefore, finding a way to identify who has the causative virus is impressive, in order to find a way to stop its proliferation. In this paper, a complete and applied study of convolutional support machines will b...
Article
Full-text available
Background: Record linkage is the process of identifying and combining records about the same individual from two or more different datasets. While there are many open source and commercial data linkage tools, the volume and complexity of currently available datasets for linkage pose a huge challenge; hence, designing an efficient linkage tool with...
Poster
Males and females differ substantially in morbidity and mortality since conception, and recent research has found sex differences in DNA methylation a critical risk factor for many cancers. However, the causes of such differences are still largely unknown. More importantly, sex differences detected in studies from unrelated individuals could be bia...
Article
Full-text available
The analysis of massive databases is a key issue for most applications today and the use of parallel computing techniques is one of the suitable approaches for that. Apache Spark is a widely employed tool within this context, aiming at processing large amounts of data in a distributed way. For the Statistics community, R is one of the preferred too...
Conference Paper
Introduction: Twins are a special cohort, as they offer researchers the opportunity to understand the roles of genetics and the environment on human traits and health conditions. Twin births are also of interest for public health, given they are five times more likely to be of low birth weight and preterm than singletons which puts them at much h...
Article
Currently, there are major EU-based projects to better utilise wearables as useful diagnostic aids/tools in clinical settings as well for deployment in the home to capture ageing processes. To date, there has been little investigation of the translation of those tools beyond the geographical regions in which they were developed and implemented. Our...
Conference Paper
Full-text available
Twin pair analysis is a valuable tool for assessing familial risk factors related to several outcomes, including diseases. Machine learning models are standard, powerful tools for prediction, although not fully suitable for twin pair analysis as most models are not able to account for the existing correlation between twin pairs. In this study, we h...
Article
Full-text available
Recent research and developments in CloudRobotics (CR) require appropriate knowledge representation o ensure interoperable data, information, andknowledge sharing within cloud infrastructures. As animportant branch of the Internet of Things (IoT), thesedemands to advance it forward motivates academic andindustrial sectors to invest on it. The IEEE...
Article
Full-text available
Objectives Early inequities in Early Childhood Development (ECD) are linked to inequalities across the five regions and within the 5570 municipalities in Brazil. We aimed to operationalize an index (IMAPI) to assess and monitor the enabling environment for nurturing care at the regional and municipal level in Brazil using existing national database...
Article
Full-text available
Technology is advancing at an extraordinary rate. Continuous flows of novel data are being generated with the potential to revolutionize how we better identify, treat, manage, and prevent disease across therapeutic areas. However, lack of security of confidence in digital health technologies is hampering adoption, particularly for biometric monitor...
Preprint
Currently, there are major EU-based projects to better utilise wearables as useful diagnostic aids/tools in clinical settings as well for deployment in the home to capture ageing processes. To date, there has been little investigation of the translation of those tools beyond the geographical regions in which they were developed and implemented. Our...
Conference Paper
Introduction The Centre for Data and Knowledge Integration for Health (CIDACS-Fiocruz) relies on the linkage of administrative data for research. The data is originated from governmental departments, generated through the administration of government programs (e.g. mortality system data, information on users of social services). The 100 Million Bra...
Conference Paper
The deployment of governmental administrative databases destined for research purposes and designed to support public policymaking holds great potential, especially when datasets are linked, making it possible to elucidate the effects of combined factors that can potentially impact the health of populations. Administrative data has been neglected i...
Article
A review and comparison of ontology-based approaches to robot autonomy – ADDENDUM - Volume 35 - Alberto Olivares-Alarcos, Daniel Beßler, Alaa Khamis, Paulo Goncalves, Maki K. Habib, Julita Bermejo-Alonso, Marcos Barreto, Mohammed Diab, Jan Rosell, João Quintas, Joanna Olszewska, Hirenkumar Nakawala, Edison Pignaton, Amelie Gyrard, Stefano Borgo, Gu...
Conference Paper
Evidence on the Nurturing Care Framework underlies the critical role of governments investing in integrated Early Childhood Development (ECD) systems to ensure that vulnerable children will survive and thrive. Brazil has endemic inequalities, poverty, and food insecurity across the 5,570 municipalities. This study aims to assess the municipality’s...
Article
Full-text available
Within the next decades, robots will need to be able to execute a large variety of tasks autonomously in a large variety of environments. To relax the resulting programming effort, a knowledge-enabled approach to robot programming can be adopted to organize information in re-usable knowledge pieces. However, for the ease of re-use, there needs to b...
Preprint
Full-text available
UNSTRUCTURED Technology is advancing at extraordinary rates with novel data being generated which could potentially revolutionary different therapeutic areas of medicine. However, adoption is medicine is hampered by a lack of trust, particularly for biometric monitoring technologies (BioMeTs) where a key question facing frontline healthcare profess...
Article
Full-text available
The Center for Data and Knowledge Integration for Health (CIDACS) was created in 2016 in Salvador (Bahia, Brazil). This paper aims to present a profile of CIDACS, including its current databases. CIDACS aims to conduct interdisciplinary studies and research, develop new scientific methodology and promote professional training using linked large-sca...
Article
Full-text available
The Centre for Data and Knowledge Integration for Health (CIDACS) was created in 2016 in Salvador, Bahia-Brazil with the objective of integrating data and knowledge aiming to answer scientific questions related to the health of the Brazilian population. This article details our experiences in the establishment and operations of CIDACS, as well as e...
Conference Paper
Ontologies play an important role across several domains as they represent and define categories, properties and relationships among concepts, data and entities existing in those domains. Specially in Robotics, ontolo-gies can be used as a standard way to represent and share knowledge and reasoning among autonomous agents. CORA (core ontology for r...
Conference Paper
Full-text available
The Centre for Data and Knowledge Integration for Health (CIDACS) was created in 2016 in Salvador, Bahia-Brazil, with the objective of integrating data and knowledge in an attempt to answer scientific questions related to populations and public health. This article details our experiences in the establishment and operations of CIDACS, as well as ef...
Article
Full-text available
Health Technology Assessment (HTA) is the systematic evaluation of the properties and impacts of health technologies and interventions. In this article, we presented a discussion of health technology assessment and its evolution in Brazil, as well as a description of secondary data sources available in Brazil with potential applications to generate...
Conference Paper
Malaria is still a worrying disease worldwide, being responsible for around 219 million cases reported in 2017 and around 435,000 deaths a year. The consensus among researchers, governmental bodies and health professionals is that many countries have relapsed their investments and surveillance actions after a few years of apparent disease reduction...
Article
Full-text available
Record linkage is a technique widely used to gather data stored in disparate data sources that presumably pertain to the same real world entity. This integration can be done deterministically or probabilistically, depending on the existence of common key attributes among all data sources involved. The probabilistic approach is very time-consuming d...
Article
Full-text available
Introduction Malaria is an infectious disease that affected nearly 215 million individuals in 2015. In Brazil, there are various information systems targeted to store data from disease notification, including malaria surveillance. However, these databases are identified and difficult to be accessed by researchers due to privacy restrictions. Objec...
Article
Full-text available
Data linkage refers to the process of identifying and linking records that refer to the same entity across multiple heterogeneous data sources. This method has been widely utilized across scientific domains, including public health where records from clinical, administrative and other surveillance databases are aggregated and used for research, dec...
Conference Paper
Full-text available
Este trabalho apresenta o uso do aprendizado por reforço em ar-quiteturas big.LITTLE. Uma heurísticá e implementada, baseado no algoritmo Q-learning, para escolher a melhor configuraçconfiguraç˜configuração de recursos que ao mesmo atenda a exigência de aplicaçaplicaç˜aplicação sensíveis a latência e consuma a menor quan-tidade de energia. Dois alg...
Conference Paper
Block-level Storage is widely used to support heavy workloads. It can be directly accessed by the operating system, but it faces some durability issues, hardware limitations and performance degradation in geographically distributed systems. Object-based Storage Device (OSD) is a data storage concept widely used to support write-once-read-many (WORM...
Conference Paper
Creating a standard for knowledge representation and reasoning in autonomous robotics is an urgent task if we consider recent advances in robotics as well as predictions about the insertion of robots in human daily life. Indeed, this will impact the way information is exchanged between multiple robots or between robots and humans and how they can a...
Conference Paper
Full-text available
Record linkage (RL) is the process of identifying and linking data that relates to the same physical entity across multiple heterogeneous data sources. Deterministic linkage methods rely on the presence of a set of common uniquely identifying attributes across all sources while probabilistic approaches use non-unique attributes and calculates simil...
Conference Paper
Full-text available
Record linkage (RL) is the process of identifying and linking data that relates to the same physical entity across multiple heterogeneous data sources. Deterministic linkage methods rely on the presence of common uniquely identifying attributes across all sources while probabilistic approaches use non-unique attributes and calculates similarity ind...
Conference Paper
In this work, we focused on Brazilian Public Health System and on large databases from Ministry of Health. We present our multithreading and multiprocessor architectures approach to data processing and probabilistic record linkage of such databases in order to produce very accurate data marts. These data marts are used by statisticians and epidemio...
Conference Paper
Full-text available
Virtual screening methodologies have been used to help drug researchers to discover new medicine. The goal of these methodologies is to work on the docking phase verifying which molecules interact with a specific protein. Typically, the number of molecules could be very large and, because of this, a great computational power is required to compare...
Conference Paper
Full-text available
The integration of disparate large and heterogeneous socioeconomic and clinical databases is considered essential to capture and model longitudinal and social aspects of diseases. However, such integration is challenging: databases are stored in disparate locations, make use of different identifiers, have variable data quality, record information i...
Conference Paper
Ontologies serve robotics in many ways, particularly in de- scribing and driving autonomous functions. These functions are built around robot tasks. In this paper, we introduce the IEEE Robot Task Representation Study Group, including its work plan, initial development efforts, and proposed use cases. This effort aims to develop a standard that pro...
Article
Full-text available
Background Data integration comprises methods and tools to aggregate data from disparate sources to various purposes. Heterogeneity and uncertainty are technical challenges in this field. The first involves different data representation or meaning, while the second refers to incomplete data or the expectancy that a data item exists in a data source...
Article
Full-text available
Background and aims A cooperation Brazil-UK was set in mid-2013 aiming at to build a huge cohort comprised by individuals registered in CadastroÚnico (CADU), a socioeconomic database used in social programmes of the Brazilian government. Epidemiologists and statisticians wish to assess the impact of Bolsa Família (PBF), a conditional cash transfer...
Article
Full-text available
Background and aims The Brazilian government has several social protection programmes that select their beneficiaries based on socioeconomic information kept in the CadastroÚnico (CADU) database. The CADU will be used to build a population-based cohort of approximately 100 million individuals. Among the social programmes is the Bolsa Família (PBF),...
Conference Paper
Full-text available
This paper presents some current results obtained from our probabilistic record linkage methods applied to the integration of a 100 million cohort composed by socioeconomic data with health databases.
Conference Paper
Full-text available
We present current results from our probabilistic linkage methods applied to the integration of a 100 million cohort composed by socioeconomic data with health databases.
Conference Paper
Full-text available
The increasing need for computing power today justifies the continuous search for techniques that decrease the time to answer usual computational problems. To take advantage of new hybrid parallel architectures composed by multithreading and multiprocessor hardware, our current efforts involve the design and validation of highly parallel algorithms...
Conference Paper
Full-text available
A broad number of large-scale network testbeds have been proposed in the last couple of years as a mean of better supporting Future Internet research. This paper present Bambu. A metropolitan innovation testbed for promoting experimental researches in the city of Salvador, Bahia.
Conference Paper
Full-text available
This paper describes our effort in designing a tool for probabilistic record linkage applied to Brazilian governmental databases associated to Cadastro Único, “Bolsa Família” programme, and the National Unified Health System (SUS). These data are used by statisticians and epidemiologists to assess the effects of social programmes held by the govern...
Conference Paper
Full-text available
Several areas, such as science, economics, finance, business intelligence, health, and others are exploring big data as a way to produce new information, make better decisions, and move forward their related technologies and systems. Specifically in health, big data represents a challenging problem due to the poor quality of data in some circumstan...
Conference Paper
Full-text available
Many e-science applications can benefit from the elasticity of resources provided by volunteer and distributed platforms. While the former is based on resources assigned voluntarily by its owners, the second is based on resources specially configured for this purpose. In this paper, we present the integration of BOINC and Hadoop, two of the most po...
Article
Auto-tuning techniques have been used in the design of routines in recent years. The goal is to develop routines which automatically adapt to the conditions of the computational system in such a way that efficient executions are obtained independently of the end-user experience. This paper aims to explore programming routines that can be automatica...
Article
In this paper, we present the current results of the newly formed IEEE-RAS Working Group, named Ontologies for Robotics and Automation. In particular, we introduce a core ontology that encompasses a set of terms commonly used in Robotics and Automation along with the methodology we have adopted. Our work uses ISO/FDIS 8373 standard developed by the...
Article
Service robotics is an emerging application area for human-centered technologies. The rise of household and personal assistance robots forecasts a human-robot collaborative society. One of the robotics community's major task is to streamline development trends, work on the harmonization of taxonomies and ontologies, along with the standardization o...
Article
Ambient intelligence, ubiquitous and networked robots, and cloud robotics are new research hot topics that have started to gain popularity among the robotics community. They enable robots to acquire richer functionalities and open the way for the composition of a variety of robotic services with three functions: semantic perception, reasoning and a...