Mariana Recamonde-Mendoza

Mariana Recamonde-Mendoza
Universidade Federal do Rio Grande do Sul | UFRGS · Institute of Informatics

PhD

About

102
Publications
25,986
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
753
Citations
Citations since 2017
74 Research Items
692 Citations
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
Introduction
Associate Professor at the Institute of Informatics, Universidade Federal do Rio Grande do Sul (UFRGS), and Professor Researcher at Hospital de Clínicas de Porto Alegre (HCPA), Porto Alegre, Brazil. Head of the Bioinformatics Core at HCPA. Conducts research in the fields of Machine Learning (with a special interest in applications in Health), Bioinformatics, and Computational Biology.
Additional affiliations
June 2017 - present
Hospital de Clínicas de Porto Alegre
Position
  • Professor
Description
  • Leader of the Bioinformatics Core
June 2016 - present
Universidade Federal do Rio Grande do Sul
Position
  • Professor
March 2015 - June 2016
Universidade Federal do Rio Grande do Sul
Position
  • Substitute Professor
Description
  • Taught the courses Introduction to Programming and Object-Oriented Programming at the undergraduate level.
Education
August 2012 - July 2013
Massachusetts Institute of Technology
Field of study
  • Computational Biology
March 2010 - March 2014
Universidade Federal do Rio Grande do Sul
Field of study
  • Computer Science
March 2005 - December 2009
Universidade Federal do Rio Grande (FURG)
Field of study
  • Computer Engineering

Publications

Publications (102)
Article
The incidence of mosquito-borne diseases is significant in under-developed regions, mostly due to the lack of resources to implement aggressive control measurements against mosquito proliferation. A potential strategy to raise community awareness regarding mosquito proliferation is building a live map of mosquito incidences using smartphone apps an...
Article
Identifying essential genes and proteins is a critical step towards a better understanding of human biology and pathology. Computational approaches helped to mitigate experimental constraints by exploring machine learning (ML) methods and the correlation of essentiality with biological information, especially protein-protein interaction (PPI) netwo...
Article
Background Prior studies have found increased rates of alcohol consumption among physicians and medical students. The present study aims to build machine learning (ML) models to identify patterns of high-risk drinking (HRD), including alcohol use disorder, within this population. Methods We analyzed data collected through a web-based survey among...
Preprint
Full-text available
Identifying the genes and mutations that drive the emergence of tumors is a major step to improve understanding of cancer and identify new directions for disease diagnosis and treatment. Despite the large volume of genomics data, the precise detection of driver mutations and their carrying genes, known as cancer driver genes, from the millions of p...
Article
Full-text available
The COVID-19 pandemic has underlined the need to partner with the community in pandemic preparedness and response in order to enable trust-building among stakeholders, which is key in pandemic management. Citizen science, defined here as a practice of public participation and collaboration in all aspects of scientific research to increase knowledge...
Preprint
Full-text available
Alterations in DNA methylation patterns are a frequent finding in cancer. Methylation aberrations can drive tumorigenic pathways and serve as potential biomarkers. The role of epigenetic alterations in thyroid cancer is still poorly understood Here, we analyzed methylome data of a total of 810 thyroid samples (n=256 for discovery and n=554 for vali...
Chapter
The use of machine learning approaches in studying cancer through omics datasets has been an important research tool since the advent of high-throughput technologies. However, these datasets present an intrinsic data complexity that may hinder model development despite their information richness. This work, therefore, aims to study the characterist...
Chapter
Cross-validation (CV) is a widely used technique in machine learning pipelines. However, some of its drawbacks have been recognized in the last decades. In particular, CV may generate folds unrepresentative of the whole dataset, which led some works to propose methods that attempt to produce more distribution-balanced folds. In this work, we propos...
Article
Fossil plant remains are commonly found in fragments in the sediment, thus complicating the reconstruction and classification of fossil plants into a higher taxonomic group. Particularly for stem anatomy, some described features repeat among the proposed lineages due to environmental pressures that induce anatomical convergence. Other characteristi...
Article
Discovering disease biomarkers from gene expression data has been greatly advanced by feature selection (FS) methods, especially using ensemble FS (EFS) strategies with perturbation at the data (i.e., homogeneous EFS) or method level (i.e., heterogeneous EFS). Here, we proposed a Hybrid EFS design that explores both types of perturbation to disrupt...
Article
Valproic acid (VPA) is a widely used antiepileptic drug not recommended in pregnancy because it is teratogenic. Many assays have assessed the impact of the VPA exposure on the transcriptome of human embryonic stem-cells (hESC), but the molecular perturbations that VPA exerts in neurodevelopment are not completely understood. This study aimed to per...
Article
Fetal Alcohol Spectrum Disorder (FASD) comprises the phenotypes induced by prenatal alcohol exposure. Understanding the molecular mechanisms of FASD is needed since it is a public health problem. This study aimed to evaluate the impact of ethanol in the differential gene expression (DGE) of embryonic cells and fetal tissues by performing a transcri...
Preprint
Motivation Advances in genomic sequencing of human populations have generated a large amount of genomics data deposited in multiple sources. Programmatic batch searches executed at once are of great scientific interest to ease genomic investigations by retrieving and integrating this massive and decentralized data with little manual intervention....
Conference Paper
Full-text available
As floras do Permiano da Bacia do Parnaíba têm despertado grande interesse científico nos últimos anos, principalmente em relação aos lenhos petrificados, comuns na região. Diversas novas espécies foram erigidas, incluindo vários gêneros nunca antes identificados, revelando o endemismo existente na área de estudo. Contudo, os lenhos fósseis até ent...
Article
Identifying the genes and mutations that drive the emergence of tumors is a critical step to improving our understanding of cancer and identifying new directions for disease diagnosis and treatment. Despite the large volume of genomics data, the precise detection of driver mutations and their carrying genes, known as cancer driver genes, from the m...
Article
This proof-of-concept study aimed to investigate the viability of a predictive model to support posttraumatic stress disorder (PTSD) staging. We performed a naturalistic, cross-sectional study at two Brazilian centers: the Psychological Trauma Research and Treatment (NET-Trauma) Program at Universidade Federal of Rio Grande do Sul, and the Program...
Article
Full-text available
There are still numerous challenges to be overcome in microarray data analysis because advanced, state-of-the-art analyses are restricted to programming users. Here we present the Gene Expression Analysis Platform, a versatile, customizable, optimized, and portable software developed for microarray analysis. GEAP was developed in C# for the graphic...
Conference Paper
Identifying stable and precise biomarkers is a key challenge in precision medicine. A promising approach in this direction is exploring omics data, such as transcriptome generated by microarrays, to discover candidate biomarkers. This, however, involves the fundamental issue of finding the most discriminative features in high-dimensional datasets....
Poster
Full-text available
Women are underrepresented in bioinformatics, despite its foundation by Margaret Dayhoff. Remarkable differences are observed between the gender ratio for bioinformatics' parent fields, with biology being more balanced than computer science. In Brazil, most workshops and scientific events feature only men as speakers. Even when females are represen...
Preprint
Full-text available
The discovery of disease biomarkers from gene expression data has been greatly advanced by feature selection (FS) methods, especially using ensemble FS (EFS) strategies with perturbation at the data level (i.e., homogeneous, Hom-EFS) or method level (i.e., heterogeneous, Het-EFS). Here we proposed a Hybrid EFS (Hyb-EFS) design that explores both ty...
Article
Full-text available
The identification of thalidomide–Cereblon-induced SALL4 degradation has brought new understanding for thalidomide embryopathy (TE) differences across species. Some questions, however, regarding species variability, still remain. The aim of this study was to detect sequence divergences between species, affected or not by TE, and to evaluate the reg...
Article
The angiotensin-converting enzyme 2 (ACE2) is the receptor for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It is highly expressed in adipose tissue, possibly associated with progression to severe coronavirus disease 2019 (COVID-19) in obese subjects. We searched the Gene Expression Omnibus (GEO) and reanalyzed the GSE59034 con...
Article
Full-text available
Embryofetal development is a critical process that needs a strict epigenetic control, however, perturbations in this balance might lead to the occurrence of congenital anomalies. It is known that anticonvulsants potentially affect epigenetics-related genes, however, it is not comprehended whether this unbalance could explain the anticonvulsants-ind...
Article
Background Atrial fibrillation (AF) is a complex disease and affects millions of people around the world. The biological mechanisms that are involved with AF are complex and still need to be fully elucidated. Therefore, we performed a metaanalysis of transcriptome data related to AF to explore these mechanisms aiming at more sensitive and reliable...
Conference Paper
This work addresses the problem of identifying and fusing duplicate features in machine learning datasets. Our goal is to evaluate the hypothesis that fusing duplicate features can improve the predictive power of the data while reducing training time. We propose a simple method for duplicate detection and fusion based on a small set of features. An...
Preprint
Full-text available
The incidence of mosquito-borne diseases is significant in under-developed regions, mostly due to the lack of resources to implement aggressive control measurements against mosquito proliferation. A potential strategy to raise community awareness regarding mosquito proliferation is building a live map of mosquito incidences using smartphone apps an...
Article
Full-text available
Thyroid hormones (THs) are critical regulators of cellular processes, while changes in their levels impact all the hallmarks of cancer. Disturbed expression of type 3 deiodinase (DIO3), the main TH-inactivating enzyme, occurs in several human neoplasms and has been associated with adverse outcomes. Here, we investigated the patterns of DIO3 express...
Preprint
Full-text available
The identification of essential genes/proteins is a critical step towards a better understanding of human biology and pathology. Computational approaches helped to mitigate experimental constraints by exploring machine learning (ML) methods and the correlation of essentiality with biological information, especially protein-protein interaction (PPI)...
Article
Full-text available
Pancreatic ductal adenocarcinoma (PDAC) is an aggressive disease with high mortality rates. PDAC initiation and progression are promoted by genetic and epigenetic dysregulation. Here, we aimed to characterize the PDAC DNA methylome in search of novel altered pathways associated with tumor development. We examined the genome-wide DNA methylation pro...
Preprint
Full-text available
Atrial fibrillation (AF) is a complex disease and affects millions of people around the world. The biological mechanisms that are involved with AF are complex and still need to be fully elucidated. Therefore, we performed a meta-analysis of transcriptome data related to AF to explore these mechanisms aiming at more sensitive and reliable results. P...
Article
It is increasingly common applications where data are naturally generated in a distributed fashion, especially after the emergence of technologies like the Internet of Things (IoT). In sensor networks, in collaborative health or genomic projects, in credit risk analysis, among other domains, distinct features are collected from multiple sources, in...
Article
Full-text available
The aim of this study was to establish a peptidomic profile based on LC-MS/MS and random forest (RF) algorithm to distinguish the urinary peptidomic scenario of type 2 diabetes mellitus (T2DM) patients with different stages of diabetic kidney disease (DKD). Urine from 60 T2DM patients was collected: 22 normal (stage A1), 18 moderately increased (st...
Article
Full-text available
The Cereblon-CRL4 complex has been studied predominantly with regards to thalidomide treatment of multiple myeloma. Nevertheless, the role of Cereblon-CRL4 in Thalidomide Embryopathy (TE) is still not understood. Not all embryos exposed to thalidomide develop TE, hence here we evaluate the role of the CRL4-Cereblon complex in TE variability and sus...
Conference Paper
Learning from data streams requires efficient algorithms capable of constructing a model according to the arrival of new instances. These data stream learners need a quick and real-time response, but mainly, they must be tailored to adapt to possible changes in the data distribution, a condition known as concept drift. However, recent works have sh...
Conference Paper
Abstract Background: Obesity is a risk factor for cardiovascular diseases; however, in obese heart failure (HF) patients live longer than lean HF patients, this observation is known as obesity paradox. MicroRNAs (miRs) regulate processes involved in both cardiac remodeling and obesity. Objective: We investigated whether the levels of circulating m...
Article
Full-text available
Predicting infectious disease dynamics is a central challenge in disease ecology. Models that can assess which individuals are most at risk of being exposed to a pathogen not only provide valuable insights into disease transmission and dynamics but can also guide management interventions. Constructing such models for wild animal populations, howeve...
Conference Paper
Background: Breast cancer is a highly heterogeneous disease and the identification of biomarkers that predict tumor biological behavior is warranted in improving patient survival. Thyroid hormones (THs) are critical regulators of cellular processes, and TH status alterations are known to contribute to cancer progression through all the hallmarks of...
Conference Paper
Background: Breast cancer is a highly heterogeneous disease and the identification of biomarkers that predict tumor biological behavior is warranted in improving patient survival. Thyroid hormones (THs) are critical regulators of cellular processes, and TH status alterations are known to contribute to cancer progression through all the hallmarks of...
Article
Two core polyadenylation elements (CPE) located in the 3′ untranslated region of eukaryotic pre-mRNAs play an essential role in their processing: the polyadenylation signal (PAS) AAUAAA and the cleavage site (CS), preferentially a CA dinucleotide. Herein, we characterized PAS and CS sequences in a set of cancer predisposition genes (CPGs) and perfo...
Article
Background Traditional methods for rejection control in transplanted patients are considered invasive, risky, and prone to sampling errors. Using molecular biomarkers as an alternative protocol to biopsies, for monitoring rejection may help to mitigate some of these problems, increasing the survival rates and well-being of patients. Recent advances...
Article
Graph alignment refers to the problem of finding a bijective mapping across vertices of two graphs such that, if two nodes are connected in the first graph, their images are connected in the second graph. Most standard graph alignment methods consider an optimization that maximizes the number of matches between the two graphs, ignoring the effect o...
Article
The prevalence of anxiety disorders in patients with Attention Deficit/Hyperactivity Disorder (ADHD) is around 15–40%, three times higher than in the general population. The dopaminergic system, classically associated with ADHD, interacts directly with the adenosinergic system through adenosine A2A receptors (A2A) and dopamine D2 receptors (D2) for...
Preprint
Full-text available
Predicting infectious disease dynamics is a central challenge in disease ecology. Models that can assess which individuals are most at risk of being exposed to a pathogen not only provide valuable insights into disease transmission and dynamics but can also guide management interventions. Constructing such models for wild animal populations, howeve...
Article
Full-text available
Pathological cardiac hypertrophy (CH) is associated with increased heart failure risk and sudden cardiac death. Several transcription factors (TFs) and miRNAs were implicated in CH, and their high combinatorial action in gene expression regulation is becoming more clear. We adopted a systems biology approach to construct a comprehensive TF-miRNA co...
Article
Full-text available
The spread of pathogens in swine populations is in part determined by movements of animals between farms. However, understanding additional characteristics that predict disease outbreaks and uncovering landscape factors related to between-farm spread are crucial steps toward risk mitigation. This study integrates animal movements with environmental...
Article
Full-text available
Colorectal cancer (CRC) is the second most common cancer in women and the third most common cancer in men globally. The identification of differentially expressed genes associated to patient’s clinical data may represent a useful approach to find important genes in CRC carcinogenesis. Previously, the TULP3 transcription factor was identified as a p...
Data
Principal component analysis (PCA) plots. (A) GSE21510. (B) GSE24514. (C) COAD-TCGA. (D) READ-TCGA. (PC) Principal component. (NT) Adjacent non-tumoral tissue. (CRC) Colorectal cancer. (COAD) Colon adenocarcinoma. (READ) Rectum adenocarcinoma. (TIFF)
Data
Survival probabilities of pathologic staging (pTNM) in TCGA studies. (A) Comparison of survival among pNTM stages in Colon adenocarcinoma (COAD). (B) Comparison of survival among pNTM stages in Rectum adenocarcinoma (READ). The x-axis corresponds to overall survival in months. (HR) Hazard ratio. In READ study the Cox proportional hazards regression...
Data
Description of the datasets selected to analyze TULP3 gene expression, including the classification and sample size, and the technique employed to quantify the transcripts. (NT) Adjacent non-tumoral tissue. (CRC) Colorectal cancer. (COAD) Colon adenocarcinoma. (READ) Rectum adenocarcinoma. (DOCX)
Data
Normalisation plots. (A) GSE21510. (B) GSE24514. (C) COAD-TCGA. (D) READ-TCGA. Blue boxplots correspond to adjacent non-tumoral samples (NT) and the red ones correspond to colorectal cancer (CRC) samples. (TIFF)
Data
Survival probabilities in TCGA studies. We used the median to dichotomize the groups classified as high and low gene expression (normalised and quantile-filtered data without log-transformation). (A) Comparison of TULP3 gene expression in Colon adenocarcinoma (COAD). (B) Comparison of TULP3 gene expression in Rectum adenocarcinoma (READ). The x-axi...
Data
Survival probabilities of early and advanced stages in TCGA studies. We grouped patients classified in stages I and II as early stage and, patients classified in stages III and IV as advanced stage. (A) Comparison of survival between early versus advanced stages in Colon adenocarcinoma (COAD). (B) Comparison of survival between early versus advance...
Data
Survival probabilities of TULP3 gene expression in early and advanced stages in COAD-TCGA study. We used the median of TULP3 to dichotomize the groups in low and high expression for early and advanced stages. (A) Comparison of survival between early stages (I and II). (B) Comparison of survival between advanced stages (III and IV). The x-axis corre...
Data
Survival probabilities of TULP3 gene expression in early and advanced stages in READ-TCGA study. We used the median of TULP3 to dichotomize the groups in low and high expression for early and advanced stages. (A) Comparison of survival between early stages (I and II). (B) Comparison of survival between advanced stages (III and IV). The x-axis corre...
Data
TULP3 gene expression comparison between groups of GSE21501. TULP3 expression profile from GSE21501 study. (NT) Adjacent non-tumoral tissue. (CRC) Colorectal cancer. Median is represented as a solid line. Equal letters above the boxplots indicate no statistical difference among the groups. We performed Kruskal-Wallis test followed by Benjamini-Hoch...
Article
Full-text available
AimsThe aim of this study was to investigate a miRNA expression profile in type 1 diabetes mellitus (T1DM) patients with DKD (cases) or without this complication (controls). Methods Expression of 48 miRNAs was screened in plasma of 58 T1DM patients (23 controls, 18 with moderate DKD, and 17 with severe DKD) using TaqMan Low Density Array cards (The...
Conference Paper
Data stream classification poses many challenges for the data mining community when the environment is non-stationary, among which adaptation to the concept drifts, i.e., changes in the underlying concepts, is a major one. Two main ways to develop adaptive approaches are ensemble methods and incremental algorithms. Ensemble methods play an importan...
Article
MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression. Emerging evidence has suggested a role for miRNAs in the development of diabetic kidney disease (DKD), indicating that miRNAs may represent potential biomarkers of this disease. However, results are still inconclusive. Therefore, we performed a systematic review of the lite...
Article
Full-text available
Growing evidence indicates that microRNAs (miRNAs) have a key role in processes involved in type 1 diabetes mellitus (T1DM) pathogenesis, including immune system functions and beta-cell metabolism and death. Although dysregulated miRNA profiles have been identified in T1DM patients, results are inconclusive; with only few miRNAs being consistently...
Conference Paper
Aims: Vitamin E is a usual antioxidant, but little is known about its effects on cardiac hypertrophy and microRNAs (miRs) expression induced by transverse aortic constriction (TAC) in mice. Methods and Results: Male Balb/c mice were randomly divided into four cohorts: SHAM ( n= 22), TAC ( n =34), SHAM supplemented with vitamin E (SHAM+VIT, n =22),...
Article
Although new candidate genes for Autism Spectrum Disorder (ASD), Schizophrenia (SCZ), Attention-Deficit/Hyperactivity Disorder (ADHD), and Bipolar Disorder (BD) emerged from genome-wide association studies (GWAS), their underlying molecular mechanisms remain poorly understood. Evidences of the involvement of intrinsically disordered proteins in dis...
Article
Full-text available
Bovine viral diarrhea virus (BVDV) causes one of the most economically important diseases in cattle, and the virus is found worldwide. A better understanding of the disease associated factors is a crucial step towards the definition of strategies for control and eradication. In this study we trained a random forest (RF) prediction model and perform...
Article
Full-text available
In many situations, a centralized, conventional classification task can not be performed because the data is not available in a central facility. In such cases, we are dealing with distributed data mining problems, in which local models must be individually built and later combined into a consensus, global model. In this paper, we are particularly...