• Home
  • IQVIA
  • Department of Biostatistics
  • Francisco De Abreu e Lima
Francisco De Abreu e Lima

Francisco De Abreu e Lima
IQVIA · Department of Biostatistics

PhD

About

72
Publications
10,340
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
203
Citations
Introduction
My main research interests are computer vision, graphical models and sparsity-inducing methods that address statistical and computational efficiency. My tools of choice for data engineering and analysis are Python, R, Bash and SQL. On my spare time I write data science tutorials for my blog, https://poissonisfish.com.
Additional affiliations
September 2013 - July 2014
Institute for Molecular and Cell Biology
Position
  • Research Collaborator
Education
August 2014 - July 2017
October 2011 - September 2013
University of Porto
Field of study
  • Cell and Molecular Biology
September 2008 - July 2011
University of Porto
Field of study
  • Biology

Publications

Publications (72)
Article
Full-text available
Mucus hypersecretion contributes to lung function impairment observed in COPD (chronic obstructive pulmonary disease), a tobacco smoking-related disease. A detailed mucus hypersecretion adverse outcome pathway (AOP) has been constructed from literature reviews, experimental and clinical data, mapping key events (KEs) across biological organisationa...
Preprint
Full-text available
Identification of pregnancies at risk of preterm birth (PTB), the leading cause of newborn deaths, remains challenging given the syndromic nature of the disease. We report a longitudinal multi-omics study coupled with a DREAM challenge to develop predictive models of PTB. We found that whole blood gene expression predicts ultrasound-based gestation...
Article
The strawberry fruit is perishable due to its high water content and soft texture, yet exhibits pleasant organoleptic and nutritional profile. Here we conducted a metabolomics-driven analysis followed by linear modelling to dissect the molecular processes in strawberry postharvest. Fruits from five cultivars were harvested and refrigerated during a...
Presentation
https://poissonisfish.com/2020/04/05/audio-classification-in-r/
Presentation
https://poissonisfish.com/2019/10/09/twitter-data-analysis-in-r/
Presentation
https://poissonisfish.com/2019/05/01/bayesian-models-in-r/
Presentation
https://poissonisfish.com/2018/11/16/the-all-new-caret-interface-in-r/
Article
Full-text available
High-throughput metabolomics technologies can provide the quantification of metabolites levels across various biological processes in different tissues, organs and species, allowing the identification of genes underpinning these complex traits. Information about changes of metabolites during strawberry development and ripening processes is key to a...
Presentation
https://poissonisfish.com/2018/07/08/convolutional-neural-networks-in-r/
Chapter
Bridging metabolomics with plant phenotypic responses is challenging. Multivariate analyses account for the existing dependencies among metabolites, and regression models in particular capture such dependencies in search for association with a given trait. However, special care should be undertaken with metabolomics data. Here we propose a modeling...
Article
Full-text available
Maize (Zea mays L.) is a staple food whose production relies on seed stocks that largely comprise hybrid varieties. Therefore, knowledge about the molecular determinants of hybrid performance (HP) in the field can be used to devise better performing hybrids to address the demands for sustainable increase in yield. Here, we propose and test a classi...
Data
Area under the curve (AUC) in predicting ‘good’ and ‘bad’ performers based on subsets of the ranked parental features. Hybrid field performance was encoded into two groups (i.e. ‘bad’ and ‘good’ performers) and predicted using support vector machines (SVM), each trained with different subsets of parental features, based on the median relative impor...
Data
Metabolic (log10-transformed metabolite intensities) and biomass data (dt ha-1) with field trial designation. (XLSX)
Data
Encoded analytes and distribution of mode of inheritance classes. Rows correspond to the 136 analytes, whereas columns correspond to the 332 crosses. Red, orange, yellow, blue and navy blue colors represent positive overdominance, positive dominance, additivity, negative dominance and negative overdominance, respectively. (XLSX)
Data
Model parameterization for prediction of hybrid performance. The table summarizes the parameters tuned for maximizing the cross-validated R2 (resp. accuracy, AUC) with SVR (resp. SVM). (XLSX)
Data
Accuracy in predicting ‘good’ and ‘bad’ performers based on subsets of the ranked parental features. Hybrid field performance was encoded into two groups (i.e. ‘bad’ and ‘good’ performers) and was predicted using support vector machines (SVM), each trained with different subsets of parental features, based on the median relative importance for pred...
Data
Test of robustness of different cutoff values for binning ‘bad’ and ‘good’ hybrids. Nine evenly-spaced threshold values of hybrid performance (HP), including the mean, were used to bin ‘bad’ and ‘good’ hybrids used for classification of HP. Accuracy (top) decreases as values spread away from the average value (i.e. 592 dt ha-1) and increases again...
Data
Effect of miP class balance threshold relaxation on the predictability of hybrid performance based on subsets of the ranked parental features. Hybrid field performance was predicted using support vector regression models (SVR), each trained with different subsets of parental features, based on the median relative importance for predicting the metab...
Data
Coefficient of variation (CV) and average Kappa per model and encoded analyte. Coefficient of variation (CV = standard deviation / mean) and average Kappa across the five independent test repetitions. The 41 encoded analytes are sorted by decreasing order of the average Kappa (AVE KAPPA). (XLSX)
Data
Schematic representation of the classification-driven framework. (a) Selected Dent × Flint hybrid genotypes (D × F) and the corresponding Dent (D) and Flint (F) inbred parental lines were germinated under controlled conditions. (b) The primary roots from the germinated plants were subjected to gas chromatography separation followed by mass spectrom...
Data
Cumulative relative frequencies of the Dent (maternal) and Flint (parental) analytes along the ranking. The cumulative relative frequency of the Dent (maternal, red) analytes is systematically higher than that of the Flint (paternal, grey) analytes along the ranking (i.e. increasing values in the x-axis). (PDF)
Data
Model parameterization for classification of mIPs. The table summarizes the parameters tuned for maximizing the cross-validated Kappa with each of the seven methods. Except PLS-DA, all methods were tuned using two parameters each. (XLSX)
Article
Full-text available
Maize is the most produced cereal crop worldwide and its oil is a key energy resource. Improving maize oil quantity and quality urges for a better understanding of lipid metabolism. To predict the function of maize genes involved in lipid biosynthesis, we assembled transcriptomic and lipidomic data sets from leaves of B73 and the high-oil line By80...
Article
Full-text available
Primary metabolism plays pivotal roles in normal plant growth, development, and reproduction. As maize is a major crop worldwide, the primary metabolites produced by maize plants are of immense importance from calorific and nutritional perspectives. Here a genome-wide association study (GWAS) of 61 primary metabolites using a maize association pane...
Article
Full-text available
Metabolism is a key determinant of plant growth and modulates plant adaptive responses. Increased metabolic variation due to heterozygosity may be beneficial for highly homozygous plants if their progeny is to respond to sudden changes in the habitat. Here, we investigate the extent to which heterozygosity contributes to the variation in metabolism...
Presentation
https://poissonisfish.com/2017/12/11/linear-mixed-effect-models-in-r/
Presentation
https://poissonisfish.com/2017/10/09/genome-wide-association-studies-in-r/
Article
Full-text available
Heterosis has been extensively exploited for yield gain in maize (Zea mays L.). Here we conducted a comparative metabolomics-based analysis of young roots from in vitro germinating seedlings and from leaves of field-grown plants in a panel of inbred lines from the Dent and Flint heterotic patterns as well as selected F1 hybrids. We found that metab...
Article
Next generation genomics holds great potential in the study of plant phenotypic variation. With several crop reference genomes now available, the affordable costs of de novo genome assembly or target resequencing offer the opportunity to mine the enormous amount of genetic diversity hidden in crop wild relatives. Wide introgressions from these wild...
Presentation
The medicinal plant Catharanthus roseus accumulates in the leaves the anticancer terpenoid indole alkaloids (TIAs) vinblastine and vincristine, universally known as the Vinca alkaloids. These TIAs were the first natural anticancer products to be clinically used, and are still among the most valuable agents used in cancer chemotherapy. Hence, the TI...
Presentation
Vacuoles play an array of key roles in growth, development and environmental interactions in plants, including the accumulation of natural products that can thus exert their defence toxic effect without interfering with current cell physiology. However, many aspects of vacuole multifunctionality are still poorly understood, namely the involvement o...

Network

Cited By

Projects

Project (1)