Eric F Lock

Eric F Lock
University of Minnesota Twin Cities | UMN · School of Public Health

PhD

About

79
Publications
14,572
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,449
Citations
Citations since 2016
48 Research Items
1278 Citations
2016201720182019202020212022050100150200250
2016201720182019202020212022050100150200250
2016201720182019202020212022050100150200250
2016201720182019202020212022050100150200250

Publications

Publications (79)
Article
Full-text available
Introduction Drug development for neurodegenerative diseases such as Friedreich’s ataxia (FRDA) is limited by a lack of validated, sensitive biomarkers of pharmacodynamic response in affected tissue and disease progression. Studies employing neuroimaging measures to track FRDA have thus far been limited by their small sample sizes and limited follo...
Article
Full-text available
The effects of early-life iron deficiency anemia (IDA) extend past the blood and include both short- and long-term adverse effects on many tissues including the brain. Prior to IDA, iron deficiency (ID) can cause similar tissue effects, but a sensitive biomarker of iron-dependent brain health is lacking. To determine serum and CSF biomarkers of ID-...
Preprint
Full-text available
We develop a Bayesian approach to predict a continuous or binary outcome from data that are collected from multiple sources with a multi-way (i.e.. multidimensional tensor) structure. As a motivating example we consider molecular data from multiple 'omics sources, each measured over multiple developmental time points, as predictors of early-life ir...
Article
Background: HIV is a risk factor for obstructive lung disease (OLD), independent of smoking. We used mass spectrometry (MS) approaches to identify metabolomic biomarkers that inform mechanistic pathogenesis of OLD in persons with HIV (PWH). Methods: We obtained bronchoalveolar lavage fluid (BALF) samples from 52 PWH, in case:control (+OLD/-OLD)...
Article
Modern data often take the form of a multiway array. However, most classification methods are designed for vectors, that is, one-way arrays. Distance weighted discrimination (DWD) is a popular high-dimensional classification method that has been extended to the multiway context, with dramatic improvements in performance when data have multiway stru...
Article
Full-text available
Background Pan-omics, pan-cancer analysis has advanced our understanding of the molecular heterogeneity of cancer. However, such analyses have been limited in their ability to use information from multiple sources of data (e.g., omics platforms) and multiple sample sets (e.g., cancer types) to predict clinical outcomes. We address the issue of pred...
Article
Full-text available
While gut microbiome and host gene regulation independently contribute to gastrointestinal disorders, it is unclear how the two may interact to influence host pathophysiology. Here we developed a machine learning-based framework to jointly analyse paired host transcriptomic (n = 208) and gut microbiome (n = 208) profiles from colonic mucosal sample...
Article
Analyzing multi-source data, which are multiple views of data on the same subjects, has become increasingly common in molecular biomedical research. Recent methods have sought to uncover underlying structure and relationships within and/or between the data sources, and other methods have sought to build a predictive model for an outcome using all s...
Article
Distance weighted discrimination (DWD) is a linear discrimination method that is particularly well-suited for classification tasks with high-dimensional data. The DWD coefficients minimize an intuitive objective function, which can solved efficiently using state-of-the-art optimization techniques. However, DWD has not yet been cast into a model-bas...
Article
Background: The effects of iron deficiency (ID) during infancy extend beyond the hematologic compartment and include short- and long-term adverse effects on many tissues including the brain. However, sensitive biomarkers of iron-dependent brain health are lacking in humans. Objective: To determine whether serum and CSF biomarkers of ID-induced m...
Article
Several modern applications require the integration of multiple large data matrices that have shared rows and/or columns. For example, cancer studies that integrate multiple omics platforms across multiple types of cancer, pan-omics pan-cancer analysis, have extended our knowledge of molecular heterogeneity beyond what was observed in single tumor...
Article
A popular method for estimating a causal treatment effect with observational data is the difference-in-differences model. In this work, we consider an extension of the classical difference-in-differences setting to the hierarchical context in which data cannot be matched at the most granular level. Our motivating example is an application to assess...
Preprint
Full-text available
Modern data often take the form of a multiway array. However, most classification methods are designed for vectors, i.e., 1-way arrays. Distance weighted discrimination (DWD) is a popular high-dimensional classification method that has been extended to the multiway context, with dramatic improvements in performance when data have multiway structure...
Preprint
Full-text available
While the gut microbiome and host gene regulation separately contribute to gastrointestinal disorders, it is unclear how the two may interact to influence host pathophysiology. Here, we developed a machine learning-based framework to jointly analyze host transcriptomic and microbiome profiles from 416 colonic mucosal samples of patients with colore...
Article
We introduce a Bayesian nonparametric regression model for data with multiway (tensor) structure, motivated by an application to periodontal disease (PD) data. Our outcome is the number of diseased sites measured over four different tooth types for each subject, with subject‐specific covariates available as predictors. The outcomes are not well‐cha...
Article
Background: The effects of infantile iron deficiency anemia (IDA) extend beyond hematological indices and include short- and long-term adverse effects on multiple cells and tissues. IDA is associated with an abnormal serum metabolomic profile, characterized by altered hepatic metabolism, lowered NAD flux, increased nucleoside levels, and a reducti...
Preprint
Pan-omics, pan-cancer analysis has advanced our understanding of the molecular heterogeneity of cancer, expanding what was known from single-cancer or single-omics studies. However, pan-cancer, pan-omics analyses have been limited in their ability to use information from multiple sources of data (e.g., omics platforms) and multiple sample sets (e.g...
Preprint
Full-text available
Analyzing multi-source data, which are multiple views of data on the same subjects, has become increasingly common in molecular biomedical research. Recent methods have sought to uncover underlying structure and relationships within and/or between the data sources, and other methods have sought to build a predictive model for an outcome using all s...
Article
Background: Pre-anemic iron deficiency (ID) and ID anemia in infancy is associated with long-term effects on brain development. Iron is an essential micronutrient needed for multiple neurodevelopmental processes, including energy production, myelination and neurotransmission. Serum and cerebrospinal fluid (CSF) metabolomic and proteomic analyses ca...
Preprint
Distance weighted discrimination (DWD) is a linear discrimination method that is particularly well-suited for classification tasks with high-dimensional data. The DWD coefficients minimize an intuitive objective function, which can solved very efficiently using state-of-the-art optimization techniques. However, DWD has not yet been cast into a mode...
Article
Full-text available
Longitudinal processes rarely occur in isolation; often the growth curves of two or more variables are interdependent. Moreover, growth curves rarely exhibit a constant pattern of change. Many educational and psychological phenomena are comprised of different developmental phases (segments). Bivariate piecewise linear mixed-effects models (BPLMEM)...
Article
Full-text available
In Idiopathic Pulmonary Fibrosis (IPF), there is unrelenting scarring of the lung mediated by pathological mesenchymal progenitor cells (MPCs) that manifest autonomous fibrogenicity in xenograft models. To determine where along their differentiation trajectory IPF MPCs acquire fibrogenic properties, we analyzed the transcriptome of 335 MPCs isolate...
Article
Objectives To determine whether rapid correction of iron deficiency using intramuscular iron dextran normalizes serum metabolomic changes in a nonhuman primate model of iron deficiency anemia (IDA). Methods Blood was collected from naturally iron-sufficient (IS; n = 10) and IDA (n = 12) male and female infant rhesus monkeys (Macaca mulatta) at 6 m...
Article
Full-text available
We built a novel Bayesian hierarchical survival model based on the somatic mutation profile of patients across 50 genes and 27 cancer types. The pan-cancer quality allows for the model to “borrow” information across cancer types, motivated by the assumption that similar mutation profiles may have similar (but not necessarily identical) effects on s...
Preprint
Several modern applications require the integration of multiple large data matrices that have shared rows and/or columns. For example, cancer studies that integrate multiple omics platforms across multiple types of cancer, pan-omics pan-cancer analysis, have extended our knowledge of molecular heterogenity beyond what was observed in single tumor a...
Article
Full-text available
Motivation: The flexibility of a Bayesian framework is promising for GWAS, but current approaches can benefit from more informative prior models. We introduce a novel Bayesian approach to GWAS, called Structured and Non-Local Priors (SNLPs) GWAS, that improves over existing methods in two important ways. First, we describe a model that allows for...
Article
Background: Iron deficiency is the most common nutrient deficiency in human infants aged 6 to 24 mo, and negatively affects many cellular metabolic processes, including energy production, electron transport, and oxidative degradation of toxins. There can be persistent influences on long-term metabolic health beyond its acute effects. Objectives:...
Preprint
A popular method for estimating a causal treatment effect with observational data is the difference-in-differences (DiD) model. In this work, we consider an extension of the classical DiD setting to the hierarchical context in which data cannot be matched at the most granular level (e.g., individual-level differences are unobservable). We propose a...
Preprint
Full-text available
We built a novel Bayesian hierarchical survival model based on the somatic mutation profile of patients across 50 genes and 27 cancer types. The pan-cancer quality allows for the model to "borrow" information across cancer types, motivated by the assumption that similar mutation profiles may have similar (but not necessarily identical) effects on s...
Article
Advances in molecular “omics” technologies have motivated new methodology for the integration of multiple sources of high‐content biomedical data. However, most statistical methods for integrating multiple data matrices only consider data shared vertically (one cohort on multiple platforms) or horizontally (different cohorts on a single platform)....
Preprint
Full-text available
Advances in molecular "omics'" technologies have motivated new methodology for the integration of multiple sources of high-content biomedical data. However, most statistical methods for integrating multiple data matrices only consider data shared vertically (one cohort on multiple platforms) or horizontally (different cohorts on a single platform)....
Article
Full-text available
Although national measures of the quality of diabetes care delivery demonstrate improvement, progress has been slow. In 2008, the Minnesota legislature endorsed the patient-centered medical home (PCMH) as the preferred model for primary care redesign. In this work, we investigate the effect of PCMH-related clinic redesign and resources on diabetes...
Article
Purpose: Preterm infants <29 weeks of gestation are at risk for severe intraventricular hemorrhage (IVH). Lower gestational age, birth weight, severe illness, as indexed by higher Score for Neonatal Acute Physiology - Perinatal Extension II (SNAPPE-II) are associated with severe IVH. The role of coagulation abnormalities on the first day after birt...
Preprint
We introduce a Bayesian nonparametric regression model for data with multiway (tensor) structure, motivated by an application to periodontal disease (PD) data. Our outcome is the number of diseased sites measured over four different tooth types for each subject, with subject-specific covariates available as predictors. The outcomes are not well-cha...
Article
High-dimensional multi-source data are encountered in many fields. Despite recent developments on the integrative dimension reduction of such data, most existing methods cannot easily accommodate data of multiple types (e.g. binary or count-valued). Moreover, multi-source data often have block-wise missing structure, i.e. data in one or more source...
Article
Full-text available
Piecewise growth mixture models (PGMM) are a flexible and useful class of methods for analyzing segmented trends in individual growth trajectory over time, where the individuals come from a mixture of two or more latent classes. These models allow each segment of the overall developmental process within each class to have a different functional for...
Preprint
Full-text available
High-dimensional multi-source data are encountered in many fields. Despite recent developments on the integrative dimension reduction of such data, most existing methods cannot easily accommodate data of multiple types (e.g., binary or count-valued). Moreover, multi-source data often have block-wise missing structure, i.e., data in one or more sour...
Article
Full-text available
Introduction: Chronic obstructive pulmonary disease (COPD) is a known risk factor for developing lung cancer but the underlying mechanisms remain unknown. We hypothesize that the COPD stroma contains molecular mechanisms supporting tumorigenesis. Materials/methods: We conducted an unbiased multi-omic analysis to identify gene expression patterns...
Article
Full-text available
We describe a probabilistic PARAFAC/CANDECOMP (CP) factorization for multiway (i.e., tensor) data that incorporates auxiliary covariates, SupCP. SupCP generalizes the supervised singular value decomposition (SupSVD) for vector-valued observations, to allow for observations that have the form of a matrix or higher-order array. Such data are increasi...
Preprint
Full-text available
Sea turtles are a keystone species and are highly sensitive to changes in their environment, making them excellent environmental indicators. In light of environmental and climate changes, species are increasingly threatened by pollution, changes in ocean health, habitat alteration, and plastic ingestion. There may be additional health related threa...
Article
Full-text available
Several recent methods address the dimension reduction and decomposition of linked high-content data matrices. Typically, these methods consider one dimension, rows or columns, that is shared among the matrices. This shared dimension may represent common features measured for different sample sets (horizontal integration) or a common sample set wit...
Article
Full-text available
Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal components analysis (PCA). However, the application of PCA is not straightforward for multi-source data, wherein multiple sources of 'omics data measure different but related biological components. In this article we utilize rec...
Article
Full-text available
We propose a framework for the linear prediction of a multi-way array (i.e., a tensor) from another multi-way array of arbitrary dimension, using the contracted tensor product. This framework generalizes several existing approaches, including methods to predict a scalar outcome from a tensor, a matrix from a matrix, or a tensor from a scalar. We de...
Article
Objectives: Iron deficiency (ID) anemia leads to long-term neurodevelopmental deficits by altering iron-dependent brain metabolism. The objective of the study was to determine if ID induces metabolomic abnormalities in the cerebrospinal fluid (CSF) in the pre-anemic stage and to ascertain the aspects of abnormal brain metabolism affected. Methods...
Article
Full-text available
High-dimensional linear classifiers, such as the support vector machine (SVM) and distance weighted discrimination (DWD), are commonly used in biomedical research to distinguish groups of subjects based on a large number of features. However, their use is limited to applications where a single vector of features is measured for each subject. In pra...
Article
Full-text available
The integrative analysis of multiple high-throughput data sources that are available for a common sample set is an increasingly common goal in biomedical research. JIVE is a tool for exploratory dimension reduction that decomposes a multi-source dataset into three terms: a low-rank approximation capturing joint variation across sources, low-rank ap...
Article
Full-text available
High-throughput genetic and epigenetic data are often screened for associations with an observed phenotype. For example, one may wish to test hundreds of thousands of genetic variants, or DNA methylation sites, for an association with disease status. These genomic variables can naturally be grouped by the gene they encode, among other criteria. How...
Code
This is a reproducible, documented workflow and code in R of data analysis of eastern pacific hawksbill sea turtle hematology and biochemistry reference ranges, and mean values for whole blood heavy metals
Chapter
Full-text available
In biomedical research, a growing number of measurement platforms and technologies are being used to assess diverse but related information on a set of common samples. This motivates integrative methods for multisource data, in which multiple data sets are derived from a common set of objects. This chapter addresses exploratory methods for multisou...
Article
Full-text available
This article concerns testing for equality of distribution between groups. We focus on screening variables with shared distributional features such as common support, modes and patterns of skewness. We propose a Bayesian testing method using kernel mixtures, which improves performance by borrowing information across the different variables and grou...
Article
Full-text available
Metabolic profiling is increasingly being used for understanding biological processes but there is no single analytical technique that provides a complete quantitative or qualitative profiling of the metabolome. Data fusion (i.e. joint analysis of data from multiple sources) has the potential to circumvent this issue facilitating knowledge discover...
Article
Full-text available
Background Expression quantitative trait loci (eQTL) play an important role in the regulation of gene expression. Gene expression levels and eQTLs are expected to vary from tissue to tissue, and therefore multi-tissue analyses are necessary to fully understand complex genetic conditions in humans. Dura mater tissue likely interacts with cranial bon...
Article
Background: HDAC inhibitors (HDACi) are being investigated as treatment for relapsed/refractory non Hodgkin lymphoma (NHL) and other cancers. However, the mechanisms underlying sensitivity and resistance to HDAC inhibition in lymphomas have not been fully characterized. We probed the cellular and molecular response to HDACi in vitro and in vivo in...
Article
Full-text available
Background Chiari Type I Malformation (CMI) is characterized by herniation of the cerebellar tonsils through the foramen magnum at the base of the skull, resulting in significant neurologic morbidity. As CMI patients display a high degree of clinical variability and multiple mechanisms have been proposed for tonsillar herniation, it is hypothesized...
Article
Full-text available
In this study, we define the genetic landscape of mantle cell lymphoma (MCL) through exome sequencing of 56 cases of MCL. We identified recurrent mutations in ATM, CCND1, MLL2, and TP53. We further identified a number of novel genes recurrently mutated in patients with MCL including RB1, WHSC1, POT1, and SMARCA4. We noted that MCLs have a distinct...
Conference Paper
Background: HDAC inhibitors (HDACi) are being investigated as treatment for relapsed/refractory non Hodgkin lymphoma (NHL) and other cancers. However, the mechanisms underlying sensitivity and resistance to HDAC inhibition in lymphomas have not been fully characterized. We probed the cellular and molecular response to HDACi in vitro and in vivo in...
Article
Mantle cell lymphoma is an uncommon form of non Hodgkin lymphoma that is characterized by poor responsiveness to chemotherapy and a high rate of mortality. While translocation of CCND1 is a defining feature of the disease, the role of collaborating somatic mutations that contribute to mantle cell lymphoma remains to be better defined. In this study...
Article
Full-text available
We propose a nonparametric Bayes test for equality of distribution between groups. The group-specific distributions are characterized as finite mixtures with common mixture components but potentially different weights. Including common components allows borrowing of information across groups, and leads to a simple and efficient approach for calcula...
Article
Full-text available
In biomedical research a growing number of platforms and technologies are used to measure diverse but related information, and the task of clustering a set of objects based on multiple sources of data arises in several applications. Most current approaches to multi-source clustering either independently determine a separate clustering for each data...
Article
Full-text available
Research in several fields now requires the analysis of datasets in which multiple high-dimensional types of data are available for a common set of objects. In particular, The Cancer Genome Atlas (TCGA) includes data from several diverse genomic technologies on the same cancerous tumor samples. In this paper we introduce Joint and Individual Variat...