Eran Elhaik

Eran Elhaik
The University of Sheffield | Sheffield · Department of Animal and Plant Sciences

Ph.D

About

120
Publications
52,104
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
10,638
Citations
Introduction
All my publications are available at my website. Please check out the available positions at my website (under Students) http://www.eranelhaiklab.org/
Additional affiliations
January 2014 - present
The University of Sheffield
Position
  • Lecturer
July 2013 - present
Johns Hopkins University
Position
  • Research Associate
January 2009 - June 2013
Johns Hopkins Bloomberg School of Public Health
Position
  • PostDoc Position

Publications

Publications (120)
Article
Full-text available
Amyotrophic lateral sclerosis (ALS) is a complex disease that leads to motor neuron death. Despite heritability estimates of 52%, genome-wide association studies (GWASs) have discovered relatively few loci. We developed a machine learning approach called RefMap, which integrates functional genomics with GWAS summary statistics for gene discovery. W...
Article
Full-text available
Spina bifida (SB) is a debilitating birth defect caused by multiple gene and environment interactions. Though SB shows non-Mendelian inheritance, genetic factors contribute to an estimated 70% of cases. Nevertheless, identifying human mutations conferring SB risk is challenging due to its relative rarity, genetic heterogeneity, incomplete penetranc...
Article
Purpose: Multiple myeloma (MM) is a plasma cell (PC) malignancy with an increasing incidence in the United States. Fluorescence in situ hybridization (FISH) is currently the gold-standard diagnostic assay to detect recurrent genomic abnormalities of prognostic and therapeutic significance in MM. Hyperdiploid MM, a standard risk abnormality, is char...
Preprint
Spina bifida (SB) is a debilitating birth defect caused by multiple gene and environment interactions. Though SB shows non-Mendelian inheritance, genetic factors contribute to an estimated 70% of cases. Nevertheless, identifying human mutations conferring SB risk is challenging due to its relative rarity, genetic heterogeneity, incomplete penetranc...
Article
We present a global atlas of 4,728 metagenomic samples from mass-transit systems in 60 cities over 3 years, representing the first systematic, worldwide catalog of the urban microbial ecosystem. This atlas provides an annotated, geospatial profile of microbial strains, functional characteristics, antimicrobial resistance (AMR) markers, and genetic...
Article
Full-text available
We present a global atlas of 4,728 metagenomic samples from mass-transit systems in 60 cities over 3 years, representing the first systematic, worldwide catalog of the urban microbial ecosystem. This atlas provides an annotated, geospatial profile of microbial strains, functional characteristics, antimicrobial resistance (AMR) markers, and genetic...
Article
Full-text available
Recent advances in metagenomic technology and computational prediction may inadvertently weaken an individual's reasonable expectation of privacy. Through cross-kingdom genetic and metagenomic forensics, we can already predict at least a dozen human phenotypes with varying degrees of accuracy. There is also growing potential to detect a "molecular...
Article
Full-text available
The past years have seen the rise of genomic biobanks and mega-scale meta-analysis of genomic data, which promises to reveal the genetic underpinnings of health and disease. However, the over-representation of Europeans in genomic studies not only limits the global understanding of disease risk but also inhibits viable research into the genomic dif...
Preprint
Full-text available
Principal Component Analysis (PCA) is a multivariate analysis that allows reduction of the complexity of datasets while preserving data's covariance and visualizing the information on colorful scatterplots, ideally with only a minimal loss of information. PCA applications are extensively used as the foremost analyses in population genetics and rela...
Article
Full-text available
In the last 15 years or so, soft selective sweep mechanisms have been catapulted from a curiosity of little evolutionary importance to a ubiquitous mechanism claimed to explain most adaptive evolution and, in some cases, most evolution. This transformation was aided by a series of articles by Daniel Schrider and Andrew Kern. Within this series, a p...
Article
Full-text available
Sudden infant death syndrome (SIDS) is the unexpected death of an infant under one year of age that remains unexplained after a thorough investigation. Despite SIDS remaining a diagnosis of exclusion with an unexplained etiology, it is widely accepted that SIDS can be caused by environmental and/or biological factors, with multiple underlying candi...
Article
Full-text available
(Cell Reports 33, 108456-1–108456-8.e1–e5; December 1, 2020) In the originally published version of this article, Eran Elhaik was incorrectly spelled in the author list. The corrected author list appears here and with the article online. The authors regret the error.
Preprint
Supervised machine learning (SML) is a powerful method for predicting a small number of well-defined output groups (e.g., potential buyers of a certain product) by taking as input a large number of known well-defined measurements (e.g., past purchases, income, ethnicity, gender, credit record, age, favorite color, favorite chewing gum). SML is pred...
Article
Full-text available
The rise of microbiomics and metagenomics has been driven by advances in genomic sequencing technology, improved microbial sampling methods, and fast-evolving approaches in bioinformatics. Humans are a host to diverse microbial communities in and on their bodies, which continuously interact with and alter the surrounding environments. Since informa...
Preprint
Supervised machine learning (SML) is a powerful method for predicting a small number of well-defined output groups (e.g., potential buyers of a certain product) by taking as input a large number of known well-defined measurements (e.g., past purchases, income, ethnicity, gender, credit record, age, favorite color, favorite chewing gum). SML is pred...
Article
Full-text available
Ancient Y-Chromosomal DNA is an invaluable tool for dating and discerning the origins of migration routes and demographic processes that occurred thousands of years ago. Driven by the adoption of high-throughput sequencing and capture enrichment methods in paleogenomics, the number of published ancient genomes has nearly quadrupled within the last...
Article
SARS-coronavirus 2 (SARS-CoV-2) has rapidly caused a global pandemic associated with a novel respiratory infection now termed coronavirus disease-19 (COVID-19). ACE2 is necessary to facilitate SARS-CoV-2 infection, but due to its essential metabolic roles, it may be difficult to target it in therapies. TMPRSS2, which interacts with ACE2, may be a b...
Preprint
The past years saw the rise of genomic biobanks and mega-scale meta-analysis of genomic data that promise to reveal the genetic underpinnings of health and disease. However, the over-representation of Europeans in genomic studies not only limit the global understanding of disease risk and intervention efficacy, but also inhibit viable research into...
Preprint
Full-text available
Advancements in DNA methods and biotechnology have enabled forensic scientists to explore the DNA evidence found as part of a criminal investigation on a much more comprehensive and predictive level. This has led to a rise in research into DNA intelligence tools such as phenotypic prediction (i.e., eye and hair colour) and inference of biogeographi...
Preprint
Full-text available
Recently, Mikheyev et al. (2019) have produced a preprint study describing the genomes of nine Khazars archeologically dated from the 7th to the 9th centuries found in the Rostov county in modern-day Russia. Skull morphology indicated a mix of "Caucasoid" and "Mongoloid" shapes. The authors compared the samples to ancient and contemporary samples t...
Article
Purpose: Multiple myeloma (MM) is a plasma cell (PC) malignancy with an increasing incidence in the US. Epidemiological studies demonstrate a 2-3 fold higher incidence of the pre-malignant monoclonal gammopathy of undetermined significance (MGUS) and MM with a ~4-year younger age of onset among African Americans (AA) compared to European Americans...
Preprint
Full-text available
Radiocarbon dating is the gold-standard in archaeology to estimate the age of skeletons, a key to studying their origins. Nearly half of all published ancient human genomes lack reliable and direct dates, which results in obscure and contradictory reports. Here, we developed the Temporal Population Structure (TPS), the first DNA-based dating method...
Article
Full-text available
Gastric cancer (GC) is the fifth most common type of cancer worldwide with high incidences in Asia, Central, and South American countries. This patchy distribution means that GC studies are neglected by large research centers from developed countries. The need for further understanding of this complex disease, including the local importance of epid...
Preprint
Full-text available
Although studies have shown that urban environments and mass-transit systems have geospatially distinct metagenomes, no study has ever systematically studied these dense, human/microbial ecosystems around the world. To address this gap in knowledge, we created a global metagenomic and antimicrobial resistance (AMR) atlas of urban mass transit syste...
Preprint
Full-text available
The Glucocorticoid Receptor (GR) co-ordinates metabolic and behavioural responses to stressors. We hypothesised that GR influences behaviour by modulating specific epigenetic and transcriptional processes in the brain. Using the zebrafish as a model organism, the brain methylomes of wild-type and gr s357 mutant adults were analysed and GR-sensitive...
Article
Full-text available
The rapid accumulation of ancient human genomes from various areas and time periods potentially enables the expansion of studies of biodiversity, biogeography, forensics, population history, and epidemiology into past populations. However, most ancient DNA (aDNA) data were generated through microarrays designed for modern-day populations, which are...
Article
Full-text available
Monoclonal gammopathies, including multiple myeloma (MM), represent a group of plasma cell (PC) disorders that comprise of mostly incurable hematopoietic malignancies with an increasing incidence in the US. Previous epidemiological studies demonstrated a 2-3 fold higher incidence of monoclonal gammopathy of undetermined significance (MGUS) and a si...
Article
Full-text available
Motivation: In clinical trials, individuals are matched using demographic criteria, paired, and then randomly assigned to treatment and control groups to determine a drug's efficacy. A chief cause for the irreproducibility of results across pilot to Phase III trials is population stratification bias caused by the uneven distribution of ancestries...
Article
Full-text available
Multiple myeloma (MM) is two- to three-fold more common in African Americans (AAs) compared to European Americans (EAs). This striking disparity, one of the highest of any cancer, may be due to underlying genetic predisposition between these groups. There are multiple unique cytogenetic subtypes of MM, and it is likely that the disparity is associa...
Preprint
Full-text available
Sudden Infant Death Syndrome (SIDS) is the most common cause of postneonatal infant death. The allostatic load hypothesis posits that SIDS is the result of perinatal cumulative painful, stressful, or traumatic exposures that tax neonatal regulatory systems. To test it, we explored the relationships between SIDS and two common stressors, male neonat...
Article
Sudden Infant Death Syndrome (SIDS) is the most common cause of postneonatal infant death. The allostatic load hypothesis posits that SIDS is the result of perinatal cumulative painful, stressful, or traumatic exposures that tax neonatal regulatory systems. To test it, we explored the relationships between SIDS and two common stressors, male neonat...
Preprint
Full-text available
The rapid accumulation of ancient human genomes from various places and time periods, mainly from the past 15,000 years, allows us to probe the past with an unparalleled accuracy and reconstruct trends in human biodiversity. Alongside providing novel insights into the population history, population structure permits correcting for population strati...
Article
Full-text available
Background Sudden infant death syndrome (SIDS) is the most common cause of postneonatal unexplained infant death. The allostatic load hypothesis posits that SIDS is the result of cumulative perinatal painful, stressful, or traumatic exposures that tax neonatal regulatory systems. Aims To test the predictions of the allostatic load hypothesis we ex...
Article
Full-text available
The public commonly associates microorganisms with pathogens. This suspicion of microorganisms is understandable, as historically microorganisms have killed more humans than any other agent while remaining largely unknown until the late seventeenth century with the works of van Leeuwenhoek and Kircher. Despite our improved understanding regarding m...
Article
Full-text available
The human population displays wide variety in demographic history, ancestry, content of DNA derived from hominins or ancient populations, adaptation, traits, copy number variation (CNVs), drug response, and more. These polymorphisms are of broad interest to population geneticists, forensics investigators, and medical professionals. Historically, mu...
Article
Full-text available
Recently, the geographical origins of Ashkenazic Jews (AJs) and their native language Yiddish were investigated by applying the Geographic Population Structure (GPS) to a cohort of exclusively Yiddish-speaking and multilingual AJs. GPS localized most AJs along major ancient trade routes in northeastern Turkey adjacent to primeval villages with name...
Preprint
Full-text available
In clinical trials, individuals are matched for demographic criteria, paired, and then randomly assigned to treatment and control groups to determine a drug’s efficacy. The successful completion of pilot trials is a prerequisite to larger and more expensive Phase III trials. One of the chief causes for the irreproducibility of results across pilot...
Article
Full-text available
The Druze are an aggregate of communities in the Levant and Near East living almost exclusively in the mountains of Syria, Lebanon and Israel whose ~1000 year old religion formally opposes mixed marriages and conversions. Despite increasing interest in genetics of the population structure of the Druze, their population history remains unknown. We i...
Article
Full-text available
Nature Communications 5 : Article number: 3513 10.1038/ncomms4513 ( 2014 ); Published: 29 April 2016 ; Updated: 31 October 2016 This article was published without any competing financial interests statement.
Article
Full-text available
Sudden infant death syndrome (SIDS) is the leading cause of death among USA infants under 1 year of age accounting for ~2,700 deaths per year. Although formally SIDS dates back at least 2,000 years and was even mentioned in the Hebrew Bible (Kings 3:19), its etiology remains unexplained prompting the CDC to initiate a sudden unexpected infant death...
Article
Full-text available
The term 'ancient DNA' (aDNA) is coming of age, with over 1,200 hits in the PubMed database, beginning in the early 1980s with the studies of 'molecular paleontology'. Rooted in cloning and limited sequencing of DNA from ancient remains during the pre-PCR era, the field has made incredible progress since the introduction of PCR and next-generation...
Article
Full-text available
The debate as to whether Jewishness is a biological trait inherent from an “authentic” “Jewish type” (jüdische Typus) ancestor or a system of beliefs has been raging for over two centuries. While the accumulated biological and anthropological evidence support the latter argument, recent genetic findings, bolstered by the direct-to-consumer genetic...
Article
Full-text available
Recently, we investigated the geographical origins of Ashkenazic Jews (AJs) and their native language Yiddish by applying a biogeographical tool, the Geographic Population Structure (GPS), to a cohort of 367 exclusively Yiddish-speaking and multilingual AJs genotyped on the Genochip microarray. GPS localized most AJs along major ancient trade route...
Article
Full-text available
The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium is a novel, interdisciplinary initiative comprised of experts across many fields, including genomics, data analysis, engineering, public health, and architecture. The ultimate goal of the MetaSUB Consortium is to improve city utilization and planning...
Article
Full-text available
The Yiddish language is over one thousand years old and incorporates German, Slavic, and Hebrew elements. The prevalent view claims Yiddish has a German origin, whereas the opposing view posits a Slavic origin with strong Iranian and weak Turkic substrata. One of the major difficulties in deciding between these hypotheses is the unknown geographica...
Article
Full-text available
In general, community similarity is thought to decay with distance; however, this view may be complicated by the relative roles of different ecological processes at different geographical scales, and by the compositional perspective (e.g. species, functional group and phylogenetic lineage) used. Coastal salt marshes are widely distributed worldwide...
Conference Paper
Full-text available
The Yiddish language is a curious amalgam of Hebrew, German, and Slavonic. It is written in Hebrew characters and spoken primarily by European Jews of Central and Eastern Europe. Due to its mixed nature, it is very difficult to trace the origin of this language using traditional linguistic approaches. Given the close association between languages,...
Article
Full-text available
For the past four decades the compositional organization of the mammalian genome posed a formidable challenge to molecular evolutionists attempting to explain it from an evolutionary perspective. Unfortunately, most of the explanations adhered to the "isochore theory," which has long been rebutted. Recently, an alternative compositional domain mode...
Article
Full-text available
Earlier this year, we published a scathing critique of a paper by Mendez et al. (2013) in which the claim was made that a Y chromosome was 237,000-581,000 years old. Elhaik et al. (2014) also attacked a popular article in Scientific American by the senior author of Mendez et al. (2013), whose title was "Sex with other human species might have been...
Article
Full-text available
The search for a method that utilizes biological information to predict humans' place of origin has occupied scientists for millennia. Over the past four decades, scientists have employed genetic data in an effort to achieve this goal but with limited success. While biogeographical algorithms using next-generation sequencing data have achieved an a...
Article
Full-text available
The search for a method that utilizes biological information to predict humans’ place of origin has occupied scientists for millennia. Over the past four decades, scientists have employed genetic data in an effort to achieve this goal but with limited success. While biogeographical algorithms using next-generation sequencing data have achieved an a...
Article
Full-text available
The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching fin...
Article
Full-text available
Mendez and colleagues reported the identification of a Y chromosome haplotype (the A00 lineage) that lies at the basal position of the Y chromosome phylogenetic tree. Incorporating this haplotype, the authors estimated the time to the most recent common ancestor (TMRCA) for the Y tree to be 338 000 years ago (95% CI=237 000-581 000). Such an extrao...