[Show abstract][Hide abstract] ABSTRACT: New findings:
What is the topic of this review? Tibetans have genetic adaptations that are hypothesized to underlie the distinct set of traits they exhibit at altitude. What advances does it highlight? Several adaptive signatures in the same genomic regions have been identified among Tibetan populations resident throughout the Qinghai-Tibetan Plateau. Many highland Tibetans exhibit a haemoglobin concentration within the range expected at sea level, and this trait is associated with putatively adaptive regions harbouring the hypoxia-inducible factor pathway genes EGLN1, EPAS1 and PPARA. Precise functional variants at adaptive loci and relationships to physiological traits, beyond haemoglobin concentration, are currently being examined in this population. Some native Tibetan, Andean and Ethiopian populations have lived at altitudes ranging from 3000 to >4000 m above sea level for hundreds of generations and exhibit distinct combinations of traits at altitude. It was long hypothesized that genetic factors contribute to adaptive differences in these populations, and recent advances in genomics provide evidence that some of the strongest signatures of positive selection in humans are those identified in Tibetans. Many of the top adaptive genomic regions highlighted thus far harbour genes related to hypoxia sensing and response. Putatively adaptive copies of three hypoxia-inducible factor pathway genes, EPAS1, EGLN1 and PPARA, are associated with sea-level range, rather than elevated, haemoglobin concentration observed in many Tibetans at high altitude, and recent studies provide insight into some of the precise adaptive variants, timing of adaptive events and functional roles. While several studies in highland Tibetans have converged on a few hypoxia-inducible factor pathway genes, additional candidates have been reported in independent studies of Tibetans located throughout the Qinghai-Tibetan Plateau. Various aspects of adaptive significance have yet to be identified, integrated, and fully explored. Given the rapid technological advances and interdisciplinary efforts in genomics, physiology and molecular biology, careful examination of Tibetans and comparisons with other distinctively adapted highland populations will provide valuable insight into evolutionary processes and models for both basic and clinical research.
[Show abstract][Hide abstract] ABSTRACT: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
[Show abstract][Hide abstract] ABSTRACT: Most isolated congenital heart defects are thought to be sporadic and are often ascribed to multifactorial mechanisms with poorly understood genetics. Total Anomalous Pulmonary Venous Return (TAPVR) occurs in 1 in 15,000 live-born infants and occurs either in isolation or as part of a syndrome involving aberrant left-right development. Previously, we reported causative links between TAVPR and the PDGFRA gene. TAPVR has also been linked to the ANKRD1/CARP genes. However, these genes only explain a small fraction of the heritability of the condition. By examination of phased single nucleotide polymorphism genotype data from 5 distantly related TAPVR patients we identified a single 25 cM shared, Identical by Descent genomic segment on the short arm of chromosome 12 shared by 3 of the patients and their obligate-carrier parents. Whole genome sequence (WGS) analysis identified a non-synonymous variant within the shared segment in the retinol binding protein 5 (RBP5) gene. The RBP5 variant is predicted to be deleterious and is overrepresented in the TAPVR population. Gene expression and functional analysis of the zebrafish orthologue, rbp7, supports the notion that RBP5 is a TAPVR susceptibility gene. Additional sequence analysis also uncovered deleterious variants in genes associated with retinoic acid signaling, including NODAL and retinol dehydrogenase 10. These data indicate that genetic variation in the retinoic acid signaling pathway confers, in part, susceptibility to TAPVR.
PLoS ONE 06/2015; 10(6):e0131514. DOI:10.1371/journal.pone.0131514 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Mutations in ATP1A3 cause Alternating Hemiplegia of Childhood (AHC) by disrupting function of the neuronal Na+/K+ ATPase. Published studies to date indicate 2 recurrent mutations, D801N and E815K, and a more severe phenotype in the E815K cohort. We performed mutation analysis and retrospective genotype-phenotype correlations in all eligible patients with AHC enrolled in the US AHC Foundation registry from 1997-2012. Clinical data were abstracted from standardized caregivers' questionnaires and medical records and confirmed by expert clinicians. We identified ATP1A3 mutations by Sanger and whole genome sequencing, and compared phenotypes within and between 4 groups of subjects, those with D801N, E815K, other ATP1A3 or no ATP1A3 mutations. We identified heterozygous ATP1A3 mutations in 154 of 187 (82%) AHC patients. Of 34 unique mutations, 31 (91%) are missense, and 16 (47%) had not been previously reported. Concordant with prior studies, more than 2/3 of all mutations are clustered in exons 17 and 18. Of 143 simplex occurrences, 58 had D801N (40%), 38 had E815K (26%) and 11 had G937R (8%) mutations. Patients with an E815K mutation demonstrate an earlier age of onset, more severe motor impairment and a higher prevalence of status epilepticus. This study further expands the number and spectrum of ATP1A3 mutations associated with AHC and confirms a more deleterious effect of the E815K mutation on selected neurologic outcomes. However, the complexity of the disorder and the extensive phenotypic variability among subgroups merits caution and emphasizes the need for further studies.
PLoS ONE 05/2015; 10(5):e0127045. DOI:10.1371/journal.pone.0127045 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Preterm birth (PTB), defined as birth prior to a gestational age (GA) of 37 completed weeks, affects more than 10 % of births worldwide. PTB is the leading cause of neonatal mortality and is associated with a broad spectrum of lifelong morbidity in survivors. The etiology of spontaneous PTB (SPTB) is complex and has an important genetic component. Previous studies have compared monozygotic and dizygotic twin mothers and their families to estimate the heritability of SPTB, but these approaches cannot separate the relative contributions of the maternal and the fetal genomes to GA or SPTB. Using the Utah Population Database, we assessed the heritability of GA in more than 2 million post-1945 Utah births, the largest familial GA dataset ever assembled. We estimated a narrow-sense heritability of 13.3 % for GA and a broad-sense heritability of 24.5 %. A maternal effect (which includes the effect of the maternal genome) accounts for 15.2 % of the variance of GA, and the remaining 60.3 % is contributed by individual environmental effects. Given the relatively low heritability of GA and SPTB in the general population, multiplex SPTB pedigrees are likely to provide more power for gene detection than will samples of unrelated individuals. Furthermore, nongenetic factors provide important targets for therapeutic intervention.
Human Genetics 04/2015; 134(7). DOI:10.1007/s00439-015-1558-1 · 4.82 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Many studies of human populations have used the male-specific region of the Y chromosome (MSY) as a marker, but MSY sequence variants have traditionally been subject to ascertainment bias. Also, dating of haplogroups has relied on Y-specific short tandem repeats (STRs), involving problems of mutation rate choice, and possible long-term mutation saturation. Next-generation sequencing can ascertain single nucleotide polymorphisms (SNPs) in an unbiased way, leading to phylogenies in which branch-lengths are proportional to time, and allowing the times-to-most-recent-common-ancestor (TMRCAs) of nodes to be estimated directly. Here we describe the sequencing of 3.7 Mb of MSY in each of 448 human males at a mean coverage of 51 ×, yielding 13,261 high-confidence SNPs, 65.9% of which are previously unreported. The resulting phylogeny covers the majority of the known clades, provides date estimates of nodes, and constitutes a robust evolutionary framework for analysing the history of other classes of mutation. Different clades within the tree show subtle but significant differences in branch lengths to the root. We also apply a set of 23 Y-STRs to the same samples, allowing SNP- and STR-based diversity and TMRCA estimates to be systematically compared. Ongoing purifying selection is suggested by our analysis of the phylogenetic distribution of non-synonymous variants in 15 MSY single-copy genes.
[Show abstract][Hide abstract] ABSTRACT: Tibetans do not exhibit increased hemoglobin concentration at high altitude. We describe a high-frequency missense mutation in the EGLN1 gene, which encodes prolyl hydroxylase 2 (PHD2), that contributes to this adaptive response. We show that a variant in EGLN1, c.[12C>G; 380G>C], contributes functionally to the Tibetan high-altitude phenotype. PHD2 triggers the degradation of hypoxia-inducible factors (HIFs), which mediate many physiological responses to hypoxia, including erythropoiesis. The PHD2 p.[Asp4Glu; Cys127Ser] variant exhibits a lower Km value for oxygen, suggesting that it promotes increased HIF degradation under hypoxic conditions. Whereas hypoxia stimulates the proliferation of wild-type erythroid progenitors, the proliferation of progenitors with the c.[12C>G; 380G>C] mutation in EGLN1 is significantly impaired under hypoxic culture conditions. We show that the c.[12C>G; 380G>C] mutation originated ~8,000 years ago on the same haplotype previously associated with adaptation to high altitude. The c.[12C>G; 380G>C] mutation abrogates hypoxia-induced and HIF-mediated augmentation of erythropoiesis, which provides a molecular mechanism for the observed protection of Tibetans from polycythemia at high altitude.
[Show abstract][Hide abstract] ABSTRACT: Background
The genetics involved in Ewing sarcoma susceptibility and prognosis are poorly understood. EWS/FLI and related EWS/ETS chimeras upregulate numerous gene targets via promoter-based GGAA-microsatellite response elements. These microsatellites are highly polymorphic in humans, and preliminary evidence suggests EWS/FLI-mediated gene expression is highly dependent on the number of GGAA motifs within the microsatellite.
Here we sought to examine the polymorphic spectrum of a GGAA-microsatellite within the NR0B1 promoter (a critical EWS/FLI target) in primary Ewing sarcoma tumors, and characterize how this polymorphism influences gene expression and clinical outcomes.
A complex, bimodal pattern of EWS/FLI-mediated gene expression was observed across a wide range of GGAA motifs, with maximal expression observed in constructs containing 20–26 GGAA motifs. Relative to white European and African controls, the NR0B1 GGAA-microsatellite in tumor cells demonstrated a strong bias for haplotypes containing 21–25 GGAA motifs suggesting a relationship between microsatellite function and disease susceptibility. This selection bias was not a product of microsatellite instability in tumor samples, nor was there a correlation between NR0B1 GGAA-microsatellite polymorphisms and survival outcomes.
These data suggest that GGAA-microsatellite polymorphisms observed in human populations modulate EWS/FLI-mediated gene expression and may influence disease susceptibility in Ewing sarcoma.
PLoS ONE 08/2014; 9(8):e104378. DOI:10.1371/journal.pone.0104378 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We report the whole-genome sequence of the common marmoset (Callithrix jacchus). The 2.26-Gb genome of a female marmoset was assembled using Sanger read data (6x) and a whole-genome shotgun strategy. A first analysis has permitted comparison with the genomes of apes and Old World monkeys and the identification of specific features that might contribute to the unique biology of this diminutive primate, including genetic changes that may influence body size, frequent twinning and chimerism. We observed positive selection in growth hormone/insulin-like growth factor genes (growth pathways), respiratory complex I genes (metabolic pathways), and genes encoding immunobiological factors and proteases (reproductive and immunity pathways). In addition, both protein-coding and microRNA genes related to reproduction exhibited evidence of rapid sequence evolution. This genome sequence for a New World monkey enables increased power for comparative analyses among available primate genomes and facilitates biomedical research application.
[Show abstract][Hide abstract] ABSTRACT: High-throughput sequencing of related individuals has become an important tool for studying human disease. However, owing to technical complexity and lack of available tools, most pedigree-based sequencing studies rely on an ad hoc combination of suboptimal analyses. Here we present pedigree-VAAST (pVAAST), a disease-gene identification tool designed for high-throughput sequence data in pedigrees. pVAAST uses a sequence-based model to perform variant and gene-based linkage analysis. Linkage information is then combined with functional prediction and rare variant case-control association information in a unified statistical framework. pVAAST outperformed linkage and rare-variant association tests in simulations and identified disease-causing genes from whole-genome sequence data in three human pedigrees with dominant, recessive and de novo inheritance patterns. The approach is robust to incomplete penetrance and locus heterogeneity and is applicable to a wide variety of genetic traits. pVAAST maintains high power across studies of monogenic, high-penetrance phenotypes in a single pedigree to highly polygenic, common phenotypes involving hundreds of pedigrees.
[Show abstract][Hide abstract] ABSTRACT: Phevor integrates phenotype, gene function, and disease information with personal genomic data for improved power to identify disease-causing alleles. Phevor works by combining knowledge resident in multiple biomedical ontologies with the outputs of variant-prioritization tools. It does so by using an algorithm that propagates information across and between ontologies. This process enables Phevor to accurately reprioritize potentially damaging alleles identified by variant-prioritization tools in light of gene function, disease, and phenotype knowledge. Phevor is especially useful for single-exome and family-trio-based diagnostic analyses, the most commonly occurring clinical scenarios and ones for which existing personal genome diagnostic tools are most inaccurate and underpowered. Here, we present a series of benchmark analyses illustrating Phevor's performance characteristics. Also presented are three recent Utah Genome Project case studies in which Phevor was used to identify disease-causing alleles. Collectively, these results show that Phevor improves diagnostic accuracy not only for individuals presenting with established disease phenotypes but also for those with previously undescribed and atypical disease presentations. Importantly, Phevor is not limited to known diseases or known disease-causing alleles. As we demonstrate, Phevor can also use latent information in ontologies to discover genes and disease-causing alleles not previously associated with disease.
The American Journal of Human Genetics 04/2014; 94(4):599-610. DOI:10.1016/j.ajhg.2014.03.010 · 10.93 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Objective
We hypothesized that genetic variation affects responsiveness to 17-alpha hydroxyprogesterone caproate (17P) for recurrent preterm birth prevention.
Women of European ancestry with ≥1 spontaneous singleton preterm birth at <34 weeks' gestation who received 17P were recruited prospectively and classified as a 17P responder or nonresponder by the difference in delivery gestational age between 17P-treated and -untreated pregnancies. Samples underwent whole exome sequencing. Coding variants were compared between responders and nonresponders with the use of the Variant Annotation, Analysis, and Search Tool (VAAST), which is a probabilistic search tool for the identification of disease-causing variants, and were compared with a Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway candidate gene list. Genes with the highest VAAST scores were then classified by the online Protein ANalysis THrough Evolutionary Relationships (PANTHER) system into known gene ontology molecular functions and biologic processes. Gene distributions within these classifications were compared with an online reference population to identify over- and under- represented gene sets.
Fifty women (9 nonresponders) were included. Responders delivered 9.2 weeks longer with 17P vs 1.3 weeks' gestation for nonresponders (P < .001). A genome-wide search for genetic differences implicated the NOS1 gene to be the most likely associated gene from among genes on the KEGG candidate gene list (P < .00095). PANTHER analysis revealed several over-represented gene ontology categories that included cell adhesion, cell communication, signal transduction, nitric oxide signal transduction, and receptor activity (all with significant Bonferroni-corrected probability values).
We identified sets of over-represented genes in key processes among responders to 17P, which is the first step in the application of pharmacogenomics to preterm birth prevention.
American journal of obstetrics and gynecology 04/2014; 210(4). DOI:10.1016/j.ajog.2014.01.013 · 4.70 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Recent studies have used a variety of analytical methods to identify genes targeted by selection in high-altitude populations located throughout the Tibetan Plateau. Despite differences in analytic strategies and sample location, hypoxia-related genes, including EPAS1 and EGLN1, were identified in multiple studies. By applying the same analytic methods to genome-wide SNP information used in our previous study of a Tibetan population (n = 31) from the township of Maduo, located in the northeastern corner of the Qinghai-Tibetan Plateau (4200 m), we have identified common targets of natural selection in a second geographically and linguistically distinct Tibetan population (n = 46) in the Tuo Tuo River township (4500 m). Our analyses provide evidence for natural selection based on iHS and XP-EHH signals in both populations at the p<0.02 significance level for EPAS1, EGLN1, HMOX2, and CYP17A1 and for PKLR, HFE, and HBB and HBG2, which have also been reported in other studies. We highlight differences (i.e., stratification and admixture) in the two distinct Tibetan groups examined here and report selection candidate genes common to both groups. These findings should be considered in the prioritization of selection candidate genes in future genetic studies in Tibet.
PLoS ONE 03/2014; 9(3):e88252. DOI:10.1371/journal.pone.0088252 · 3.23 Impact Factor