[Show abstract][Hide abstract] ABSTRACT: The protein-coding exome of a patient with a monogenic disease contains about 20,000 variants, only one or two of which are disease causing. We found that 58% of rare variants in the protein-coding exome of the general population are located in only 2% of the genes. Prompted by this observation, we aimed to develop a gene-level approach for predicting whether a given human protein-coding gene is likely to harbor disease-causing mutations. To this end, we derived the gene damage index (GDI): a genome-wide, gene-level metric of the mutational damage that has accumulated in the general population. We found that the GDI was correlated with selective evolutionary pressure, protein complexity, coding sequence length, and the number of paralogs. We compared GDI with the leading gene-level approaches, genic intolerance, and de novo excess, and demonstrated that GDI performed best for the detection of false positives (i.e., removing exome variants in genes irrelevant to disease), whereas genic intolerance and de novo excess performed better for the detection of true positives (i.e., assessing de novo mutations in genes likely to be disease causing). The GDI server, data, and software are freely available to noncommercial users from lab.rockefeller.edu/casanova/GDI.
Proceedings of the National Academy of Sciences 10/2015; 112(44). DOI:10.1073/pnas.1518646112 · 9.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The Roma, also known as 'Gypsies', represent the largest and the most widespread ethnic minority of Europe. There is increasing evidence, based on linguistic, anthropological and genetic data, to suggest that they originated from the Indian subcontinent, with subsequent bottlenecks and undetermined gene flow from/to hosting populations during their diaspora. Further support comes from the presence of Indian uniparentally inherited lineages, such as mitochondrial DNA M and Y-chromosome H haplogroups, in a significant number of Roma individuals. However, the limited resolution of most genetic studies so far, together with the restriction of the samples used, have prevented the detection of other non-Indian founder lineages that might have been present in the proto-Roma population. We performed a high-resolution study of the uniparental genomes of 753 Roma and 984 non-Roma hosting European individuals. Roma groups show lower genetic diversity and high heterogeneity compared with non-Roma samples as a result of lower effective population size and extensive drift, consistent with a series of bottlenecks during their diaspora. We found a set of founder lineages, present in the Roma and virtually absent in the non-Roma, for the maternal (H7, J1b3, J1c1, M18, M35b, M5a1, U3, and X2d) and paternal (I-P259, J-M92, and J-M67) genomes. This lineage classification allows us to identify extensive gene flow from non-Roma to Roma groups, whereas the opposite pattern, although not negligible, is substantially lower (up to 6.3%). Finally, the exact haplotype matching analysis of both uniparental lineages consistently points to a Northwestern origin of the proto-Roma population within the Indian subcontinent.European Journal of Human Genetics advance online publication, 16 September 2015; doi:10.1038/ejhg.2015.201.
European journal of human genetics: EJHG 09/2015; DOI:10.1038/ejhg.2015.201 · 4.35 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The optimal coordination of the transcriptional response of host cells to infection is essential for establishing appropriate immunological outcomes. In this context, the role of microRNAs (miRNAs) - important epigenetic regulators of gene expression - in regulating mammalian immune systems is increasingly well recognised. However, the expression dynamics of miRNAs, and that of their isoforms, in response to infection remains largely unexplored. Here, we characterized the genome-wide miRNA transcriptional responses of human dendritic cells, over time, to various mycobacteria differing in their virulence as well as to other bacteria outside the genus Mycobacterium, using small RNA-sequencing. We detected the presence of a core temporal response to infection, shared across bacteria, comprising 49 miRNAs, highlighting a set of miRNAs that may play an essential role in the regulation of basic cellular responses to stress. Despite such broadly shared expression dynamics, we identified specific elements of variation in the miRNA response to infection across bacteria, including a virulence-dependent induction of the miR-132/212 family in response to mycobacterial infections. We also found that infection has a strong impact on both the relative abundance of the miRNA hairpin arms and the expression dynamics of miRNA isoforms. That we observed broadly consistent changes in relative arm expression and isomiR distribution across bacteria suggests that this additional, internal layer of variability in miRNA responses represents an additional source of subtle miRNA-mediated regulation upon infection. Collectively, this study increases our understanding of the dynamism and role of miRNAs in response to bacterial infection, revealing novel features of their internal variability and identifying candidate miRNAs that may contribute to differences in the pathogenicity of mycobacterial infections.
[Show abstract][Hide abstract] ABSTRACT: High-frequency microsatellite haplotypes of the male-specific Y-chromosome can signal past episodes of high reproductive success of particular men and their patrilineal descendants. Previously, two examples of such successful Y-lineages have been described in Asia, both associated with Altaic-speaking pastoral nomadic societies, and putatively linked to dynasties descending, respectively, from Genghis Khan and Giocangga. Here we surveyed a total of 5321 Y-chromosomes from 127 Asian populations, including novel Y-SNP and microsatellite data on 461 Central Asian males, to ask whether additional lineage expansions could be identified. Based on the most frequent eight-microsatellite haplotypes, we objectively defined 11 descent clusters (DCs), each within a specific haplogroup, that represent likely past instances of high male reproductive success, including the two previously identified cases. Analysis of the geographical patterns and ages of these DCs and their associated cultural characteristics showed that the most successful lineages are found both among sedentary agriculturalists and pastoral nomads, and expanded between 2100 BCE and 1100 CE. However, those with recent origins in the historical period are almost exclusively found in Altaic-speaking pastoral nomadic populations, which may reflect a shift in political organisation in pastoralist economies and a greater ease of transmission of Y-chromosomes through time and space facilitated by the use of horses.European Journal of Human Genetics advance online publication, 14 January 2015; doi:10.1038/ejhg.2014.285.
European journal of human genetics: EJHG 01/2015; 23(10). DOI:10.1038/ejhg.2014.285 · 4.35 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Multi-parametric flow cytometry is a key technology for characterization of immune cell phenotypes. However, robust high-dimensional post-analytic strategies for automated data analysis in large numbers of donors are still lacking. Here, we report a computational pipeline, called FlowGM, which minimizes operator input, is insensitive to compensation settings, and can be adapted to different analytic panels. A Gaussian Mixture Model (GMM)-based approach was utilized for initial clustering, with the number of clusters determined using Bayesian Information Criterion. Meta-clustering in a reference donor permitted automated identification of 24 cell types across four panels. Cluster labels were integrated into FCS files, thus permitting comparisons to manual gating. Cell numbers and coefficient of variation (CV) were similar between FlowGM and conventional gating for lymphocyte populations, but notably FlowGM provided improved discrimination of “hard-to-gate” monocyte and dendritic cell (DC) subsets. FlowGM thus provides rapid high-dimensional analysis of cell phenotypes and is amenable to cohort studies.
[Show abstract][Hide abstract] ABSTRACT: Pathogens, and the infectious diseases they cause, have been paramount among the threats encountered by humans in their expansions throughout the globe. Numerous studies have identified immunity and host defence genes as being among the functions most strongly targeted by selection, most likely pathogen-driven. The dissection of the form and intensity of such selective pressures have increased our knowledge of the biological relevance of the underlying immunological mechanisms in host defence. Although the identities of the specific infectious agents imposing these selective pressures remain, in most cases, elusive, the impact of several pathogens, notably malaria and cholera, has been described. However, past selection against infectious diseases may have some fitness costs upon environmental changes, potentially leading to maladaptation and immunopathology.
Current Opinion in Genetics & Development 12/2014; 29:31–38. DOI:10.1016/j.gde.2014.07.004 · 7.57 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The progress of genomic technologies is allowing researchers to scan the genomes of different species for the occurrence of natural selection at an unprecedented level of resolution. These studies show that genes involved in immune processes are preferential targets of different forms of selection, some of which act to preserve immune diversity over time. Recent work in humans shows that this can be achieved either by inheriting advantageous immune variation from distant ancestral species, through long-term balancing selection, or by acquiring novel selected alleles through admixture with extinct hominins such as Neanderthals or Denisovans. These studies collectively increase our knowledge of immune genes for which maintaining the functional diversity has conferred a strong selective advantage for host survival.
Current Opinion in Immunology 09/2014; 30C(1):79-84. DOI:10.1016/j.coi.2014.08.002 · 7.48 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The evolutionary history of the human pygmy phenotype (small body size), a characteristic of African and Southeast Asian rainforest hunter-gatherers, is largely unknown. Here we use a genome-wide admixture mapping analysis to identify 16 genomic regions that are significantly associated with the pygmy phenotype in the Batwa, a rainforest hunter-gatherer population from Uganda (east central Africa). The identified genomic regions have multiple attributes that provide supporting evidence of genuine association with the pygmy phenotype, including enrichments for SNPs previously associated with stature variation in Europeans and for genes with growth hormone receptor and regulation functions. To test adaptive evolutionary hypotheses, we computed the haplotype-based integrated haplotype score (iHS) statistic and the level of population differentiation (FST) between the Batwa and their agricultural neighbors, the Bakiga, for each genomic SNP. Both |iHS| and FST values were significantly higher for SNPs within the Batwa pygmy phenotype-associated regions than the remainder of the genome, a signature of polygenic adaptation. In contrast, when we expanded our analysis to include Baka rainforest hunter-gatherers from Cameroon and Gabon (west central Africa) and Nzebi and Nzime neighboring agriculturalists, we did not observe elevated |iHS| or FST values in these genomic regions. Together, these results suggest adaptive and at least partially convergent origins of the pygmy phenotype even within Africa, supporting the hypothesis that small body size confers a selective advantage for tropical rainforest hunter-gatherers but raising questions about the antiquity of this behavior.
Proceedings of the National Academy of Sciences 08/2014; 111(35):E3596-E3603. DOI:10.1073/pnas.1402875111 · 9.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Identifying the genotypes underlying human disease phenotypes is a fundamental step in human genetics and medicine. High-throughput genomic technologies provide thousands of genetic variants per individual. The causal genes of a specific phenotype are usually expected to be functionally close to each other. According to this hypothesis, candidate genes are picked from high-throughput data on the basis of their biological proximity to core genes -- genes already known to be responsible for the phenotype. There is currently no effective gene-centric online interface for this purpose.
We describe here the human gene connectome server (HGCS), a powerful, easy-to-use interactive online tool enabling researchers to prioritize any list of genes according to their biological proximity to core genes associated with the phenotype of interest. We also make available an updated and extended version for all human gene-specific connectomes. The HGCS is freely available to noncommercial users from: http://hgc.rockefeller.edu/.
The HGCS should help investigators from diverse fields to identify new disease-causing candidate genes more effectively, via a user-friendly online interface.
[Show abstract][Hide abstract] ABSTRACT: Genome-wide scans for selection have identified multiple regions of the human genome as being targeted by positive selection. However, only a small proportion has been replicated across studies, and the prevalence of positive selection as a mechanism of adaptive change in humans remains controversial. Here we explore the power of two haplotype-based statistics - the integrated haplotype score (iHS) and the Derived Intra-allelic Nucleotide Diversity (DIND) test - in the context of next-generation sequencing data, and evaluate their robustness to demography and other selection modes. We show that these statistics are both powerful for the detection of recent positive selection, regardless of population history, and robust to variation in coverage, with DIND being insensitive to very low coverage. We apply these statistics to whole-genome sequence datasets from the 1000 Genomes Project and Complete Genomics. We found that putative targets of selection were highly significantly enriched in genic and non-synonymous SNPs, and that DIND was more powerful than iHS in the context of small sample sizes, low-quality genotype calling or poor coverage. As we excluded genomic confounders and alternative selection models, such as background selection, the observed enrichment attests to the action of recent, strong positive selection. Further support to the adaptive significance of these genomic regions came from their enrichment in functional variants detected by genome-wide association studies, informing the relationship between past selection and current benign and disease-related phenotypic variation. Our results indicate that hard sweeps targeting low-frequency standing variation have played a moderate, albeit significant, role in recent human evolution.
[Show abstract][Hide abstract] ABSTRACT: Standardization of immunophenotyping procedures has become a high priority. We have developed a suite of whole-blood, syringe-based assay systems that can be used to reproducibly assess induced innate or adaptive immune responses. By eliminating preanalytical errors associated with immune monitoring, we have defined the protein signatures induced by (1) medically relevant bacteria, fungi, and viruses; (2) agonists specific for defined host sensors; (3) clinically employed cytokines; and (4) activators of T cell immunity. Our results provide an initial assessment of healthy donor reference values for induced cytokines and chemokines and we report the failure to release interleukin-1α as a common immunological phenotype. The observed naturally occurring variation of the immune response may help to explain differential susceptibility to disease or response to therapeutic intervention. The implementation of a general solution for assessment of functional immune responses will help support harmonization of clinical studies and data sharing.
[Show abstract][Hide abstract] ABSTRACT: The emergence of agriculture in West-Central Africa approximately 5,000 years ago, profoundly modified the cultural landscape and mode of subsistence of most sub-Saharan populations. How this major innovation has had an impact on the genetic history of rainforest hunter-gatherers-historically referred to as 'pygmies'-and agriculturalists, however, remains poorly understood. Here we report genome-wide SNP data from these populations located west-to-east of the equatorial rainforest. We find that hunter-gathering populations present up to 50% of farmer genomic ancestry, and that substantial admixture began only within the last 1,000 years. Furthermore, we show that the historical population sizes characterizing these communities already differed before the introduction of agriculture. Our results suggest that the first socio-economic interactions between rainforest hunter-gatherers and farmers introduced by the spread of farming were not accompanied by immediate, extensive genetic exchanges and occurred on a backdrop of two groups already differentiated by their specialization in two ecotopes with differing carrying capacities.
[Show abstract][Hide abstract] ABSTRACT: MicroRNAs (miRNAs) are critical regulators of gene expression and their role in a wide variety of biological processes, including host antimicrobial defense, is increasingly well described. Consistent with their diverse functional effects, miRNA expression is highly context-dependent and shows marked changes upon cellular activation. However, the genetic control of miRNA expression in response to external stimuli and the impact of such perturbations on miRNA-mediated regulatory networks at the population level remain to be determined. Here we assessed changes in miRNA expression upon Mycobacterium tuberculosis infection and mapped expression quantitative trait loci (eQTL) in dendritic cells from a panel of healthy individuals. Genome-wide expression profiling revealed that ~40% of miRNAs are differentially expressed upon infection. We find that the expression of 3% of miRNAs is controlled by proximate genetic factors, which are enriched in a promoter-specific histone modification associated with active transcription. Notably, we identify two infection-specific response eQTLs, for miR-326 and miR-1260, providing an initial assessment of the impact of genotype-environment interactions on miRNA molecular phenotypes. Furthermore, we show that infection coincides with a marked remodeling of the genome-wide relationships between miRNA and mRNA expression levels. This observation, supplemented by experimental data using the model of miR-29a, sheds light on the role of a set of miRNAs in cellular responses to infection. Collectively, this study increases our understanding of the genetic architecture of miRNA expression in response to infection, and highlights the wide-reaching impact of altering miRNA expression on the transcriptional landscape of a cell.
Genome Research 01/2014; 24(5). DOI:10.1101/gr.161471.113 · 14.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Human genetic studies are rarely conducted for immunological purposes. Instead, they are typically driven by medical and evolutionary goals, such as understanding the predisposition or resistance to infectious or inflammatory diseases, the pathogenesis of such diseases, and human evolution in the context of the long-standing relationships between humans and their commensal and environmental microbes. However, the dissection of these experiments of Nature has also led to major immunological advances. In this review, we draw on some of the immunological lessons learned in the three branches of human molecular genetics most relevant to immunology: clinical genetics, epidemiological genetics, and evolutionary genetics. We argue that human genetics has become a new frontier not only for timely studies of specific features of human immunity, but also for defining general principles of immunity. These studies teach us about immunity as it occurs under "natural" conditions, through the transition from the almost complete wilderness that existed worldwide until about a century ago to the current unevenly distributed medically shaped environment. Hygiene, vaccines, antibiotics, and surgery have considerably decreased the burden of infection, but these interventions have been available only recently, so have yet to have a major impact on patterns of genomic diversity, making it possible to carry out unbiased evolutionary studies at the population level. Clinical genetic studies of childhood phenotypes have not been blurred by modern medicine either. Instead, medical advances have actually facilitated such studies, by making it possible for children with life-threatening infections to survive. In addition, the prevention and treatment of infectious diseases have increased life expectancy at birth from ∼20 yr to ∼80 yr, providing unique opportunities to study the genetic basis of immunological phenomena against which there is no natural counterselection, such as reactivation and secondary infectious diseases and breakdown of self-tolerance manifesting as autoimmunity, in populations of adult and aging patients. Recently developed deep sequencing and stem cell technologies are of unprecedented power, and their application to human genetics is opening up exciting and timely possibilities for young immunologists seeking uncharted waters to explore.
Cold Spring Harbor Symposia on Quantitative Biology 10/2013; 78(1). DOI:10.1101/sqb.2013.78.019968
[Show abstract][Hide abstract] ABSTRACT: Demographic changes are known to leave footprints on genetic polymorphism. Together with the increased availability of large polymorphism datasets, coalescent-based methods allow inferring the past demography of populations from their present-day patterns of genetic diversity. Here, we analyzed both nuclear (20 non-coding regions) and mitochondrial (HVS-I) re-sequencing data to infer the demographic history of 66 African and Eurasian human populations presenting contrasting life-styles (nomadic hunter-gatherers, nomadic herders and sedentary farmers). This allowed us to investigate the relationship between life-style and demography, and to address the long-standing debate about the chronology of demographic expansions and the Neolithic transition. In Africa, we inferred expansion events for farmers, but constant population sizes or contraction events for hunter-gatherers. In Eurasia, we inferred higher expansion rates for farmers than herders with HVS-I data, except in Central Asia and Korea. Although isolation and admixture processes could have impacted our demographic inferences, these processes alone seem unlikely to explain the contrasted demographic histories inferred in populations with different life-styles. The small expansion rates or constant population sizes inferred for herders and hunter-gatherers may thus result from constraints linked to nomadism. However, autosomal data revealed contraction events for two sedentary populations in Eurasia, which may be caused by founder effects. Finally, the inferred expansions likely predated the emergence of agriculture and herding. This suggests that human populations could have started to expand in Paleolithic times, and that strong Paleolithic expansions in some populations may have ultimately favored their shift towards agriculture during the Neolithic.
[Show abstract][Hide abstract] ABSTRACT: The Y chromosome and the mitochondrial genome have been used to estimate when the common patrilineal and matrilineal ancestors of humans lived. We sequenced the genomes of 69 males from nine populations, including two in which we find basal branches of the Y-chromosome tree. We identify ancient phylogenetic structure within African haplogroups and resolve a long-standing ambiguity deep within the tree. Applying equivalent methodologies to the Y chromosome and the mitochondrial genome, we estimate the time to the most recent common ancestor (T(MRCA)) of the Y chromosome to be 120 to 156 thousand years and the mitochondrial genome T(MRCA) to be 99 to 148 thousand years. Our findings suggest that, contrary to previous claims, male lineages do not coalesce significantly more recently than female lineages.
[Show abstract][Hide abstract] ABSTRACT: The study of the genetic and selective landscape of immunity genes across primates can provide insight into the existing differences in susceptibility to infection observed between human and non-human primates. Here, we explored how selection has driven the evolution of a key family of innate immunity receptors, the Toll-like receptors (TLRs), in African great ape species. We sequenced the ten TLRs in various populations of chimpanzees and gorillas, and analysed these data jointly with a human dataset. We found that purifying selection has been more pervasive in great apes than in humans. Furthermore, in chimpanzees and gorillas, purifying selection has targeted TLRs irrespectively of whether they are endosomal or cell-surface, in contrast with humans where strong selective constraints are restricted to endosomal TLRs. These observations suggest important differences in the relative importance of TLR-mediated pathogen sensing, such as that of recognition of flagellated bacteria by TLR5, between human and great apes. Lastly, we used a population genetics-phylogenetics method that jointly analyse polymorphism and divergence data to detect fine-scale variation in selection pressures at specific codons within TLR genes. We identified different codons at different TLRs as being under positive selection in each species, highlighting that functional variation at these genes has conferred a selective advantage in immunity to infection to specific primate species. Overall, this study showed that the degree of selection driving the evolution of TLRs has largely differed between human and non-human primates, increasing our knowledge on their respective biological contribution to host defence in the natural setting.
Human Molecular Genetics 07/2013; 22(23). DOI:10.1093/hmg/ddt335 · 6.39 Impact Factor