[Show abstract][Hide abstract] ABSTRACT: Date palms (Phoenix dactylifera) are the most significant perennial crop in arid regions of the Middle East and North Africa. Here, we present a comprehensive catalogue of approximately seven million single nucleotide polymorphisms in date palms based on whole genome re-sequencing of a collection of 62 cultivars. Population structure analysis indicates a major genetic divide between North Africa and the Middle East/South Asian date palms, with evidence of admixture in cultivars from Egypt and Sudan. Genome-wide scans for selection suggest at least 56 genomic regions associated with selective sweeps that may underlie geographic adaptation. We report candidate mutations for trait variation, including nonsense polymorphisms and presence/absence variation in gene content in pathways for key agronomic traits. We also identify a copia-like retrotransposon insertion polymorphism in the R2R3 myb-like orthologue of the oil palm virescens gene associated with fruit colour variation. This analysis documents patterns of post-domestication diversification and provides a genomic resource for this economically important perennial tree crop.
[Show abstract][Hide abstract] ABSTRACT: Many decades of theory have demonstrated that, in non-recombining systems, slightly deleterious mutations accumulate non-reversibly, potentially driving the extinction of many asexual species. Non-recombining chromosomes in sexual organisms are thought to have degenerated in a similar fashion; however, it is not clear the extent to which damaging mutations accumulate along chromosomes with highly variable rates of crossing over. Using high-coverage sequencing data from over 1,400 individuals in the 1000 Genomes and CARTaGENE projects, we show that recombination rate modulates the distribution of putatively deleterious variants across the entire human genome. Exons in regions of low recombination are significantly enriched for deleterious and disease-associated variants, a signature varying in strength across worldwide human populations with different demographic histories. Regions with low recombination rates are enriched for highly conserved genes with essential cellular functions and show an excess of mutations with demonstrated effects on health, a phenomenon likely affecting disease susceptibility in humans.
[Show abstract][Hide abstract] ABSTRACT: Background
The silencing of tumor suppressor genes (TSGs) by aberrant DNA methylation occurs frequently in acute myeloid leukemia (AML). This epigenetic alteration can be reversed by 5-aza-2’-deoxcytidine (decitabine, 5-AZA-CdR). Although 5-AZA-CdR can induce complete remissions in patients with AML, most patients relapse. The effectiveness of this therapy may be limited by the inability of 5-AZA-CdR to reactivate all TSGs due to their silencing by other epigenetic mechanisms such as histone methylation or chromatin compaction. EZH2, a subunit of the polycomb repressive complex 2, catalyzes the methylation of histone H3 lysine 27 (H3K27) to H3K27me3. 3-Deazaneplanocin-A (DZNep), an inhibitor of methionine metabolism, can reactivate genes silenced by H3K27me3 by its inhibition of EZH2. In a previous report, we observed that 5-AZA-CdR, in combination with DZNep, shows synergistic antineoplastic action against AML cells. Gene silencing due to chromatin compaction is attributable to the action of histone deacetylases (HDAC). This mechanism of epigenetic gene silencing can be reversed by HDAC inhibitors such as trichostatin-A (TSA). Silent TSGs that cannot be reactivated by 5-AZA-CdR or DZNep have the potential to be reactivated by TSA. This provides a rationale for the use of HDAC inhibitors in combination with 5-AZA-CdR and DZNep to treat AML.
The triple combination of 5-AZA-CdR, DZNep, and TSA induced a remarkable synergistic antineoplastic effect against human AML cells as demonstrated by an in vitro colony assay. This triple combination also showed a potent synergistic activation of several key TSGs as determined by real-time PCR. The triple combination was more effective than the combination of two agents or a single agent. Microarray analysis showed that the triple combination generated remarkable changes in global gene expression.
Our data suggest that it may be possible to design a very effective therapy for AML using agents that target the reversal of the following three epigenetic “lock” mechanisms that silence gene expression: DNA methylation, histone methylation, and histone deacetylation. This approach merits serious consideration for clinical investigation in patients with advanced AML.
[Show abstract][Hide abstract] ABSTRACT: Mutations in the mitochondrial genome are associated with multiple diseases and biological processes; however, little is known about the extent of sequence variation in the mitochondrial transcriptome. By ultra-deeply sequencing mitochondrial RNA (>6000×) from the whole blood of ~1000 individuals from the CARTaGENE project, we identified remarkable levels of sequence variation within and across individuals, as well as sites that show consistent patterns of posttranscriptional modification. Using a genome-wide association study, we find that posttranscriptional modification of functionally important sites in mitochondrial transfer RNAs (tRNAs) is under strong genetic control, largely driven by a missense mutation in MRPP3 that explains ~22% of the variance. These results reveal a major nuclear genetic determinant of posttranscriptional modification in mitochondria and suggest that tRNA posttranscriptional modification may affect cellular energy production.
[Show abstract][Hide abstract] ABSTRACT: Sickle cell disease (SCD) is a congenital blood disease, affecting predominantly children from sub-Saharan Africa, but also populations world-wide. Although the causal mutation of SCD is known, the sources of clinical variability of SCD remain poorly understood, with only a few highly heritable traits associated with SCD having been identified. Phenotypic heterogeneity in the clinical expression of SCD is problematic for follow-up (FU), management, and treatment of patients. Here we used the joint analysis of gene expression and whole genome genotyping data to identify the genetic regulatory effects contributing to gene expression variation among groups of patients exhibiting clinical variability, as well as unaffected siblings, in Benin, West Africa. We characterized and replicated patterns of whole blood gene expression variation within and between SCD patients at entry to clinic, as well as in follow-up programs. We present a global map of genes involved in the disease through analysis of whole blood sampled from the cohort. Genome-wide association mapping of gene expression revealed 390 peak genome-wide significant expression SNPs (eSNPs) and 6 significant eSNP-by-clinical status interaction effects. The strong modulation of the transcriptome implicates pathways affecting core circulating cell functions and shows how genotypic regulatory variation likely contributes to the clinical variation observed in SCD.
Frontiers in Genetics 02/2014; 5:26. DOI:10.3389/fgene.2014.00026
[Show abstract][Hide abstract] ABSTRACT: Whole-exome or gene targeted resequencing in hundreds to thousands of individuals has shown that the majority of genetic variants are at low frequency in human populations. Rare variants are enriched for functional mutations and are expected to explain an important fraction of the genetic etiology of human disease, therefore having a potential medical interest. In this work, we analyze the whole-exome sequences of French-Canadian individuals, a founder population with a unique demographic history that includes an original population bottleneck less than 20 generations ago, followed by a demographic explosion, and the whole exomes of French individuals sampled from France. We show that in less than 20 generations of genetic isolation from the French population, the genetic pool of French-Canadians shows reduced levels of diversity, higher homozygosity, and an excess of rare variants with low variant sharing with Europeans. Furthermore, the French-Canadian population contains a larger proportion of putatively damaging functional variants, which could partially explain the increased incidence of genetic disease in the province. Our results highlight the impact of population demography on genetic fitness and the contribution of rare variants to the human genetic variation landscape, emphasizing the need for deep cataloguing of genetic variants by resequencing worldwide human populations in order to truly assess disease risk.
[Show abstract][Hide abstract] ABSTRACT: Regions of the genome that are under evolutionary constraint across multiple species have previously been used to identify functional sequences in the human genome. Furthermore, it is known that there is an inverse relationship between evolutionary constraint and the allele frequency of a mutation segregating in human populations, implying a direct relationship between interspecies divergence and fitness in humans. Here we utilise this relationship to test differences in the accumulation of putatively deleterious mutations both between populations and on the individual level.
Using whole genome and exome sequencing data from Phase 1 of the 1000 Genome Project for 1,092 individuals from 14 worldwide populations we show that minor allele frequency (MAF) varies as a function of constraint around both coding regions and non-coding sites genome-wide, implying that negative, rather than positive, selection primarily drives the distribution of alleles among individuals via background selection. We find a strong relationship between effective population size and the depth of depression in MAF around the most conserved genes, suggesting that populations with smaller effective size are carrying more deleterious mutations, which also translates into higher genetic load when considering the number of putatively deleterious alleles segregating within each population. Finally, given the extreme richness of the data, we are now able to classify individual genomes by the accumulation of mutations at functional sites using high coverage 1000 Genomes data. Using this approach we detect differences between 'healthy' individuals within populations for the distributions of putatively deleterious rare alleles they are carrying.
These findings demonstrate the extent of background selection in the human genome and highlight the role of population history in shaping patterns of diversity between human individuals. Furthermore, we provide a framework for the utility of personal genomic data for the study of genetic fitness and diseases.
[Show abstract][Hide abstract] ABSTRACT: Albendazole (ABZ), a benzimidazole (BZ) anthelmintic (AH), is commonly used for treatment of soil-transmitted helminths (STHs). Its regular use increases the possibility that BZ resistance may develop, which, in veterinary nematodes is caused by single nucleotide polymorphisms (SNPs) in the β-tubulin gene at positions 200, 167 or 198. The relative importance of these SNPs varies among the different parasitic nematodes of animals studied to date, and it is currently unknown whether any of these are influencing BZ efficacy against STHs in humans. We assessed ABZ efficacy and SNP frequencies before and after treatment of Ascaris lumbricoides, Trichuris trichiura and hookworm infections.
Studies were performed in Haiti, Kenya, and Panama. Stool samples were examined prior to ABZ treatment and two weeks (Haiti), one week (Kenya) and three weeks (Panama) after treatment to determine egg reduction rate (ERR). Eggs were genotyped and frequencies of each SNP assessed.
In T. trichiura, polymorphism was detected at codon 200. Following treatment, there was a significant increase, from 3.1% to 55.3%, of homozygous resistance-type in Haiti, and from 51.3% to 67.8% in Kenya (ERRs were 49.7% and 10.1%, respectively). In A. lumbricoides, a SNP at position 167 was identified at high frequency, both before and after treatment, but ABZ efficacy remained high. In hookworms from Kenya we identified the resistance-associated SNP at position 200 at low frequency before and after treatment while ERR values indicated good drug efficacy.
Albendazole was effective for A. lumbricoides and hookworms. However, ABZ exerts a selection pressure on the β-tubulin gene at position 200 in T. trichiura, possibly explaining only moderate ABZ efficacy against this parasite. In A. lumbricoides, the codon 167 polymorphism seemed not to affect drug efficacy whilst the polymorphism at codon 200 in hookworms was at such low frequency that conclusions cannot be drawn.
[Show abstract][Hide abstract] ABSTRACT: We describe a novel approach to capturing the covariance structure of peripheral blood gene expression that relies on the identification of highly conserved Axes of variation. Starting with a comparison of microarray transcriptome profiles for a new dataset of 189 healthy adult participants in the Emory-Georgia Tech Center for Health Discovery and Well-Being (CHDWB) cohort, with a previously published study of 208 adult Moroccans, we identify nine Axes each with between 99 and 1,028 strongly co-regulated transcripts in common. Each axis is enriched for gene ontology categories related to sub-classes of blood and immune function, including T-cell and B-cell physiology and innate, adaptive, and anti-viral responses. Conservation of the Axes is demonstrated in each of five additional population-based gene expression profiling studies, one of which is robustly associated with Body Mass Index in the CHDWB as well as Finnish and Australian cohorts. Furthermore, ten tightly co-regulated genes can be used to define each Axis as "Blood Informative Transcripts" (BITs), generating scores that define an individual with respect to the represented immune activity and blood physiology. We show that environmental factors, including lifestyle differences in Morocco and infection leading to active or latent tuberculosis, significantly impact specific axes, but that there is also significant heritability for the Axis scores. In the context of personalized medicine, reanalysis of the longitudinal profile of one individual during and after infection with two respiratory viruses demonstrates that specific axes also characterize clinical incidents. This mode of analysis suggests the view that, rather than unique subsets of genes marking each class of disease, differential expression reflects movement along the major normal Axes in response to environmental and genetic stimuli.
[Show abstract][Hide abstract] ABSTRACT: Background
Congenital multiple intestinal atresia (MIA) is a severe, fatal neonatal disorder, involving the occurrence of obstructions in the small and large intestines ultimately leading to organ failure. Surgical interventions are palliative but do not provide long-term survival. Severe immunodeficiency may be associated with the phenotype. A genetic basis for MIA is likely. We had previously ascertained a cohort of patients of French-Canadian origin, most of whom were deceased as infants or in utero. The goal of the study was to identify the molecular basis for the disease in the patients of this cohort.
We performed whole exome sequencing on samples from five patients of four families. Validation of mutations and familial segregation was performed using standard Sanger sequencing in these and three additional families with deceased cases. Exon skipping was assessed by reverse transcription-PCR and Sanger sequencing.
Five patients from four different families were each homozygous for a four base intronic deletion in the gene TTC7A, immediately adjacent to a consensus GT splice donor site. The deletion was demonstrated to have deleterious effects on splicing causing the skipping of the attendant upstream coding exon, thereby leading to a predicted severe protein truncation. Parents were heterozygous carriers of the deletion in these families and in two additional families segregating affected cases. In a seventh family, an affected case was compound heterozygous for the same 4bp deletion and a second missense mutation p.L823P, also predicted as pathogenic. No other sequenced genes possessed deleterious variants explanatory for all patients in the cohort. Neither mutation was seen in a large set of control chromosomes.
Based on our genetic results, TTC7A is the likely causal gene for MIA.
Journal of Medical Genetics 02/2013; 50(5). DOI:10.1136/jmedgenet-2012-101483 · 6.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Background Congenital multiple intestinal atresia (MIA) is a severe, fatal neonatal disorder, involving the occurrence of obstructions in the small and large intestines ultimately leading to organ failure. Surgical interventions are palliative but do not provide long-term survival. Severe immunodeficiency may be associated with the phenotype. A genetic basis for MIA is likely. We had previously ascertained a cohort of patients of French-Canadian origin, most of whom were deceased as infants or in utero. The goal of the study was to identify the molecular basis for the disease in the patients of this cohort. Methods We performed whole exome sequencing on samples from five patients of four families. Validation of mutations and familial segregation was performed using standard Sanger sequencing in these and three additional families with deceased cases. Exon skipping was assessed by reverse transcription-PCR and Sanger sequencing. Results Five patients from four different families were each homozygous for a four base intronic deletion in the gene TTC7A, immediately adjacent to a consensus GT splice donor site. The deletion was demonstrated to have deleterious effects on splicing causing the skipping of the attendant upstream coding exon, thereby leading to a predicted severe protein truncation. Parents were heterozygous carriers of the deletion in these families and in two additional families segregating affected cases. In a seventh family, an affected case was compound heterozygous for the same 4bp deletion and a second missense mutation p.L823P, also predicted as pathogenic. No other sequenced genes possessed deleterious variants explanatory for all patients in the cohort. Neither mutation was seen in a large set of control chromosomes. Conclusions Based on our genetic results, TTC7A is the likely causal gene for MIA.
Journal of Medical Genetics 02/2013; · 6.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: One of the most rapidly evolving genes in humans, PRDM9, is a key determinant of the distribution of meiotic recombination events. Mutations in this meiotic-specific gene have previously been associated with male infertility in humans and recent studies suggest that PRDM9 may be involved in pathological genomic rearrangements. By studying genomes from families with children affected by B-cell precursor acute lymphoblastic leukemia (B-ALL), we characterized meiotic recombination patterns within a family with two siblings having hyperdiploid childhood ALL and observed unusual localization of maternal recombination events. The mother of the family carries a rare PRDM9 allele, potentially explaining the unusual patterns found. From exomes sequenced in 44 additional parents of children affected with B-ALL, we discovered a substantial and significant excess of rare allelic forms of PRDM9. The rare PRDM9 alleles are transmitted to the affected children in half the cases, nonetheless there remains a significant excess of rare alleles among patients relative to controls. We successfully replicated this latter observation in an independent cohort of 50 children with B-ALL, where we found an excess of rare PRDM9 alleles in aneuploid and infant B-ALL patients. PRDM9 variability in humans is thought to influence genomic instability, and these data support a potential role for PRDM9 variation in risk of acquiring aneuploidies or genomic rearrangements associated with childhood leukemogenesis.
Genome Research 12/2012; 23(3). DOI:10.1101/gr.144188.112 · 14.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The CARTaGENE (CaG) study is both a population-based biobank and the largest ongoing prospective health study of men and women in Quebec. In population-based cohorts, participants are not recruited for a particular disease but represent a random selection among the population, minimizing the need to correct for bias in measured phenotypes. CaG targeted the segment of the population that is most at risk of developing chronic disorders, that is 40-69 years of age, from four metropolitan areas in Quebec. Over 20 000 participants consented to visiting 1 of 12 assessment sites where detailed health and socio-demographic information, physiological measures and biological samples (blood, serum and urine) were captured for a total of 650 variables. Significant correlations of diseases and chronic conditions are observed across these regions, implicating complex interactions, some of which we describe for major chronic conditions. The CaG study is one of the few population-based cohorts in the world where blood is stored not only for DNA and protein based science but also for gene expression analyses, opening the door for multiple systems genomics approaches that identify genetic and environmental factors associated with disease-related quantitative traits. Interested researchers are encouraged to submit project proposals on the study website (www.cartagene.qc.ca).
International Journal of Epidemiology 10/2012; 42(5). DOI:10.1093/ije/dys160 · 9.18 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The host mechanisms responsible for protection against malaria remain poorly understood, with only a few protective genetic
effects mapped in humans. Here, we characterize a host-specific genome-wide signature in whole-blood transcriptomes of Plasmodium falciparum-infected West African children and report a demonstration of genotype-by-infection interactions in vivo. Several associations
involve transcripts sensitive to infection and implicate complement system, antigen processing and presentation, and T-cell
activation (i.e., SLC39A8, C3AR1, FCGR3B, RAD21, RETN, LRRC25, SLC3A2, and TAPBP), including one association that validated a genome-wide association candidate gene (SCO1), implicating binding variation within a noncoding regulatory element. Gene expression profiles in mice infected with Plasmodium chabaudi revealed and validated similar responses and highlighted specific pathways and genes that are likely important responders
in both hosts. These results suggest that host variation and its interplay with infection affect children’s ability to cope
with infection and suggest a polygenic model mounted at the transcriptional level for susceptibility.
Proceedings of the National Academy of Sciences 09/2012; 109(42). DOI:10.1073/pnas.1204945109 · 9.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: DNA methylation and histone methylation are both involved in epigenetic regulation of gene expression and their dysregulation can play an important role in leukemogenesis. Aberrant DNA methylation has been reported to silence the expression of tumor suppressor genes in leukemia. Overexpression of the histone methyltransferase, EZH2, a subunit of the polycomb group repressive complex 2 (PRC2), was observed to promote oncogenesis. This is due to aberrant gene silencing by the trimethylation of histone H3 lysine 27 (H3K27me3) by EZH2. Since both these epigenetic silencing events are reversible, they are interesting targets for chemotherapeutic intervention by using an inhibitor of DNA methylation, such as 5-aza-2'-deoxcytidine (5-AZA-CdR), and 3-deazaneplanocin-A (DZNep), an inhibitor of the EZH2. Human HL-60 and murine L1210 leukemic cells exposed in vitro to 5-AZA-CdR and DZNep in combination showed a synergistic loss of clonogenicity in a colony assay as compared to each agent alone. This positive chemotherapeutic interaction was also observed in mice with L1210 leukemia. Quantitative PCR showed that the combination also produced a remarkable synergistic activation of the tumor suppressor genes, CDKN1A and FBXO32. Microarray analysis showed that 5-AZA-CdR plus DZNep produced a synergistic activation of >150 genes. Our results indicate that 5-AZA-CdR plus DZNep can reactivate target genes that are silenced by two distinct epigenetic mechanisms leading to a loss of the proliferative potential of leukemic cells.
Leukemia research 04/2012; 36(8):1049-54. DOI:10.1016/j.leukres.2012.03.001 · 2.35 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The advent of next generation sequencing technologies has opened new possibilities in the analysis of human disease. In this review we present the main next-generation sequencing technologies, with their major contributions and possible applications to the study of the genetic etiology of complex diseases.
Journal of neuroimmunology 01/2012; 248(1-2):10-22. DOI:10.1016/j.jneuroim.2011.12.017 · 2.47 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Gene-environment interactions have long been recognized as a fundamental concept in evolutionary, quantitative, and medical genetics. In the genomics era, study of how environment and genome interact to shape gene expression variation is relevant to understanding the genetic architecture of complex phenotypes. While genetic analysis of gene expression variation focused on main effects, little is known about the extent of interaction effects implicating regulatory variants and their consequences on transcriptional variation. Here we survey the current state of the concept of transcriptional gene-environment interactions and discuss its utility for mapping disease phenotypes in light of the insights gained from genome-wide association studies of gene expression.
Frontiers in Genetics 01/2012; 3:228. DOI:10.3389/fgene.2012.00228
[Show abstract][Hide abstract] ABSTRACT: J.B.S. Haldane proposed in 1947 that the male germline may be more mutagenic than the female germline. Diverse studies have supported Haldane's contention of a higher average mutation rate in the male germline in a variety of mammals, including humans. Here we present, to our knowledge, the first direct comparative analysis of male and female germline mutation rates from the complete genome sequences of two parent-offspring trios. Through extensive validation, we identified 49 and 35 germline de novo mutations (DNMs) in two trio offspring, as well as 1,586 non-germline DNMs arising either somatically or in the cell lines from which the DNA was derived. Most strikingly, in one family, we observed that 92% of germline DNMs were from the paternal germline, whereas, in contrast, in the other family, 64% of DNMs were from the maternal germline. These observations suggest considerable variation in mutation rates within and between families.
[Show abstract][Hide abstract] ABSTRACT: A major objective of genomics is to elucidate the mapping between genotypic and phenotypic space as a step toward understanding how small changes in gene function can lead to elaborate phenotypic changes. One approach that has been utilized is to examine overall patterns of covariation between phenotypic variables of interest, such as morphology, physiology, and behavior, and underlying aspects of gene activity, in particular transcript abundance on a genome-wide scale. Numerous studies have demonstrated that such patterns of covariation occur, although these are often between samples with large numbers of unknown genetic differences (different strains or even species) or perturbations of large effect (sexual dimorphism or strong loss-of-function mutations) that may represent physiological changes outside of the normal experiences of the organism. We used weak mutational perturbations in genes affecting wing development in Drosophila melanogaster that influence wing shape relative to a co-isogenic wild type. We profiled transcription of 1150 genes expressed during wing development in 27 heterozygous mutants, as well as their co-isogenic wild type and one additional wild-type strain. Despite finding clear evidence of expression differences between mutants and wild type, transcriptional profiles did not covary strongly with shape, suggesting that information from transcriptional profiling may not generally be predictive of final phenotype. We discuss these results in the light of possible attractor states of gene expression and how this would affect interpretation of covariation between transcriptional profiles and other phenotypes.