[show abstract][hide abstract] ABSTRACT: Genetic mapping of quantitative traits requires genotypic data for large numbers of markers in many individuals. For such studies, the use of large single nucleotide polymorphism (SNP) genotyping arrays still offers the most cost-effective solution. Herein we report on the design and performance of a SNP genotyping array for Populus trichocarpa (black cottonwood). This genotyping array was designed with SNPs pre-ascertained in 34 wild accessions covering most of the species latitudinal range. We adopted a candidate gene approach to the array design that resulted in the selection of 34 131 SNPs, the majority of which are located in, or within 2 kb of, 3543 candidate genes. A subset of the SNPs on the array (539) was selected based on patterns of variation among the SNP discovery accessions. We show that more than 95% of the loci produce high quality genotypes and that the genotyping error rate for these is likely below 2%. We demonstrate that even among small numbers of samples (n = 10) from local populations over 84% of loci are polymorphic. We also tested the applicability of the array to other species in the genus and found that the number of polymorphic loci decreases rapidly with genetic distance, with the largest numbers detected in other species in section Tacamahaca. Finally, we provide evidence for the utility of the array to address evolutionary questions such as intraspecific studies of genetic differentiation, species assignment and the detection of natural hybrids.
[show abstract][hide abstract] ABSTRACT: Pyrenophora tritici-repentis is a necrotrophic fungus causal to the disease tan spot of wheat, whose contribution to crop loss has increased significantly during the last few decades. Pathogenicity by this fungus is attributed to the production of host-selective toxins (HST), which are recognized by their host in a genotype-specific manner. To better understand the mechanisms that have led to the increase in disease incidence related to this pathogen, we sequenced the genomes of three P. tritici-repentis isolates. A pathogenic isolate that produces two known HSTs was used to assemble a reference nuclear genome of approximately 40 Mb composed of 11 chromosomes that encode 12,141 predicted genes. Comparison of the reference genome with those of a pathogenic isolate that produces a third HST, and a nonpathogenic isolate, showed the nonpathogen genome to be more diverged than those of the two pathogens. Examination of gene-coding regions has provided candidate pathogen-specific proteins and revealed gene families that may play a role in a necrotrophic lifestyle. Analysis of transposable elements suggests that their presence in the genome of pathogenic isolates contributes to the creation of novel genes, effector diversification, possible horizontal gene transfer events, identified copy number variation, and the first example of transduplication by DNA transposable elements in fungi. Overall, comparative analysis of these genomes provides evidence that pathogenicity in this species arose through an influx of transposable elements, which created a genetically flexible landscape that can easily respond to environmental changes.
[show abstract][hide abstract] ABSTRACT: The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP-encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.
[show abstract][hide abstract] ABSTRACT: The DNA sequences of chromosomes I and II of Rhodobacter sphaeroides strain 2.4.1 have been revised, and the annotation of the entire genomic sequence, including both chromosomes and the five plasmids, has been updated. Errors in the originally published sequence have been corrected, and ∼11% of the coding regions in the original sequence have been affected by the revised annotation.
Journal of bacteriology 12/2012; 194(24):7016-7. · 3.94 Impact Factor
[show abstract][hide abstract] ABSTRACT: • Plant population genomics informs evolutionary biology, breeding, conservation and bioenergy feedstock development. For example, the detection of reliable phenotype-genotype associations and molecular signatures of selection requires a detailed knowledge about genome-wide patterns of allele frequency variation, linkage disequilibrium and recombination. • We resequenced 16 genomes of the model tree Populus trichocarpa and genotyped 120 trees from 10 subpopulations using 29 213 single-nucleotide polymorphisms. • Significant geographic differentiation was present at multiple spatial scales, and range-wide latitudinal allele frequency gradients were strikingly common across the genome. The decay of linkage disequilibrium with physical distance was slower than expected from previous studies in Populus, with r(2) dropping below 0.2 within 3-6 kb. Consistent with this, estimates of recent effective population size from linkage disequilibrium (N(e) ≈ 4000-6000) were remarkably low relative to the large census sizes of P. trichocarpa stands. Fine-scale rates of recombination varied widely across the genome, but were largely predictable on the basis of DNA sequence and methylation features. • Our results suggest that genetic drift has played a significant role in the recent evolutionary history of P. trichocarpa. Most importantly, the extensive linkage disequilibrium detected suggests that genome-wide association studies and genomic selection in undomesticated populations may be more feasible in Populus than previously assumed.
New Phytologist 08/2012; 196(3):713-725. · 6.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: MOTIVATION: The sequencing of over a thousand natural strains of the model plant Arabidopsis thaliana is producing unparalleled information at the genetic level for plant researchers. To enable the rapid exploitation of these data for functional proteomics studies, we have created a resource for the visualization of protein information and proteomic datasets for sequenced natural strains of A. thaliana. RESULTS: The 1001 Proteomes portal can be used to visualize amino acid substitutions or non-synonymous single-nucleotide polymorphisms in individual proteins of A. thaliana based on the reference genome Col-0. We have used the available processed sequence information to analyze the conservation of known residues subject to protein phosphorylation among these natural strains. The substitution of amino acids in A. thaliana natural strains is heavily constrained and is likely a result of the conservation of functional attributes within proteins. At a practical level, we demonstrate that this information can be used to clarify ambiguously defined phosphorylation sites from phosphoproteomic studies. Protein sets of available natural variants are available for download to enable proteomic studies on these accessions. Together this information can be used to uncover the possible roles of specific amino acids in determining the structure and function of proteins in the model plant A. thaliana. An online portal to enable the community to exploit these data can be accessed at http://1001proteomes.masc-proteomics.org/
[show abstract][hide abstract] ABSTRACT: The ND18 strain of Barley stripe mosaic virus (BSMV) infects several lines of Brachypodium distachyon, a recently developed model system for genomics research in cereals. Among the inbred lines tested, Bd3-1 is highly resistant at 20 to 25 °C, whereas Bd21 is susceptible and infection results in an intense mosaic phenotype accompanied by high levels of replicating virus. We generated an F(6:7) recombinant inbred line (RIL) population from a cross between Bd3-1 and Bd21 and used the RILs, and an F(2) population of a second Bd21 × Bd3-1 cross to evaluate the inheritance of resistance. The results indicate that resistance segregates as expected for a single dominant gene, which we have designated Barley stripe mosaic virus resistance 1 (Bsr1). We constructed a genetic linkage map of the RIL population using SNP markers to map this gene to within 705 Kb of the distal end of the top of chromosome 3. Additional CAPS and Indel markers were used to fine map Bsr1 to a 23 Kb interval containing five putative genes. Our study demonstrates the power of using RILs to rapidly map the genetic determinants of BSMV resistance in Brachypodium. Moreover, the RILs and their associated genetic map, when combined with the complete genomic sequence of Brachypodium, provide new resources for genetic analyses of many other traits.
PLoS ONE 01/2012; 7(6):e38333. · 3.73 Impact Factor
[show abstract][hide abstract] ABSTRACT: Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.
[show abstract][hide abstract] ABSTRACT: Agenesis of the corpus callosum (AgCC) is a congenital brain malformation that occurs in approximately 1:1,000-1:6,000 births. Several syndromes associated with AgCC have been traced to single gene mutations; however, the majority of AgCC causes remain unidentified. We investigated a mother and two children who all shared complete AgCC and a chromosomal deletion at 1q42. We fine mapped this deletion and show that it includes Disrupted-in-Schizophrenia 1 (DISC1), a gene implicated in schizophrenia and other psychiatric disorders. Furthermore, we report a de novo chromosomal deletion at 1q42.13 to q44, which includes DISC1, in another individual with AgCC. We resequenced DISC1 in a cohort of 144 well-characterized AgCC individuals and identified 20 sequence changes, of which 4 are rare potentially pathogenic variants. Two of these variants were undetected in 768 control chromosomes. One of these is a splice site mutation at the 5' boundary of exon 11 that dramatically reduces full-length mRNA expression of DISC1, but not of shorter forms. We investigated the developmental expression of mouse DISC1 and find that it is highly expressed in the embryonic corpus callosum at a critical time for callosal formation. Taken together our results suggest a significant role for DISC1 in corpus callosum development.
American Journal of Medical Genetics Part A 08/2011; 155A(8):1865-76. · 2.30 Impact Factor
[show abstract][hide abstract] ABSTRACT: While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200-900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.
PLoS ONE 01/2010; 5(4):e10314. · 3.73 Impact Factor
[show abstract][hide abstract] ABSTRACT: Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels such as ethanol and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions, and 18 larger deletions, leading to the loss of more than 100 kb of genomic DNA. From these events, we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild-type strain QM6a. Our analysis provides genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus and suggests areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production.
Proceedings of the National Academy of Sciences 09/2009; 106(38):16151-6. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: Genes with overlapping expression and function may gradually diverge despite retaining some common functions. To test whether such genes show distinct patterns of molecular evolution within species, we examined sequence variation at the bric à brac (bab) locus of Drosophila melanogaster. This locus is composed of two anciently duplicated paralogs, bab1 and bab2, which are involved in patterning the adult abdomen, legs, and ovaries. We have sequenced the 148 kb genomic region spanning the bab1 and bab2 genes from 94 inbred lines of D. melanogaster sampled from a single location. Two non-coding regions, one in each paralog, appear to be under selection. The strongest evidence of directional selection is found in a region of bab2 that has no known functional role. The other region is located in the bab1 paralog and is known to contain a cis-regulatory element that controls sex-specific abdominal pigmentation. The coding region of bab1 appears to be under stronger functional constraint than the bab2 coding sequences. Thus, the two paralogs are evolving under different selective regimes in the same natural population, illuminating the different evolutionary trajectories of partially redundant duplicate genes.
Journal of Molecular Evolution 09/2009; 69(2):194-202. · 2.15 Impact Factor
[show abstract][hide abstract] ABSTRACT: N-myristoylation is a common form of co-translational protein fatty acylation resulting from the attachment of myristate to a required N-terminal glycine residue. We show that aberrantly acquired N-myristoylation of SHOC2, a leucine-rich repeat-containing protein that positively modulates RAS-MAPK signal flow, underlies a clinically distinctive condition of the neuro-cardio-facial-cutaneous disorders family. Twenty-five subjects with a relatively consistent phenotype previously termed Noonan-like syndrome with loose anagen hair (MIM607721) shared the 4A>G missense change in SHOC2 (producing an S2G amino acid substitution) that introduces an N-myristoylation site, resulting in aberrant targeting of SHOC2 to the plasma membrane and impaired translocation to the nucleus upon growth factor stimulation. Expression of SHOC2(S2G) in vitro enhanced MAPK activation in a cell type-specific fashion. Induction of SHOC2(S2G) in Caenorhabditis elegans engendered protruding vulva, a neomorphic phenotype previously associated with aberrant signaling. These results document the first example of an acquired N-terminal lipid modification of a protein causing human disease.
[show abstract][hide abstract] ABSTRACT: We have generated extreme ionizing radiation resistance in a relatively sensitive bacterial species, Escherichia coli, by directed evolution. Four populations of Escherichia coli K-12 were derived independently from strain MG1655, with each specifically adapted to survive exposure to high doses of ionizing radiation. D(37) values for strains isolated from two of the populations approached that exhibited by Deinococcus radiodurans. Complete genomic sequencing was carried out on nine purified strains derived from these populations. Clear mutational patterns were observed that both pointed to key underlying mechanisms and guided further characterization of the strains. In these evolved populations, passive genomic protection is not in evidence. Instead, enhanced recombinational DNA repair makes a prominent but probably not exclusive contribution to genome reconstitution. Multiple genes, multiple alleles of some genes, multiple mechanisms, and multiple evolutionary pathways all play a role in the evolutionary acquisition of extreme radiation resistance. Several mutations in the recA gene and a deletion of the e14 prophage both demonstrably contribute to and partially explain the new phenotype. Mutations in additional components of the bacterial recombinational repair system and the replication restart primosome are also prominent, as are mutations in genes involved in cell division, protein turnover, and glutamate transport. At least some evolutionary pathways to extreme radiation resistance are constrained by the temporally ordered appearance of specific alleles.
Journal of bacteriology 07/2009; 191(16):5240-52. · 3.94 Impact Factor
[show abstract][hide abstract] ABSTRACT: Forward genetic screens with ENU (N-ethyl-N-nitrosourea) mutagenesis can facilitate gene discovery, but mutation identification is often difficult. We present the first study in which an ENU-induced mutation was identified by massively parallel DNA sequencing. This mutation causes heterotaxy and complex congenital heart defects and was mapped to a 2.2-Mb interval on mouse chromosome 7. Massively parallel sequencing of the entire 2.2-Mb interval identified 2 single-base substitutions, one in an intergenic region and a second causing replacement of a highly conserved cysteine with arginine (C193R) in the gene Megf8. Megf8 is evolutionarily conserved from human to fruit fly, and is observed to be ubiquitously expressed. Morpholino knockdown of Megf8 in zebrafish embryos resulted in a high incidence of heterotaxy, indicating a conserved role in laterality specification. Megf8(C193R) mouse mutants show normal breaking of symmetry at the node, but Nodal signaling failed to be propagated to the left lateral plate mesoderm. Videomicroscopy showed nodal cilia motility, which is required for left-right patterning, is unaffected. Although this protein is predicted to have receptor function based on its amino acid sequence, surprisingly confocal imaging showed it is translocated into the nucleus, where it is colocalized with Gfi1b and Baf60C, two proteins involved in chromatin remodeling. Overall, through the recovery of an ENU-induced mutation, we uncovered Megf8 as an essential regulator of left-right patterning.
Proceedings of the National Academy of Sciences 03/2009; 106(9):3219-24. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: It has been suggested that autism, like other complex genetic disorders, may benefit from the study of rare or Mendelian variants associated with syndromic or non-syndromic forms of the disease. However, there are few examples in which common variation in genes causing a Mendelian neuropsychiatric disorder has been shown to contribute to disease susceptibility in an allied common condition. Joubert syndrome (JS) is a rare recessively inherited disorder, with mutations reported at several loci including the gene Abelson's Helper Integration 1 (AHI1). A significant proportion of patients with JS, in some studies up to 40%, have been diagnosed with autism spectrum disorder (ASD) and several linkage studies in ASD have nominally implicated the region on 6q where AHI1 resides. To evaluate AHI1 in ASD, we performed a three-stage analysis of AHI1 as an a priori candidate gene for autism. Re-sequencing was first used to screen AHI1, followed by two subsequent association studies, one limited and one covering the gene more completely, in Autism Genetic Resource Exchange (AGRE) families. In stage 3, we found evidence of an associated haplotype in AHI1 with ASD after correction for multiple comparisons, in a region of the gene that had been previously associated with schizophrenia. These data suggest a role for AHI1 in common disorders affecting human cognition and behavior.
Human Molecular Genetics 10/2008; 17(24):3887-96. · 7.69 Impact Factor
[show abstract][hide abstract] ABSTRACT: Noonan and LEOPARD syndromes are developmental disorders with overlapping features, including cardiac abnormalities, short stature and facial dysmorphia. Increased RAS signaling owing to PTPN11, SOS1 and KRAS mutations causes approximately 60% of Noonan syndrome cases, and PTPN11 mutations cause 90% of LEOPARD syndrome cases. Here, we report that 18 of 231 individuals with Noonan syndrome without known mutations (corresponding to 3% of all affected individuals) and two of six individuals with LEOPARD syndrome without PTPN11 mutations have missense mutations in RAF1, which encodes a serine-threonine kinase that activates MEK1 and MEK2. Most mutations altered a motif flanking Ser259, a residue critical for autoinhibition of RAF1 through 14-3-3 binding. Of 19 subjects with a RAF1 mutation in two hotspots, 18 (or 95%) showed hypertrophic cardiomyopathy (HCM), compared with the 18% prevalence of HCM among individuals with Noonan syndrome in general. Ectopically expressed RAF1 mutants from the two HCM hotspots had increased kinase activity and enhanced ERK activation, whereas non-HCM-associated mutants were kinase impaired. Our findings further implicate increased RAS signaling in pathological cardiomyocyte hypertrophy.