-
Jo Ann Banks,
Tomoaki Nishiyama,
Mitsuyasu Hasebe,
John L Bowman,
Michael Gribskov, Claude dePamphilis,
Victor A Albert,
Naoki Aono,
Tsuyoshi Aoyama,
Barbara A Ambrose, [......],
Uffe Hellsten,
Dominique Loqué,
Robert Otillar,
Asaf Salamov,
Jeremy Schmutz,
Harris Shapiro,
Erika Lindquist,
Susan Lucas,
Daniel Rokhsar,
Igor V Grigoriev
[show abstract]
[hide abstract]
ABSTRACT: Vascular plants appeared ~410 million years ago, then diverged into several lineages of which only two survive: the euphyllophytes (ferns and seed plants) and the lycophytes. We report here the genome sequence of the lycophyte Selaginella moellendorffii (Selaginella), the first nonseed vascular plant genome reported. By comparing gene content in evolutionarily diverse taxa, we found that the transition from a gametophyte- to a sporophyte-dominated life cycle required far fewer new genes than the transition from a nonseed vascular to a flowering plant, whereas secondary metabolic genes expanded extensively and in parallel in the lycophyte and angiosperm lineages. Selaginella differs in posttranscriptional gene regulation, including small RNA regulation of repetitive elements, an absence of the trans-acting small interfering RNA pathway, and extensive RNA editing of organellar genes.
Science 05/2011; 332(6032):960-3. · 31.20 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
Although the overwhelming majority of genes found in angiosperms are members of gene families, and both gene- and genome-duplication are pervasive forces in plant genomes, some genes are sufficiently distinct from all other genes in a genome that they can be operationally defined as 'single copy'. Using the gene clustering algorithm MCL-tribe, we have identified a set of 959 single copy genes that are shared single copy genes in the genomes of Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa . To characterize these genes, we have performed a number of analyses examining GO annotations, coding sequence length, number of exons, number of domains, presence in distant lineages, such as Selaginella and Physcomitrella , and phylogenetic analysis to estimate copy number in other seed plants and to demonstrate their phylogenetic utility. We then provide examples of how these genes may be used in phylogenetic analyses to reconstruct organismal history, both by using extant coverage in EST databases for seed plants and de novo amplification via RT-PCR in the family Brassicaceae.
Results
There are 959 single copy nuclear genes shared in Arabidopsis , Populus , Vitis and Oryza ["APVO SSC genes"]. The majority of these genes are also present in the Selaginella and Physcomitrella genomes. Public EST sets for 197 species suggest that most of these genes are present across a diverse collection of seed plants, and appear to exist as single or very low copy genes, though exceptions are seen in recently polyploid taxa and in lineages where there is significant evidence for a shared large-scale duplication event. Genes encoding proteins localized in organelles are more commonly single copy than expected by chance, but the evolutionary forces responsible for this bias are unknown.
Regardless of the evolutionary mechanisms responsible for the large number of shared single copy genes in diverse flowering plant lineages, these genes are valuable for phylogenetic and comparative analyses. Eighteen of the APVO SSC single copy genes were amplified in the Brassicaceae using RT-PCR and directly sequenced. Alignments of these sequences provide improved resolution of Brassicaceae phylogeny compared to recent studies using plastid and ITS sequences. An analysis of sequences from 13 APVO SSC genes from 69 species of seed plants, derived mainly from public EST databases, yielded a phylogeny that was largely congruent with prior hypotheses based on multiple plastid sequences. Whereas single gene phylogenies that rely on EST sequences have limited bootstrap support as the result of limited sequence information, concatenated alignments result in phylogenetic trees with strong bootstrap support for already established relationships. Overall, these single copy nuclear genes are promising markers for phylogenetics, and contain a greater proportion of phylogenetically-informative sites than commonly used protein-coding sequences from the plastid or mitochondrial genomes.
Conclusions
Putatively orthologous, shared single copy nuclear genes provide a vast source of new evidence for plant phylogenetics, genome mapping, and other applications, as well as a substantial class of genes for which functional characterization is needed. Preliminary evidence indicates that many of the shared single copy nuclear genes identified in this study may be well suited as markers for addressing phylogenetic hypotheses at a variety of taxonomic levels.
BMC Evolutionary Biology. 01/2010;
-
[show abstract]
[hide abstract]
ABSTRACT: Whole genome doubling (WGD), a frequent occurrence during the evolution of the angiosperms, complicates ancestral gene order reconstruction due to the multiplicity of solutions to the genome halving process. Using the genome of a related species (the outgroup) to guide the halving of a WGD descendant attenuates this problem. We investigate a battery of techniques for further improvement, including an unbiased version of the guided genome halving algorithm, reference to two related genomes instead of only one to guide the reconstruction, use of draft genome sequences in contig form only, incorporation of incomplete sets of homology correspondences among the genomes, and addition of large numbers of "singleton" correspondences. We make use of genomic distance, breakpoint reuse rate, dispersion of sets of alternate solutions, and other means to evaluate these techniques, and employ the papaya (Carica papaya) and grapevine (Vitis vinifera) genomes to reconstruct the pre-WGD ancestor of poplar (Populus trichocarpa), as well as an early rosid ancestor. A significant result is that the papaya genome has rearranged at a greater rate from the rosid ancestor than phylogenetic relationships would predict.
Journal of computational biology: a journal of computational molecular cell biology 10/2009; 16(10):1353-67. · 1.69 Impact Factor
-
P Kerr Wall,
Jim Leebens-Mack,
André S Chanderbali,
Abdelali Barakat,
Erik Wolcott,
Haiying Liang,
Lena Landherr,
Lynn P Tomsho,
Yi Hu,
John E Carlson,
Hong Ma,
Stephan C Schuster,
Douglas E Soltis,
Pamela S Soltis,
Naomi Altman, Claude W dePamphilis
[show abstract]
[hide abstract]
ABSTRACT: We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis.
The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics.
NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms.
BMC Genomics 09/2009; 10:347. · 4.07 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Whole genome doubling (WGD), a frequent occurrence during the evolution of the angiopsperms, complicates ancestral gene order
reconstruction due to the multiplicity of solutions to the genome halving process. Using the genome of a related species (the
outgroup) to guide the halving of a WGD descendant attenuates this problem. We investigate a battery of techniques for further
improvement, including an unbiased version of the guided genome halving algorithm, reference to two related genomes instead
of only one to guide the reconstruction, use of draft genome sequences in contig form only, incorporation of incomplete sets
of homology correspondences among the genomes and addition of large numbers of “singleton” correspondences. We make use of
genomic distance, breakpoint reuse rate, dispersion of sets of alternate solutions and other means to evaluate these techniques,
while reconstructing the pre-WGD ancestor of Populus trichocarpa as well as an early rosid ancestor.
09/2008: pages 252-264;
-
Dawn Field,
George Garrity,
Tanya Gray,
Norman Morrison,
Jeremy Selengut,
Peter Sterk,
Tatiana Tatusova,
Nicholas Thomson,
Michael J Allen,
Samuel V Angiuoli, [......],
Yoshio Tateno,
Adrian Tett,
Sarah Turner,
David Ussery,
Bob Vaughan,
Naomi Ward,
Trish Whetzel,
Ingio San Gil,
Gareth Wilson,
Anil Wipat
[show abstract]
[hide abstract]
ABSTRACT: With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases.
Nature Biotechnology 06/2008; 26(5):541-7. · 29.50 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We improve on guided genome halving algorithms so that several thousand gene sets, each containing two paralogs in the descendant T of the doubling event and their single ortholog from an undoubled reference genome R, can be analyzed to reconstruct the ancestor A of T at the time of doubling. At the same time, large numbers of defective gene sets, either missing one paralog from T or missing their ortholog in R, may be incorporated into the analysis in a consistent way. We apply this genomic rearrangement distance-based approach to the recently sequenced poplar (Populus trichocarpa) and grapevine (Vitis vinifera) genomes, as T and R respectively.
Computational systems bioinformatics / Life Sciences Society. Computational Systems Bioinformatics Conference 02/2008; 7:261-71.
-
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
Genome evolution is shaped not only by nucleotide substitutions, but also by structural changes including gene and genome duplications, insertions, deletions and gene order rearrangements. The most popular methods for reconstructing phylogeny from genome rearrangements include GRAPPA and MGR. However these methods are limited to cases where equal gene content or few deletions can be assumed. Since conserved duplicated regions are present in many chloroplast genomes, the inference of inverted repeats is needed in chloroplast phylogeny analysis and ancestral genome reconstruction.
Results
We extend GRAPPA and develop a new method GRAPPA-IR to handle chloroplast genomes. A test of GRAPPA-IR using divergent chloroplast genomes from land plants and green algae recovers the phylogeny congruent with prior studies, while analysis that do not consider IR structure fail to obtain the accepted topology. Our extensive simulation study also confirms that GRAPPA has better accuracy then the existing methods.
Conclusions
Tests on a biological and simulated dataset show GRAPPA-IR can accurately recover the genome phylogeny as well as ancestral gene orders. Close analysis of the ancestral genome structure suggests that genome rearrangement in chloroplasts is probably limited by inverted repeats with a conserved core region. In addition, the boundaries of inverted repeats are hot spots for gene duplications or deletions. The new GRAPPA-IR is available from http://phylo.cse.sc.edu .
BMC Genomics. 01/2008;
-
[show abstract]
[hide abstract]
ABSTRACT: Through multifaceted genome-scale research involving phylogenomics, targeted gene surveys, and gene expression analyses in diverse basal lineages of angiosperms, our studies provide insights into the most recent common ancestor of all extant flowering plants. MADS-box gene duplications have played an important role in the origin and diversification of angiosperms. Furthermore, early angiosperms possessed a diverse tool kit of floral genes and exhibited developmental 'flexibility', with broader patterns of expression of key floral organ identity genes than are found in eudicots. In particular, homologs of B-function MADS-box genes are more broadly expressed across the floral meristem in basal lineages. These results prompted formulation of the 'fading borders' model, which states that the gradual transitions in floral organ morphology observed in some basal angiosperms (e.g. Amborella) result from a gradient in the level of expression of floral organ identity genes across the developing floral meristem.
Trends in Plant Science 09/2007; 12(8):358-67. · 11.05 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
The genus Cuscuta L. (Convolvulaceae), commonly known as dodders, are epiphytic vines that invade the stems of their host with haustorial feeding structures at the points of contact. Although they lack expanded leaves, some species are noticeably chlorophyllous, especially as seedlings and in maturing fruits. Some species are reported as crop pests of worldwide distribution, whereas others are extremely rare and have local distributions and apparent niche specificity. A strong phylogenetic framework for this large genus is essential to understand the interesting ecological, morphological and molecular phenomena that occur within these parasites in an evolutionary context.
Results
Here we present a well-supported phylogeny of Cuscuta using sequences of the nuclear ribosomal internal transcribed spacer and plastid rps2 , rbcL and matK from representatives across most of the taxonomic diversity of the genus. We use the phylogeny to interpret morphological and plastid genome evolution within the genus. At least three currently recognized taxonomic sections are not monophyletic and subgenus Cuscuta is unequivocally paraphyletic. Plastid genes are extremely variable with regards to evolutionary constraint, with rbcL exhibiting even higher levels of purifying selection in Cuscuta than photosynthetic relatives. Nuclear genome size is highly variable within Cuscuta , particularly within subgenus Grammica , and in some cases may indicate the existence of cryptic species in this large clade of morphologically similar species.
Conclusion
Some morphological characters traditionally used to define major taxonomic splits within Cuscuta are homoplastic and are of limited use in defining true evolutionary groups. Chloroplast genome evolution seems to have evolved in a punctuated fashion, with episodes of loss involving suites of genes or tRNAs followed by stabilization of gene content in major clades. Nearly all species of Cuscuta retain some photosynthetic ability, most likely for nutrient apportionment to their seeds, while complete loss of photosynthesis and possible loss of the entire chloroplast genome is limited to a single small clade of outcrossing species found primarily in western South America.
BMC Biology. 01/2007;
-
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
Some of the most difficult phylogenetic questions in evolutionary biology involve identification of the free-living relatives of parasitic organisms, particularly those of parasitic flowering plants. Consequently, the number of origins of parasitism and the phylogenetic distribution of the heterotrophic lifestyle among angiosperm lineages is unclear.
Results
Here we report the results of a phylogenetic analysis of 102 species of seed plants designed to infer the position of all haustorial parasitic angiosperm lineages using three mitochondrial genes: atp1, coxI , and matR . Overall, the mtDNA phylogeny agrees with independent studies in terms of non-parasitic plant relationships and reveals at least 11 independent origins of parasitism in angiosperms, eight of which consist entirely of holoparasitic species that lack photosynthetic ability. From these results, it can be inferred that modern-day parasites have disproportionately evolved in certain lineages and that the endoparasitic habit has arisen by convergence in four clades. In addition, reduced taxon, single gene analyses revealed multiple horizontal transfers of atp1 from host to parasite lineage, suggesting that parasites may be important vectors of horizontal gene transfer in angiosperms. Furthermore, in Pilostyles we show evidence for a recent host-to-parasite atp1 transfer based on a chimeric gene sequence that indicates multiple historical xenologous gene acquisitions have occurred in this endoparasite. Finally, the phylogenetic relationships inferred for parasites indicate that the origins of parasitism in angiosperms are strongly correlated with horizontal acquisitions of the invasive coxI group I intron.
Conclusion
Collectively, these results indicate that the parasitic lifestyle has arisen repeatedly in angiosperm evolutionary history and results in increasing parasite genomic chimerism over time.
BMC Evolutionary Biology. 01/2007;
-
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
MicroRNAs (miRNAs) are small RNAs (sRNA) ~21 nucleotides in length that negatively control gene expression by cleaving or inhibiting the translation of target gene transcripts. miRNAs have been extensively analyzed in Arabidopsis and rice and partially investigated in other non-model plant species. To date, 109 and 62 miRNA families have been identified in Arabidopsis and rice respectively. However, only 33 miRNAs have been identified from the genome of the model tree species ( Populus trichocarpa ), of which 11 are Populus specific. The low number of miRNA families previously identified in Populus , compared with the number of families identified in Arabidopsis and rice, suggests that many miRNAs still remain to be discovered in Populus . In this study, we analyzed expressed small RNAs from leaves and vegetative buds of Populus using high throughput pyrosequencing.
Results
Analysis of almost eighty thousand small RNA reads allowed us to identify 123 new sequences belonging to previously identified miRNA families as well as 48 new miRNA families that could be Populus -specific. Comparison of the organization of miRNA families in Populus , Arabidopsis and rice showed that miRNA family sizes were generally expanded in Populus . The putative targets of non-conserved miRNA include both previously identified targets as well as several new putative target genes involved in development, resistance to stress, and other cellular processes. Moreover, almost half of the genes predicted to be targeted by non-conserved miRNAs appear to be Populus -specific. Comparative analyses showed that genes targeted by conserved and non-conserved miRNAs are biased mainly towards development, electron transport and signal transduction processes. Similar results were found for non-conserved miRNAs from Arabidopsis .
Conclusion
Our results suggest that while there is a conserved set of miRNAs among plant species, a large fraction of miRNAs vary among species. The non-conserved miRNAs may regulate cellular, physiological or developmental processes specific to the taxa that produce them, as appears likely to be the case for those miRNAs that have only been observed in Populus . Non-conserved and conserved miRNAs seem to target genes with similar biological functions indicating that similar selection pressures are acting on both types of miRNAs. The expansion in the number of most conserved miRNAs in Populus relative to Arabidopsis , may be linked to the recent genome duplication in Populus , the slow evolution of the Populus genome, or to differences in the selection pressure on duplicated miRNAs in these species.
BMC Genomics. 01/2007;
-
Jim Leebens-Mack,
Todd Vision,
Eric Brenner,
John E Bowers,
Steven Cannon,
Mark J Clement,
Clifford W Cunningham, Claude dePamphilis,
Rob deSalle,
Jeff J Doyle, [......],
J Chris Pires,
Yin-Long Qiu,
Seung Y Rhee,
Kimmen Sjölander,
Douglas E Soltis,
Pamela S Soltis,
Dennis W Stevenson,
Kerr Wall,
Tandy Warnow,
Christian Zmasek
[show abstract]
[hide abstract]
ABSTRACT: In the eight years since phylogenomics was introduced as the intersection of genomics and phylogenetics, the field has provided fundamental insights into gene function, genome history and organismal relationships. The utility of phylogenomics is growing with the increase in the number and diversity of taxa for which whole genome and large transcriptome sequence sets are being generated. We assert that the synergy between genomic and phylogenetic perspectives in comparative biology would be enhanced by the development and refinement of minimal reporting standards for phylogenetic analyses. Encouraged by the development of the Minimum Information About a Microarray Experiment (MIAME) standard, we propose a similar roadmap for the development of a Minimal Information About a Phylogenetic Analysis (MIAPA) standard. Key in the successful development and implementation of such a standard will be broad participation by developers of phylogenetic analysis software, phylogenetic database developers, practitioners of phylogenomics, and journal editors.
Omics A Journal of Integrative Biology 02/2006; 10(2):231-7. · 2.44 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
The magnoliids with four orders, 19 families, and 8,500 species represent one of the largest clades of early diverging angiosperms. Although several recent angiosperm phylogenetic analyses supported the monophyly of magnoliids and suggested relationships among the orders, the limited number of genes examined resulted in only weak support, and these issues remain controversial. Furthermore, considerable incongruence resulted in phylogenetic reconstructions supporting three different sets of relationships among magnoliids and the two large angiosperm clades, monocots and eudicots. We sequenced the plastid genomes of three magnoliids, Drimys (Canellales), Liriodendron (Magnoliales), and Piper (Piperales), and used these data in combination with 32 other angiosperm plastid genomes to assess phylogenetic relationships among magnoliids and to examine patterns of variation of GC content.
Results
The Drimys , Liriodendron , and Piper plastid genomes are very similar in size at 160,604, 159,886 bp, and 160,624 bp, respectively. Gene content and order are nearly identical to many other unrearranged angiosperm plastid genomes, including Calycanthus , the other published magnoliid genome. Overall GC content ranges from 34–39%, and coding regions have a substantially higher GC content than non-coding regions. Among protein-coding genes, GC content varies by codon position with 1st codon > 2nd codon > 3rd codon, and it varies by functional group with photosynthetic genes having the highest percentage and NADH genes the lowest. Phylogenetic analyses using parsimony and likelihood methods and sequences of 61 protein-coding genes provided strong support for the monophyly of magnoliids and two strongly supported groups were identified, the Canellales/Piperales and the Laurales/Magnoliales. Strong support is reported for monocots and eudicots as sister clades with magnoliids diverging before the monocot-eudicot split. The trees also provided moderate or strong support for the position of Amborella as sister to a clade including all other angiosperms.
Conclusion
Evolutionary comparisons of three new magnoliid plastid genome sequences, combined with other published angiosperm genomes, confirm that GC content is unevenly distributed across the genome by location, codon position, and functional group. Furthermore, phylogenetic analyses provide the strongest support so far for the hypothesis that the magnoliids are sister to a large clade that includes both monocots and eudicots.
BMC Evolutionary Biology. 01/2006;
-
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
Genome rearrangements influence gene order and configuration of gene clusters in all genomes. Most land plant chloroplast DNAs (cpDNAs) share a highly conserved gene content and with notable exceptions, a largely co-linear gene order. Conserved gene orders may reflect a slow intrinsic rate of neutral chromosomal rearrangements, or selective constraint. It is unknown to what extent observed changes in gene order are random or adaptive. We investigate the influence of natural selection on gene order in association with increased rate of chromosomal rearrangement. We use a novel parametric bootstrap approach to test if directional selection is responsible for the clustering of functionally related genes observed in the highly rearranged chloroplast genome of the unicellular green alga Chlamydomonas reinhardtii , relative to ancestral chloroplast genomes.
Results
Ancestral gene orders were inferred and then subjected to simulated rearrangement events under the random breakage model with varying ratios of inversions and transpositions. We found that adjacent chloroplast genes in C. reinhardtii were located on the same strand much more frequently than in simulated genomes that were generated under a random rearrangement processes (increased sidedness; p < 0.0001). In addition, functionally related genes were found to be more clustered than those evolved under random rearrangements (p < 0.0001). We report evidence of co-transcription of neighboring genes, which may be responsible for the observed gene clusters in C. reinhardtii cpDNA.
Conclusion
Simulations and experimental evidence suggest that both selective maintenance and directional selection for gene clusters are determinants of chloroplast gene order.
BMC Evolutionary Biology. 01/2006;
-
[show abstract]
[hide abstract]
ABSTRACT: Polyploidy events have played an important role in the evolution of angiosperm genomes. Here, we demonstrate how genomic histories can increase phylogenetic resolution in a gene family, specifically the expansin superfamily of cell wall proteins. There are 36 expansins in Arabidopsis and 58 in rice. Traditional sequence-based phylogenetic trees yield poor resolution below the family level. To improve upon these analyses, we searched for gene colinearity (microsynteny) between Arabidopsis and rice genomic segments containing expansin genes. Multiple rounds of genome duplication and extensive gene loss have obscured synteny. However, by simultaneously aligning groups of up to 10 potentially orthologous segments from the two species, we traced the history of 49 out of 63 expansin-containing segments back to the ancestor of monocots and eudicots. Our results indicate that this ancestor had 15-17 expansin genes, each ancestral to an extant clade. Some clades have strikingly different growth patterns in the rice and Arabidopsis lineages, with more than half of all rice expansins arising from two ancestral genes. Segmental duplications, most of them part of polyploidy events, account for 12 out of 21 new expansin genes in Arabidopsis and 16 out of 44 in rice. Tandem duplications explain most of the rest. We were also able to estimate a minimum of 28 gene deaths in the Arabidopsis lineage and nine in rice. This analysis greatly clarifies expansin evolution since the last common ancestor of monocots and eudicots and the method should be broadly applicable to many other gene families.
The Plant Journal 12/2005; 44(3):409-19. · 6.16 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
Rates of synonymous nucleotide substitutions are, in general, exceptionally low in plant mitochondrial genomes, several times lower than in chloroplast genomes, 10–20 times lower than in plant nuclear genomes, and 50–100 times lower than in many animal mitochondrial genomes. Several cases of moderate variation in mitochondrial substitution rates have been reported in plants, but these mostly involve correlated changes in chloroplast and/or nuclear substitution rates and are therefore thought to reflect whole-organism forces rather than ones impinging directly on the mitochondrial mutation rate. Only a single case of extensive, mitochondrial-specific rate changes has been described, in the angiosperm genus Plantago .
Results
We explored a second potential case of highly accelerated mitochondrial sequence evolution in plants. This case was first suggested by relatively poor hybridization of mitochondrial gene probes to DNA of Pelargonium hortorum (the common geranium). We found that all eight mitochondrial genes sequenced from P. hortorum are exceptionally divergent, whereas chloroplast and nuclear divergence is unexceptional in P. hortorum . Two mitochondrial genes were sequenced from a broad range of taxa of variable relatedness to P. hortorum , and absolute rates of mitochondrial synonymous substitutions were calculated on each branch of a phylogenetic tree of these taxa. We infer one major, ~10-fold increase in the mitochondrial synonymous substitution rate at the base of the Pelargonium family Geraniaceae, and a subsequent ~10-fold rate increase early in the evolution of Pelargonium . We also infer several moderate to major rate decreases following these initial rate increases, such that the mitochondrial substitution rate has returned to normally low levels in many members of the Geraniaceae. Finally, we find unusually little RNA editing of Geraniaceae mitochondrial genes, suggesting high levels of retroprocessing in their history.
Conclusion
The existence of major, mitochondrial-specific changes in rates of synonymous substitutions in the Geraniaceae implies major and reversible underlying changes in the mitochondrial mutation rate in this family. Together with the recent report of a similar pattern of rate heterogeneity in Plantago , these findings indicate that the mitochondrial mutation rate is a more plastic character in plants than previously realized. Many molecular factors could be responsible for these dramatic changes in the mitochondrial mutation rate, including nuclear gene mutations affecting the fidelity and efficacy of mitochondrial DNA replication and/or repair and – consistent with the lack of RNA editing – exceptionally high levels of "mutagenic" retroprocessing. That the mitochondrial mutation rate has returned to normally low levels in many Geraniaceae raises the possibility that, akin to the ephemerality of mutator strains in bacteria, selection favors a low mutation rate in plant mitochondria.
BMC Evolutionary Biology. 01/2005;
-
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
The analysis of synonymous and nonsynonymous rates of DNA change can help in the choice among competing explanations for rate variation, such as differences in constraint, mutation rate, or the strength of genetic drift. Nonphotosynthetic plants of the Orobanchaceae have increased rates of DNA change. In this study 38 taxa of Orobanchaceae and relatives were used and 3 plastid genes were sequenced for each taxon.
Results
Phylogenetic reconstructions of relative rates of sequence evolution for three plastid genes ( rbcL , matK and rps2 ) show significant rate heterogeneity among lineages and among genes. Many of the non-photosynthetic plants have increases in both synonymous and nonsynonymous rates, indicating that both (1) selection is relaxed, and (2) there has been a change in the rate at which mutations are entering the population in these species. However, rate increases are not always immediate upon loss of photosynthesis. Overall there is a poor correlation of synonymous and nonsynonymous rates. There is, however, a strong correlation of synonymous rates across the 3 genes studied and the lineage-speccific pattern for each gene is strikingly similar. This indicates that the causes of synonymous rate variation are affecting the whole plastid genome in a similar way. There is a weaker correlation across genes for nonsynonymous rates. Here the picture is more complex, as could be expected if there are many causes of variation, differing from taxon to taxon and gene to gene.
Conclusions
The distinctive pattern of rate increases in Orobanchaceae has at least two causes. It is clear that there is a relaxation of constraint in many (though not all) non-photosynthetic lineages. However, there is also some force affecting synonymous sites as well. At this point, it is not possible to tell whether it is generation time, speciation rate, mutation rate, DNA repair efficiency or some combination of these factors.
BMC Evolutionary Biology. 01/2005;
-
Victor Albert,
Douglas Soltis,
John Carlson,
William Farmerie,
Wall P Kerr,
Daniel Ilut,
Teri Solow,
Lukas Mueller,
Lena Landherr,
Yi Hu, [......],
Rafael Perl-Treves,
Scott Schlarbaum,
Barbara Bliss,
Xiaohong Zhang,
Steven Tanksley,
David Oppenheimer,
Pamela Soltis,
Hong Ma, Claude dePamphilis,
James Leebens-Mack
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants.
Results
Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis /rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms.
Conclusion
Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways.
BMC Plant Biology. 01/2005;
-
[show abstract]
[hide abstract]
ABSTRACT: Loss of selective constraint on a gene may be expected following changes in the environment or life history that render its function unnecessary. The long-term persistence of protein-coding genes after the loss of known functional necessity can occur by chance or because of selective maintenance of an unknown gene function. The selective maintenance of an alternative gene function is not demonstrated by the failure of statistical tests to reject the hypothesis that there has been no change in the degree of constraint on the evolution of coding genes. Maintenance may be inferred, however, when power analyses of such tests demonstrate that there has been a sufficient number of nucleotide substitutions to detect the loss of selective constraint. Here, we describe a power analysis for tests of loss of constraint on protein-coding genes. The power analysis was applied to loss-of-constraint tests for opsin gene evolution in cave-dwelling crayfish and rbcL evolution in nonphotosynthetic parasitic plants. The power of previously applied tests for loss of constraint on cave crayfish opsin genes was insufficient to distinguish between chance retention and selective maintenance of opsin genes. However, the power of codon-based likelihood ratio tests for change in d(N)/d(S) (=omega) (nonsynonymous to synonymous change) did have sufficient power to detect a loss of constraint on rbcL associated with a loss of photosynthesis in most examples but failed to detect such a change in three independent lineages. We conclude that rbcL has been selectively maintained in these holoparasitic plant lineages. This conclusion suggests that either these taxa are photosynthetic for at least a part of their life or rbcL may have an unknown function in these plants unrelated to photosynthesis.
Molecular Biology and Evolution 09/2002; 19(8):1292-302. · 5.55 Impact Factor