Article

The complete plastome of Blidingia marginata and comparative analysis with the relative species in Ulvales

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In the present study, the whole chloroplast (cp) genome of B. marginata was characterized for the first time and genomic features were comparatively analyzed with six relative species in Ulvales. The cp genome of B. marginata was 170, 562 bp in length, exhibiting similar general structure but different in GC content from Ulva and Pseudendoclonium. A total of 113 unique genes were annotated, including 84 protein-coding genes, 26 tRNA and 3 rRNA. A higher level of rearrangements and small syntenic blocks exists comparing the locally collinear blocks. Codon usage analysis identified 30 biased codons with A or U-ended. Sequence analysis detected a total of 23 forward repeats, 18 palindrome repeats, 8 reverse repeats and 38 SSRs with different types. The phylogenetic analyses based on the entire cp genome suggested that Blidingia is closer to Ulva than to Pseudendoclonium. The entire cp genome of Blidingia provides a valuable resource for further studies.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The total GC contents of the 30 Ulva chloroplast genome range from 23.7 to 28.95 (Table 1), indicating a strong preference for A/T bases in Ulva chloroplast genomes. A strong A/T bias was also found in the chloroplast genomes of other Ulvophyceae groups such as Blidingia marginata [68], Trentepohlia [69], Cephaleuros [69], and Halimedineae [70], showing that closely related Ulvophyceae taxa have similar codon usage biases. This A/T bias in chloroplast genomes is absent from Ulva nuclear genomes [71,72]. ...
Article
Full-text available
Background: Ulva is a globally distributed genus with ecological and economic significance, yet the codon usage bias of the Ulva chloroplast genome remains poorly understood. Methods: We assessed the Ulva chloroplast genome codon usage patterns and their drivers by analyzing 30 genomes across 16 Ulva species. Results: The nucleotide composition analysis demonstrated that Ulva chloroplast genomes are rich in A/T, and prefer to use codons that ended with A/T. The relative synonymous codon usage analysis suggested that related species have similar codon usage patterns. A total of 25 high-frequency codons and 7–14 optimal codons were identified in these chloroplast genomes. The ENC values ranged from 31.40 to 32.76, all of which are less than 35, illustrating a strong codon bias of the Ulva genus. Our comparative analyses suggested that natural selection played the main role in the formation of the codon usage bias. Furthermore, the correlation analysis indicated that an influence of the base composition and gene expression levels on the codon usage bias. Conclusions: This study provides the first comprehensive analysis of the codon usage patterns in Ulva chloroplast genomes, improving our understanding of the genetics and evolution of these economically and ecologically important macroalgae.
... Considering the chloroplast infA genes lack this intron in all other chlorophycean taxa whose plastomes have been sequenced thus far (e.g. Turmel et al., 2016;Gao et al., 2022;Liu et al., 2023), the 100% occurrence frequency of this intron observed in Ulva plastomes indicated that this intron should have been acquired by the common ancestor of Ulva species. ...
Article
Full-text available
Chloroplast intron infA-62 as a degenerated group II intron family was previously observed to exist specifically in infA genes of chloroplast/plastid genomes (plastomes) in the genus Ulva (Ulvophyceae, Chlorophyta). To understand occurrence frequency, molecular evolution and phylogenetic utility of this intron family in Ulva species, in this study, we conducted more sampling tests based on newly designed specific primers, analyzed evolutionary features of its secondary structures, and employed intron infA-62 for phylogenetic analysis of Ulva species. The 100% occurrence frequency of this intron has been observed in Ulva plastomes, supporting its acquisition by the earliest progenitor of Ulva species. The GC content of this intron family is unprecedentedly low (21.0-25.2%) for group II introns. The intron infA-62 family is classified as an atypical form of ORF-less group IIB-like secondary structures. Some new evolutionary features have been revealed in this intron family, including the extremely low GC content in some domains (e.g. domains IB, ICa, ID2, IDa, II and IV), a very short stem in domain I, a drastically changing domain IC2, and a completely degenerated domain IV. Secondary structures of this intron family showed progressive RNA structural deviations and species-specific variations during the Ulva evolution. Nine mutation hotspots have been detected in loop regions of domains IA, IB, IC1, IC2, ICa, IDa, II, IV and VI. The ML phylogenetic tree constructed based on the nucleotide sequences of intron infA-62 showed that Ulva species were classified into two clades representing two Ulva lineages, Ulva I and II, which was consistent with those based on organelle multigene datasets. Our evidences show that intron infA-62 coevolved with the plastomes during the evolution and speciation of Ulva species. The intron infA-62 that combines primary sequence and secondary structure can be used as an efficient phylogenetic marker for identification and classification of Ulva species.
Article
Full-text available
To understand the evolutionary driving forces of chloroplast (or plastid) genomes (plastomes) in the green macroalgal genus Ulva (Ulvophyceae, Chlorophyta), in this study, we sequenced and constructed seven complete chloroplast genomes from five Ulva species, and conducted comparative genomic analysis of Ulva plastomes in Ulvophyceae. Ulva plastome evolution reflects the strong selection pressure driving the compactness of genome organization and the decrease of overall GC composition. The overall plastome sequences including canonical genes, introns, derived foreign sequences and non-coding regions show a synergetic decrease in GC content at varying degrees. Fast degeneration of plastome sequences including non-core genes (minD and trnR3), derived foreign sequences, and noncoding spacer regions was accompanied by the marked decrease of their GC composition. Plastome introns preferentially resided in conserved housekeeping genes with high GC content and long length, as might be related to high GC content of target site sequences recognized by intron-encoded proteins (IEPs), and to more target sites contained by long GC-rich genes. Many foreign DNA sequences integrated into different intergenic regions contain some homologous specific orfs with high similarity, indicating that they could have been derived from the same origin. The invasion of foreign sequences seems to be an important driving force for plastome rearrangement in these IR-lacking Ulva cpDNAs. Gene partitioning pattern has changed and distribution range of gene clusters has expanded after the loss of IR, indicating that genome rearrangement was more extensive and more frequent in Ulva plastomes, which was markedly different from that in IR-containing ulvophycean plastomes. These new insights greatly enhance our understanding of plastome evolution in ecologically important Ulva seaweeds.
Article
Full-text available
The plastid organelle is essential for many vital cellular processes and the growth and development of plants. The availability of a large number of complete plastid genomes could be effectively utilized to understand the evolution of the plastid genomes and phylogenetic relationships among plants. We comprehensively analyzed the plastid genomes of Viridiplantae comprising 3,654 taxa from 298 families and 111 orders and compared the genomic organizations in their plastid genomic DNA among major clades, which include gene gain/loss, gene copy number, GC content, and gene blocks. We discovered that some important genes that exhibit similar functions likely formed gene blocks, such as the psb family presumably showing co-occurrence and forming gene blocks in Viridiplantae. The inverted repeats (IRs) in plastid genomes have doubled in size across land plants, and their GC content is substantially higher than non-IR genes. By employing three different data sets [all nucleotide positions (nt123), only the first and second codon positions (nt12), and amino acids (AA)], our phylogenomic analyses revealed Chlorokybales + Mesostigmatales as the earliest-branching lineage of streptophytes. Hornworts, mosses, and liverworts forming a monophylum were identified as the sister lineage of tracheophytes. Based on nt12 and AA data sets, monocots, Chloranthales and magnoliids are successive sister lineages to the eudicots + Ceratophyllales clade. The comprehensive taxon sampling and analysis of different data sets from plastid genomes recovered well-supported relationships of green plants, thereby contributing to resolving some long-standing uncertainties in the plant phylogeny.
Article
Full-text available
To understand the evolution of Ulva chloroplast genomes at intraspecific and interspecific levels, in this study, three complete chloroplast genomes of Ulva compressa Linnaeus were sequenced and compared with the available Ulva cpDNA data. Our comparative analyses unveiled many noticeable findings. First, genome size variations of Ulva cpDNAs at intraspecific and interspecific levels were mainly caused by differences in gain or loss of group I/II introns, integration of foreign DNA fragments, and content of non-coding intergenic spacer regions. Second, chloroplast genomes of U. compressa shared the same 100 conserved genes as other Ulva cpDNA, whereas Ulva flexuosa appears to be the only Ulva species with the minD gene retained in its cpDNA. Third, five types of group I introns, most of which carry a LAGLIDADG or GIY-YIG homing endonuclease, and three of group II introns, usually encoding a reverse transcriptase/maturase, were detected at 26 insertion sites of 14 host genes in the 23 Ulva chloroplast genomes, and many intron insertion-sites have been found for the first time in Chlorophyta. Fourth, one degenerate group II intron previously ignored has been detected in the infA genes of all Ulva species, but not in the closest neighbor, Pseudoneochloris marina, and the other chlorophycean taxa, indicating that it should be the result of an independent invasion event that occurred in a common ancestor of Ulva species. Finally, the seven U. compressa cpDNAs represented a novel gene order which was different from that of other Ulva cpDNAs. The structure of Ulva chloroplast genomes is not conserved, but remarkably plastic, due to multiple rearrangement events.
Article
Full-text available
In temperate and subarctic regions of the Northern Hemisphere, green algae of the genus Blidingia are a substantial and environment-shaping component of the upper and mid-supralittoral zones. However, taxonomic knowledge on these important green algae is still sparse. In the present study, the molecular diversity and distribution of Blidingia species in the German State of Schleswig-Holstein was examined for the first time, including Baltic Sea and Wadden Sea coasts and the offshore island of Helgo-land (Heligoland). In total, three entities were delimited by DNA barcoding, and their respective distributions were verified (in decreasing order of abundance: Blidingia marginata, Blidingia cornuta sp. nov. and Blidingia minima). Our molecular data revealed strong taxonomic discrepancies with historical species concepts, which were mainly based on morphological and ontogenetic characters. Using a combination of molecular, morphological and ontogenetic approaches, we were able to disentangle previous mis-identifications of B. minima and demonstrate that the distribution of B. minima is more restricted than expected within the examined area. Blidingia minima, the type of the genus name Blidingia, is epitypified within this study by material collected at the type locality Helgoland. In contrast with B. minima, B. marginata shows a higher phenotypic plasticity and is more widely distributed in the study area than previously assumed. The third entity, Blidingia cornuta sp. nov., is clearly delimited from other described Blidingia species, due to unique characters in its ontogenetic development and morphology as well as by its tufA and rbcL sequences.
Article
Full-text available
Since 2007, the annual green tide disaster in the Yellow Sea has brought serious economic losses to China. There is no research on the genetic similarities of four constituent species of green tide algae at the genomic level. We previously determined the mitochondrial genomes of Ulva prolifera, Ulva linza and Ulva flexuosa. In the present work, the mitochondrial genome of another green tide (Ulva compressa) was sequenced and analyzed. With the length of 62,311 bp, it contained 29 encoding genes, 26 tRNAs and 10 open reading frames. By comparing these four mitochondrial genomes, we found that U. compressa was quite different from the other three types of Ulva species. However, there were similarities between U. prolifera and U. linza in the number, distribution and homology of open reading frames, evolutionary and codon variation of tRNA, evolutionary relationship and selection pressure of coding genes. Repetitive sequence analysis of simple sequence repeats, tandem repeat and forward repeats further supposed that they have evolved from the same origin. In addition, we directly analyzed gene homologies and translocation of four green tide algae by Mauve alignment. There were gene order rearrangements among them. With fast-evolving genomes, these four green algal mitochondria have both conservatism and variation, thus opening another window for the understanding of origin and evolution of Ulva.
Article
Full-text available
Ulva is a green macroalga often causing a macroalgal bloom, ‘green tide’. Ulva ohnoi is a major species composing the green tide of the southern coastal regions of Japan. Here, we sequenced the complete mitochondrial and chloroplast genomes of the authentic strain of U. ohnoi. The mitochondrial and chloroplast genomes were of 65,326 bp and 103,313 bp, respectively, and the gene content was highly conserved in the Ulva species. The phylogenetic analyses using mitochondrial or chloroplast proteins represented the same topology with high supporting values. These results show that mitochondrial and chloroplast genomes can be used as reliable phylogenetic markers.
Article
Full-text available
Gentiana section Cruciata is widely distributed across Eurasia at high altitudes, and some species in this section are used as traditional Chinese medicine. Accurate identification of these species is important for their utilization and conservation. Due to similar morphological and chemical characteristics, correct discrimination of these species still remains problematic. Here, we sequenced three complete chloroplast (cp) genomes (G. dahurica, G. siphonantha and G. officinalis). We further compared them with the previously published plastomes from sect. Cruciata and developed highly polymorphic molecular markers for species authentication. The eight cp genomes shared the highly conserved structure and contained 112 unique genes arranged in the same order, including 78 protein-coding genes, 30 tRNAs, and 4 rRNAs. We analyzed the repeats and nucleotide substitutions in these plastomes and detected several highly variable regions. We found that four genes (accD, clpP, matK and ycf1) were subject to positive selection, and sixteen InDel-variable loci with high discriminatory powers were selected as candidate barcodes. Our phylogenetic analyses based on plastomes further confirmed the monophyly of sect. Cruciata and primarily elucidated the phylogeny of Gentianales. This study indicated that cp genomes can provide more integrated information for better elucidating the phylogenetic pattern and improving discriminatory power during species authentication.
Article
Full-text available
Buddleja colvilei Hook.f. & Thomson (Scrophulariaceae) is a threatened alpine plant with a distribution throughout the Himalayas, also used as an ornamental plant. The name Buddleja sessilifolia B.S. Sun ex S.Y. Pao was assigned in 1983 to a plant distributed throughout the Gaoligong Mountains, but the name was later placed in synonymy with B. colvilei in the Flora of China. In this study we sequenced the complete chloroplast (cp) genomes of two individuals of B. colvilei and three individuals of B. sessilifolia from across the range. Both molecular and morphological analysis support the revision of B. sessilifolia. The phylogenetic analysis constructed with the whole cp genomes, the large single-copy regions (LSC), small single-copy regions (SSC), inverted repeat (IR) and the nuclear genes 18S/ITS1/5.8S/ITS2/28S all supported B. sessilifolia as a distinct species. Additionally, coalescence-based species delimitation methods (bGMYC, bPTP) using the whole chloroplast datasets also supported B. sessilifolia as a distinct species. The results suggest that the B. sessilifolia lineage was early diverging among the Asian Buddleja species. Overall gene contents were similar and gene arrangements were found to be highly conserved in the two species, however, fixed differences were found between the two species. A total of 474 single nucleotide polymorphisms (SNPs) were identified between the two species. The Principal Coordinate Analysis of the morphological characters resolved two groups and supported B. sessilifolia as a distinct species. Discrimination of B. colvilei and B. sessilifolia using morphological characters and the redescription of B. sessilifolia are detailed here.
Article
Full-text available
DAMBE is a comprehensive software package for genomic and phylogenetic data analysis on Windows, Linux and Macintosh computers. New functions include imputing missing distances and phylogeny simultaneously (paving the way to build large phage and transposon trees), new bootstrapping/jackknifing methods for PhyPA (phylogenetics from pairwise alignments), and an improved function for fast and accurate estimation of the shape parameter of the gamma distribution for fitting rate heterogeneity over sites. Previous method corrects multiple hits for each site independently. DAMBE's new method uses all sites simultaneously for correction. DAMBE, featuring a user-friendly graphic interface, is freely available from http://dambe.bio.uottawa.ca.
Article
Full-text available
Chloroplast genomes have undergone tremendous alterations through the evolutionary history of the green algae (Chloroplastida). This study focuses on the evolution of chloroplast genomes in the siphonous green algae (order Bryopsidales). We present five new chloroplast genomes, which along with existing sequences, yields a dataset representing all but one families of the order. Using comparative phylogenetic methods, we investigated the evolutionary dynamics of genomic features in the order. Our results show extensive variation in chloroplast genome architecture and intron content. Variation in genome size is accounted for by the amount of intergenic space and freestanding open reading frames that do not show significant homology to standard plastid genes. We show the diversity of these non-standard genes based on their conserved protein domains, which are often associated with mobile functions (reverse transcriptase/intron maturase, integrases, phage- or plasmid-DNA primases, transposases, integrases, ligases). Investigation of the introns showed proliferation of group II introns in the early evolution of the order and their subsequent loss in the core Halimedineae, possibly through RT-mediated intron loss.
Article
Full-text available
Forsythia suspensa is an important medicinal plant and traditionally applied for the treatment of inflammation, pyrexia, gonorrhea, diabetes, and so on. However, there is limited sequence and genomic information available for F. suspensa. Here, we produced the complete chloroplast genomes of F. suspensa using Illumina sequencing technology. F. suspensa is the first sequenced member within the genus Forsythia (Oleaceae). The gene order and organization of the chloroplast genome of F. suspensa are similar to other Oleaceae chloroplast genomes. The F. suspensa chloroplast genome is 156,404 bp in length, exhibits a conserved quadripartite structure with a large single-copy (LSC; 87,159 bp) region, and a small single-copy (SSC; 17,811 bp) region interspersed between inverted repeat (IRa/b; 25,717 bp) regions. A total of 114 unique genes were annotated, including 80 protein-coding genes, 30 tRNA, and four rRNA. The low GC content (37.8%) and codon usage bias for A- or T-ending codons may largely affect gene codon usage. Sequence analysis identified a total of 26 forward repeats, 23 palindrome repeats with lengths >30 bp (identity > 90%), and 54 simple sequence repeats (SSRs) with an average rate of 0.35 SSRs/kb. We predicted 52 RNA editing sites in the chloroplast of F. suspensa, all for C-to-U transitions. IR expansion or contraction and the divergent regions were analyzed among several species including the reported F. suspensa in this study. Phylogenetic analysis based on whole-plastome revealed that F. suspensa, as a member of the Oleaceae family, diverged relatively early from Lamiales. This study will contribute to strengthening medicinal resource conservation, molecular phylogenetic, and genetic engineering research investigations of this species.
Article
Full-text available
Ulva flexuosa, one kind of green tide algae, has outbroken in the Yellow Sea of China during the past ten years. In the present study, we sequenced the chloroplast genome of U. flexuosa followed by annotation and comparative analysis. It indicated that the chloroplast genomes had high conservation among Ulva spp., and high rearrangement outside them. Though U. flexuosa was closer to U. linza than U. fasciata in phylogenetic tree, the average Ka/Ks between U. flexuosa and U. linza assessed by 67 protein-coding genes was higher than those between U. flexuosa and other species in Ulva spp., due to the variation of psbZ, psbM and ycf20. Our results laid the foundation for the future studies on the evolution of chloroplast genomes of Ulva, as well as the molecular identification of U. flexuosa varieties.
Article
Full-text available
Ulva linza is one of the causal species that result in the macroalgal blooms around Yellow Sea, China. The blooms have now become the world’s largest green tide, making great disaster for the ecosystem. We analyzed whole genome sequence of chloroplast for the first time (GenBank accession number KX058323). It was found that the annular-shape genome was made up of 86,726 bp, including 67 protein coding genes. We then aligned amino acids of chlorophyta species containing 44 common genes in series as phylogenetic tree, which shows Chlorophyceae and Trebouxiophyceae separately cluster except for Leptosiraterrestri. In the phylogenetic tree of amino acid alignment, Ulvophyceae, Pedinophyceae, Prasinophytes and Chlorophyta incertae sedis are independent cluster respectively, and closer to Trebouxiophyceae in the origin.
Article
Full-text available
Chloroplasts play a crucial role in sustaining life on earth. The availability of over 800 sequenced chloroplast genomes from a variety of land plants has enhanced our understanding of chloroplast biology, intracellular gene transfer, conservation, diversity, and the genetic basis by which chloroplast transgenes can be engineered to enhance plant agronomic traits or to produce high-value agricultural or biomedical products. In this review, we discuss the impact of chloroplast genome sequences on understanding the origins of economically important cultivated species and changes that have taken place during domestication. We also discuss the potential biotechnological applications of chloroplast genomes.
Article
Full-text available
We present the 96 005 bp circular chloroplast genome (cpDNA) of Ulva fasciata. This cpDNA was ∼4000 bp smaller than the cpDNA of Ulva sp. UNA00071828; however, this cpDNA was AT rich (75.1%) similar to Ulva sp. The U. fasciata cpDNA was also similar in gene content (101 identified genes) compared to Ulva sp., which included 71 protein-coding genes, 3 ribosomal RNAs (rRNAs) and 27 transfer RNAs (tRNAs). Only one tRNA, trnN(AUU), that was present in Ulva sp. was absent in U. fasciata. Five introns were present in the following genes of U. fasciata: petB (2), psbD (1), psaB (1) and rrl (1). Ulva sp. lacked introns in psbD and psaB, and introns present in atpA and psbB in Ulva sp. were absent in the homologous genes of U. fasciata. A gene arrangement comparison of both Ulva species showed that a ∼27 000 bp segment of DNA consisting of genes psbB to trnT(UGU) was inverted. Furthermore, a phylogenetic analysis of 1135 bp of the rbcL gene confirmed that this cpDNA and the previously published mitochondrial genome from this sample were indeed from U. fasciata.
Article
Full-text available
Previous studies of trebouxiophycean chloroplast genomes revealed little information regarding the evolutionary dynamics of this genome because taxon sampling was too sparse and the relationships between the sampled taxa were unknown. We recently sequenced the chloroplast genomes of 27 trebouxiophycean and two pedinophycean green algae to resolve the relationships among the main lineages recognized for the Trebouxiophyceae. These taxa and the previously sampled members of the Pedinophyceae and Trebouxiophyceae are included in the comparative chloroplast genome analysis we report here. The 38 genomes examined display considerable variability at all levels, except gene content. Our results highlight the high propensity of the rDNA-containing large inverted repeat (IR) to vary in size, gene content and gene order as well as the repeated losses it experienced during trebouxiophycean evolution. Of the seven predicted IR losses, one event demarcates a superclade of 11 taxa representing five late-diverging lineages. IR expansions/contractions account not only for changes in gene content in this region, but also for changes in gene order and gene duplications. Inversions also led to gene rearrangements within the IR, including the reversal or disruption of the rDNA operon in some lineages. Most of the 20 IR-less genomes are more rearranged compared to their IR-containing homologs and tend to show an accelerated rate of sequence evolution. In the IR-less superclade, several ancestral operons were disrupted, a few genes were fragmented, and a subgroup of taxa features a G+C-biased nucleotide composition. Our analyses also unveiled putative cases of gene acquisitions through horizontal transfer. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Article
Full-text available
Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensisvar deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5'portion of the psbAgene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.
Article
Full-text available
Bignoniaceae is a Pantropical plant family that is especially abundant in the Neotropics. Members of the Bignoniaceae are diverse in many ecosystems and represent key components of the Tropical flora. Despite the ecological importance of the Bignoniaceae and all the efforts to reconstruct the phylogeny of this group, whole chloroplast genome information has not yet been reported for any members of the family. Here, we report the complete chloroplast genome sequence of Tanaecium tetragonolobum (Jacq.) L.G. Lohmann, which was reconstructed using de novo and referenced-based assembly of single-end reads generated by shotgun sequencing of total genomic DNA in an Illumina platform. The gene order and organization of the chloroplast genome of T. tetragonolobum exhibits the general structure of flowering plants, and is similar to other Lamiales chloroplast genomes. The chloroplast genome of T. tetragonolobum is a circular molecule of 153,776 base pairs (bp) with a quadripartite structure containing two single copy regions, a large single copy region (LSC, 84,612 bp) and a small single copy region (SSC, 17,586 bp) separated by inverted repeat regions (IRs, 25,789 bp). In addition, the chloroplast genome of T. tetragonolobum has 38.3% GC content and includes 121 genes, of which 86 are protein-coding, 31 are transfer RNA, and four are ribosomal RNA. The chloroplast genome of T. tetragonolobum presents a total of 47 tandem repeats and 347 simple sequence repeats (SSRs) with mononucleotides being the most common and di-, tri-, tetra-, and hexanucleotides occurring with less frequency. The results obtained here were compared to other chloroplast genomes of Lamiales available to date, providing new insight into the evolution of chloroplast genomes within Lamiales. Overall, the evolutionary rates of genes in Lamiales are lineage-, locus-, and region-specific, indicating that the evolutionary pattern of nucleotide substitution in chloroplast genomes of flowering plants is complex. The discovery of tandem repeats within T. tetragonolobum and the presence of divergent regions between chloroplast genomes of Lamiales provides the basis for the development of markers at various taxonomic levels. The newly developed markers have the potential to greatly improve the resolution of molecular phylogenies.
Article
Full-text available
Species of Bryopsidales form ecologically important components of seaweed communities worldwide. These siphonous macroalgae are composed of a single giant tubular cell containing millions of nuclei and chloroplasts, and harbor diverse bacterial communities. Little is known about the diversity of chloroplast genomes (cpDNAs) in this group, and about the possible consequences of intracellular bacteria on genome composition of the host. We present the complete cpDNAs of Bryopsis plumosa and Tydemania expeditionis, as well as a re-annotated cpDNA of B. hypnoides, which was shown to contain a higher number of genes than originally published. Chloroplast genomic data were also used to evaluate phylogenetic hypotheses in the Chlorophyta, such as monophyly of the Ulvophyceae (the class in which the order Bryopsidales is currently classified). Both DNAs are circular and lack a large inverted repeat. The cpDNA of B. plumosa is 106,859 bp long and contains 115 unique genes. A 13 kb region was identified with several freestanding open reading frames (ORFs) of putative bacterial origin, including a large ORF (>8 kb) closely related to bacterial rhs-family genes. The cpDNA of T. expeditionis is 105,200 bp long and contains 125 unique genes. As in B. plumosa, several regions were identified with ORFs of possible bacterial origin, including genes involved in mobile functions (transposases, integrases, phage/plasmid DNA primases), and ORFs showing close similarity with bacterial DNA methyltransferases. The cpDNA of B. hypnoides differs from that of B. plumosa mainly in the presence of long intergenic spacers, and a large tRNA region. Chloroplast phylogenomic analyses were largely inconclusive with respect to monophyly of the Ulvophyceae, and the relationship of the Bryopsidales within the Chlorophyta. The cpDNAs of B. plumosa and T. expeditionis are amongst the smallest and most gene dense chloroplast genomes in the core Chlorophyta. The presence of bacterial genes, including genes typically found in mobile elements, suggest that these have been acquired through horizontal gene transfer, which may have been facilitated by the occurrence of obligate intracellular bacteria in these siphonous algae.
Article
Full-text available
Sequencing mitochondrial and chloroplast genomes has become an integral part in understanding the genomic machinery and the phylogenetic histories of green algae. Previously, only three chloroplast genomes (Oltmannsiellopsis viridis, Pseudendoclonium akinetum, and Bryopsis hypnoides) and two mitochondrial genomes (O. viridis and P. akinetum) from the class Ulvophyceae have been published. Here, we present the first chloroplast and mi-tochondrial genomes from the ecologically and economically important marine, green algal genus Ulva. The chloroplast genome of Ulva sp. was 99,983 bp in a circular-mapping molecule that lacked inverted repeats, and thus far, was the smallest ulvophycean plastid ge-nome. This cpDNA was a highly compact, AT-rich genome that contained a total of 102 identified genes (71 protein-coding genes, 28 tRNA genes, and three ribosomal RNA genes). Additionally, five introns were annotated in four genes: atpA (1), petB (1), psbB (2), and rrl (1). The circular-mapping mitochondrial genome of Ulva sp. was 73,493 bp and follows the expanded pattern also seen in other ulvophyceans and trebouxiophyceans. The Ulva sp. mtDNA contained 29 protein-coding genes, 25 tRNA genes, and two rRNA genes for a total of 56 identifiable genes. Ten introns were annotated in this mtDNA: cox1 (4), atp1 (1), nad3 (1), nad5 (1), and rrs (3). Double-cut-and-join (DCJ) values showed that organellar genomes across Chlorophyta are highly rearranged, in contrast to the highly conserved organellar genomes of the red algae (Rhodophyta). A phylogenomic investigation of 51 plastid protein-coding genes showed that Ulvophyceae is not monophyletic, and also placed Oltmannsiellopsis (Oltmannsiellopsidales) and Tetraselmis (Chlorodendrophyceae) closely to Ulva (Ulvales) and Pseudendoclonium (Ulothrichales).
Article
Full-text available
Premise of the study: To study population genetics, phylogeography, and hybridization of Nelumbo (Nelumbonaceae), chloroplast microsatellite markers were developed. Methods and results: Seventeen microsatellite loci were identified from the chloroplast genomes of N. nucifera and N. lutea. Polymorphisms were assessed in three populations of N. nucifera and one population of N. lutea. Nine loci were found to be polymorphic in N. nucifera, and all 17 loci were found to be polymorphic in N. lutea. In N. nucifera, the number of alleles per locus ranged from two to six, and the unbiased haploid diversity per locus ranged from 0.198 to 0.790. In N. lutea, the number of alleles ranged from two to four, and the unbiased haploid diversity per locus ranged from 0.245 to 0.694. Conclusions: The identified chloroplast simple sequence repeat markers will be useful for the study of genetic diversity, phylogeography, and identification of Nelumbo cultivars.
Article
Full-text available
The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.
Article
Full-text available
The second Internal Transcriber Spacer (ITS2) is a fast evolving part of the nuclear-encoded rRNA operon located between the 5.8S and 28S rRNA genes. Based on crossing experiments it has been proposed that even a single Compensatory Base Change (CBC) in helices 2 and 3 of the ITS2 indicates sexual incompatibility and thus separates biological species. Taxa without any CBC in these ITS2 regions were designated as a 'CBC clade'. However, in depth comparative analyses of ITS2 secondary structures, ITS2 phylogeny, the origin of CBCs, and their relationship to biological species have rarely been performed. To gain 'close-up' insights into ITS2 evolution, (1) 86 sequences of ITS2 including secondary structures have been investigated in the green algal order Ulvales (Chlorophyta, Viridiplantae), (2) after recording all existing substitutions, CBCs and hemi-CBCs (hCBCs) were mapped upon the ITS2 phylogeny, rather than merely comparing ITS2 characters among pairs of taxa, and (3) the relation between CBCs, hCBCs, CBC clades, and the taxonomic level of organisms was investigated in detail. High sequence and length conservation allowed the generation of an ITS2 consensus secondary structure, and introduction of a novel numbering system of ITS2 nucleotides and base pairs. Alignments and analyses were based on this structural information, leading to the following results: (1) in the Ulvales, the presence of a CBC is not linked to any particular taxonomic level, (2) most CBC 'clades' sensu Coleman are paraphyletic, and should rather be termed CBC grades. (3) the phenetic approach of pairwise comparison of sequences can be misleading, and thus, CBCs/hCBCs must be investigated in their evolutionary context, including homoplasy events (4) CBCs and hCBCs in ITS2 helices evolved independently, and we found no evidence for a CBC that originated via a two-fold hCBC substitution. Our case study revealed several discrepancies between ITS2 evolution in the Ulvales and generally accepted assumptions underlying ITS2 evolution as e.g. the CBC clade concept. Therefore, we developed a suite of methods providing a critical 'close-up' view into ITS2 evolution by directly tracing the evolutionary history of individual positions, and we caution against a non-critical use of the ITS2 CBC clade concept for species delimitation.
Article
Full-text available
Despite their name, synonymous mutations have significant consequences for cellular processes in all taxa. As a result, an understanding of codon bias is central to fields as diverse as molecular evolution and biotechnology. Although recent advances in sequencing and synthetic biology have helped to resolve longstanding questions about codon bias, they have also uncovered striking patterns that suggest new hypotheses about protein synthesis. Ongoing work to quantify the dynamics of initiation and elongation is as important for understanding natural synonymous variation as it is for designing transgenes in applied contexts.
Article
Full-text available
The green algae belonging to the Chlorophyta-the lineage sister to that comprising the land plants and their charophycean green algal relatives (Streptophyta)-have been subdivided into four classes (Prasinophyceae, Ulvophyceae, Trebouxiophyceae, and Chlorophyceae). Yet the Pedinomonadales, an assemblage consisting of tiny, naked uniflagellates with a second basal body, has no clear affiliation with these classes and the branching order of the crown chlorophytes remains unknown. To gain an insight into the phylogenetic position of the Pedinomonadales and the relationships among the recognized chlorophyte classes, we have sequenced the chloroplast genomes of Pedinomonas minor (Pedinomonadales) and of two trebouxiophyceans belonging to the Chlorellales, Parachlorella kessleri (Chlorellaceae) and Oocystis solitaria (Oocystaceae), and compared these genomes with those of previously examined streptophytes and chlorophytes, including Chlorella vulgaris (Chlorellaceae). Unlike their Chlorella homolog, the three newly investigated chloroplast DNAs (cpDNAs) carry a large rRNA-encoding inverted repeat (IR) that divides the genome into large and small single-copy regions. In contrast to the situation found for ulvophycean and chlorophycean cpDNAs, the gene contents of the IR and single-copy regions are strikingly similar to that inferred for the common ancestor of chlorophytes and streptophytes. The intronless 98,340-bp Pedinomonas genome is among the chlorophyte cpDNAs featuring the smallest size and most ancestral gene organization. All 105 conserved genes encoded by this genome are included in the gene repertoires of Oocystis (111 genes) and Chlorella (113 genes), with just trnR(ccg) missing from Parachlorella cpDNA. Trees inferred from 71 cpDNA-encoded genes/proteins of 16 chlorophytes and nine streptophytes showed that Pedinomonas is nested in the Chlorellales, a group of algae lacking flagella. This phylogenetic conclusion is independently supported by uniquely shared gene linkages. We hypothesize that chlorellalean and pedinomonadalean green algae are reduced forms of a distant biflagellate ancestor that might have also given rise to the other known trebouxiophycean lineages. Our structural cpDNA data suggest that the Chlorellales and Pedinomonadales represent a deep branch of core chlorophytes, strengthening the notion that the Trebouxiophyceae emerged before the Ulvophyceae and Chlorophyceae. Our results further emphasize the importance of secondary reduction at both the cellular and genome levels during chlorophyte evolution.
Article
Full-text available
Codon usage data has been compiled for 110 yeast genes. Cluster analysis on relative synonymous codon usage revealed two distinct groups of genes. One group corresponds to highly expressed genes, and has much more extreme synonymous codon preference. The pattern of codon usage observed is consistent with that expected if a need to match abundant tRNAs, and intermediacy of tRNA-mRNA interaction energies are important selective constraints. Thus codon usage in the highly expressed group shows a higher correlation with tRNA abundance, a greater degree of third base pyrimidine bias, and a lesser tendency to the A+T richness which is characteristic, of the yeast genome. The cluster analysis can be used to predict the likely level of gene expression of any gene, and identifies the pattern of codon usage likely to yield optimal gene expression in yeast.
Article
Full-text available
A software tool was developed for the identification of simple sequence repeats (SSRs) in a barley ( Hordeum vulgare L.) EST (expressed sequence tag) database comprising 24,595 sequences. In total, 1,856 SSR-containing sequences were identified. Trimeric SSR repeat motifs appeared to be the most abundant type. A subset of 311 primer pairs flanking SSR loci have been used for screening polymorphisms among six barley cultivars, being parents of three mapping populations. As a result, 76 EST-derived SSR-markers were integrated into a barley genetic consensus map. A correlation between polymorphism and the number of repeats was observed for SSRs built of dimeric up to tetrameric units. 3'-ESTs yielded a higher portion of polymorphic SSRs (64%) than 5'-ESTs did. The estimated PIC (polymorphic information content) value was 0.45 +/- 0.03. Approximately 80% of the SSR-markers amplified DNA fragments in Hordeum bulbosum, followed by rye, wheat (both about 60%) and rice (40%). A subset of 38 EST-derived SSR-markers comprising 114 alleles were used to investigate genetic diversity among 54 barley cultivars. In accordance with a previous, RFLP-based, study, spring and winter cultivars, as well as two- and six-rowed barleys, formed separate clades upon PCoA analysis. The results show that: (1) with the software tool developed, EST databases can be efficiently exploited for the development of cDNA-SSRs, (2) EST-derived SSRs are significantly less polymorphic than those derived from genomic regions, (3) a considerable portion of the developed SSRs can be transferred to related species, and (4) compared to RFLP-markers, cDNA-SSRs yield similar patterns of genetic diversity.
Article
Full-text available
A computer program, ARAGORN, identifies tRNA and tmRNA genes. The program employs heuristic algorithms to predict tRNA secondary structure, based on homology with recognized tRNA consensus sequences and ability to form a base‐paired cloverleaf. tmRNA genes are identified using a modified version of the BRUCE program. ARAGORN achieves a detection sensitivity of 99% from a set of 1290 eubacterial, eukaryotic and archaeal tRNA genes and detects all complete tmRNA sequences in the tmRNA database, improving on the performance of the BRUCE program. Recently discovered tmRNA genes in the chloroplasts of two species from the ‘green’ algae lineage are detected. The output of the program reports the proposed tRNA secondary structure and, for tmRNA genes, the secondary structure of the tRNA domain, the tmRNA gene sequence, the tag peptide and a list of organisms with matching tmRNA peptide tags.
Article
Full-text available
One major lineage of green plants, the Chlorophyta, is represented by the green algal classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae, and Chlorophyceae. The Prasinophyceae occupies the most basal position in the Chlorophyta, but the branching order of the Ulvophyceae, Trebouxiophyceae, and Chlorophyceae remains unresolved. The chloroplast genome sequences currently available for representatives of three chlorophyte classes have revealed that this genome is highly plastic, with Chlamydomonas (Chlorophyceae) and Chlorella (Trebouxiophyceae) showing fewer ancestral features than Nephroselmis (Prasinophyceae). We report the 195,867-bp chloroplast DNA (cpDNA) sequence of Pseudendoclonium akinetum (Ulvophyceae), a member of the class that has not been previously examined for detailed cpDNA analysis. This genome shares common evolutionary trends with its Chlorella and Chlamydomonas homologs. The gene content, number of ancestral gene clusters, and abundance of short dispersed repeats in Pseudendoclonium cpDNA are intermediate between those observed for Chlorella and Chlamydomonas cpDNAs. Although Pseudendoclonium cpDNA features a large inverted repeat, its quadripartite structure is unusual in displaying an rRNA operon transcribed toward the large single-copy (LSC) region and a small single-copy region containing 14 genes that are normally found in the LSC region. Twenty-seven group I introns lie in nine genes and fall within four subgroups (IA1, IA2, IA3, and IB); 19 encode putative homing endonucleases, and 7 have homologs at identical insertion sites in other chlorophyte or streptophyte organelle genomes. The high similarity observed among the 14 IA1 and 7 IA2 introns and their encoded endonucleases suggests that many introns arose from intragenomic proliferation of a few founding introns in the lineage leading to Pseudendoclonium. Interestingly, one intron (in atpA) and some of the dispersed repeats also reside in Pseudendoclonium mitochondria, providing strong evidence for interorganellar lateral transfer of these genetic elements. Phylogenetic analyses of 58 cpDNA-encoded proteins and genes support the hypothesis that the Ulvophyceae is sister to the Trebouxiophyceae but cannot eliminate the hypothesis that the Ulvophyceae is sister to the Chlorophyceae. We favor the latter hypothesis because it is strongly supported by phylogenetic analyses of gene order data and by independent structural evidence based on shared gene losses and rearrangement break points within ancestrally conserved gene clusters.
Article
Full-text available
Transfer RNAs (tRNAs) and small nucleolar RNAs (snoRNAs) are two of the largest classes of non-protein-coding RNAs. Conventional gene finders that detect protein-coding genes do not find tRNA and snoRNA genes because they lack the codon structure and statistical signatures of protein-coding genes. Previously, we developed tRNAscan-SE, snoscan and snoGPS for the detection of tRNAs, methylation-guide snoRNAs and pseudouridylation-guide snoRNAs, respectively. tRNAscan-SE is routinely applied to completed genomes, resulting in the identification of thousands of tRNA genes. Snoscan has successfully detected methylation-guide snoRNAs in a variety of eukaryotes and archaea, and snoGPS has identified novel pseudouridylation-guide snoRNAs in yeast and mammals. Although these programs have been quite successful at RNA gene detection, their use has been limited by the need to install and configure the software packages on UNIX workstations. Here, we describe online implementations of these RNA detection tools that make these programs accessible to a wider range of research biologists. The tRNAscan-SE, snoscan and snoGPS servers are available at http://lowelab.ucsc.edu/tRNAscan-SE/, http://lowelab.ucsc.edu/snoscan/ and http://lowelab.ucsc.edu/snoGPS/, respectively.
Article
Full-text available
The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. The basal position of the Prasinophyceae has been well documented, but the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae is currently debated. The four complete chloroplast DNA (cpDNA) sequences presently available for representatives of these classes have revealed extensive variability in overall structure, gene content, intron composition and gene order. The chloroplast genome of Pseudendoclonium (Ulvophyceae), in particular, is characterized by an atypical quadripartite architecture that deviates from the ancestral type by a large inverted repeat (IR) featuring an inverted rRNA operon and a small single-copy (SSC) region containing 14 genes normally found in the large single-copy (LSC) region. To gain insights into the nature of the events that led to the reorganization of the chloroplast genome in the Ulvophyceae, we have determined the complete cpDNA sequence of Oltmannsiellopsis viridis, a representative of a distinct, early diverging lineage. The 151,933 bp IR-containing genome of Oltmannsiellopsis differs considerably from Pseudendoclonium and other chlorophyte cpDNAs in intron content and gene order, but shares close similarities with its ulvophyte homologue at the levels of quadripartite architecture, gene content and gene density. Oltmannsiellopsis cpDNA encodes 105 genes, contains five group I introns, and features many short dispersed repeats. As in Pseudendoclonium cpDNA, the rRNA genes in the IR are transcribed toward the single copy region featuring the genes typically found in the ancestral LSC region, and the opposite single copy region harbours genes characteristic of both the ancestral SSC and LSC regions. The 52 genes that were transferred from the ancestral LSC to SSC region include 12 of those observed in Pseudendoclonium cpDNA. Surprisingly, the overall gene organization of Oltmannsiellopsis cpDNA more closely resembles that of Chlorella (Trebouxiophyceae) cpDNA. The chloroplast genome of the last common ancestor of Oltmannsiellopsis and Pseudendoclonium contained a minimum of 108 genes, carried only a few group I introns, and featured a distinctive quadripartite architecture. Numerous changes were experienced by the chloroplast genome in the lineages leading to Oltmannsiellopsis and Pseudendoclonium. Our comparative analyses of chlorophyte cpDNAs support the notion that the Ulvophyceae is sister to the Chlorophyceae.
Article
Full-text available
Unlabelled: RAxML-VI-HPC (randomized axelerated maximum likelihood for high performance computing) is a sequential and parallel program for inference of large phylogenies with maximum likelihood (ML). Low-level technical optimizations, a modification of the search algorithm, and the use of the GTR+CAT approximation as replacement for GTR+Gamma yield a program that is between 2.7 and 52 times faster than the previous version of RAxML. A large-scale performance comparison with GARLI, PHYML, IQPNNI and MrBayes on real data containing 1000 up to 6722 taxa shows that RAxML requires at least 5.6 times less main memory and yields better trees in similar times than the best competing program (GARLI) on datasets up to 2500 taxa. On datasets > or =4000 taxa it also runs 2-3 times faster than GARLI. RAxML has been parallelized with MPI to conduct parallel multiple bootstraps and inferences on distinct starting trees. The program has been used to compute ML trees on two of the largest alignments to date containing 25,057 (1463 bp) and 2182 (51,089 bp) taxa, respectively. Availability: icwww.epfl.ch/~stamatak
Article
Phylogenetic relationships within the green algal phylum Chlorophyta have proven difficult to resolve. The core Chlorophyta include Chlorophyceae, Ulvophyceae, Trebouxiophyceae, Pedinophyceae and Chlorodendrophyceae, but the relationships among these classes remain unresolved and the monophyly of Ulvophyceae and Trebouxiophyceae are highly controversial. We analyzed a dataset of 101 green algal species and 73 protein-coding genes sampled from complete and partial chloroplast genomes, including six newly sequenced ulvophyte genomes (Blidingia minima NIES-1837, Ulothrix zonata, Halochlorococcum sp. NIES-1838, Scotinosphaera sp. NIES-154, Caulerpa brownii and Cephaleuros sp. HZ-2017). We applied the Tree Certainty (TC) score to quantify the level of incongruence between phylogenetic trees in chloroplast genomic datasets, and show that the conflicting phylogenetic trees of core Chlorophyta stem from the most GC-heterogeneous sites. With removing the most GC-heterogeneous sites, our chloroplast phylogenomic analyses using heterogeneous models consistently support monophyly of the Chlorophyceae and of the Trebouxiophyceae, but the Ulvophyceae was resolved as polyphyletic. Our analytical framework provides an efficient approach to recover the optimal phylogenetic relationships by minimizing conflicting signals.
Article
Nautilus pompilius has been listed as an endangered species by CITES due to low fecundity and overfishing. In this study, its whole circular mitochondrial genome (15,693 bp) was determined by polymerase chain reaction method. It contained the typical mitogenome gene set of 22 tRNAs, 13 protein coding genes (NAD1-6, NAD4L, COX1-3, Cob, ATP6 and ATP8), two rRNAs (rrnS and rrnL) and the non-coding A+T-rich region. The nucleotide composition was asymmetric with an obvious bias towards A and T (60.2%). Phylogenetic analysis revealed that N. pompilius is closely related to Nautilus macromphalus as expected. The complete mitochondrial genome of Nautilus pompilius will provided valuable information for corresponding conservation genetic studies.
Article
Organelle phylogenomic analysis requires precisely constructed multi-gene alignment matrices concatenated by pre-aligned single gene datasets. For non-bioinformaticians, it can take days to weeks to manually create high-quality multi-gene alignments comprising tens or hundreds of homologous genes. Here, we describe a new and highly efficient pipeline, HomBlocks, which uses a homologous block searching method to construct multiple sequence alignment. This approach can automatically recognize locally collinear blocks among organelle genomes and excavate phylogenetically informative regions to construct multiple sequence alignment in a few hours. In addition, HomBlocks supports organelle genomes without annotation and makes adjustment to different taxon datasets, thereby enabling the inclusion of as many common genes as possible. Topology comparison of trees built by conventional multi-gene and HomBlocks alignments implemented in different taxon categories shows that the same efficiency can be achieved by HomBlocks as when using the traditional method. The availability of Homblocks makes organelle phylogenetic analyses more accessible to non-bioinformaticians, thereby promising to lead to a better understanding of phylogenic relationships at an organelle genome level. Availability and implementation: HomBlocks is implemented in Perl and is supported by Unix-like operative systems, including Linux and macOS. The Perl source code is freely available for download from https://github.com/fenghen360/HomBlocks.git, and documentation and tutorials are available at https://github.com/fenghen360/HomBlocks. Contact: yxmao@ouc.edu.cn or fenghen360@126.com.
Article
Red algal plastid genomes are often considered ancestral and evolutionarily stable, and thus more closely resembling the last common ancestral plastid genome of all photosynthetic eukaryotes [1, 2]. However, sampling of red algal diversity is still quite limited (e.g., [2, 3, 4, 5]). We aimed to remedy this problem. To this end, we sequenced six new plastid genomes from four undersampled and phylogenetically disparate red algal classes (Porphyridiophyceae, Stylonematophyceae, Compsopogonophyceae, and Rhodellophyceae) and discovered an unprecedented degree of genomic diversity among them. These genomes are rich in introns, enlarged intergenic regions, and transposable elements (in the rhodellophycean Bulboplastis apyrenoidosa), and include the largest and most intron-rich plastid genomes ever sequenced (that of the rhodellophycean Corynoplastis japonica; 1.13 Mbp). Sophisticated phylogenetic analyses accounting for compositional heterogeneity show that these four “basal” red algal classes form a larger monophyletic group, Proteorhodophytina subphylum nov., and confidently resolve the large-scale relationships in the Rhodophyta. Our analyses also suggest that secondary red plastids originated before the diversification of all mesophilic red algae. Our genomic survey has challenged the current paradigmatic view of red algal plastid genomes as “living fossils” [1, 2, 6] by revealing an astonishing degree of divergence in size, organization, and non-coding DNA content. A closer look at red algae shows that they comprise the most ancestral (e.g., [2, 7, 8]) as well as some of the most divergent plastid genomes known.
Article
The genetic diversity and DNA fingerprinting of 15 elite rice genotypes using 30 SSR primers on chromosome numbers 7-12 was investigated. The results revealed that all the primers showed distinct polymorphism among the cultivars studied indicating the robust nature of microsatellites in revealing polymorphism. Cluster analysis grouped the rice genotypes into 10 classes in which japonica types DH-1 (Azucena) and Moroborekan clustered separately from indica types. Principal component analysis was done to visualize genetic relationships among the elite breeding lines. The results were similar to UPGMA results. Based on this study, the larger range of similarity values for related cultivars using microsatellites provides greater confidence for the assessment of genetic diversity and relationships. The information obtained from the DNA fingerprinting studies helps to distinctly identify and characterize 9 varieties using 18 different RM primers. This information can be used in background selections during backcross breeding programs.
Article
We cultured and sequenced newly collected material of Percursaria dawsonii Hollenberg et I. A. Abbott, a poorly known epibiont of limpets found along the west coast of North America. Zoospores, parthenogametes, and zygotes exhibited empty-spore germination, which produced a prostrate system that expanded by stolon-like development. A dense carpet of initially uniseriate filaments arose from the prostrate system. These upright filaments soon became biseriate proximally and pluriseriate distally, where they were flat, twisted and distromatic for approximately four-fifths of their length and rarely became hollow in larger individuals. These fronds became fertile distally and released quadriflagellate zoospores and/or biflagellate isogametes. Individual gametophytic fronds were dioecious, and alternated with isomorphic sporophytic fronds. The gametes moved rapidly and often exhibited an unusual side-to-side vibration that made them appear like winged insects. Comparisons of 18S rRNA gene and ITS sequences indicated that P. dawsonii should be included in the genus Blidingia, family Kornmanniaceae.
Article
Codon usage in chloroplast genome of six seed plants (Arabidopsis thaliana, Populus alba, Zea mays, Triticum aestivum, Pinus koraiensis and Cycas taitungensis) was analyzed to find general patterns of codon usage in chloroplast genomes of seed plants. The results show that chloroplast genomes of the six seed plants had similar codon usage patterns, with a strong bias towards a high representation of NNA and NNT codons. In chloroplast genomes of the six seed plants, the effective number of codons (ENC) for most genes was similar to that of the expected ENC based on the GC content at the third codon position, but several genes with low ENC values were laying below the expected curve. All of these data indicate that codon usage was dominated by a mutational bias in chloroplast genomes of seed plants and that selection appeared to be limited to a subset of genes and to only subtly affect codon usage. Meantime, four, six, eight, nine, ten and 12 codons were defined as the optimal codons in chloroplast genomes of the six seed plants.
Article
In phylogenetic analyses of molecular sequence data, partitioning involves estimating independent models of molecular evolution for different sets of sites in a sequence alignment. Choosing an appropriate partitioning scheme is an important step in most analyses because it can affect the accuracy of phylogenetic reconstruction. Despite this, partitioning schemes are often chosen without explicit statistical justification. Here, we describe two new objective methods for the combined selection of best-fit partitioning schemes and nucleotide substitution models. These methods allow millions of partitioning schemes to be compared in realistic time frames and so permit the objective selection of partitioning schemes even for large multilocus DNA data sets. We demonstrate that these methods significantly outperform previous approaches, including both the ad hoc selection of partitioning schemes (e.g., partitioning by gene or codon position) and a recently proposed hierarchical clustering method. We have implemented these methods in an open-source program, PartitionFinder. This program allows users to select partitioning schemes and substitution models using a range of information-theoretic metrics (e.g., the Bayesian information criterion, akaike information criterion [AIC], and corrected AIC). We hope that PartitionFinder will encourage the objective selection of partitioning schemes and thus lead to improvements in phylogenetic analyses. PartitionFinder is written in Python and runs under Mac OSX 10.4 and above. The program, source code, and a detailed manual are freely available from www.robertlanfear.com/partitionfinder.
Article
Chloroplast genomes have retained a core set of genes from their cyanobacterial ancestor, most of them required for the light reactions of photosynthesis or functions connected with transcription and translation. Other genes have been transferred to the nucleus or were lost in a lineage-specific manner. The genomes are distinguished by the selection of genes retained, whether or not transcripts are edited, presence/absence of introns and small repeats and their physical organization. Plants and green algae have kept fewer plastid genes than either the red algae or the chromistan algae, which obtained their plastids from red algae by secondary endosymbiosis. Photosynthetic dinoflagellates have the fewest (fewer than 20), but still grow photoautotrophically. All chloroplast genomes map as a circle, but there have been extensive rearrangements of gene order even between related species. Genome sizes vary much more than gene content, depending on the extent of gene duplication and small repeats and the size of intergenic spacers.
Article
The repetitive structure of genomic DNA holds many secrets to be discovered. A systematic study of repetitive DNA on a genomic or inter-genomic scale requires extensive algorithmic support. The REPuter program described herein was designed to serve as a fundamental tool in such studies. Efficient and complete detection of various types of repeats is provided together with an evaluation of significance and interactive visualization. This article circumscribes the wide scope of repeat analysis using applications in five different areas of sequence analysis: checking fragment assemblies, searching for low copy repeats, finding unique sequences, comparing gene structures and mapping of cDNA/EST sequences.
Article
In most bacteria, synonymous codons are not used with equal frequencies. Different factors have been proposed to contribute to codon usage preference, including translational selection, GC composition, strand-specific mutational bias, amino acid conservation, protein hydropathy, transcriptional selection and even RNA stability. The review discusses these factors and their contribution to bias in synonymous codon usage in bacterial genomes.
Article
Chloroplasts originated from cyanobacteria only once, but have been laterally transferred to other lineages by symbiogenetic cell mergers. Such secondary symbiogenesis is rarer and chloroplast losses commoner than often assumed.
Article
The influence of local base composition on mutations in chloroplast DNA (cpDNA) is studied in detail and the resulting, empirically derived, mutation dynamics are used to analyze both base composition and codon usage bias. A 4 x 4 substitution matrix is generated for each of the 16 possible flanking base combinations (contexts) using 17,253 noncoding sites, 1309 of which are variable, from an alignment of three complete grass chloroplast genome sequences. It is shown that substitution bias at these sites is correlated with flanking base composition and that the A+T content of these flanking sites as well as the number of flanking pyrimidines on the same strand appears to have general influences on substitution properties. The context-dependent equilibrium base frequencies predicted from these matrices are then applied to two analyses. The first examines whether or not context dependency of mutations is sufficient to generate average compositional differences between noncoding cpDNA and silent sites of coding sequences. It is found that these two classes of sites exist, on average, in very different contexts and that the observed mutation dynamics are expected to generate significant differences in overall composition bias that are similar to the differences observed in cpDNA. Context dependency, however, cannot account for all of the observed differences: although silent sites in coding regions appear to be at the equilibrium predicted, noncoding cpDNA has a significantly lower A+T content than expected from its own substitution dynamics, possibly due to the influence of indels. The second study examines the codon usage of low-expression chloroplast genes. When context is accounted for, codon usage is very similar to what is predicted by the substitution dynamics of noncoding cpDNA. However, certain codon groups show significant deviation when followed by a purine in a manner suggesting some form of weak selection other than translation efficiency. Overall, the findings indicate that a full understanding of mutational dynamics is critical to understanding the role selection plays in generating composition bias and sequence structure.
Mauve: multiple alignment of conserved genomic sequence with rearrangements
  • Darling
Die Chlorophyceen der schwedischen Westküste
  • Kylin
Codon usage in the chloroplast genome of rice (Oryza sativa L. ssp. japonica)
  • Liu