ArticlePDF Available

Dynamics of Actin Evolution in Dinoflagellates

Authors:

Abstract and Figures

Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.
Content may be subject to copyright.
Dynamics of Actin Evolution in Dinoflagellates
Sunju Kim,
,1
Tsvetan R. Bachvaroff,*,
1
Sara M. Handy,à
,2
and Charles F. Delwiche
2,3
1
Smithsonian Environmental Research Center, Edgewater, Maryland
2
Department of Cell Biology and Molecular Genetics
3
The Maryland Agricultural Experiment Station, University of Maryland
Present address: Department of Life Science, Gongju National University, Gongju, Chungnam, Republic of Korea.
àPresent address: U.S. FDA Center for Food Safety and Applied Nutrition, College Park, Maryland.
*Corresponding author: E-mail: bachvarofft@si.edu
Associate editor: Andrew Roger
Abstract
Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete
genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes
involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics
of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related
species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species
provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in
dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis
acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For
these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively
similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons
within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed
clades containing sequences from both species. One type included the most similar sequences in between-species
comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent
sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the
sequences, most variation occurred in synonymous sites or the 5#UnTranslated Region (UTR), although there was still
limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of
all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall,
variation in the actin gene family fits best with the ‘‘birth and death’’ model of evolution based on recent duplications,
pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that
actin may be too conserved to be useful for phylogenetic estimation of closely related species.
Key words: actin, birth-and-death evolution, dinoflagellate, phylogeny, Dinophysis, dinokaryon.
Introduction
Dinoflagellates are important primary producers in the
oceans and are infamous for creating toxins. They also have
a proclivity for endosymbiosis and have adopted plastids of
almost every major pigment type (Delwiche 1999;Schnepf
and Elbra¨chter 1999). The dinoflagellate genus Dinophysis is
a good example of both traits because Dinophysis species
produce diarrheic shellfish poison, and some species seem
to have adopted plastids from cryptophyte algae (Schnepf
and Elbra¨chter 1988). The Dinophysis plastid has been fairly
well studied based on pigment analysis, ultrastructure, plas-
tid gene phylogeny, and finally by culturing on cryptophtye
fed ciliates (Lucas and Vesk 1990;Takishita et al. 2002;Park
et al. 2006).
Dinophysis species also contain the characteristic dino-
flagellate nucleus, so-called dinokaryon (Spector 1984). The
dinokaryotic nucleus differs from other eukaryotes because
it is packed with massive condensed chromosomes, visible
under light microscopy throughout the cell cycle. When
examined with transmission electron microscopy, these
chromosomes contain banded DNA fibrils without the
nucleosomal-histone DNA packaging typical of most other
eukaryotes. Dinoflagellate nuclei are also notable because
of large genome size with some dinoflagellates containing
hundreds of picograms of DNA per nucleus (100 pg 10
11
bases), and the DNA contains a significant fraction of the
modified base 5-hydroxymethyluracil in place of thymine
(Rae 1976).
Although genome size makes complete sequencing dif-
ficult, cDNA sequencing has been done for a few dinofla-
gellate species (Bachvaroff et al. 2004;Hackett et al. 2004;
Lidie et al. 2005;Patron et al. 2005,2006). Analysis of highly
expressed genes and their genomic complements suggests
that some genes are in large multicopy gene families often
with subtle nucleotide and amino acid variation between
copies (Rowan et al. 1996;Reichman et al. 2003;Bachvaroff
et al. 2004,2009;Zhang et al. 2006 ;Bachvaroff and Place
2008;Moustafa et al. 2010). However, the evolution of the
multicopy gene families in dinoflagellates is not well
©The Author 2010. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please
e-mail: journals.permissions@oup.com
Mol. Biol. Evol. 28(4):1469–1480. 2011 doi:10.1093/molbev/msq332 Advance Access publication December 13, 2010 1469
Research article
understood, with possible mechanisms underlying the di-
versity including whole-genome duplication, concerted
evolution, and single gene duplication. In general, gene family
evolution can be fit to two competing models: concerted
evolution versus the birth and death model.
Under concerted evolution, gene duplication and revi-
sion are driven by large-scale events at the level of whole
genes, such as gene conversion via crossing over and re-
combination, as reviewed by Nei and Rooney (2005). These
events are predicted to occur within and between the pop-
ulations of gene copies in each cell. Therefore, changes be-
tween gene copies should occur at the scale of a whole gene
and tend to co-occur and spread by gene conversion across
many gene copies in ‘‘concert.’’ Concerted evolution helps
to explain distinct lineage sorting, particularly of neutral
sites such as ribosomal RNA (rRNA) Internal Transcribed
Spacer regions between sibling species, and fits the data
well when large numbers of similar or identical gene copies
are maintained in the genome. In contrast, under the birth
and death model, after ‘‘birth’’ (or duplication) individual
gene copies are expected to vary independently across the
genome and differences between copies can accumulate at
individual nucleotide sites. After many changes accumu-
late, some gene copies can lose proper function, become
pseudogenes, and effectively ‘‘die.’’ Birth and death models
explain observations such as incomplete lineage sorting,
pseudogenes, and changes between gene copies primarily
in synonymous or neutral sites.
Here, we examine variation in the actin multicopy gene
family in two closely related Dinophysis species to better
understand the evolution of gene families in dinoflagellates
and to assess the utility of this molecule for phylogenetic
studies. Actin is a conserved and ubiquitously expressed
multicopy gene that has already been deeply sequenced
in another dinoflagellate (Bachvaroff and Place 2008),
and we here generated small data sets from another five
dinoflagellate species as well as larger data sets from
two Dinophysis species. These latter two species, Dinophysis
acuminata and D. caudata are morphologically distinct but
closely related with 99.4% similar small subunit ribosomal
DNA sequences (Handy et al. 2009). Of particular interest
in this context was the question of whether the different
actin gene copies would be clearly distinct between the two
closely related species and change in a concerted manner
within each species. Alternately, if sequences from the two
species would intermingle in phylogenetic trees incomplete
lineage sorting and a birth and death mode of evolution
would be favored (Rooney 2004). The results provide in-
sight into the tempo and mode of gene family evolution
in Dinophysis and other dinoflagellates.
Materials and Methods
Sample Collection and Cultures
Dinophysis acuminata and D. caudata cell cultures were gen-
erously donated by Prof. Myung G. Park at Chonnam National
University in the Republic of Korea. Dinophysis acuminata
and D.caudata cells were taken a week after being fed on
the prey ciliate Myrionecta rubra, which in turn had been
raised on cryptophyte prey (Park et al. 2006), and harvested
by centrifugation at 1,200 g for 10 min. The pellets were
stored in 1.5 ml microfuge tubes containing RNAlater
(Ambion AM7020, Austin, TX) at 80 °Cfor1week.The
cells were washed with Tris–ethylenediaminetetraacetic acid
buffer (10 mM Tris-HCl pH 8.0, 1 mM ethylenediaminetetra-
acetic acid [EDTA]) four times to remove RNAlater using
centrifugation and aspiration of the supernatant to exchange
the washing fluid.
Dinophysis caudata cells collected with a 30 lm plank-
ton net from Ft. Pierce Inlet Pier (FPIP), Florida, in the
United States (27°27#550$N, 80°19#067$W) were individ-
ually isolated using capillary pipette, washed six times with
0.45 lm filtered seawater, and filtered onto 5 lm Millipore
polycarbonate filters. The filters were placed into 2.0 ml
centrifuge tubes containing a nonionic detergent solution
(Galuzzi et al. 2004) and stored at 20 or 4 °C until pro-
cessed. Blastodinium crassum infecting the copepod Para-
calanus parvus was isolated from the waters off of La Paz in
Baja California South, Mexico (Coats et al. 2008). The di-
noflagellate was preserved in nonacid Lugol’s solution
(4% w/v iodine þ6% w/v potassium iodide) and stored
at 4 °C until processed. The Lugol’s preserved cells were
washed six times with distilled water and placed into
a 1.5 ml microfuge tube with 100 ll distilled water.
Gymnodinium catenatum,Karlodinium veneficum, and
Katodinium rotundatum were grown in natural seawater
with a salinity of 15 supplemented with f/2 nutrients at
20 °C with light:dark cycle of 14:10 under approximately
100 lEin/m
2
/s
1
. Fifty to hundred milliliter volumes of ex-
ponential phase cultures (approximately 10,000 cells/ml)
were harvested by centrifugation in a clinical centrifuge
at 3,000 g for 10 min. The resulting pellets were stored
at 80 °C until processed.
Extraction of RNA and Genomic DNA
Total RNA and genomic DNA of D.acuminata and D.caudata
were extracted from cell pellets using TRIzol Reagent
(Invitrogen, Carlsbad, CA) according to the manufacturer’s
instructions. The cells were lysed in Trizol, chloroform was
added, and the aqueous phase was removed for RNA and
the interphase retained for DNA extraction (see below).
The RNA was precipitated from the aqueous phase with
two volumes of isopropanol, and the RNA pellet was fur-
ther purified using lithium precipitation. One-fifth volume
of 12 M LiCl was added to the RNA and the mixture was
incubated at 20 °C for 30 min, pelleted by centrifugation,
washed with 70% ethanol, and resuspended in RNAase free
water. The DNA was isolated from the Trizol aqueous:or-
ganic interphase. The remaining aqueous and organic
phases were completely removed and DNA was precipi-
tated from the aqueous:organic interphase with two vol-
umes of ethanol. The isolated DNA pellet was washed four
times with a solution of 0.1 M trisodium citrate and 10% eth-
anol, followed by a single wash with 75% ethanol. The DNA-
pellet was resuspended in 8 mM NaOH adjusted to pH
8.4 using 4-(2-HydroxyEthyl)-1-PiperazineEthaneSulfonic acid
Kim et al. ·doi:10.1093/molbev/msq332 MBE
1470
(free acid). The quantity and quality of DNA and RNA were
determined using a spectrophotometer (NanoDrop, ND-
1000, Thermo Scientific). When the DNA quality was low
based on 260:280 nm absorbance ratios, an additional chloro-
form purification and ethanol precipitation were done.
For the uncultured dinoflagellates D. caudata FPIP and
B. crassum, the preserved cells were washed in deionized
fresh water, briefly sonicated using a probe tipped sonica-
tor (Heat Systems Ultrasonic, Plain view, NY) as previously
described in Handy et al. (2009), and the resulting sonicate
was used as polymerase chain reaction (PCR) template. Ge-
nomic DNA from cultures of G. catenatum, Kar. veneficum,
and Kat. rotundatum was extracted using Cetyltrimethyl-
ammonium bromide (CTAB) buffer (100 mM Tris-HCL
pH 8.0, 0.7 M NaCl, 2% [w/v] CTAB [Sigma, St Louis,
MO], 20 mM EDTA) (Doyle and Doyle 1987). Each cell pel-
let was resuspended in 1 ml of CTAB buffer and incubated
at 50 °C for 10 min. The DNAs were purified by a single
extraction with chloroform, precipitated with isopropanol,
washed with 70% ethanol, and resuspended in 50–100 llof
water.
Reverse Transcription of RNA and PCR
Total RNA was reverse transcribed with a poly T primer,
RTpr (5#-CGAATTGTCGACTAGTACTTTTTTTTTTTTTT-
TT-3#) using the AccuScript High Fidelity Reverse Tran-
scriptase-PCR System (Stratagene, La Jolla, CA) according
to the manufacturer’s instructions but modified by skip-
ping 65 °C incubation prior to the synthesis step at
42 °C and adding 1 ll (40 U) of RNase OUT (Invitrogen,
Carlsbad, CA). The additional sequence before the polyT
sequence in the primer provided a 3#UTR priming site.
PCR Amplification
The quality of cDNA was then checked using a PCR with
the spliced leader primer SL1 (5#-TCCGTAGCCATTTTG-
GCTCAA-3#) and the reverse transcription primer NDTRN
that is included in the polyT reverse transcription primer
(5#-CGAATTGTCGACTAGTACTTT-3#). The spliced leader
is a conserved sequence found at the 5#end of a diverse
array of dinoflagellate mRNAs (Lidie and Van Dolah 2007;
Zhang et al. 2007). The PCR conditions were initial dena-
turation at 94 °C for 2 min; 35 cycles of 94 °C for 30 s, 57 °C
for 30 s, and 72 °C for 2 min; and final extension at 72 °C for
5 min. Reactions were run in total volume of 20 ll contain-
ing 500 mg/ml bovine serum albumin (Sigma A2053),
50 mM Tris-HCl (pH 8.3), 3 mM Mg, 10 lM deoxyribonu-
clotides, and 0.12 units of Promega Go-Taq. Two microli-
ters of the first-strand cDNA reaction were added as
template. Agarose gel electrophoresis followed by ethidium
bromide staining was used to see if reverse transcription
and amplification were successful.
Partial actin cDNA (approximately 774 nt) was amplified
with the forward primer ACT AF2 (5#-ATGACKCAGA-
TYATGTTYGA-3#) and reverse primer ACT OR1 (5#-TCA-
GAAGCACTTCCTGTGCAC-3#)(fig.1) using 2 ll of the first
stand of cDNA reaction in the same PCR programs above.
The complete coding region (approximately 1,199–1,248
nt) was amplified from cDNA with the general spliced
leader primer SL1 and gene-specific reverse primer OR1
(fig. 1) using the same PCR program as above. The same
actin primers and PCR conditions were used to amplify
genomic versions of actin using either DNA purified from
cultures (50 ng of DNA added) or sonicated cells (4 llof
sonicate) as template.
The PCR products were visualized with agarose gel elec-
trophoresis followed by ethidium bromide staining. Most
PCR products were purified using PolyEthylene Glycol
(PEG) precipitation (20% w/v PEG, mw 8000, 2.5 M NaCl
solution) (Morgan and Soltis 1995). When multiple bands
were present, only correctly sized products were purified
with a QIAEX II agarose gel extraction kit (QIAGEN, Valen-
cia, CA). Purified PCR products were cloned with the
pGEM-T Easy Vector and competent cells (Promega, Mad-
ison, WI). White colonies were randomly picked for PCR
with M13 vector primers (as above) and sequenced with
Big Dye Terminator Cycle Sequencing Kit version 3.1 (Ap-
plied Biosystems, Foster, CA) on an ABI 3730 sequencer.
Sequence Analyses and Alignments
Sequencher v.4.8 (Genecodes, Ann Arbor, MI) was used to
trim vector and ambiguous sequences and to assemble bi-
directional sequencing reads for each clone. The resulting
nucleotide sequences were verified by Blast searches in
National Center for Biotechnology Information (NCBI)
and screened for potential chimeric sequences with the
Bellerophon program (Huber et al. 2004).
The actin sequences generated in this study were aligned
with the actin sequences from GenBank using MacClade
v.4.08 (Maddison and Maddison 2002). The nucleotide
alignment contained 277 sequences and was trimmed to
774 nucleotide comparable positions. Four separate data
sets were used for phylogenetic analysis: one, D.acuminata
alone; two, D.caudata alone; three, D.acuminata and
D.caudata together; and four, all eight dinoflagellate species.
The aligned nucleotide data set was also translated into
amino acids (257 amino acids) with MacClade. Pairwise dis-
tances between gene copies were calculated with PAUP* v.
4b10. Different taxon categories were extracted using perl
scripts. Histograms of the distances were calculated with
IGOR Pro v. 5.04B software (Oregon) and graphed using
Sigma Plot 11 (Systat software, San Jose, CA). The Guanine
and Cytosine (GC) content of all positions and third
codon positions were analyzed with codonw (http://mobyle.
pasteur.fr/cgi-bin/postal.py?form5codonw). The number of
synonymous substitutions per synonymous site(d
S
)andnon-
synonymous substitutions pernonsynonymous site(d
N
)(Nei
and Gojobori 1986) were estimated using PAML3.14 (Yang
et al. 1997).
The optimal trees were found from the nucleotide
and amino acid data sets with the program RAxML 7.0.4
(Stamatakis 2006) under the optimal GTR þIþCmodel
for nucleotide alignments and the PROTGAMMAJTT
model for amino acid alignments with 100 bootstrap rep-
licates in both cases.
Evolution of Dinoflagellate Actins ·doi:10.1093/molbev/msq332 MBE
1471
Results
Novel Dinoflagellate Actin Sequences
A total of 142 partial and complete actin sequences (ap-
proximately 774 and 1131 nt, respectively) were deter-
mined from five different dinoflagellate species, of which
128 were unique sequences (table 1 and supplementary ta-
ble S1,Supplementary Material online): 11 from B. crassum
(HQ391454–HQ391464), 47 from D. acuminata (HQ391359–
HQ391384 and HQ391405–HQ391420), 56 from D. caudata
(HQ391385–HQ391404 and HQ391421–HQ391453), 8
from G. catenatum (HQ391465–HQ391472), and 20 from
Kat. rotundatum of which 14 were unique (HQ391473–
HQ391486). One 275 base intron was found when aligning
the genomic and cDNA sequences in the beginning of
aKat. rotundatum actin genomic amplicon (HQ391477).
Rarefaction Analysis of Sequences
Rarefaction analysis of the sequences cloned from both
Dinophysis species suggested that redundancy between
clones was low because the rarefaction curve was still in
a near linear stage with 41 unique sequences of 47 for D.
acuminata and 53 unique sequences of 56 for D. caudata.
Errors due to amplification are likely to be approximately
one error per 774 base amplicon (Eun 1996). The rarefaction
analysis was then repeated assuming that sequences differ-
ing by less than 2 bases in pairwise comparisons were the
same. Under this assumption, there were 23 of 47 unique
sequences for D. acuminata and 43 of 56 for D. caudata
still suggesting that redundancy was low. Increasing the
number of allowable differences stepwise from 1–13 nucleo-
tides only showed a notable decrease in the number of
unique sequences between four and five nucleotides (fig. 2).
5`UTR 3`UTR
5` 3`
5`UTR 3`UTR
5` 3`
5`UTR 3`UTR
5` 3`
(A) Genomic DNA
(B) mRNA
cDNA
AF2
OR1
AF2
SL1
RTpr
774 bases
1131 bases
Poly (A)
Poly (A)
86 -117bases OR1
SL
SL
Reverse Transcription
Coding region (no intron)
SL Trans-spliced leader
Start codon
Stop codon
PCR primer
FIG.1. Actin gene amplification schematic. The relative positions of the different primers used in the study are placed on a model of the actin
gene. The genomic and mRNA versions of the gene are shown.
Table 1. The Number of Actin Gene Sequences in This Study, GC content, The Average of nonsynonymous (d
N
) and synonymous (d
S
)
Substitution Rates between Actin Gene Copies in Dinoflagellate Species.
Species Genomic
Partial
cDNA
Trans-Spliced
cDNA GC Content
a
d
N
6SE
b
d
S
6SE
b
d
N
/d
S
b
Amphidinium carterae
c
111 0.54 (0.66) 0.004 60.0000 0.103 60.0013 0.091 60.0021
Blastodinium crassum 11 0.58 (0.80) 0.033 60.0035 0.532 60.0446 0.132 60.0301
Dinophysis acuminata 21 (16) 14 12 0.59 (0.82) 0.007 60.0002 0.090 60.0043 0.232 60.0074
D. caudata 36 (33) 7 13 0.61 (0.87) 0.009 60.0001 0.211 60.0051 0.157 60.0054
Gymnodinium catenatum 8 0.59 (0.71) 0.048 60.0050 0.173 60.0160 0.306 60.0243
Karenia brevis
c
4 0.53 (0.60) 0.024 60.0097 0.431 60.1469 0.044 60.0088
Karlodinium veneficum
c
19 0.52 (0.58) 0.004 60.0002 0.280 60.0068 0.013 60.0009
Katodinium rotundatum 20 (14) 0.63 (0.90) 0.004 60.0002 0.034 60.0065 0.419 60.0400
a
The mean of overall GC content with that of third codon GC content in a parentheses were estimated based on partial actin-coding region (774 nt).
b
Synonymous (d
S
) and nonsynonymous (d
N
)substitutions per site and its ratio (d
N
/d
S
) were shown as mean ±standard error (SE) and excluding infinite values, where is
d
S
50. Note that the average d
N
and d
S
values for the two Dinophysis species do not reflect the complicated distributions shown in figure 3.
C
These multiple gene copies were accessed from GenBank.
Kim et al. ·doi:10.1093/molbev/msq332 MBE
1472
Bellerophon identified 11 possible chimeric sequences
for D. caudata and none from D. acuminata. Of the puta-
tive chimeric sequences from D. caudata, seven were de-
rived from the same pair of parents, and two others
were derived from one of these two parent sequences.
None of the chimeras was an exact match to either parent
sequence on the two sides of the putative break point, sug-
gesting that chimera check might not accurately recover
potential chimeras in actin gene families, as has been noted
by Lahr and Katz (2009). Therefore, all sequences were re-
tained for further analysis.
Gene Copy Diversity within Species
Pairwise nucleotide and amino acid differences between
gene copies within each of the two Dinophysis species were
calculated. In order to compare different size amplicons,
the nucleotide alignment was trimmed to 774 bases (fig. 1).
The pairwise nucleotide differences between different
genomic actin copies from D.caudata showed a bimodal
distribution with two distinct peaks around 10 and 52 nu-
cleotides, whereas the full-length and partial cDNA sequen-
ces exhibited a less even distribution up to 92 nucleotide
differences (fig. 3A). Similarly, the D.acuminata actin se-
quences had a bimodal pattern with two peaks around
8 and 59 nucleotide differences (fig. 3B). Despite the sub-
stantial differences between gene copies in nucleotide
comparisons, translation into amino acids reduced the dif-
ferences to a single peak centered around three to five
amino acids for both species (fig. 3Cand D).
The full-length spliced leader amplified cDNA sequences
included the nontranslated 5#UTR excluded from the anal-
ysis above because one of the primers was designed from
the 5#spliced leader (fig. 1). In D.caudata, the 5#UTR
region varied from 64 to 96 bases with a total of six se-
quence types differing in sequence and length (data not
shown). In D.acuminata, all the 5#UTR were 56 bases long,
and all but one had very similar sequence (0–2 bases of
differences). Clustering based on 5#UTR length and se-
quence correlated well with placement on the trees de-
scribed below using the partial coding region.
Gene Copy Diversity between Species
When comparing sequences between the two species in
phylogenetic trees, there were three major groups, ranked
here by the amount of divergence between species. First,
the majority of sequences appeared to be recently dupli-
cated and were termed type 1. These bushy clusters of se-
quences formed two clades corresponding to the two
species. In between-species comparisons, the sequences
differed by on average 55 nucleotides or 7.1% (both median
and mode) between the two clades. Although the majority
of sequences were tightly clustered with each other, these
sequences were not the most closely related in compari-
sons between species (fig. 4). The type 2 sequences were
the most similar copies in comparisons between species.
These sequences differed by as few as 12 nucleotides be-
tween the two species (1.5% of 774 bases) and represented
only a small fraction of sequences recovered (9/103). Fi-
nally, the type 3 sequences were a few very divergent copies
(12/103), mostly from D. caudata (9/12) that differed from
D. acuminata by up to 86 bases.
The three types of sequences were defined on the basis
of between-species comparisons but were similarly distrib-
uted into three categories on the distance histograms com-
paring sequences within species (fig. 3Eand F). For example,
the labeled colored branches a and d from figure 4 can be
compared with the distributions on figure 3Eand F. Sim-
ilarly, the type 2 and 3 sequences could be mapped onto
specific branches and compared with distributions of d
N
and d
S
values in the within-species comparisons.
The distribution of sequences across the distance histo-
grams (fig. 3) and tree (fig.4) seemed to be more influenced
by the PCR template (cDNA vs. genomic DNA) rather than
primers (gene-specific vs. general spliced leader forward
primer) used for amplification. Most type 3 or divergent
sequences were recovered from cDNA template, with ex-
actly half of these sequences coming from the two different
primer sets. Also, the type 2 or most similar sequences in
between sequence comparisons were overwhelmingly
drawn from genomic DNA (8/9). Pseudogenes were com-
Dinophysis acuminata
010
20 30
Number of unique sequences
0
10
20
30
40
50
60
0 -1 nucleotide differences
2
3
4
5
6
Dinophysis caudata
Number of observed sequences
Number of unique sequences
0
10
20
30
40
50
60
0 -1 nucleotide differences
2
3 -4
5
6-7
50 60
40
010
20 30 50 60
40
FIG.2.Rarefaction analysis of the actin gene sequences from the two
Dinophysis species. Individual clones from each species were
compared allowing increasing numbers of differences between
sequences. The number of observed sequences was plotted against
the number of unique sequences.
Evolution of Dinoflagellate Actins ·doi:10.1093/molbev/msq332 MBE
1473
mon in types 1 and 2 (9/81 and 3/9, respectively) but were
not found in the divergent type 3 group.
Another feature of the sequence data was differences
between the species in GC content particularly in the third
codon position (GC3). Dinophysis caudata was more GC
rich (overall 60%, GC3 86%) than D. acuminata (59%
and 81%). Dividing the sequences by type, the bushy type
I sequences from D. caudata (60% and 85%) and D. acumi-
nata (59% and 81%) were less GC biased than the type 2,
similar sequences (62% and 87%), and the type 3 divergent
sequences (62% and 90%) (supplemental fig. S2,Supple-
mentary Material online). When third codon positions
were excluded, the overall ML tree was similar to figure 3,
although bootstrap support was lower: the two type 1 sin-
gle species clades were well supported (99% bootstrap for
D. acuminata and 76% for D. caudata)aswasthetype2
clade (92%), and the long branch type 3 clade was present
but not supported.
Phylogenetic Analyses of Dinoflagellate Actin Genes
Maximum likelihood trees were made using a nucleotide
alignment from the 142 actin gene copies from this study
and an additional 135 dinoflagellate sequences from NCBI
(fig. 5). Except for the two closely related Dinophysis spe-
cies, the sequences from each species formed monophy-
letic groups with bootstrap support 70%. The type 1
conserved, tightly clustered sequences with short branch
lengths from D. acuminata and D. caudata again formed
two distinct well-supported monophyletic groups. However,
the type 2 and 3 sequences from the two species showed
interspecific clustering of putatively orthologous sequences,
as was seen in the tree containing only Dinophysis species
(fig. 3). Trees made from the translated alignment also con-
tained intermingled sequences from the two Dinophysis
species, and the two fucoxanthin-containing species, K. ven-
eficum and K. brevis, were nested within the Dinophysis clade
but not with strong bootstrap support (supplementary fig.
S2,Supplementary Material online). The remaining species
were recovered as monophyletic clades.
Two more distantly related species, D. caudata and
A. carterae, were selected for comparison of raw pairwise
nucleotide and amino acid differences. These distributions
showed that nucleotide differences between species were
larger than within species, but that amino acid differences
between species clearly overlapped with the variation
Dinophysis caudata
Pairwise nucleotide differences
10 20 30 40 50 60 70 80 90 0908070605040302010010 0010
Number of pairwise comparisons
0
10
20
30
40
50
60
Genomic DNA
Trans-spliced cDNA
Partial cDNA
Dinophysis caudata
Pairwise amino acid differences
0 5 10 15 20 25 30
Number of pairwise comparisons
0
10
20
30
40
50
60
Dinophysis acuminata
Pairwise nucleotide differences
Number of pairwise comparisons
0
10
20
30
40
50
60
Dinophysis acuminata
Pairwise amino acid differences
0 5 10 15 20 25 30
Number of pairwise comparisons
0
10
20
30
40
50
60
A
C
B
D
Dinophysis caudata
Substitutions per site
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Number of pairwise comparisons (dS)
0
100
200
300
400
0
200
400
600
800
1000
a
b
c
Dinophysis acuminata
Substitutions per site
0
100
200
300
400
Number of
p
airwise com
p
arisons
(
dN
)
0
200
400
600
800
1000
dS
dN
d
f
e
F
E
FIG.3.Histograms showing pairwise comparisons of gene copies within the two Dinophysis species. These histograms show the distribution of
nucleotide and amino acid differences when comparing different amplicons within the two Dinophysis species. Each different primer and
template combination were compared including partial genomic and cDNA amplicons with gene-specific primers and full-length amplicons
from cDNA. The top panels (Aand B) depict nucleotide differences when comparing different gene copies. The middle panels (Cand D) depict
amino acid differences, and the bottom panels (Eand F) depict synonymous and nonsynonymous substitution rates per site. The left (A, C, and E)
and the right panels (B,D,andF) are the gene copies amplified from D. caudata and D. acuminata, respectively. The lowercase letters (a-f) in
panels Eand Frefer to specific branch lengths in figure 4.
Kim et al. ·doi:10.1093/molbev/msq332 MBE
1474
within a species (fig. 5, inset). The average pairwise distance
for A. carterae and D. caudata was nine amino acids but
within-species distributions centered around three or five
amino acids. Branch lengths in the amino acid tree were
quite short, as reflected by the overlapping distributions
in raw pairwise distances.
Type 3:
divergent copies
Type2:
similar copies
Type 1:
recently duplicated copies
0.01
0.01
100
100
100
71
79
76
86
99
92
100
77
89 71
100
100
100
100
92
98
85
SL cDNA
D.acuminata
Partial cDNA
Genomic DNA
Pseudogene
D.caudata
b
a
e
d
c
f
FIG.4.Unrooted maximum likelihood tree of two Dinophysis species using a nucleotide alignment. The most likely unrooted tree using 103
actin gene copies from the two Dinophysis species is shown with selected bootstrap values (.70%) shown. The tree can be divided into three
categories based on comparisons between the two species. Most copies form tight clusters of similar sequences (top) that vary by eight to ten
nucleotide differences (type 1). In the middle are the type 2 sequences that are most similar in between-species comparisons. At the bottom
are sequences that are dramatically divergent in two species comparisons or the type 3 divergent copies. The different types of amplicons and
pseudogenes are shown with symbols, blue and red colors refer to D.caudata and D.acuminata, respectively (see supplementary table S1,
Supplementary Material online, for more details). The lowercase letters on colored branches refer to specific branches corresponding to
comparisons within species shown in figure 3.
Evolution of Dinoflagellate Actins ·doi:10.1093/molbev/msq332 MBE
1475
Gene Copies from Other Species
When sequences from the six other species were examined,
similar trends of within-species nucleotide changes were seen
with synonymous differences exceeding nonsynonymous
changes (table 1). Here, we used the d
S
(synonymous sub-
stitution rate) and d
N
(nonsynonymous substitution rate)
measures calculated in the context of a phylogenetic tree.
Overall, the mean values of d
S
between copies in a species
were much greater than those of d
N
with values ranging from
0.034 to 0.532 (fig 3Eand F). The d
N
values were 1 order of
magnitude lower than d
S
and ranged from 0.004 to 0.048.
Gymnodinium catenatum,B.crassum,andA. cartarae geno-
mic amplicons had sequences with frameshifts that we inter-
pret as pseudogenes (five of eight sequences for
G. catenatum) and showed relatively high d
S
values (fig. 5
and table 1). The corresponding d
N
values for these species
were also large. In contrast, the highly conserved Kat.rotun-
datum sequences had the lowest d
S
value, but the d
S
value in
this species was still an order of magnitude greater than the d
N
value.
The pseudogene sequences of G.catenatum were found
on long branches and contained substantial amino acid
substitutions (fig. 5 and supplementary fig. S2,Supplemen-
tary Material online). Blastodinium crassum was also highly
divergent and appeared to accumulate amino acid
substitution changes. In contrast, the highly diverged
nucleotide sequences of Kar. veneficum were mostly
See figure 3
100
98
75
97
85
100
78
98
95
78
91 9691
84
99 100
82
100
100 83
98 100
99
98
100
95
90
98
98 92
100
78
97
100
94
92
100
95
100
89
85
82
98
98
94
93
89
100
75
77
96
72
70
71
Pseudogene
Dinophysis caudata
Gymnodinium catenatu
m
Katodinium rotundatum
Blastodinium crassum
Karlodinium veneficum
Karenia brevis
Amphidinium carterae
Outgroup
Perkinsus marinus
0.05
Type2: similar copies
Type1: recently duplicated copies
Type 3: divergent copies
Dinophysis acuminata
Nucleotide differences
0 20 40 60 80 100 120 140 160 180
0
200
400
600
Amino acid differences
010 20 30405060
Number of pairwise comparisons
0
200
400
600
800
1000
1200
1400
1600
1800
A.carterae vs. D. caudata
A.carterae
D.caudata
FIG.5.Maximum likelihood nucleotide tree of actin genes in eight dinoflagellate species. The most likely tree using 276 sequences from eight
species with a 774 nucleotide alignment is shown. The three types of Dinophysis sequences are bracketed. Inset into the figure are histograms
showing raw pairwise nucleotide and amino acid differences within and between Amphidinium carterae and Dinophysis caudata.
Kim et al. ·doi:10.1093/molbev/msq332 MBE
1476
identical after translation into amino acids with only very
few amino acid differences.
Discussion
Dinoflagellate Nuclear Genomics
The nuclear genome of dinoflagellates remains among the
most poorly understood of any other major eukaryote lin-
eage. With genome sizes ranging around 100 pg, direct
complete genomic sequencing and assembly would chal-
lenge even the most modern sequencing methods and
deepest pockets. However, Expressed Sequence Tag
(EST) sequencing from dinoflagellates has been a popular
tool to bypass the daunting genome (Bachvaroff et al. 2004;
Hackett et al. 2004;Lidie et al. 2005;Patron et al. 2005,
2006), and at least one directed study has compared cDNA
sequences with their genomic counterparts (Bachvaroff
and Place 2008). From these sequencing efforts, a rough
model can be described for dinoflagellate genomes. Many,
but not all genes were found to have variation between
gene copies, often with most differences found in syn-
onymous sites (Reichman et al. 2003;Zhang et al. 2006;
Bachvaroff and Place 2008;Bachvaroff et al. 2009). Here,
we use a targeted approach to examine the evolution of
a representative dinoflagellate gene family. Two closely re-
lated but morphologically distinct species were selected,
and a conserved multicopy gene family, actin, was used
to understand the mode and tempo of gene amplification
in dinoflagellate nuclear genomes.
Gene Duplication in Dinoflagellates Compared with
Related Genomes
Sequencing multiple versions of actin from different dino-
flagellate species clearly demonstrates the abundance and
diversity of copies in the genome, especially when com-
pared with genomes from other alveolates. In genomic
study of the related apicomplexa, only a few gene families
were multicopy (Brayton et al. 2007). In the third alvoelate
lineage, the ciliates, three rounds of whole genome dupli-
cation have been described in the ciliate Paramecium tet-
raurelia (Aury et al. 2006), and gene duplication and
diversification in the ciliate somatic genome appear to have
evolved independently in at least two ciliate lineages
(Robinson and Katz 2007). By contrast, the dinoflagellate
Dinophysis genome is large and has multiple actin gene
copies, but dinoflagellates do not appear to maintain so-
matic and germinal nuclei, as do ciliates.
The Actin Gene Complement in Other Genomes
Actin is often a multicopy gene. However, comparison of
the multicopy actin genes in dinoflagellates with the actin
gene complement in completely sequenced genomes sug-
gests distinct differences in the dinoflagellate duplication
pattern. In multicellular animals, the actin gene family phy-
logeny corresponds well with distinct functional categories
into, for example, cytosolic, smooth, or cardiac muscle
actins (OOta and Saitou 1999). More broadly, many eukar-
yotes have a complement of more divergent actin-related
proteins (arp1–11), each with distinct functions (Muller
et al. 2005).
As an example of actin gene diversity, we can look to
a recent study of the actin gene family from the complete
genome of the ‘‘slime mold’’ Dictyostelium discoideum. The
total complement of actin and arp genes in D.discoideum is
quite large, with 41 different genes, of which eight have
been identified as arp genes (Joseph et al. 2008). However,
of the 29 canonical actins from D. discoideum, only 21 are
contiguously alignable without gaps, and 17 encode iden-
tical proteins. The remaining ‘‘orphans’’ are divergent actin
genes relative to canonical actins but not clearly related to
an arp. Expression is overwhelmingly drawn from the pool
of identical amino acid copies, and four of the additional
actin copies may not be expressed (Joseph et al. 2008).
Thus, in D. discoideum, both expression and conservation
of amino acid sequence act together to restrict actin var-
iation in the proteome.
In dinoflagellates, by contrast, almost all canonical actin
protein sequences vary subtly from copy to copy, and there
seems to be no correlation of conservation and expression.
In the species that have been examined, (here the two
Dinophysis species and previously in A. carterae;Bachvaroff
and Place 2008), expression from the pool of total genomic
actin gene copies included variant gene copies, sometimes
even apparent pseudogenes. However, the amino acid vari-
ation has a limit: the sequences can be readily aligned without
gaps, and the distribution of amino acid differences was
clearly restricted, suggesting that we are only examining
canonical actins. Clearly, deeper proteomic and subcellular
localization methods would be required to prove similarity
or difference in function, but the apparent continuous dis-
tribution of amino acid differences suggests that we are
studying a single population of copies without distinct func-
tional differences. The rarefaction analysis suggests that the
sequences presented here are likely subsamples of the overall
set of copies, so it is likely that many more actin gene copies
are present in some dinoflagellates than D. discoideum.
Clearly, deep EST sequencing and more difficult contiguous
genome sequencing would be required to fully sample both
conserved and divergent actin gene copies in dinoflagellates
as well as to sample the actin-related proteins.
To investigate the possibility that some unknown ampli-
fication, sequencing, or cloning artifact is at least partially
responsible for our observations of actin gene diversity, we
did rarefaction curves across a broad range of sequence dif-
ferences from 0 to 7 base differences. Sequencing or am-
plification error estimates would be on average less than
0.1%, or up to 2 bases in pairwise comparisons, and most
sequences differed much more. The rarefaction curves
showed substantial diversity between sequences and did
not collapse until sequences with 4 or more base differen-
ces were assumed to be identical (fig. 2). Furthermore, the
sequence differences were mostly synonymous; clearly, se-
quencing or amplification error would not lead to such
a pattern. The degree of variation differed from taxon to
taxon, with less variation observed in Kat. rotundatum
and more in B. crassum. Overall, our sampling method,
Evolution of Dinoflagellate Actins ·doi:10.1093/molbev/msq332 MBE
1477
although not unbiased, appears to represent the underlying
features of the different genomes we sampled. Chimerism,
or PCR recombination, is another possible source of error,
but in this case, we would expect to see sequences inter-
mediate between the two parent sequences in phyloge-
netic trees (Bachvaroff and Place 2008;Lahr and Katz
2009). Thus, chimerism would not explain the observed di-
vergent sequences in our trees.
Other Gene Families in Dinoflagellates
The pattern of actin gene family evolution probably applies
to other dinoflagellate genes. Here, we used the actin gene
as a typical highly expressed gene or as an example for de-
scribing dinoflagellate genome evolution. In the present
study, we have shown a diverse multicopy actin gene family
in five additional dinoflagellate species, and this pattern
may also be true for other highly expressed dinoflagellate
genes (Bachvaroff and Place 2008;Moustafa et al. 2010).
The pattern we describe here of multicopy divergent gene
families seems to apply well to a class of dinoflagellate
genes including the peridinin chlorophyll protein (PCP)
gene in Symbiodinium sp. (Reichman et al. 2003), the pro-
liferating cell nuclear antigen gene in Pfiesteria piscicida
(Zhang et al. 2006), and perhaps other gene families in di-
noflagellates including rubisco (Rowan et al. 1996;Zhang
and Lin 2003;Moustafa et al. 2010). Many plastid-associated
genes such as psbO and light harvesting complex (LHC)
genes were also found in multicopy variant families
(Bachvaroff et al. 2004). Interestingly, data from the syndi-
nian dinoflagellate Amoebophrya sp. did not find multiple
divergent gene copies for a small set of genes (Bachvaroff
et al. 2009). Thus, the dinoflagellate gene family pattern
described here may be restricted to the dinokaryotes or
those dinoflagellates with full-fledged dinokaryotic chro-
mosome organization. In this context, Oxyrrhis marinus,
a free-living heterotrophic dinoflagellate that does not have
a clear affinity either with dinokaryotic or syndinian dino-
flagellates (Saldarriaga et al. 2003) also has multiple actin
copies (Sano and Kato 2009).
Many of these multicopy genes, including actin in A.
carterae, and Kar. veneficum, were found in tandem arrays
of slightly varying copies, a feature we were unable to con-
firm using PCR strategies in these two Dinophysis species
(data not shown). Another example would be the PCP gene
in Lingulodinium polyedra that seems to form a homoge-
nous repeat array of thousands of identical copies (Le et al.
1997). Repeats of individual protein units within a single
transcript appear to also be common, as was found for
rubisco (Rowan et al. 1996;Zhang and Lin 2003), LHC
(Hiller et al. 1995), and luciferase genes (Li et al. 1997).
Mode and Tempo of Gene Family Evolution
Comparisons of the two closely related Dinophysis species
illuminate the dynamics of dinoflagellate gene family evo-
lution. Actin gene family evolution in dinoflagellates likely
represents a pattern typical of some other dinoflagellate
genes, and comparisons of two closely related species
should expose the dynamics of gene family evolution more
clearly than studies of a single species could. The two spe-
cies allow the different gene copies to be calibrated in time
relative to a speciation event. In this context, we selected
two clearly distinct but closely related species based on cell
form. Our reference frame is relative to the unknown di-
vergence time of the two species, not units of absolute
time.
Strikingly, pairwise distances among sequences do not
show a monotonic increase in distance, but rather segre-
gate into three distinct distance types, suggesting episodic
diversification of sequences. The different actin gene copies
seem to be drawn from three different populations based
on between-species divergence. These three types of
sequences appear to have diverged from each other at dif-
ferent time points during and after speciation. First, the
most commonly found sequences were recently duplicated
after speciation and seem to have diverged consistently in
both species (termed here ‘‘type 1’’). Variation within the
pool of recently duplicated copies is quite similar in both
species and was centered about eight to ten nucleotides in
within-species comparisons. The majority of sequences
from both species were sampled from this pool suggesting
recent parallel amplification in both species of a specific
gene copy.
A second category of sequences, termed here ‘‘type 2’’
included those sequences most similar in comparisons be-
tween the species. Although these copies could potentially
be the result of gene transfer between species either via
lateral transfer or gene flow between species, their similar-
ity likely represents the small divergence between these
gene copies. More likely these copies were shared between
the two species right up until species divergence.
Finally, there is a set of very divergent sequences, termed
here ‘‘type 3.’’ These copies seem to represent a different
class of actin gene copies, likely present in the most recent
common ancestor of the two species, and either subject to
faster relative mutation rate or longer divergence time than
the other gene copies. These sequences do not have any
clear functional difference from canonical actins because
most differences are only present at the nucleotide level.
Some of the divergent cDNA sequences from D. caudata
do appear to form the tail of the distributions on when
comparing amino differences (fig. 3C), suggesting potential
phenotypic differences.
The highly conserved nature of the actin amino acid se-
quence helps to confirm that orthologous sequences, or in-
paralogs, are being compared, as was demonstrated in the
phylogenetic trees. Thus, most of the variation is neutral or
near neutral, in terms both of selection and of phenotype.
Models of Gene Family Evolution in Dinoflagellates
Taken together, the three actin gene types indicated in-
complete lineage sorting during the divergence of the
two Dinophysis species and inconsistent divergence time
or rate of change between gene copies. From the presence
of three catergories of actin sequences, we infer that mul-
tiple rounds of gene duplication events have occurred and
that the most recent common ancestor likely contained
Kim et al. ·doi:10.1093/molbev/msq332 MBE
1478
multiple paralogous actin gene copies. This is a special case
of incomplete lineage sorting, which has been described as
parologous lineage sorting (Rooney and Ward 2008). The
data suggest discrete saltatory gene amplification events
rather than a continuous production of new gene copies,
although deeper sampling might produce a more contin-
uous distribution of differences. Also apparent from our
data is parallel selection in both species of a particular
set of gene copies for duplication. We conclude that the
most recent common ancestor of these two species likely
contained a diverse suite of actin gene copies and that gene
family expansion and contraction have continued during
and after the two species become isolated.
Overall, the number and diversity of gene copies, the
accumulation of neutral mutations, the presence of many
pseudogenes, and the pattern of incomplete lineage sorting
fit well with the birth and death model of gene family evo-
lution, especially when compared with the concerted evo-
lution model (Nei and Rooney 2005). The data suggest that
new gene copies are produced over time and that diver-
gence between the gene copies seems to happen indepen-
dently. Incomplete lineage sorting is considered to be
a strong indicator of birth and death evolution (Rooney
2004). However, we have to keep in mind that in dinofla-
gellates, the evolution of gene families seems to apply
broadly to more genes, with larger numbers of copies than
seen in other eukaryotic gene families although apparently
not as a byproduct of whole genome duplication. Also the
apparent concerted evolution of synonymous sites in actin
gene copies in the dinoflagellate A. carterae, and the iden-
tity of the PCP gene family in L. polyedra argue for differ-
ences in specific lineages and genes. This suggests that
a mixed model with some features of concerted evolution,
or a birth and death model with strong purifying selection
coupled with rapid gene turnover may best explain these
data (Rooney and Ward 2005). Certainly, deeper study of
dinoflagellate populations including several genes from mul-
tiple closely related species would provide deeper insight.
In some sense, these multicopy gene families can operate
as a population within an individual. Each individual can
contain a specific and dynamic population of gene copies,
creating an additional hierarchy to consider in the context
of standard population genetics models.
Implications for Phylogeny
Amino acid–based phylogenetic trees using multiple actin
gene copies showed that actin may not be useful for esti-
mating phylogeny of closely related dinoflagellate species.
Because most variation between actin copies within a spe-
cies was synonymous, translation into amino acids would
eliminate most within-species variability. However, in the
case of actin, amino acid variation between distantly re-
lated species overlaps with variation within a species
(fig. 5), and the two closely related Dinophysis species were
intermingled in amino acid trees (supplementary fig. S2,
Supplementary Material online). Thus, actin may be too
conserved to be useful for phylogeny when comparing
closely related species within the dinokaryotes. In general,
phylogenies using protein-coding genes in dinoflagellates
will require estimates of within- and between-species var-
iation for that gene.
Supplementary Material
Supplementary figures S1 and S2 and table S1 are available
at Molecular Biology and Evolution online (http://www.
mbe.oxfordjournals.org/).
Acknowledgments
The authors wish to thank Dr D. Wayne Coats of the
Smithsonian Environmental Research Center for mentoring
and support. The authors thank G. Concepcion, G. Mendez,
W. Macturk, and R. Timme of the University of Maryland
and two anonymous reviewers for critical reviews of the
manuscript. This project was funded in part by a National
Science Foundation grant from the Assembling the Tree of
Life project (#EF-0629624). This work was supported by
a Smithsonian Postdoctoral Fellowship grant to S.K.
References
Aury JM, Jaillon O, Duret L, et al. (43 co-authors). 2006. Global
trends of whole-genome duplications revealed by the ciliate
Paramecium tetraurelia. Nature 444:171–178.
Bachvaroff TR, Concepcion GT, Rogers CR, Delwiche CF. 2004.
Dinoflagellate EST data indicate massive transfer of chloroplast
genes to the nucleus. Protist 55:65–78.
Bachvaroff TR, Place AR. 2008. From stop to start: tandem gene
arrangement, copy number and trans-splicing sites in the
dinoflagellate Amphidinium carterae. PLoS One. 3:e2929.
Bachvaroff TR, Place AR, Coats DW. 2009. Expressed sequence tags
from Amoebophrya sp. Infecting Karlodinium veneficum:
comparing host and parasite sequences. J Eukaryot Microbiol.
56:531–541.
Brayton KA, Lau AO, Herndon DR, et al. (28 co-authors). 2007.
Genome sequence of Babesia bovis and comparative analysis of
apicomplexan hemoprotozoa. PLoS Pathog. 3:1401–1413.
Coats DW, Bachvaroff TR, Handy SM, Kim S, Garate-Lizarrage I,
Delwiche CF. 2008. Prevalence and phylogeny of parasitic
dinoflagellates (Genus Blastodinium) infecting copepods in the
Gulf of California. Oceanides 23:63–77.
Delwiche CF. 1999. Tracing the thread of plastid diversity through
the tapestry of life. Am Nat. 154:S164–S177.
Doyle JJ, Doyle JL. 1987. A rapid DNA isolation procedure for small
quantities of fresh leaf tissue. Phytochem Bull. 19:11–15.
Eun HM. 1996. Enzymology primer for recombinant DNA
technology. San Diego (CA): Academic Press.
Galuzzi L, Penna A, Bertozzini E, Vila M, Garces E, Magnani M. 2004.
Development of a real-time PCR assay for rapid detection and
quantification of Alexandrium minutum (a dinoflagellate). Appl
Environ Microbiol. 70:1199–1206.
Hackett J, Yoon H, Soares M, Bonaldo M, Casavant T, Scheetz T,
Nosenko T, Bhattacharya D. 2004. Migration of the plastid
genome to the nucleus in a peridinin dinoflagellate. Curr Biol.
14:213–218.
Handy SM, Bachvaroff TR, Timme RE, Coats DW, Kim S,
Delwiche CF. 2009. Phylogeny of four dinophysiacean genera
(Dinophyceae, Dinophysiales) based on rDNA sequences from
single cells and environmental samples. J Phycol. 45:1163–1174.
Hiller RG, Wrench PM, Sharples FP. 1995. The light harvesting
chlorophyll a-c-binding protein of dinoflagellates: a putative
polyprotein. FEBS lett. 363:175–178.
Evolution of Dinoflagellate Actins ·doi:10.1093/molbev/msq332 MBE
1479
Huber T, Faulkner G, Hugenholtz P. 2004. Bellerophon: a progarm to
detect chimeric sequences in multiple sequence alignments.
Bioinformatics 20:2317–2319.
Joseph JM, Fey P, Ramalingam N, Liu X, Rohlfs M, Noegel AA, Mu
¨ller-
Taubenberger A, Glo¨ckner G, Schleicher M. 2008. The actinome of
Dictyostelium discoideum in comparison to actins and actin-
related proteins from other organisms. PLoS One.3:e2654.
Lahr DJ, Katz LA. 2009. Reducing the impact of PCR-mediated
recombination in molecular evolution and environmental
studies using a new-generation high-fidelity DNA polymerase.
Biotechniques 47:857–866.
Le QH, Markovic P, Hastings JW, Jovine RV, Morse D. 1997. Structure
and organization of the peridinin-chlorophyll a-binding protein
gene in Gonyaulax polyedra.Mol gen genet. 255:595–604.
Li L, Hong R, Hastings J. 1997. Three functional luciferase domains in
a single polypeptide chain. Proc Natl Acad Sci. 94:8954–8958.
Lidie KB, Ryan JC, Barbier M, Van Dolah FM. 2005. Gene expression
in Florida red tide dinoflagellate Karenia brevis: analysis of an
expressed sequence tag library and development of a DNA
microarray. Mar Biotechnol. 7:481–493.
Lidie KB, Van Dolah FM. 2007. Spliced leader RNA-mediated trans-
splicing in a dinoflagellate, Karenia brevis.J Eukaryot Microbiol.
54:427–435.
Lucas I, Vesk A. 1990. The fine structure of two photosynthetic species
of Dinophysis (Dinophysiales, Dinophyceae). JPhycol. 26:345–357.
Maddison WP, Maddison PR. 2002. MacClade version 4: analysis of
phylogeny and character evolution. Sunderland (MA): Sinauer
Associates.
Morgan DR, Soltis DE. 1995. Phylogenetic relationships among
members of Saxifragaceae sensu lato based on rbcL sequence
data. Ann Mo Bot Gard. 82:208–234.
Moustafa A, Evans AN, Kulis DM, DL Erdner HJD, Anderson DM,
Bhattacharya D. 2010. Transcriptome profiling of a toxic
dinoflagellate reveals a gene-rich protist and a potential impact
on gene expression due to bacterial presence. PLoS One. 5:e9688.
Muller J, Oma Y, Vallar L, Friederich E, Poch O, Winsor B. 2005.
Sequence and comparative genomic analysis of actin-related
proteins. Mol Biol Cell. 16:5736–5748.
Nei M, Gojobori T. 1986. Simple methods for estimating the
numbers of synonymous and nonsynonymous nucleotide
substitutions. Mol Biol Evol. 3:418–426.
Nei M, Rooney AP. 2005. Concerted and birth-and-death evolution
of multigene families. Annu Rev Genet. 39:121–152.
OOta S, Saitou N. 1999. Phylogenetic relationship of muscle tissues
deduced from superimposition of gene trees. Mol Biol Evol.
16:856–867.
Park MG, Kim S, Kim HS, Myung G, Kang YG, Yih W. 2006. First
successful culture of the marine dinoflagellate Dinophysis
acuminata.Aquat Microb Ecol. 45:101–106.
Patron NJ, Waller RF, Keeling PJ. 2005. Complex protein targeting to
dinoflagellate plastids. J Mol Biol. 348:1015–1024.
Patron NJ, Waller RF, Keeling PJ. 2006. A tertiary plastid uses genes
from two endosymbionts. J Mol Biol. 357:1373–1382.
Rae P. 1976. Hydroxymethyluracil in eukaryote DNA: a natural feature
of the pyrrophyta (dinoflagellates). Science 194:1062–1064.
Reichman J, Wilcox T, Vize P. 2003. PCP gene family in
Symbiodinium from Hippopus hippopus: low level of concerted
evolution, isoform diversity and spectral tuning of chromo-
phores. Mol Biol Evol. 20:2143–2154.
Robinson T, Katz LA. 2007. Non-mendelian inheritance of paralogs
of 2 cytoskeletal genes in the ciliate Chilodonella uncinata.Mol
Biol Evol. 24:2495–2503.
Rooney AP. 2004. Mechanisms underlying the evolution and
maintenance of functionally heterogeneous 18S rRNA genes in
Apicomplexans. Mol Biol Evol. 21:1704–1711.
Rooney AP, Ward TJ. 2005. Evolution of a large ribosomal RNA
multigene family in filamentous fungi: birth and death of
a concerted evolution paradigm. Proc Natl Acad Sci U S A. 102:
5084–5089.
Rooney AP, Ward TJ. 2008. Birth-and-death evolution of the
internalin multigene family in Listeria.Gene 427:124–128.
Rowan R, Whitney SM, Fowler A, Yellowlees D. 1996. Rubisco in
marine symbiotic dinoflagellates: form II enzymes in eukaryotic
oxygenic phototrophs encoded by a nuclear multigene family.
Plant Cell. 8:539–553.
Saldarriaga JF, McEwan ML, Fast NM, Taylor FJR, Keeling PJ. 2003.
Multiple protein phylogenies show that Oxyrrhis marina and
Perkinsus marinus are early branches of the dinoflagellate
lineage. Int J Syst Evol Microbiol. 53:355–365.
Sano J, Kato K. 2009. Localization and copoy number of the
protein-coding genes actin, a-tubulin, and hsp90 in the
nucleus of a primitive dinoflagellate, Oxyrrhis marina.Zool
Sci. 26:745–753.
Schnepf E, Elbra¨chter M. 1988. Cryptophycean-like double
membrane-bound chloroplast in the dinofalgellate, Dinophysis
Ehrenb.: evolutionary, phyllogenetic and toxicological implica-
tions. Botanica Acta. 101:196–203.
Schnepf E, Elbra¨chter M. 1999. Dinophyte chloroplasts and
phylogeny—a review. Grana 38:81–97.
Spector DL. 1984. Dinoflagellate nuclei. In: Spector DL, editor.
Dinoflagellates. London: Academic Press. p. 107–147.
Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood-based
phylogenetic analysis with thousands of taxa and mixed models.
Bioinformatics 22:2688–2690.
Takishita K, Koike K, Maruyama T, Ogata T. 2002. Molecular
evidence for plastid robbery (Kleptoplastidy) in Dinophysis,
a dinoflagellate causing diarrhetic shellfish poisoning. Protist
153(3):293–302.
Yang Z, Nelsen R, Goldman N, Pedersen AK. 1997. PAML: a program
package for phylogenetic analysis by maximum likelihood.
Comput Appl Biosci. 13:555–556.
Zhang H, Hou Y, Lin S. 2006. Isolation and characterization of
proliferating cell nuclear antigen from the dinoflagellate
Pfeisteria piscicida.J Eukaryot Microbiol. 53:142–150.
Zhang J, Hou Y, Miranda L, Campbell D, Sturm N, Gaasterland T,
Lin S. 2007. Spliced leader RNA trans-splicing in dinoflagellates.
Proc Natl Acad Sci. 104:4618–4623.
Zhang H, Lin S. 2003. Complex gene structure of the form II Rubisco
in the dinoflagellate Prorocentrum minimum (Dinophyceae).
J Phycol. 39:1160–1171.
Kim et al. ·doi:10.1093/molbev/msq332 MBE
1480
... Under concerted evolution, the duplicated genes were extensively homogenized due to gene conversion or unequal crossover, resulting in a pattern of intra-species gene clustering (Nei and Rooney, 2005;Zhu et al., 2013). In the latter model, individual gene copies were expected to vary independently across the evolution after duplication, some gene copies produced novel genes, some of which existed in the genome for a long time, while others were lost or degenerated into pseudogenes through deletion events (Kim et al., 2011;Nei and Rooney, 2005). Our phylogenetic analyses revealed an obvious subfamily clustering pattern rather than intra-species clustering pattern for ciliates (Fig. 2, Supplementary Fig. S1). ...
... In all, variation in the ciliate actin gene family evolution fit best with the birth and death model based on subfamily clustering pattern, strong negative selection, recent duplications, pseudogenes, and incomplete lineage sorting. Thus, we speculated that the birth and death model might provide an important evolutionary pathway to increase the functional complexity of actin gene family as reported in dinoflagellates and primates (Eyun et al., 2017;Kim et al., 2011;Zhu et al., 2013). ...
Article
Actin gene family is a divergent and ancient eukaryotic cellular cytoskeletal gene family, and participates in many essential cellular processes. Ciliated protists offer us an excellent opportunity to investigate gene family evolution, since their gene families evolved faster in ciliates than in other eukaryotes. Nonetheless, actin gene family is well studied in few model ciliate species but little is known about its evolutionary patterns in ciliates. Here, we analyzed the evolutionary pattern of eukaryotic actin gene family based on genomes/transcriptomes of 36 species covering ten ciliate classes, as well as those of nine non-ciliate eukaryotic species. Results showed: (1) Except for conventional actins and actin-related proteins (Arps) shared by various eukaryotes, at least four ciliate-specific subfamilies occurred during evolution of actin gene family. Expansions of Act2 and ArpC were supposed to have happen in the ciliate common ancestor, while expansions of ActI and ActII may have occurred in the ancestor of Armophorea, Muranotrichea, and Spirotrichea. (2) The number of actin isoforms varied greatly among ciliate species. Environmental adaptability, whole genome duplication (WGD) or segmental duplication events, distinct spatial and temporal patterns of expression might play driving forces for the increasement of isoform numbers. (3) The 'birth and death' model of evolution could explain the evolution of actin gene family in ciliates. And actin genes have been generally under strong negative selection to maintain protein structures and physiological functions. Collectively, we provided meaningful information for understanding the evolution of eukaryotic actin gene family.
... Though this process ensures even the most divergent rDNA alleles remain species-specific, it is not perfect. Consequently, some dinoflagellate species exhibit much higher inter-allelic sequence variation in the rDNA genes, with numerous pseudogene copies often observed, though they still segregate into species-specific clades [26, 76,81,82] (Figs 1 and 2). ...
... Another subgroup of species falling into this group are those which are morphologically distinct, have indistinguishable rDNA sequences. A good example are some of the described species in the genus Dinophysis [19,78,[80][81][82][103][104][105][106] (S3 Table in S1 File). This is best illustrated in a recent study by Wolny et al. (2020) [107]. ...
Article
Full-text available
Dinoflagellate species are traditionally defined using morphological characters, but molecular evidence accumulated over the past several decades indicates many morphologically-based descriptions are inaccurate. This recognition led to an increasing reliance on DNA sequence data, particularly rDNA gene segments, in defining species. The validity of this approach assumes the divergence in rDNA or other selected genes parallels speciation events. Another concern is whether single gene rDNA phylogenies by themselves are adequate for delineating species or if multigene phylogenies are required instead. Currently, few studies have directly assessed the relative utility of multigene versus rDNA-based phylogenies for distinguishing species. To address this, the current study examined D1-D3 and ITS/5.8S rDNA gene regions, a multi-gene phylogeny, and morphological characters in Gambierdiscus and other related dinoflagellate genera to determine if they produce congruent phylogenies and identify the same species. Data for the analyses were obtained from previous sequencing efforts and publicly available dinoflagellate transcriptomic libraries as well from the additional nine well-characterized Gambierdiscus species transcriptomic libraries generated in this study. The D1-D3 and ITS/5.8S phylogenies successfully identified the described Gambierdiscus and Alexandrium species. Additionally, the data showed that the D1-D3 and multigene phylogenies were equally capable of identifying the same species. The multigene phylogenies, however, showed different relationships among species and are likely to prove more accurate at determining phylogenetic relationships above the species level. These data indicated that D1-D3 and ITS/5.8S rDNA region phylogenies are generally successful for identifying species of Gambierdiscus , and likely those of other dinoflagellates. To assess how broadly general this finding is likely to be, rDNA molecular phylogenies from over 473 manuscripts representing 232 genera and 863 described species of dinoflagellates were reviewed. Results showed the D1-D3 rDNA and ITS phylogenies in combination are capable of identifying 97% of dinoflagellate species including all the species belonging to the genera Alexandrium , Ostreopsis and Gambierdiscus , although it should be noted that multi-gene phylogenies are preferred for inferring relationships among these species. A protocol is presented for determining when D1-D3, confirmed by ITS/5.8S rDNA sequence data, would take precedence over morphological features when describing new dinoflagellate species. This protocol addresses situations such as: a) when a new species is both morphologically and molecularly distinct from other known species; b) when a new species and closely related species are morphologically indistinguishable, but genetically distinct; and c) how to handle potentially cryptic species and cases where morphotypes are clearly distinct but have the same rDNA sequence. The protocol also addresses other molecular, morphological, and genetic approaches required to resolve species boundaries in the small minority of species where the D1-D3/ITS region phylogenies fail.
... They were referred to as Dino-or Mesokaryota, and sometimes viewed as a fourth domain of life unto themselves [1]. These perplexing nuclear characteristics include large genomes, modified DNA bases, permanently condensed liquid-crystalline cholesteric-like chromosomes, a lack of nucleosomes, highly duplicated genes found in tandem arrays, a gene organization lacking typical eukaryotic conserved motifs, and a massive transfer of plastid genes to the nuclear genome [10,13,17,19,[29][30][31][32]. The application of phylogenetic methods and molecular systematic data revealed that dinoflagellates reside firmly in the crown of the eukaryotes, among the Alveolates rather than belonging to a unique domain of life or even a basal lineage of eukaryotes [4,33]. ...
... Whether the gene duplication of members of a tandem array in dinoflagellates has arisen due to concerted evolution or whether it represents a birth-death model has been examined in some detail for both actin and Peridinin Chlorophyll-a binding protein genes of Amphidinium carterae and Symbiodinium respectively [31,48]. In most eukaryotes, the sequence uniformity in tandem arrays of rRNA genes is thought to be maintained by concerted evolution. ...
Thesis
Full-text available
Dinoflagellates possess large genomes in which most genes are present in many copies. This has made studies of their genomic organization and phylogenetics challenging. Recent advances in sequencing technology have made deep sequencing of dinoflagellate transcriptomes feasible. This dissertation investigates the genomic organization of dinoflagellates to better understand the challenges of assembling dinoflagellate transcriptomic and genomic data from short read sequencing methods, and develops new techniques that utilize deep sequencing data to identify orthologous genes across a diverse set of taxa. To better understand the genomic organization of dinoflagellates, a genomic cosmid clone of the tandemly repeated gene Alchohol Dehydrogenase (AHD) was sequenced and analyzed. The organization of this clone was found to be counter to prevailing hypotheses of genomic organization in dinoflagellates. Further, a new non-canonical splicing motif was described that could greatly improve the automated modeling and annotation of genomic data. A custom phylogenetic marker discovery pipeline, incorporating methods that leverage the statistical power of large data sets was written. A case study on Stramenopiles was undertaken to test the utility in resolving relationships between known groups as well as the phylogenetic affinity of seven unknown taxa. The pipeline generated a set of 373 genes useful as phylogenetic markers that successfully resolved relationships among the major groups of Stramenopiles, and placed all unknown taxa on the tree with strong bootstrap support. This pipeline was then used to discover 668 genes useful as phylogenetic markers in dinoflagellates. Phylogenetic analysis of 58 dinoflagellates, using this set of markers, produced a phylogeny with good support of all branches. The Suessiales were found to be sister to the Peridinales. The Prorocentrales formed a monophyletic group with the Dinophysiales that was sister to the Gonyaulacales. The Gymnodinales was found to be paraphyletic, forming three monophyletic groups. While this pipeline was used to find phylogenetic markers, it will likely also be useful for finding orthologs of interest for other purposes, for the discovery of horizontally transferred genes, and for the separation of sequences in metagenomic data sets.
... Genetic duplication is known to impact genome evolution of dinoflagellates, with genes occurring in high copy numbers implicating essential functions (e.g. [22,23]), possibly facilitated by the introgression of transcripts into the genome following trans-splicing of spliced leader in transcription [24,25]. We investigated the evolution of protein families to search for evidence of functional innovation and divergence within species, and its potential connection to lifestyle. ...
Article
Full-text available
Dinoflagellates in the order Suessiales include the family Symbiodiniaceae, which have essential roles as photosymbionts in corals, and their cold-adapted sister group, Polarella glacialis. These diverse taxa exhibit extensive genomic divergence, although their genomes are relatively small (haploid size < 3 Gbp) when compared with most other free-living dinoflagellates. Different strains of Symbiodiniaceae form symbiosis with distinct hosts and exhibit different regimes of gene expression, but intraspecific whole-genome divergence is poorly understood. Focusing on three Symbiodiniaceae species (the free-living Effrenium voratum and the symbiotic Symbiodinium microadriaticum and Durusdinium trenchii) and the free-living outgroup P. glacialis, for which whole-genome data from multiple isolates are available, we assessed intraspecific genomic divergence with respect to sequence and structure. Our analysis, based on alignment and alignment-free methods, revealed a greater extent of intraspecific sequence divergence in Symbiodiniaceae than in P. glacialis. Our results underscore the role of gene duplication in generating functional innovation, with a greater prevalence of tandemly duplicated single-exon genes observed in the genomes of free-living species than in symbionts. These results demonstrate the remarkable intraspecific genomic divergence in dinoflagellates under the constraint of reduced genome sizes, shaped by genetic duplications and symbiogenesis events during the diversification of Symbiodiniaceae.
... A possible explanation for this difference is that the DinoSL added to the transcripts contains a potential promoter motif, TTT(G), which is then retroposed into genome together with coding sequences (Figure 1). This hypothesis is supported by several observations, first being that the DinoSL relicts located between −50 and −100 from the start codon [the usual length of 5 UTR in dinoflagellates (Zhang et al., 2007;Kim et al., 2010)] are more conserved regardless of their ages. In addition, the potential promoter motif, TTT(G), is more conserved than other motifs in DinoSL relicts upstream of the retrogene. ...
Article
Full-text available
The birth and evolution of retrogenes have played crucial roles in genome evolution. Dinoflagellates represent a unique lineage for retrogene research because the retrogenes can be reliably identified by the presence of a 22 nucleotide splice leader called DinoSL, which is post-transcriptionally added to the 5′ terminus of all mRNAs. Compared to studies of retrogenes conducted in other model genomes, dinoflagellate retrogenes can potentially be more comprehensively characterized because intron-containing retrogenes have already been detected. Unfortunately, dinoflagellate retrogene research has long been neglected. Here, we review the work on dinoflagellate retrogenes and show their distinct character. Like the dinoflagellate genome itself, dinoflagellate retrogenes are also characterized by many unusual features, including a high survival rate and large numbers in the genome. These data are critical complements to what we know about retrogenes, and will further frame our understanding of retroposition and its roles in genome evolution, as well as providing new insights into retrogene studies in other genomes.
... We found the highest degree of conservation in DinoSLs located between À50 and À100 ( fig. 1D), a location where dinoflagellate promoters are likely to be found based on the usual length of 5 0 -UTR (Zhang et al. 2007;Kim et al. 2011). ...
Article
Full-text available
Gene retroposition is an importantmechanismof genome evolution but the role it plays in dinoflagellates, a critical player in marine ecosystems, is not known. Until recently, when the genomes of two coral-symbiotic dinoflagellate genomes, Symbiodinium kawagutii and S. minutum, were released, it has not been possible to systematically study these retrogenes. Here we examine the abundant retrogenes ( 23%of the total genes) in these species. The hallmark of retrogenes in the genome is the presence of DCCGTAGCCATTTTGGCTCAAG, a spliced leader (DinoSL) constitutively trans-spliced to the 50-end of all nucleus-encodedmRNAs. Although the retrogenes have often lost part of the 22-nt DinoSL, the putative promoter motif from the DinoSL, TTT(T/G), is consistently retained in the upstream region of these genes, providing an explanation for the high survival rate of retrogenes in dinoflagellates.Our analysis of DinoSL sequence divergence revealed twomajor bursts of retroposition in the evolutionary history of Symbiodinium, occurringat 60 and 6Ma. Reconstruction of the evolutionary trajectory of the Symbiodinium genomes mapped these 2 times to the origin and rapid radiation of this dinoflagellate lineage, respectively.GOanalysis revealed differential functional enrichment of the retrogenes between the two episodes, with a broad impact on transport in the first bout and more localized influence on symbiosis-related processes such as cell adhesion in the second bout. This study provides the first evidence of large-scale retroposition as a major mechanism of genome evolution for any organism and sheds light on evolution of coral symbiosis. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
... Since 5 0 - Fig. 1 The alignment of SL-RNAs. SL-RNA sequences were found either at the NCBI (the National Center for Biotechnology Information) website and/or directly in publications (Brehm et al. 2000;Davis 1997;Derelle et al. 2010;Douris et al. 2010;Ebel et al. 1999;Evans et al. 1997;Ganot et al. 2004;Gibson et al. 2000;Kim et al. 2011;Marlétaz et al. 2008;Miller et al. 1986;Muhich et al. 1987;Pouchkina-Stantcheva and Tunnacliffe 2005;Rajkovic et al. 1990;Yang et al. 2015;Yeats et al. 2010;Zayas et al. 2005;Zhang et al. 2009). The ClustalO alignment (Sievers et al. 2011) of SL-RNAs was manually edited in BioEdit (Hall 1999) with respect to the fact that splice-donor site should be GU in the majority of species and putative Sm-binding site at the 3 0 -end of SL-RNA molecule should resemble that of trypanosomatids (Gibson et al. 2000). ...
Article
Full-text available
Trans-splicing is a process by which 5′- and 3′-ends of two pre-RNA molecules transcribed from different sites of the genome can be joined together to form a single RNA molecule. The spliced leader (SL) trans-splicing is mediated by the spliceosome and it allows the replacement of 5′-end of pre-mRNA by 5′(SL)-end of SL-RNA. This form of splicing has been observed in many phylogenetically unrelated eukaryotes. Either the SL trans-splicing (SLTS) originated in the last eukaryotic common ancestor (LECA) (or even earlier) and it was lost in most eukaryotic lineages, or this mechanism of RNA processing evolved several times independently in various unrelated eukaryotic taxa. The bioinformatic comparisons of SL-RNAs from various eukaryotic taxonomic groups have revealed the similarities of secondary structures of most SL-RNAs and a relative conservation of their splice sites (SSs) and Sm-binding sites (SmBSs). We propose that such structural and functional similarities of SL-RNAs are unlikely to have evolved repeatedly many times. Hence, we favor the scenario of an early evolutionary origin for the SLTS and multiple losses of SL-RNAs in various eukaryotic lineages.
... During the process to establish SRD of actin, upmost 24 different actin subtypes for one species was deposited, which suggested that the actin copy for that species was at least 24. Considering that there are highly variations in different actin copies within one species [12][13][14] , it was not unexpected that some actin copy type was not included in the sequence dataset, thus unable to be annotated to genus. In the study of Kermarrec et al. 3 , taxon outside sequence reference dataset (SRD) were annotated when sequencing the SSU rDNA, rbcL and cox1 amplicons of diatom environmental samples by 454 sequencing. ...
Article
Full-text available
Miseq sequencing and data analysis for the actin gene and v9 region of 18S rDNA of 7 simulated samples consisting of different mixture of dinoflagellates and diatoms were carried out. Not all the species were detectable in all the 18S v9 samples, and sequence percent in all the v9 samples were not consistent with the corresponding cell percent which may suggest that 18S rDNA copy number in different cells of these species differed greatly which result in the large deviation of the amplification. And 18S rDNA amplification of the microalgae was prone to be contaminated by fungus. The amplification of actin gene all was from the dinoflagellates because of its targeted degenerate primers. All the actin sequences of dinoflagellates were detected in the act samples except act4, and sequence percentage of the dinoflagellates in the act samples was not completely consistent with the dinoflagellates percentage of cell samples, but with certain amplification deviations. Indexes of alpha diversity of actin gene sequencing may be better reflection of community structure, and beta diversity analysis could cluster the dinoflagellates samples with identical or similar composition together and was distinguishable with blooming simulating samples at the generic level. Hence, actin gene was more proper than rDNA as the molecular marker for the community analysis of the dinoflagellates.
... Valiadi et al. (2012) found multiple non-identical copies of lcf within some dinoflagellate strains, with variation up to c. 9% among sequences of an A. fundyense strain. Large variation among gene copies is common in dinoflagellates and the degree of variation in lcf is in line with other studies (Tanikawa et al., 2004;Kim et al., 2011;. However, polymorphisms among gene copies are often species-specific and have been observed for other genes particularly in A. fundyense (Miranda et al., 2012). ...
Article
The toxic dinoflagellate Alexandrium ostenfeldii is the only bioluminescent bloom-forming phytoplankton in coastal waters of the Baltic Sea. We analysed partial luciferase gene (lcf) sequences and bioluminescence production in Baltic A. ostenfeldii bloom populations to assess the distribution and consistency of the trait in the Baltic Sea, and to evaluate applications for early detection of toxic blooms. Lcf was consistently present in 61 Baltic Sea A. ostenfeldii strains isolated from six separate bloom sites. All Baltic Sea strains except one produced bioluminescence. In contrast, the presence of lcf and the ability to produce bioluminescence did vary among strains from other parts of Europe. In phylogenetic analyses, lcf sequences of Baltic Sea strains clustered separately from North Sea strains, but variation between Baltic Sea strains was not sufficient to distinguish between bloom populations. Clustering of the lcf marker was similar to internal transcribed spacer (ITS) sequences with differences being minor and limited to the lowest hierarchical clusters, indicating a similar rate of evolution of the two genes. In relation to monitoring, the consistent presence of lcf and close coupling of lcf with bioluminescence suggests that bioluminescence can be used to reliably monitor toxic bloom-forming A. ostenfeldii in the Baltic Sea.
Article
Dinoflagellates are algae of tremendous importance to ecosystems and to public health. The cell biology and genome organisation of dinoflagellate species is highly unusual. For example, the plastid genomes of peridinin-containing dinoflagellates encode only a minimal number of genes arranged on small elements termed "minicircles". Previous studies of peridinin plastid genes have found evidence for divergent sequence evolution, including extensive substitutions, novel insertions and deletions, and use of alternative translation initiation codons. Understanding the extent of this divergent evolution has been hampered by the lack of characterised peridinin plastid sequences. We have identified over 300 previously unannotated peridinin plastid mRNAs from published transcriptome projects, vastly increasing the number of sequences available. Using these data, we have produced a well-resolved phylogeny of peridinin plastid lineages, which uncovers several novel relationships within the dinoflagellates. This enables us to define changes to plastid sequences that occurred early in dinoflagellate evolution, and that have contributed to the subsequent diversification of individual dinoflagellate clades. We find that the origin of the peridinin dinoflagellates was specifically accompanied by elevations both in the overall number of substitutions that occurred on plastid sequences, and in the Ka/Ks ratio associated with plastid sequences, consistent with changes in selective pressure. These substitutions, alongside other changes, have accumulated progressively in individual peridinin plastid lineages. Throughout our entire dataset, we identify a persistent bias towards non-synonymous substitutions occurring on sequences encoding photosystem I subunits and stromal regions of peridinin plastid proteins, which may have underpinned the evolution of this unusual organelle.
Article
Full-text available
In an attempt to elucidate relationships among the morphologically diverse members of Saxifragaceae sensu lato, phylogenetic analyses of rbcL sequence data were conducted on representative genera of 16 of the 17 subfamilies. Also included were many putatively related families, as well as a diverse array of dicotyledonous flowering plants. Our phylogenetic analyses suggest that taxa of Saxifragaceae sensu lato are allied with at least 10 separate, often distantly related, lineages of several subclasses of flowering plants. Sequence data, in combination with other lines of evidence, suggest that Saxifragaceae sensu stricto should consist only of subfamily Saxifragoideae, a group of about 30 herbaceous genera that form the core of Saxifragaceae sensu lato. These data also suggest that potential close relatives of Saxifragaceae sensu stricto include Iteoideae, Pterostemonoideae, and Ribesoideae and possibly Penthoroideae and Tetracarpaeoideae, all traditional subfamilies of Saxifragaceae sensu lato, as well as Crassulaceae. These members of Saxifragaceae sensu lato, along with Saxifragaceae sensu stricto, Crassulaceae, and several genera from the subclass Hamamelidae, are basal to a large assemblage of taxa, most of which are usually placed in Rosidae. Within this primarily rosid alliance, representatives of four other subfamilies of Saxifragaceae sensu lato (Francooideae, Baueroideae, Parnassioideae, and Lepuropetaloideae) are allied with the rosid families Greyiaceae, Cunoniaceae, and Celastraceae. According to rbcL sequence evidence, Hydrangeoideae and Cornaceae are closely related members of a clade that is basal to a large group of taxa primarily from subclass Asteridae. Representative genera of four subfamilies of Saxifragaceae sensu lato (Phyllonomoideae, Escallonioideae, Montinioideae, and Vahlioideae) are in our results allied with taxa usually included in Asteridae. Significantly, relationships of Saxifragaceae sensu lato suggested by rbcL sequence data are in very close agreement with those supported by several other lines of evidence, especially embryology, serology, and iridoid chemistry.
Article
Full-text available
The dinoflagellate genus Dinophysis includes several species that cause diarrhetic shellfish poisoning, none of which have yet been established in culture. We report on the maintenance of Dinophysis acuminata cultures that were established in December 2005 and also on its feeding mechanism, and growth rates when fed the ciliate prey Myrionecta rubra with and without the addition of the cryptophyte Teleaulax sp. D. acuminata grew well (growth rate of 0.95 d(-1)) in laboratory culture when supplied with the marine ciliate M rubra as prey, reaching a maximum concentration of about 2400 cells ml(-1) at the end of the feeding experiment, In contrast, D. acuminata did not show sustained growth in the absence of the ciliate or when provided the cryptophyte Teleaulax sp. as prey (D. acuminata used its peduncle to extract the cell contents of the prey organism, M. rubra). Based on the prey-predator interactions occurring among D. acuminata, M rubra, and Teleaulax sp. in this study, establishment of permanent culture of the dinoflagellate D. acuminata may facilitate a better understanding of the ecophysiology, biology, and toxicology of Dinophysis species, as well as the evolution of dinoflagellate plastids.
Article
Full-text available
Dinophytes acquired chloroplasts obviously early in evolution and later lost them multiple limes. Most families and genera contain both photosynthetic and heterotrophic species. Chloroplasts enveloped by three membranes with thylakoids in stacks of three, containing peridinin as the main pigment, are regarded as the original dinophyte plastids. Pyrenoids are generally present. Stigmata, if present, are usually parts of the chloroplast or are modified original plastids. The form II type RUBISCO found in the dinophytes is unique for eukaryotes, otherwise known only in some anaerobic bacteria. It is disputed whether the original dinophyte chloroplasts are derived from a prokaryotic or an eukaryotic endosymbiosis. Various dinoflagellates contain aberrant chloroplasts. Glenodinium foliaceum and Peridinium balticum have a single complete endosymbiont, originally a pennate diatom. Podolampas bipes houses several dictyophycean symbiont cells. The "symbionts" of Lepidodinium viride and Gymnodinium chlorophorum are highly reduced prasinophyte cells. The chloroplasts of Gymnodinium mikimotoi have aberrant pigments (fucoxanthin derivatives, no peridinin) and fine structure. The dinoflagellate hosts do nor seem to contain any parts of the former endosymbiont except the chloroplasts. Photosynthetic Dinophysis species have cryptophycean-like chloroplasts, whereas symbiotic cyanobacteria are found in other members of the Dinophysiales, e.g., Ornithocercus. Various dinophytes, e.g. Gymnodinium aeruginosum, use kleptochloroplasts from ingested cryptophytes transiently for photosynthesis. Original or secondarily acquired chloroplasts can only be used for phylogenetic considerations in exceptionally cases: it seems unlikely that the Prorocentrales have evolved from the Dinophysiales because all Prorocentrales possess original dinoflagellate chloroplasts, whereas no member of the Dinophysiales has such chloroplasts.
Article
Genes encoding ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) were cloned from dinoflagellate symbionts (Symbiodinium spp) of the giant clam Tridacna gigas and characterized. Strikingly, Symbiodinium Rubisco is completely different from other eukaryotic (form I) Rubiscos: it is a form II enzyme that is ∼65% identical to Rubisco from Rhodospirillum rubrum (Rubisco forms I and II are ∼25 to 30% identical); it is nuclear encoded by a multigene family; and the predominantly expressed Rubisco is encoded as a precursor polyprotein. One clone appears to contain a predominantly expressed Rubisco locus (rbcA), as determined by RNA gel blot analysis of Symbiodinium RNA and sequencing of purified Rubisco protein. Another contains an enigmatic locus (rbcG) that exhibits an unprecedented pattern of amino acid replacement but does not appear to be a pseudogene. The expression of rbcG has not been analyzed; it was detected only in the minor of two taxa of Symbiodinium that occur together in T. gigas. This study confirms and describes a previously unrecognized branch of Rubisco's evolution: a eukaryotic form II enzyme that participates in oxygenic photosynthesis and is encoded by a diverse, nuclear multigene family.
Article
Dinoflagellates of the genus Dinophysis are responsible for diarrhetic shellfish poisoning. Phototrophic species have an orange primary fluorescence indicating the presence of phycobilins. The chloroplasts greatly resemble cryptophycean chloroplasts having pairs of thylakoids and electron-dense material in the thylakoid lumen. They are bound by only two membranes, in contrast to the blue-green chloroplasts of Amphidinium wigrense Woloszynsk, which are enveloped by three membranes (Wilcox and Wedemayer, 1985). Possible ways of evolution of the Dinophysis chloroplasts, phylogenetical questions and implications for the monitoring of toxic dinoflagellates are discussed.
Article
The dinoflagellate genus Dinophysis contains species known to cause diarrhetic shellfish poisoning. Although most photosynthetic dinoflagellates have plastids with peridinin, photosynthetic Dinophysis species have cryptophyte-like plastids containing phycobilin rather than peridinin. We sequenced nuclear- and plastid-encoded SSU rDNA from three photosynthetic species of Dinophysis for phylogenetic analyses. In the tree of nuclear SSU rDNA, Dinophysis was a monophyletic group nested with peridinin-containing dinoflagellates. However, in the tree of plastid SSU rDNA, the Dinophysis plastid lineage was within the radiation of cryptophytes and was closely related to Geminigera cryophila. These analyses indicate that an ancestor of Dinophysis, which may have originally possessed peridinin-type plastid and lost it subsequently, adopted a new plastid from a cryptophyte. Unlike dinoflagellates with fully integrated plastids, the Dinophysis plastid SSU rDNA sequences were identical among the three species examined, while there were species-specific base substitutions in their nuclear SSU rDNA sequences. Queries of the DNA database showed that the plastid SSU rDNA sequence of Dinophysis is almost identical to that of an environmental DNA clone of a <10 m sized plankter, possibly a cryptophyte and a likely source of the Dinophysis plastid. The present findings suggest that these Dinophysis species engulfed and temporarily retained plastids from a cryptophyte.
Article
Dinoflagellate algae are important primary producers and of significant ecological and economic impact because of their ability to form “red tides” [1]. They are also models for evolutionary research because of an unparalleled ability to capture photosynthetic organelles (plastids) through endosymbiosis [2]. The nature and extent of the plastid genome in the dominant perdinin-containing dinoflagellates remain, however, two of the most intriguing issues in plastid evolution. The plastid genome in these taxa is reduced to single-gene minicircles 3. and 4. encoding an incomplete (until now 15) set of plastid proteins. The location of the remaining photosynthetic genes is unknown. We generated a data set of 6,480 unique expressed sequence tags (ESTs) from the toxic dinoflagellate Alexandrium tamarense (for details, see the Experimental Procedures in the Supplemental Data) to find the missing plastid genes and to understand the impact of endosymbiosis on genome evolution. Here we identify 48 of the non-minicircle-encoded photosynthetic genes in the nuclear genome of A. tamarense, accounting for the majority of the photosystem. Fifteen genes that are always found on the plastid genome of other algae and plants have been transferred to the nucleus in A. tamarense. The plastid-targeted genes have red and green algal origins. These results highlight the unique position of dinoflagellates as the champions of plastid gene transfer to the nucleus among photosynthetic eukaryotes.