ArticlePDF Available

Antiquity and Evolution of the MADS-Box Gene Family Controlling Flower Development in Plants

Authors:

Abstract and Figures

MADS-box genes in plants control various aspects of development and reproductive processes including flower formation. To obtain some insight into the roles of these genes in morphological evolution, we investigated the origin and diversification of floral MADS-box genes by conducting molecular evolutionary genetics analyses. Our results suggest that the most recent common ancestor of today's floral MADS-box genes evolved roughly 650 MYA, much earlier than the Cambrian explosion. They also suggest that the functional classes T (SVP), B (and Bs), C, F (AGL20 or TM3), A, and G (AGL6) of floral MADS-box genes diverged sequentially in this order from the class E gene lineage. The divergence between the class G and E genes apparently occurred around the time of the angiosperm/gymnosperm split. Furthermore, the ancestors of three classes of genes (class T genes, class B/Bs genes, and the common ancestor of the other classes of genes) might have existed at the time of the Cambrian explosion. We also conducted a phylogenetic analysis of MADS-domain sequences from various species of plants and animals and presented a hypothetical scenario of the evolution of MADS-box genes in plants and animals, taking into account paleontological information. Our study supports the idea that there are two main evolutionary lineages (type I and type II) of MADS-box genes in plants and animals.
Content may be subject to copyright.
Antiquity and Evolution of the MADS-Box Gene Family Controlling
Flower Development in Plants
Jongmin Nam, Claude W. dePamphilis, Hong Ma, and Masatoshi Nei
Institute of Molecular Evolutionary Genetics and Department of Biology, Pennsylvania State University
MADS-box genes in plants control various aspects of development and reproductive processes including flower
formation. To obtain some insight into the roles of these genes in morphological evolution, we investigated the origin and
diversification of floral MADS-box genes by conducting molecular evolutionary genetics analyses. Our results suggest
that the most recent common ancestor of today’s floral MADS-box genes evolved roughly 650 MYA, much earlier than
the Cambrian explosion. They also suggest that the functional classes T (SVP), B (and Bs), C, F (AGL20 or TM3), A,
and G (AGL6) of floral MADS-box genes diverged sequentially in this order from the class E gene lineage. The
divergence between the class G and E genes apparently occurred around the time of the angiosperm/gymnosperm split.
Furthermore, the ancestors of three classes of genes (class T genes, class B/Bs genes, and the common ancestor of the
other classes of genes) might have existed at the time of the Cambrian explosion. We also conducted a phylogenetic
analysis of MADS-domain sequences from various species of plants and animals and presented a hypothetical scenario of
the evolution of MADS-box genes in plants and animals, taking into account paleontological information. Our study
supports the idea that there are two main evolutionary lineages (type I and type II) of MADS-box genes in plants and
animals.
Introduction
MADS-box genes encode transcription factors and
have been found in three eukaryotic kingdoms, plants,
animals, and fungi. In plants, MADS-box genes include
developmental regulatory genes comparable to homeobox
genes in animals. The protein region encoded by the
highly conserved MADS-box is called the MADS-domain
and is part of the DNA-binding domain. It is composed of
approximately 55 amino acids (aa). It has been proposed
that there are at least 2 lineages (type I and type II) of
MADS-box genes in plants, animals, and fungi (fig. 1;
Alvarez-Buylla et al. 2000b). Most of the well-studied
plant genes are type II genes and have three more domains
than type I genes: intervening (I) domain (;30 codons),
keratin-like coiled-coil (K) domain (;70 codons), and C-
terminal (C) domain (variable length). These genes are
called the MIKC-type and are specific to plants.
The plant-specific MIKC-type MADS-box genes
were first discovered in flowering plants (angiosperms).
They can be divided into at least nine classes on the basis
of their functions and expression patterns (table 1). In
angiosperms, several classes of MADS-box genes control
flower formation and are often referred to as floral MADS-
box genes. In particular, the ‘‘ABC’’ model of flower
formation proposes that the four floral components
(organs) are controlled by the interactions of three classes
of floral MADS-box genes, A, B, and C (Weigel and
Meyerowitz 1994; Ma and dePamphilis 2000). More
recently, this ABC model was amended to include an
interaction with an additional class of genes, called class E
genes (Theissen 2001). According to this amended model,
called the ‘‘quartet model,’’ the combinatorial tetramers of
four classes of floral MADS-domain proteins regulate the
development of the four floral components (Honma and
Goto 2001; Theissen 2001): sepals by class A genes, petals
by class A, B, and E genes, stamens by class B, C, and E
genes, and carpels by class C and E genes (table 1). Class
A, C, and E genes are also involved in floral meristem
development.
Other classes include the class D genes, which are the
close relatives of class C genes and control ovule
development (Theissen 2001). The recently proposed class
B-sister (Bs) genes also appear to control the development
of ovule and seed coat, though their protein sequences are
quite different from those of D genes (Becker et al. 2002;
Nesi et al. 2002). In addition, another group of MADS-box
genes that includes AGL20 (AGAMOUS-LIKE 20)in
Arabidopsis thaliana (thale cress; hereafter called Arabi-
dopsis) plays a pivotal role in flower activation as an
integrator of genetic and environmental flowering path-
ways (Lee et al. 2000). This group of genes will be called
‘‘class F genes’’ instead of the TM3 or orphan group as
previously named (Purugganan 1997; Becker et al. 2000).
Several genes such as AGL6 in Arabidopsis seem to be
involved in the development of both flowers and vegetative
organs (Alvarez-Buylla et al. 2000a). We call these genes
‘‘class G genes.’’ Furthermore, there is a group of genes
that trigger flowering as an initiator or a repressor. Loss of
function of some of these genes resulted in late flowering
or early flowering (Hartmann et al. 2000; Michaels et al.
2003). We call these genes ‘‘class T genes.’’
All the above genes are directly involved in flower
formation of angiosperms. We therefore call them ‘‘floral
MADS-box genes’’ in this article, though this terminology
is usually used for the class A, B, C, and E genes. Note
that our classification of MADS-box genes is for
simplifying the explanation of our study rather than for
proposing new terminologies. There are large numbers of
other MADS-box genes in angiosperms. Some of them
appear to control flowering time or formation of leaves,
fruits, roots, etc. (Zhang and Forde 1998; Michaels and
Amasino 1999; Sheldon et al. 1999; Alvarez-Buylla et al.
2000a; Hartmann et al. 2000), but the functions of other
genes are unknown.
Key words: MADS-box genes, molecular evolution, flower de-
velopment, divergence time, evolutionary developmental biology.
E-mail: nxm2@psu.edu.
1435
Mol. Biol. Evol. 20(9):1435–1447. 2003
DOI: 10.1093/molbev/msg152
Molecular Biology and Evolution, Vol. 20, No. 9,
ÓSociety for Molecular Biology and Evolution 2003; all rights reserved.
The primary purpose of this article is to investigate
the evolutionary relationships and divergence times of
floral MADS-box genes. However, because most floral
MADS-box genes are known to exist in gymnosperms as
well (e.g., Winter et al. 1999; Becker et al. 2000), we
consider the genes from both angiosperms and gymno-
sperms. Previously, Purugganan (1997) studied a similar
problem, but this problem should be reexamined because
extensive data on MADS-box genes have become avail-
able in recent years. Furthermore, to understand the long-
term evolution of MADS-box genes, we will also
investigate the evolutionary relationships of MADS-
domain sequences from plants and animals.
Materials and Methods
Floral MADS-Box Genes Used
At present, MIKC-type MADS-box gene sequences
are available from various species of angiosperms,
gymnosperms, ferns, club mosses, and mosses (GenBank,
TIGR). There are more than 70 MADS-box genes
annotated in Arabidopsis (The Arabidopsis Genome
Initiative 2000 and our unpublished study). Similarly, we
have identified about 70 genes from rice by conducting
a TBLASTN search in the Rice Genome Database of
China (Yu et al. 2002) and the TIGR Rice Genome
Database. From these databases, we compiled 293 full-
length MIKC-type MADS-box genes. In the phylogenetic
study of floral MADS-box genes, we used 23 reproductive
genes, covering all classes of genes shared by angiosperms
and gymnosperm species (class B, Bs, C, F, G, and T
genes). These genes were chosen from the well-studied
eudicot species Arabidopsis, monocot species Oryza sativa
(rice) and Zea mays (maize), and gymnosperm species
Pinus radiata (Monterey pine), Picea abies (Norway
spruce), and Gnetum gnemon (table 1). We did not include
the gymnosperm class E gene (PrMADS1) reported from
the pine Pinus radiata, because this appears to be
a contaminated gene from Eucalyptus grandis at the time
of experimentation (G. Theissen, personal communica-
tion). Class A and E genes from angiosperms were also
included from our analysis because of their importance
during flower development, though these genes have not
been found in gymnosperms. Class D genes were excluded
from the analysis, because their protein sequences were
close to C gene sequences and the distinction between C
and D genes was not always clear.
Protein sequences of these genes were obtained from
GenBank or TIGR. The names of the proteins and their
GenBank accession numbers or TIGR locus numbers are
as follows: AGL9 (At1g24260), AGL6 (At2g45650),
AGL20 (At2g45660), APETALA1 (AP1) (At1g69120),
APETALA3 (AP3) (At3g54340), PISTILLATA (PI)
(At5g20240), AGAMOUS (AG) (At4g18960), SVP
(At2g22540), OsMADS3 (S59480), OsMADS4
(T03902), OsMADS8 (AAC49817), OsMADS14
(AAF19047), OsMADS16 (AAD19872), OsMADS17
(AAF21900), OsMADS50 (BAA81886), OsMADS54
(BAA81880), DAL1 (T14846), DAL2 (S51934), DAL3
(T14848), DAL13 (AAF18377), GGM13 (CAB44459),
ZMM17 (CAC81053), ABS (At5g23260), and LAMB1
(AAG08991). As is shown in table 1, the protein sequence
of a class T gene from G. gnemon, GGM12, is available,
but it was not used in our analysis because it was
a fragmentary sequence. In this article we have used
simplified gene notations to make the study understand-
able for a wide audience.
Phylogenetic Analysis of MIKC-Type Genes
We used protein sequences for our phylogenetic
analysis, because the evolutionary pattern of protein
sequences appears to be simpler than that of DNA
sequences (Nei and Kumar 2000, chapter 2) and protein
sequences often give more satisfactory results than DNA
sequences in the study of long-term evolution (Hashimoto
et al. 1994; Russo, Takezaki, and Nei 1996; Glazko and
Nei 2003). In the present case, we could minimize the
effect of variation in the GC content at third codon position
by using protein sequences.
We aligned 293 protein sequences using the computer
program ClustalX (Thompson et al. 1997) with default
parameters except the gap opening parameter of 2.0. We
then constructed a preliminary Neighbor-Joining (NJ) tree
with Poisson-correction (PC) distance using the computer
program MEGA2 (version 2.1) (Kumar et al. 2001). (In
MEGA2, taxon input orders are randomized for all
bootstrap replications.) According to this tree, we divided
293 protein sequences into 18 groups and aligned them
separately with the same parameters using ClustalX. These
aligned groups were again aligned to each other using the
profile alignment option in this program. After elimination
of gaps in this alignment, we constructed an initial NJ tree
using PC distance. As mentioned above, we selected 24
representative sequences of 142 amino acid sites without
gaps, including the MADS-domain, the K-domain, and the
conserved region of the I-domain. Using MEGA2, we then
constructed NJ trees with p-distance (proportion of
different amino acids), PC distance, and PC gamma
distance (Nei and Kumar 2000, chapter 2). In addition, we
constructed maximum-likelihood (ML) trees using the
PROTML program with the Poisson and JTT models
(Adachi and Hasegawa 1996) and maximum-parsimony
FIG. 1.—Schematic diagram of two types (types I and II) of MADS-
box genes in plants and animals. The plant-specific MIKC-type MADS-
domain proteins are presented with the name and function of each
conserved domain. A broken line indicates the DNA-binding region, and
a dotted line the protein-protein interaction region. This figure has been
modified from Alvarez-Buylla et al. (2000b).
1436 Nam et al.
(MP) trees using the PAUP* program with the stepwise
addition and tree-bisection-reconnection (TBR) algorithm
with 500 bootstrap resamplings (Swofford 1998). A
distantly related MADS-box gene, LAMB1, from the club
moss Lycopodium annotinum, was used as the outgroup in
this study. According to our phylogenetic analysis, this
gene was closely related to type I genes (see Supplemen-
tary Material online at the journal’s Web site: http://
www.molbiolevol.org). Alvarez-Buylla et al. (2000b) have
suggested that type I proteins do not have the K-domain
(putative coiled-coil structure). However, the LAMB1
protein has a domain similar to the K-domain, including
regularly spaced hydrophobic amino acids (e.g., leucine,
isoleucine, and valine), which are known to be important
for protein-protein interaction (Moon et al. 1999).
Therefore, we could align the LAMB1 protein sequence
with other MADS-domain protein sequences. Moreover,
LAMB1 has been suggested to be a new MIKC-type
MADS-box gene designated as MIKC*-type, whereas the
other 23 genes were classical MIKC genes (MIKC
c
-type;
Henschel et al. 2002). There are two more MIKC*-type
genes (PPM3 and PPM4) reported from the moss
Physcomitrella patens (Henschel et al. 2002). Use of
these genes as the outgroups produced essentially the same
topology for the floral MADS-box genes.
Once the topology of the phylogenetic tree was
determined, we estimated the times of divergence between
various types of genes using the linearized tree method
(Takezaki, Rzhetsky, and Nei 1995; see program LIN-
TREE in http://mep.bio.psu.edu). With the LINTREE
method, the time scale constructed does not apply to the
outgroup. We also used Yoder and Yang’s (2000)
likelihood method implemented in the computer program
PAML (Yang 2002) with a different evolutionary rate for
class B genes of angiosperms from the rate used with the
remaining genes. Sanderson’s (2003) penalized likelihood
method was also used.
Phylogenetic Analysis of MADS-Domains from
Plants and Animals
The animal species studied so far seem to have at
least one type I gene and one type II MADS-box gene, but
the number of the genes is generally very small (Alvarez-
Buylla et al. 2000b). All of the well-studied plant MADS-
box genes are type II genes, and there are many other type
II genes in angiosperms and gymnosperms. The existence
of plant type I genes has not been well established, except
in Arabidopsis, rice, and club moss (Alvarez-Buylla et al.
2000band our unpublished study).
To study the evolutionary relationships of type I and
type II MADS-box genes, we used the MADS-domain
sequences (;55 aa) of 87 representative genes from plants
(Arabidopsis, rice, spruce, pine, gnetum, fern, club moss,
and moss) and animals (human, mouse, zebrafish, fruitfly,
mosquito, and nematode) (see Supplementary Material
Table 1
Representatives of Different Classes of MADS-Box Genes Considered in This Study
Gene and Source
Class
Arabidopsis
(Eudicots)
Rice or Maize
(Monocots)
Norway Spruce, Monterey
Pine, or Gnetum
(Gymnosperms) Function in Arabidopsis
Class A
(AP1 or SQUA)
Arabi A
(APETALA1 or AP1)
Rice A
(OsMADS14)
Unknown Sepal and petal development, floral
meristem development (Weigel
and Meyerowitz 1994)
Class B
AP3/PI or
DEF/GLO
Arabi B-AP3
(AP3)
Arabi B-PI
(PISTILLATA or PI)
Rice B-AP3
(OsMADS16)
Rice B-PI
(OsMADS4)
Spruce B (DEFICIENS-
AGAMOUS-LIKE 13
or DAL13)
Petal and stamen development
(Weigel and Meyerowitz 1994)
Class Bs Arabi Bs (ABS) Maize Bs (ZMM17) Gnetum Bs (GGM13) Ovule and seed coat development
(Nesi et al. 2002)
Class C
(AG or PLENA)
Arabi C
(AGAMOUS or AG)
Rice C (OsMADS3) Spruce C (DAL2) Stamen and carpel development,
floral meristem development
(Weigel and Meyerowitz 1994)
Class D
(not used in this study)
Arabi D (AGL11) Rice D (OsMADS13) Unknown Ovule development (Theissen 2001)
Class E
(AGL2/4/9)
Arabi E (AGL9) Rice E (OsMADS8) Unknown Petal, stamen, carpel and floral
meristem development (Theissen
2001)
Class F
(AGL20 or TM3)
Arabi F (AGL20) Rice F (OsMADS50) Spruce F (DAL3) Flowering activation (integrator of
genetic and environmental
flowering pathways) (Lee et al.
2000)
Class G
(AGL6)
Arabi G (AGL6) Rice G (OsMADS17) Spruce G (DAL1) Expressed in both vegetative and
reproductive tissues (Alvarez-
Buylla et al 2000a)
Class T
(SVP or STMADS11)
Arabi T (SVP) Rice T (OsMADS54) Gnetum T (GGM12, partial
sequence—not used in
this study)
Flowering repression (Hartmann
et al. 2000)
NOTE.—We used simplified gene names. Commonly used names and their abbreviations are given in parentheses. The function of each class of genes is based on the
studies in Arabidopsis. ‘‘Arabi’’ indicates Arabidopsis. All classes of genes are members of floral MADS-box genes.
Evolution of the MADS-Box Genes in Plants 1437
online). In this study we used only MADS-domain
sequences, because animal genes do not have the IKC
domain. The 87 MADS-domain sequences were aligned
by using ClustalX, and the evolutionary relationships of
the genes were examined by constructing a NJ tree with p-
distance for 55 shared amino acids.
Results
Phylogenetic Tree of MIKC-Type Genes
The phylogenetic tree of 24 representative MADS-
box genes from eudicots, monocots, and gymnosperms is
presented in figure 2. This tree was obtained by the NJ
method with PC distance, but very similar trees were
obtained by NJ with p-distance and PC gamma distance,
and by ML and MP methods (see Supplementary Material
online). Although the bootstrap values for interior branch
a-b, as well as for the B or Bs gene clades of this tree, are
very low, the other clades involving class E, G, A, F, and
C genes are supported with reasonably high bootstrap
values (.70%). Similar patterns were observed in trees
obtained by other tree-building methods. Therefore, the
portion of the tree containing the class E, G, A, F, and C
genes appears to be reliable.
This tree suggests that after separation of the class T
genes from the non-T floral MADS-box genes, class B/Bs
genes were the first to diverge from the rest of non-T floral
MADS-box genes, although this finding is still pro-
visional. Class C genes then separated from the genes
belonging to class F, A, G, and E genes. The next group of
genes to diverge was class F genes. Moreover, the
taxonomic distribution of functional classes of floral
MADS-box genes (table 1) suggests that class E and G
genes, which diverged most recently, diverged around the
time of angiosperm/gymnosperm split. Several class-
specific or taxon-specific amino acids have been reported
(e.g., Huang et al. 1995; Kramer, Dorit, and Irish 1998),
but we did not find any key features of conserved amino
acids supporting any clade of the tree in figure 2. We also
compared the positions of introns among all classes of
genes, but the positions were too conserved to be
informative for inferring the phylogenetic relationships
of MADS-box genes (data not shown).
Estimates of Divergence Times
Although molecular estimates of divergence times
between genes or species depend on a number of
assumptions and are generally very crude (Nei, Xu, and
Glazko 2001; Glazko and Nei 2003), they are still useful
for obtaining a rough idea of the evolutionary history of
genes or species. With this caveat in mind, we estimated
the times of divergence between different classes of genes.
In the estimation of divergence times, the hypothesis of
constant evolutionary rate should first be tested, and then
the sequences whose evolutionary rate significantly
deviates from constancy should be eliminated (Takezaki,
Rzhetsky, and Nei 1995). In this case a number of authors
have used Yang’s (2002) or Gu and Zhang’s (1997)
likelihood method for estimating gamma parameter a.
However, for the purpose of time estimation, these
methods, particularly the former method, tend to give
underestimates of a, and this often leads to overestimation
of divergence times when ancient divergence times are
estimated (Nei, Xu, and Glazko 2001; Glazko and Nei
2003). This seems to be particularly true for slowly
evolving genes such as cytochrome c. Dickerson (1971)
showed that in cytochrome c and hemoglobin the number
of amino acid substitutions estimated by PC distance (a¼
) is nearly proportional to the time since species
divergence up to about 500 MYA. Nei (1987, pp. 47–
50) also showed that variation in evolutionary rate among
amino acid sites has a relatively small effect on time
estimates unless the sequence divergence is very high. We
have therefore decided to use primarily PC distance for
estimating divergence times. However, we also used
Dayhoff’s distance to take into account backward and
parallel mutations. According to Nei and Kumar (2000,
chapter 2), Dayhoff’s distance can be computed by a PC
gamma distance with a¼2.25. We therefore used this
method. Note that the use of these distances gives
conservative estimates of divergence times compared with
those obtained by the PC gamma distance with a likelihood
estimate of a(see below).
We used the two-cluster test of Takezaki, Rzhetsky,
and Nei (1995) to examine the applicability of the
molecular clock for the tree in figure 2 and found that
the four B genes (2 AP3 genes and 2 PI genes) evolved
significantly faster than other genes at the 3% level. We
therefore eliminated these four genes and constructed
a linearized tree with PC distance for the remaining genes
(fig. 3A). The two-cluster test also showed that the spruce
FIG. 2.—Phylogenetic tree of nine classes of MADS-box genes (A,
B, Bs, C, D, E, F, G, and T) from monocots, dicots, and gymnosperms
with a gene from the club moss Lycopodium annotinum,LAMB1, used as
the outgroup. The number for each interior branch is a percent bootstrap
value (500 resamplings). The scale bar indicates the estimated number of
amino acid substitutions per site. The number of amino acids used was
142 without gaps per sequence. AP3 and PI are abbreviations of
APETALA3 and PISTILLATA, respectively. Gene names were simpli-
fied to make the paper understandable to a wide audience (see table 1).
Calibration points used for estimating divergence times are marked with
an asterisk.
1438 Nam et al.
C gene evolved significantly more slowly than the
Arabidopsis and rice C genes at the 5% level, but we
retained this gene because it was important for calibration
of the time scale, and because a relatively small deviation
of a sequence from rate constancy does not affect time
estimates seriously (Nei and Kumar 2000, pp. 200–202).
In addition to the four B genes, we also eliminated all Bs
genes because of the uncertain phylogenetic position of the
genes (fig. 2). To compare our results with previous
estimates of divergence times for floral MADS-box genes
by Purugganan (1997), we constructed a linearized tree for
a simplified Purugganan tree topology. Purugganan
studied the phylogenetic tree of many floral MADS-box
genes, but the bootstrap values of the interior branches
were so low that he merged several interior nodes. If we
use only 24 genes, as in our study, the linearized
Purugganan tree becomes as given in part B of figure 3.
We therefore estimated the divergence time for the merged
node (a-b-c-d).
To calibrate the time scale of the linearized tree,
a calibration point is necessary. For our data set, the
divergence times between ‘‘eudicots’’ and ‘monocots’’
and between ‘‘gymnosperms’’ and ‘angiosperms’’ may be
used as the calibration point. However, there is no good
fossil record for the divergence of eudicots and monocots,
and other authors have used various values (131–200
MYA) for this divergence (Wolfe et al. 1989; Laroche, Li,
and Bousquet 1995; Soltis et al. 2002). This calibration
point also gives some unreasonable time estimates for our
data set (see below). By contrast, there seems to be
a consensus about the divergence time between angio-
sperms and gymnosperms, which is about 300 MYA. This
estimate is supported by both paleontological data and
molecular time estimates (Stewart and Rothwell 1993,
pp. 505–512; Savard et al. 1994; Goremykin, Hansmann,
and Martin 1997; Soltis et al. 2002). In addition, the
angiosperm/gymnosperm split calibration will produce
smaller standard errors of time estimates than the monocot/
eudicot split calibration, because the former is a more
ancient evolutionary event than the latter (Glazko and Nei
2003). We have therefore decided to use this time as the
calibration point.
Figure 3Ashows that each of class G, F, and C genes
included one gymnosperm gene and two angiosperm
genes. We therefore computed the average PC distance (d)
between the gymnosperm and angiosperm genes and
obtained d¼0.372. This gives an estimate of the rate of
amino acid substitution (r)tober¼d/(2 3300) per
million years or r¼6.2 310
10
per year. The timescales
for trees A and B in figure 3 were obtained by using this
rate of amino acid substitution. The times of divergence
between different classes of genes can then be estimated
from these linearized trees. The results obtained are
presented in table 2, which also includes time estimates
obtained by using Dayhoff and PC gamma distances.
When PC distance is used, the time of divergence between
the T and the non-T floral MADS-box genes is estimated
to be about 652 MYA. This is well before the time of the
Cambrian explosion (about 545 MYA; see fig. 4). Table 2
also suggests that the divergence between class B genes
and other non-T floral MADS-box genes (612 MYA)
occurred before the Cambrian explosion. The divergence
between class C genes and the remaining non-T floral
genes (537 MYA) again appears to have occurred around
the Cambrian explosion. This might sound strange,
because most animal and plant phyla are believed to have
evolved no earlier than the time of the Cambrian
explosion. However, recent paleontological data (Xiao,
Zhang, and Knoll 1998) suggest that, by this time, green
algae had already evolved. The fossil record suggests that
the first land plants such as bryophytes appeared around
450 MYA. Our estimates in table 2 suggest that class A, G,
and E gene lineages originated after the occurrence of land
plants. Table 2 also includes an estimate (556 MYA) of the
divergence time between B and Bs genes. In the estimation
of this divergence time, the class B genes from
angiosperms were excluded because of their faster rate
of evolution compared to other genes, and the divergence
FIG. 3.—Linearized trees used for estimating divergence times. The time scale is based on the results with PC distance. A. Topology from figure 2.
B. Topology when the interior branches between nodes a, b, c, and d are collapsed.
Evolution of the MADS-Box Genes in Plants 1439
time was estimated by dividing the distance between the B
and Bs genes by 2r, where r¼6.2 310
10
per year. This
estimate suggests that the gymnosperm B and Bs genes
diverged a long time ago, if they are clearly definable
separate gene groups.
Because many of the above estimates of divergence
times far exceed the times of first appearance of land plants
in the fossil record (450 MYA), they might be over-
estimates. However, if we use Dayhoff distance or PC
gamma distance with an ML estimate (1.06) of aobtained
by Gu and Zhang’s method, the divergence time estimates
become even greater (table 2). This was especially so
when PC gamma distance was used. In this case branch
points a and b were estimated to be 816 and 743 MYA,
respectively. We also used Yoder and Yang’s method
without eliminating B genes but with the assumption that
these genes evolved faster than the other genes (two rates
model). This method also gave greater estimates than those
obtained by PC distance even when the Poisson model
(a¼), Dayhoff model, or Poisson gamma model (a¼
1.06) was used (table 2). Sanderson’s penalized likelihood
method gave even greater estimates than other methods
(see Supplementary Material online). Therefore, our
estimates obtained from the linearized tree method with
PC distance are most conservative.
One might wonder whether we used most closely
related copies (orthologous genes) of the class G, F, and C
genes between angiosperms and gymnosperms for com-
puting the time scale. Actually we tried to do so, but there
is no guarantee for the use of real orthologous genes, in
part because no complete genome sequence is yet available
from any gymnosperm species and in part because it is not
easy to determine orthologous genes even in the presence
of complete genome sequences (Theissen 2002). However,
if we had used nonorthologous genes for any of these gene
classes, our estimates would have been lower than
unbiased estimates, because the rate of amino acid
substitution should have been overestimated. This factor
also tends to make our estimates conservative.
As already mentioned, some authors have used the
monocot/eudicot divergence (200 MYA) as the calibration
point. In our data set, however, the use of this calibration
point gave a divergence time estimate of 251 MYA
between the angiosperms and the gymnosperms. (The
average distance of the angiosperm and gymnosperm
genes from class C, F, and G genes was used.) When we
used a calibration point of 150 MYA for the monocot/
eudicot divergence, we obtained an estimate of divergence
of 188 MYA for the angiosperm and gymnosperm split.
These estimates are clearly unreasonable, because angio-
sperms and gymnosperms are believed to have diverged
about 300 MYA. We therefore decided not to use the
monocot/eudicot calibration point. Incidentally, if we use
the angiosperm/gymnosperm divergence (300 MYA) as
the calibration point, we obtain an expected divergence
time of 239 MYA between monocots and eudicots.
In figure 3B, we have Purugganan’s topology. If we
estimate the branch point (a-b-c-d) of this topology, we
obtain 575 MYA. This is considerably greater than
Purugganan’s estimate (476 MYA). This difference has
occurred in part because Purugganan used the monocot/
eudicot divergence (200 MYA) as the calibration point and
in part because he used paralogous genes of E genes
between monocots and eudicots.
Phylogenetic Tree of 87 MADS-Domains from
Plants and Animals
Figure 5 shows a NJ tree of type I and type II MADS-
domain sequences from plant and animal species. Type I
and type II genes form their own clades, and these clades
are quite well supported by the bootstrap test. Type II
genes are further divided into plant and animal genes. The
monophyletic cluster of animal type II genes is well
supported. Plant type II genes also form a monophyletic
cluster, although the bootstrap support is rather weak
(51%). Animal type I genes form a monophyletic group. In
contrast, plant type I genes do not form a monophyletic
cluster, although genes from Arabidopsis and rice form
a well-supported cluster. This failure of plant type I genes
to form a monophyletic cluster could be due to the small
number of amino acids used.
Although our results are somewhat ambiguous, they
generally support the view put forth by Alvarez-Buylla
Table 2
Estimates of Divergence Times (6SE) of Floral MADS-Box Genes
Linearized Tree Method Maximum-Likelihood Method
Node
PC
Distance
Dayhoff
Distance
(a¼2.25)
PC Gamma
Distance
(a¼1.06)
Poisson
Model
(a¼)
Dayhoff
Model
Poisson þGamma
Model (a¼1.06)
(a) T/(others) 652 672 721 691 816 6120 816 813 836
(b) B/(C/D-F-A-G-E) 612 662 668 677 743 699 749 775 772
(c) (C/D)/ (F-A-G-E) 537 644 573 654 612 668 630 647 631
(d) F/(A-G-E) 502 642 531 650 566 662 564 569 586
(e) A/(G-E) 374 639 380 645 388 651 428 406 422
(f) G/E 289 629 286 631 282 635 341 327 340
(g) B/Bs 556 665 598 678 646 694 656 662 714
Node a-b-c-d
(Purugganan tree) 575 649 623 659 684 675 689 701 706
NOTE.—Unit of time estimates is MYA. The gymnosperm/angiosperm split (ca. 300 MYA) in classes C, F, and G was used for calibrating the time scale. Dayhoff
distance was computed by using PC gamma distance with a¼2.25 (Nei and Kumar 2000, chapter 2). In the linearized tree method, time estimates for nodes (a) ;(f ) were
computed by using 16 genes. (Three Bs genes and 4 angiosperm B genes were excluded.) The time of divergence between the B and Bs genes was estimated separately (see
text). Because the ML method does not give proper standard errors (Yoder and Yang 2000), those values are not presented.
1440 Nam et al.
et al. (2000b), that the type I and type II genes were
generated by a gene duplication that occurred before the
plant/animal divergence. Animal type I genes control
very basic transcription processes concerned with
various aspects of cell growth and differentiation and
neuronal transmission, etc., whereas type II genes are
responsible for muscle development (Shore and Shar-
rocks 1995). The function of plant type I genes is not
well understood, and these genes have only been
identified by genomic sequencing of Arabidopsis and
rice, although the LAMB1 gene in the club moss has
been suspected to be a type I gene. Many plant type II
genes in figure 5 belong to one of the nine classes of
MIKC-type MADS-box genes considered in figure 2.
However, there are additional MADS-box genes that
control various developmental processes such as root
formation.
Plant type II genes form many clades of a few genes,
and many of these clades are statistically supported
relatively well. However, their inter-clade relationships
are poorly supported. In particular, B/Bs genes are no
longer monophyletic. Nevertheless, the relationships of
the genes belonging to floral MADS-box gene classes A,
C, E, F, G, and T are virtually the same as those in figure
2. Therefore, the tree in figure 5 may reflect the
evolutionary history of MADS-box domains to some
extent. The low bootstrap values for these relationships
occurred primarily because we used many sequences with
only 55 aa, and because there are many other MADS-box
genes which are closely related to but are distinct from
floral MADS-box genes in plant genomes. It is possible
that the nine classes of floral MADS-box genes were
derived from some of these distinct MADS-box genes
nearly independently. In the present case it is not
meaningful to try to estimate the divergence times of
these genes, because the number of amino acids per
sequence is very small.
Discussion
Reliability of Estimates of Divergence Times
The fact that nonflowering gymnosperms have most
classes (B, Bs, C, G, and T) of floral MADS-box genes
indicates that the gene duplications that generated these
genes occurred long before their angiosperm-specific
functions were established. It is not clear what kinds of
function these floral MADS-box genes had before their
functional diversification, but they were probably involved
in the regulation of broad developmental and reproductive
processes, as was suggested by Becker et al. (2000). This
evolutionary pattern is similar to that of homeobox genes
that control segmentation of animal body structure (Zhang
and Nei 1996; Purugganan 1998). Cnidarian species such
as jellyfish do not have a segmented body structure, yet
they have hox genes (Ferrier and Holland 2001). Actually,
similar evolutionary patterns are observed with several
other gene families controlling development (e.g., Burglin
1997; Meyerowitz 2002), and it appears that the
occurrence of gene duplication before functional diversi-
fication is a generalized phenomenon with gene families
controlling development.
Our conservative estimates suggest that class A and B
floral genes diverged about 612 MYA, which is two times
earlier than the paleontological estimates of divergence
time between gymnosperms and angiosperms. It also far
exceeds the paleontological estimate of the time of first
land plants (mosses) (ca. 450 MYA). However, mosses are
known to have at least two genes that are homologous to
classical MIKC-type genes (Henschel et al. 2002). It
should also be noted that classical MIKC-type genes have
been identified even in green algae such as Chara,
Coleochaete,andClosterium (M. Hasebe, personal
communication), all of which evolved earlier than land
plants. Note that the oldest fossil record of green algae is
700–750 Myr old (Chen and Xiao 1991; Butterfield 2000),
FIG. 4.—Schematic representation of the evolution of floral MADS-box genes. Divergence time estimates (MYA) are indicated for each node of
the tree in figure 3A. The divergence time for node g was estimated separately (see text). Several important events in plant evolution are indicated to the
left of the time scale. The time estimates of these major events are taken from Stewart and Rothwell (1993, pp. 505–512).
Evolution of the MADS-Box Genes in Plants 1441
FIG. 5.—Phylogenetic tree of 87 MADS-domain sequences from Arabidopsis, rice, gymnosperms, ferns, club mosses, mosses, and animals. This
tree was constructed by the NJ method with p-distance for a 55-aa domain. The number for each interior branch is the percent bootstrap value (500
resamplings), and only values greater than 50% are shown. The names of plant species used are the same as those in figure 2, except for ferns and
mosses. Those of the remaining species are as follows: fern, Ceratopteris richardii; moss, Physcomitrella patens; human, Homo sapiens; mouse, Mus
musculus; zebrafish, Danio rerio; nematode, Caenorhabditis elegans; mosquito, Anopheles gambiae; fly, Drosophila melanogaster.
1442 Nam et al.
although green algae do not appear to be monophyletic.
These observations suggest that our estimate of the time of
origin of floral MADS-box genes may not be too early.
In this discussion we have used the most conservative
estimates of divergence times obtained by PC distance. If
we use PC gamma distance or Yoder and Yang’s method,
estimates of the time of origin of floral MADS-box genes
become greater than 800 MYA. These estimates appear to
be too early if we consider the fossil record of land plants
and green algae, but we cannot rule out this possibility
because the fossil record is notoriously incomplete. It is
worth noting that, until recently, all or most orders of
placental mammals were believed to have diverged only
about 65 MYA. At present, however, we know of the
fossil remains of a placental mammal that is about 125
Myr old (Ji et al. 2002). The notion of the Cambrian
explosion, in which most visible eukaryotic organisms are
believed to have been absent before 545 MYA, is also
slowly changing. We now know 570 Myr-old fossils of
animal eggs (Xiao, Zhang, and Knoll 1998), 900–1,200
Myr-old fossils of red algae (Butterfield 2000), and 1,100–
1,200 Myr-old trace fossils of worm (Seilacher, Bose, and
Pfluger 1998; Rasmussen et al. 2002), although the
authenticity of these trace fossils has been questioned
(Conway Morris 2002).
Nevertheless, it is not clear what kind of function the
MIKC-type genes had in ancestral non-seed plants. In
recent years an intensive study has been made to identify
genes orthologous to floral MADS-box genes in non-seed
plants, but that study has not been very successful (e.g.,
Mu¨ nster et al. 1997; Hasebe et al. 1998; Hohe et al. 2002;
Svensson and Engstro¨m 2002). What are the possible
reasons for these negative results? There seem to be at
least five: First, the orthologs of floral MADS-box genes in
non-seed plants so far studied might have been lost in the
course of evolution. Second, the orthologs of floral
MADS-box genes in non-seed plants are so different from
the floral MADS-box gene that it is difficult to identify
orthologs now. Third, our molecular time estimates are too
old, even though we used the most conservative method.
This may happen if the rate of amino acid substitution was
faster in the early stage of evolution of floral MADS-box
genes than in the later stage. Fourth, the current fossil
record is incomplete and land plants might have evolved
earlier than currently believed. Fifth, the genes so far
studied may be incomplete, and a complete genome search
may find the genes. At present, however, it is difficult to
resolve the discrepancy between the theoretical and
experimental studies.
Long-term Evolution of MADS-Box Genes
As mentioned, MADS-box genes are highly con-
served, and the MADS-domain sequences are shared by
plants, animals, and fungi, indicating that MADS-box
genes have an ancient history. Therefore, studying the
history of MADS-box genes, we should be able to obtain
some insight into the evolution of morphological charac-
ters in eukaryotes. Unfortunately, our knowledge about the
MADS-box genes and their function in early eukaryotes is
quite limited. Nevertheless, it would be interesting to
speculate about the evolution of MADS-box genes in
eukaryotes, taking into account both paleontological
information and molecular dating. Having a plausible
scenario may give some useful information for future
experimental studies. Here we consider only the evolution
of plant and animal genes, because MADS-box genes in
fungi other than the budding yeast are not well studied.
We can see from figure 5 that both plants and animals
have two different types of MADS-box genes, type I and
type II genes. As indicated by Alvarez-Buylla et al.
(2000b), this suggests that these two types of genes
diverged by a gene duplication that occurred before the
plant/animal divergence (fig. 6). The oldest geological
evidence of eukaryotes is given by a lipid biomarker,
which has been dated 2,700 MYA (Brocks et al. 1999).
There are also eukaryotic fossils that have been dated
2,100 MYA (Han and Runnegar 1992). There is no fossil
record that indicates the time of divergence between plants
and animals, but molecular data suggest that the di-
vergence time is about 1,400 MYA (Feng, Cho, and
Doolittle 1997; Wang, Kumar, and Hedges 1999; Nei, Xu,
and Glazko 2001). If these estimates are reliable, the gene
duplication must have occurred some time between 1,400
MYA and 2,700 MYA (fig. 6). Because yeast, Caeno-
rhabditis elegans, and Drosophila melanogaster all have
a small number of type I and type II genes (two type I
genes and two type II genes in yeast; one type I gene and
one type II gene in C. elegans and D. melanogaster), it is
likely that the early plants (possibly red and brown algae,
Cavalier-Smith 2002; note that the monophyly of plants
and these algae is still controversial) also had a small
number of type I and type II genes. This hypothesis may
be tested by examining the genomes of extant red and
brown algae. Because these early plants have quite
complex morphological characters and life cycles, this
would help us to understand the ancient function of
MADS-box genes during plant evolution. According to the
conservative estimates of divergence times of MADS-box
genes we present in table 2, a group of green algae which
are believed to have evolved 700–750 MYA (fig. 6) is
expected to have at most one gene that is ancestral to all
the floral MADS-box genes currently present in angio-
sperms and gymnosperms. However, if our estimates from
gamma distance are correct, green algae may have three
genes that are ancestral to the current T, B (and Bs), and E
(or A, C, F, G) classes of genes.
Figure 6 shows several evolutionary events in both
animal and plant lineages. Molecular estimates of di-
vergence times of early metazoan animals are almost
always considerably earlier than paleontological estimates.
For example, molecular data have suggested that the
nematode lineage diverged from the vertebrate lineage
800–1,100 MYA (e.g., Feng, Cho, and Doolittle 1997;
Wang, Kumar, and Hedges 1999; Nei, Xu, and Glazko
2001), which is about two times earlier than the times of the
Cambrian explosion. The nematode C. elegans is known to
have one type I gene and one type II MADS-box gene
(Alvarez-Buylla et al. 2000b; our unpublished data). The
type I and type II MADS-box genes in animals have not
been studied very well, but the zebrafish has several type I
and type II genes (our unpublished results). These findings
Evolution of the MADS-Box Genes in Plants 1443
suggest that MADS-box genes are very ancient and
evolved gradually in the long history of plants and animals.
Previously we indicated that the MADS-box gene
family is an important gene family comparable to the
animal homeobox gene family. In this regard, it is
interesting to note that the homeobox gene family also
exists in plants, animals, and fungi (Burglin 1997; Kappen
2000), and that there are at least two lineages of homeobox
genes that diverged before the plant/animal/fungal split. It
would be interesting to investigate how these two different
multigene families controlling development coevolved.
Gene Family Expansion or Birth-and-Death Evolution?
Figure 2 shows a pattern of functional diversification
of major groups of MADS-box genes. This figure suggests
that the number of genes of this multigene family has
steadily increased as the reproductive system became more
FIG. 6.—A scenario of the evolution of MADS-box genes in plant and animal lineages. Important events of plant and animal evolution (divergence
from the lineage leading to Arabidopsis or human) are presented with their estimated times. The references for these estimates are as follows: (1) timeof
the oldest biomarkers of eukaryotes (Brocks et al. 1999), (2) oldest fossil record of eukaryotic algae (Han and Runnegar 1992), (3) fossil record of some
forms of red algae (Butterfield 2000), (4) trace fossil of animals (Seilacher, Bose, and Pfluger 1998; Rasmussen et al. 2002), (5) molecular time
estimates of the animal/plant split and nematode evolution (Feng, Cho, and Doolittle 1997; Wang, Kumar, and Hedges 1999; Nei, Xu, and Glazko
2001), (6) fossil record of green algae (Chen and Xiao 1991), (7) fossil record of jawless fish (Maisey 1996, pp. 52–55), and (8) fossil record of the bird/
mammal split (Benton 1993, pp. 717–771). The number of circles and squares does not represent the real gene number in each organism. The estimated
numbers of MADS-box genes in the species of available genome sequences are as follows: Arabidopsis (.70 genes), rice (.70 genes), human (5
genes), fly (2 genes), nematode (2 genes), and budding yeast (4 genes).
1444 Nam et al.
complex. However, although the gene number must have
increased from the time of early plants, this tree does not
give the entire picture of evolution of MADS-box genes,
because we did not include many genes that are not
directly related to flower formation. Our tree in figure 5 is
not very reliable, but if it represents a general pattern of
evolution of MADS-box genes, it is possible that different
floral MADS-box genes were derived from other floral
MADS-box genes, which have already been lost, or even
from other reproductive MADS-box genes. Furthermore,
the Arabidopsis genome is known to contain several
MADS-box pseudogenes or truncated genes (our un-
published data), indicating that some MADS-box genes
died out in the evolutionary process. These observations
suggest that the MADS-box gene family might have been
subjected to the birth-and-death model of evolution, in
which some genes generate duplicate genes with new
functions but others become nonfunctional or are deleted
from the genome (Nei, Gu, and Sitnikova 1997). If this is
the case, it is possible that the genome of gymnosperms or
ferns contains nearly as many MADS-box genes as the
angiosperm genomes and that the genes in these plants
merely exert the different functions required for the
different forms of reproduction. Of course, it is also
possible that the phylogenetic tree of current angiosperm
genes in figure 2 in large part reflects the history of the
increase of member genes of the MADS-box gene family
in gymnosperms and angiosperms. At present, we cannot
distinguish between the two alternative hypotheses, but
this could be done rather easily if the genomic sequences
of gymnosperms and ferns were determined. It is also
important to note that the two hypotheses are not mutually
exclusive and we are interested only in the relative
importance of the two possibilities.
Acknowledgments
We thank Takeshi Itoh and Yoshiyuki Suzuki for
valuable comments on an earlier version of this paper. We
also thank Mitsuyasu Hasebe, Doug Soltis, Pam Soltis,
and two anonymous reviewers for their useful comments.
This work was supported by research grants from the
National Institutes of Health to M.N. J.N. has a scholarship
from the Rotary Foundation.
Literature Cited
Adachi, J., and M. Hasegawa. 1996. MOLPHY, a computer
program package for molecular phylogenetics. Version 2.3.
The Institute of Statistical Mathematics, Tokyo.
Alvarez-Buylla, E. R., S. J. Liljegren, S. Pelaz, S. E. Gold, C.
Burgeff, G. S. Ditta, F. Vergara-Silva, and M. F. Yanofsky.
2000a. MADS gene evolution beyond flowers, expression in
pollen, endosperm, guard cells, roots, and trichomes. Plant J.
24:457–466.
Alvarez-Buylla, E. R., S. Pelaz, S. J. Liljegren, S. E. Gold, C.
Burgeff, G. S. Ditta, L. Ribas de Pouplana, L. Martinez-
Castilla, and M. F. Yanofsky. 2000b. An ancestral MADS-
box gene duplication occurred before the divergence of plants
and animals. Proc. Natl. Acad. Sci. USA 97:5328–5333.
Becker, A., K. Kaufmann, A. Freialdenhoven, C. Vincent, M. A.
Li, H. Saedler, and G. Theissen. 2002. A novel MADS-box
gene subfamily with a sister-group relationship to class B
floral homeotic genes. Mol. Genet. Genomics 266:942–950.
Becker, A., K. U. Winter, B. Meyer, H. Saedler, and G. Theissen.
2000. MADS gene diversity in seed plants 300 million years
ago. Mol. Biol. Evol. 17:1425–1434.
Benton, M. J. 1993. The fossil records 2. Chapman and Hall,
New York.
Brocks, J. J., G. A. Logan, R. Buick, and R. E. Summons. 1999.
Archean molecular fossils and the early rise of eukaryotes.
Science 285:1033–1036.
Burglin, T. R. 1997. Analysis of TALE superclass homeobox
genes (MEIS, PBC, KNOX, Iroquois, TGIF) reveals a novel
domain conserved between plants and animals. Nucleic Acids
Res. 25:4173–4180.
Butterfield, N. J. 2000. Bangiomorpha pubescens n. gen., n. sp.:
implications for the evolution of sex, multicellularity, and the
Mesoproterozoic/Neoproterozoic radiation of eukaryotes.
Paleobiology 26:386–404.
Cavalier-Smith, T. 2002. The phagotrophic origin of eukaryotes
and phylogenetic classification of Protozoa. Int. J. Syst. Evol.
Microbiol. 52:297–354.
Chen, M., and Z. Xiao. 1991. Discovery of the macrofossils in
the Upper Sinain Doushantuo Formation at Miaohe, eastern
Yangtze Gorges. Sci. Geol. Sinica 4:317–324.
Conway Morris, S. 2002. Ancient animals or something else
entirely? Science 298:57–58.
Dickerson, R. E. 1971. The structures of cytochrome c and the
rates of molecular evolution. J. Mol. Evol. 1:26–45.
Feng, D. F., G. Cho, and R. F. Doolittle. 1997. Determining
divergence times with a protein clock: update and reevalua-
tion. Proc. Natl. Acad. Sci. USA 94:13028–13033.
Ferrier, D. E., and P. W. Holland. 2001. Ancient origin of the
Hox gene cluster. Nat. Rev. Genet. 2:33–38.
Glazko, G. V., and M. Nei. 2003. Estimation of divergence times
for major lineages of primate species. Mol. Biol. Evol.
20:424–434.
Goremykin, V. V., S. Hansmann, and W. F. Martin. 1997.
Evolutionary analysis of 58 proteins encoded in six
completely sequenced chloroplast genomes: revised molecular
estimates of two seed plant divergence times. Plant Syst. Evol.
206:337–351.
Gu, X., and J. Zhang. 1997. A simple method for estimating the
parameter of substitution rate variation among sites. Mol.
Biol. Evol. 14:1106–1113.
Han, T. M., and B. Runnegar. 1992. Megascopic eukaryotic
algae from the 2.1-billion-year-old Negaunee Iron Formation,
Michigan. Science 257:232–235.
Hartmann, U., S. Hohmann, K. Nettesheim, E. Wisman, H.
Saedler, and P. Huijser. 2000. Molecular cloning of SVP,
a negative regulator of the floral transition in Arabidopsis.
Plant J. 21:351–360.
Hasebe, M., C. K. Wen, M. Kato, and J. A. Banks. 1998. Char-
acterization of MADS homeotic genes in the fern Ceratopteris
richardii. Proc. Natl. Acad. Sci. USA 95:6222–6227.
Hashimoto, T., Y. Nakamura, F. Nakamura, T. Shirakura, J.
Adachi, N. Goto, K. Okamoto, and M. Hasegawa. 1994. Protein
phylogeny gives a robust estimation for early divergences of
eukaryotes: phylogenetic place of a mitochondria-lacking
protozoan, Giardia lamblia. Mol. Biol. Evol. 11:65–71.
Henschel, K., R. Kofuji, M. Hasebe, H. Saedler, T. Mu¨ nster, and
G. Theissen. 2002. Two ancient classes of MIKC-type
MADS-box genes are present in the moss Physcomitrella
patens. Mol. Biol. Evol. 19:801–814.
Hohe, A., S. A. Rensing, M. Mildner, and R. Reski. 2002. Day
length and temperature strongly influence sexual reproduction
and expression of a novel MADS-box gene in the moss
Physcomitrella patens. Plant Biol. 4:595–602.
Evolution of the MADS-Box Genes in Plants 1445
Honma, T., and K. Goto. 2001. Complexes of MADS-box
proteins are sufficient to convert leaves into floral organs.
Nature 409:525–529.
Huang, H., M. Tudor, C. A. Weiss, Y. Hu, and H. Ma. 1995. The
Arabidopsis MADS-box gene AGL3 is widely expressed and
encodes a sequence-specific DNA-binding protein. Plant Mol.
Biol. 28:549–567.
Ji, Q., Z. X. Luo, C. X. Yuan, J. R. Wible, J. P. Zhang, and J. A.
Georgi. 2002. The earliest known eutherian mammal. Nature
416:816–822.
Kappen, C. 2000. Analysis of a complete homeobox gene
repertoire: implications for the evolution of diversity. Proc.
Natl. Acad. Sci. USA 97:4481–4486.
Kramer, E. M., R. L. Dorit, and V. F. Irish. 1998. Molecular
evolution of genes controlling petal and stamen development:
duplication and divergence within the APETALA3 and
PISTILLATA MADS-box gene lineages. Genetics 149:765–
783.
Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001.
MEGA2: molecular evolutionary genetics analysis software.
Bioinformatics 17:1244–1245.
Laroche, J., P. Li, and J. Bousquet. 1995. Mitochondrial DNA
and monocot-dicot divergence time. Mol. Biol. Evol.
12:1151–1156.
Lee, H., S. S. Suh, E. Park, E. Cho, J. H. Ahn, S. G. Kim, J. S.
Lee, Y. M. Kwon, and I. Lee. 2000. The AGAMOUS-LIKE
20 MADS domain protein integrates floral inductive pathways
in Arabidopsis. Genes Dev. 14:2366–2376.
Ma, H., and C. dePamphilis. 2000. The ABCs of floral evolution.
Cell 101:5–8.
Maisey, J. G. 1996. Discovering fossil fishes. Henry Holt and
Co., New York.
Meyerowitz, E. M. 2002. Plants compared to animals: the
broadest comparative study of development. Science
295:1482–1485.
Michaels, S. D., and R. M. Amasino. 1999. FLOWERING
LOCUS C encodes a novel MADS domain protein that acts as
a repressor of flowering. Plant Cell 11:949–956.
Michaels, S. D., G. Ditta, C. Gustafson-Brown, S. Pelaz, M. F.
Yanofsky, and R. M. Amasino. 2003. AGL24 acts as
a promoter of flowering in Arabidopsis and is positively
regulated by vernalization. Plant J. 33:867–874.
Moon, Y., J. S. Jeon, S. K. Sung, and G. An. 1999.
Determination of the motif responsible for interaction between
the rice APETALA1/AGAMOUS-LIKE9 family proteins
using a yeast two-hybrid system. Plant Physiol. 120:1193–
1204.
Mu¨ nster, T., J. Pahnke, A. Di Rosa, J. T. Kim, W. Martin, H.
Saedler, and G. Theissen. 1997. Floral homeotic genes were
recruited from homologous MADS genes preexisting in the
common ancestor of ferns and seed plants. Proc. Natl. Acad.
Sci. USA 94:2415–2420.
Nei, M. 1987. Molecular evolutionary genetics. Columbia
University Press, New York.
Nei, M., X. Gu, and T. Sitnikova. 1997. Evolution by the birth-
and-death process in multigene families of the vertebrate
immune system. Proc. Natl. Acad. Sci. USA 94:7799–7806.
Nei, M., and S. Kumar. 2000. Molecular evolution and
phylogenetics. Oxford University Press, New York.
Nei, M., P. Xu, and G. Glazko. 2001. Estimation of divergence
times from multiprotein sequences for a few mammalian
species and several distantly related organisms. Proc. Natl.
Acad. Sci. USA 98:2497–2502.
Nesi, N., I. Debeaujon, C. Jond, A. J. Stewart, G. I. Jenkins, M.
Caboche, and L. Lepiniec. 2002. The TRANSPARENT
TESTA16 locus encodes the Arabidopsis Bsister MADS
domain protein and is required for proper development and
pigmentation of the seed coat. Plant Cell 14:2463–2479.
Purugganan, M. D. 1997. The MADS-box floral homeotic gene
lineages predate the origin of seed plants: phylogenetic and
molecular clock estimates. J. Mol. Evol. 45:392–396.
———. 1998. The molecular evolution of development.
Bioessays 20:700–711.
Rasmussen, B., S. Bengtson, I. R. Fletcher, and N. J.
McNaughton. 2002. Discoidal impressions and trace-like
fossils more than 1200 million years old. Science 296:1112–
1115.
Russo, C. A., N. Takezaki, and M. Nei. 1996. Efficiencies of
different genes and different tree-building methods in re-
covering a known vertebrate phylogeny. Mol. Biol. Evol.
13:525–536.
Sanderson, M. J. 2003. r8s: inferring absolute rates of molecular
evolution and divergence times in the absence of a molecular
clock. Bioinformatics 19:301–302.
Savard, L., P. Li, S. H. Strauss, M. W. Chase, M. Michaud, and J.
Bousquet. 1994. Chloroplast and nuclear gene sequences
indicate late Pennsylvanian time for the last common ancestor
of extant seed plants. Proc. Natl. Acad. Sci. USA 91:5163–
5167.
Seilacher, A., P. K. Bose, and F. Pfluger. 1998. Triploblastic
animals more than 1 billion years ago: trace fossil evidence
from India. Science 282:80–83.
Sheldon, C. C., P. P. Perez, J. Metzger, J. A. Edwards, W. J.
Peacock, and E. S. Dennis. 1999. The FLF MADS box gene:
a repressor of flowering in Arabidopsis regulated by
vernalization and methylation. Plant Cell 11:445–458.
Shore, P., and A. D. Sharrocks. 1995. The MADS-box family of
transcription factors. Eur. J. Biochem. 229:1–13.
Soltis, P. S., D. E. Soltis, V. Savolainen, P. R. Crane, and T. G.
Barraclough. 2002. Rate heterogeneity among lineages of
tracheophytes: integration of molecular and fossil data and
evidence for molecular living fossils. Proc. Natl. Acad. Sci.
USA 99:4430–4435.
Stewart, W. N., and G. W. Rothwell. 1993. Paleobotany and the
evolution of plants. Cambridge University Press, New York.
Svensson, M. E., and P. Engstrom. 2002. Closely related MADS-
box genes in club moss (Lycopodium) show broad expression
patterns and are structurally similar to, but phylogenetically
distinct from, typical seed plant MADS-box genes. New
Phytol. 154:439–450.
Swofford, D. L. 1998. PAUP*: phylogenetic analysis using
parsimony (*and other methods). Version 4. Sinauer Asso-
ciates, Sunderland, Mass.
Takezaki, N., A. Rzhetsky, and M. Nei. 1995. Phylogenetic test
of the molecular clock and linearized trees. Mol. Biol. Evol.
12:823–833.
The Arabidopsis Genome Initiative. 2000. Analysis of the
genome sequence of the flowering plant Arabidopsis thaliana.
Nature 408:796–815.
Theissen, G. 2001. Development of floral organ identity, stories
from the MADS house. Curr. Opin. Plant Biol. 4:75–85.
———. 2002. Secret life of genes. Nature 415:741.
Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and
D. G. Higgins. 1997. The ClustalX windows interface,
flexible strategies for multiple sequence alignment aided by
quality analysis tools. Nucleic Acids Res. 24:4876–4882.
Wang, D. Y., S. Kumar, and S. B. Hedges. 1999. Divergence
time estimates for the early history of animal phyla and the
origin of plants, animals, and fungi. Proc. R. Soc. Lond. Ser.
B. 266:163–171.
Weigel, D., and E. M. Meyerowitz. 1994. The ABCs of floral
homeotic genes. Cell 78:203–209.
1446 Nam et al.
Winter, K-U., A. Becker, T. Munster, J. T. Kim, H. Saedler, and
G. Theissen. 1999. MADS-box genes reveal that gnetophytes
are more closely related to conifers than to flowering plants.
Proc. Natl. Acad. Sci. USA 96:7342–7347.
Wolfe, K. H., M. Gouy, Y. W. Yang, P. M. Sharp, and W. H. Li.
1989. Date of the monocot-dicot divergence estimated from
chloroplast DNA sequence data. Proc. Natl. Acad. Sci. USA
86:6201–6205.
Xiao, S., Y. Zhang, and A. H. Knoll. 1998. Three-dimensional
preservation of algae and animal embryos in a Neoproterozoic
phosphorite. Nature 391:553–558.
Yang, Z. 2002. Phylogenetic analysis by maximum likelihood
(PAML). Version 3.13. University College London, London.
Yoder, A. D., and Z. Yang. 2000. Estimation of primate
speciation dates using local molecular clocks. Mol. Biol.
Evol. 17:1081–1090.
Yu, J., S. Hu, J. Wang et al. (100 co-authors). 2002. A draft
sequence of the rice genome (Oryza sativa L. ssp. indica).
Science 296:79–92.
Zhang, H., and B. G. Forde. 1998. An Arabidopsis MADS box
gene that controls nutrient-induced changes in root architec-
ture. Science 279:407–409.
Zhang, J., and M. Nei. 1996. Evolution of Antennapedia-class
homeobox genes. Genetics 142:295–303.
William Martin, Associate Editor
Accepted April 18, 2003
Evolution of the MADS-Box Genes in Plants 1447
... The A-class genes are involved in the development of sepals and petals. Our results indicate that A-class genes are specific to angiosperms, which is consistent with previous reports [49,66,67]. Additionally, in monocots, A-class genes seem to have undergone duplication, resulting in more paralogous genes than in dicots ( Fig. 3a and 3b, Table S5). ...
Article
Full-text available
Background Papaya exhibits three sex types: female (XX), male (XY), and hermaphrodite (XYh), making it an unusual trioecious model for studying sex determination. A critical aspect of papaya sex determination is the pistil abortion in male flowers. However, the regulatory networks that control the development of pistils and stamens in papaya remain incompletely understood. Results In this study, we identified three organ-specific clusters involved in papaya pistils and stamens development. We found that pistil development is primarily characterized by the significant expression of auxin-related genes, while the pistil abortion genes in males is mainly associated with cytokinin, gibberellin, and auxin pathways. Additionally, we constructed expression regulatory networks for the development of female pistils, aborted pistils and stamens in male flowers, revealing key regulatory genes and signaling pathways involved in papaya organ development. Furthermore, we systematically identified 65 members of the MADS-box gene family and 10 ABCDE subfamily MADS-box genes in papaya. By constructing a phylogenetic tree of the ABCDE subfamily, we uncovered gene contraction and expansion in papaya, providing an improved understanding of the developmental mechanisms and evolutionary history of papaya floral organs. Conclusions These findings provide a robust framework for identifying candidate sex-determining genes and constructing the sex determination regulatory network in papaya, providing insights and genomic resources for papaya breeding.
... Throughout evolutionary history, MADS-box genes have experienced multiple rounds of gene duplication and functional diversification. Research indicates that various lineages of MADS-box genes have diversified incrementally, closely linked to morphological evolution in plants [45]. For example, the ABC model proposes that A, B, and C class genes interact to regulate the development of the four distinct parts of flowers [46]. ...
Article
Full-text available
The MADS-box transcription factor (TF) gene family is pivotal in various aspects of plant biology, particularly in growth, development, and environmental adaptation. It comprises Type I and Type II categories, with the MIKC-type subgroups playing a crucial role in regulating genes essential for both the vegetative and reproductive stages of plant life. Notably, MADS-box proteins can influence processes such as flowering, fruit ripening, and stress tolerance. Here, we provide a comprehensive overview of the structural features, evolutionary lineage, multifaceted functions, and the role of MADS-box TFs in responding to biotic and abiotic stresses. We particularly emphasize their implications for crop enhancement, especially in light of recent advances in understanding the impact on sugarcane (Saccharum spp.), a vital tropical crop. By consolidating cutting-edge findings, we highlight potential avenues for expanding our knowledge base and enhancing the genetic traits of sugarcane through functional genomics and advanced breeding techniques. This review underscores the significance of MADS-box TFs in achieving improved yields and stress resilience in agricultural contexts, positioning them as promising targets for future research in crop science.
... In this study, we discovered that TaHY5 was highly expressed in the C-S1 vs. D-S1 and C-S2 vs. D-S2 groups ( Table 2), speculating that this gene might indirectly affect wheat heading time by interacting with other factors. Many studies have demonstrated that MADS-box genes encoding a family of transcription factors control owering time and diverse developmental processes in plants (Becker et al. 2003;Nam et al. 2003). In this research, we found that several known MADS-box genes, including TaFUL2, TaAGL6, TaSEP3, TaAG1, TaVRT2, and TaSEP1-2, were signi cantly upregulated (Table S12), suggesting that these genes play important roles in regulating heading time for 18-1-5. ...
Preprint
Full-text available
Developing early-heading wheat cultivars is an important breeding strategy for saving photo-terminal resources, and facilitating the multiple-cropping systems and annual grain yield. Psathyrostachys huashanica Keng (2n = 2x = 14, NsNs) is a potentially useful germplasm of early heading and maturation for wheat improvement. In this study, we found that a wheat–P. huashanica 7Ns disomic addition line, namely 18-1-5, showed earlier heading and earlier maturation than its wheat parents. Morphological observations of spike differentiation revealed that 18-1-5 developed distinctly faster than its wheat parents from the double ridge stage during spike development. To explore the potential molecular mechanisms on the early heading, we performed transcriptome analysis at four different developmental stages of 18-1-5 and its wheat parents. A total of 10,043 differentially expressed genes (DEGs) were identified during spike development. Gene Ontology (GO) enrichment analysis showed that these DEGs were linked to carbohydrate metabolic process, photosynthesis, response to abscisic acid, and ethylene-activated signaling pathway. Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis showed that these DEGs were mainly involved in plant hormone signal transduction (ARF, AUX/IAA, SAUR, DELLA, BRI1, and ETR), starch and sucrose metabolism (SUS1 and TPP), photosynthesis-antenna proteins (Lhc), and circadian rhythm (PRR37, FT, Hd3a, COL, and CDF) pathways. In addition, several DEGs annotated as transcription factors (TFs), such as bHLH, bZIP, MADS-box, MYB, NAC, SBP, WRKY, and NF-Y, may be related to flowering time. Our results provide valuable information for further studies on the regulatory mechanism, candidate genes, and genetic resources of early-heading breeding in wheat.
... Transcription factors (TFs) play a crucial role in flower development. The MADs-box, MYB, bHLH, bZIP, NAC, and WRKY families are among the most prominent TF families involved in this process [29][30][31][32][33]. In this study, we identified a total of 20 differentially expressed TFs belonging to these six TF families ( Figure 4). ...
Article
Full-text available
Stamens are vital reproductive organs in angiosperms, essential for plant growth, reproduction, and development. The genetic regulation and molecular mechanisms underlying stamen development are, however, complex and varied among different plant species. MSL-lncRNAs, a gene specific to the Y chromosome of Populus deltoides, is predominantly expressed in male flower buds. Heterologous expression of MSL-lncRNAs in Arabidopsis thaliana resulted in an increase in both stamen and anther count, without affecting pistil development or seed set. To reveal the molecular regulatory network influenced by MSL-lncRNAs on stamen development, we conducted transcriptome sequencing of flowers from both wild-type and MSL-lncRNAs-overexpressing Arabidopsis. A total of 678 differentially expressed genes were identified between wild-type and transgenic Arabidopsis. Among these, 20 were classified as transcription factors, suggesting a role for these regulatory proteins in stamen development. GO enrichment analysis revealed that the differentially expressed genes were significantly associated with processes such as pollen formation, polysaccharide catabolic processes, and secondary metabolism. KEGG pathway analysis indicated that MSL-lncRNAs might promote stamen development by upregulating genes involved in the phenylpropanoid biosynthesis pathway. The top three upregulated genes, all featuring the DUF295 domain, were found to harbor an F-box motif at their N-termini, which is implicated in stamen development. Additionally, in transgenic Arabidopsis flowers, genes implicated in tapetum formation and anther development were also observed to be upregulated, implying a potential role for MSL-lncRNAs in modulating pollen development through the positive regulation of these genes. The findings from this study establish a theoretical framework for elucidating the genetic control exerted by MSL-lncRNAs over stamen and pollen development.
... The PIA1 gene which regulates floral organ identity and is member of MADS-box transcription factors control plant flower development and is related to floral organ identity determination [28,29] was identified as the main regulatory hub gene of the pink module recognized in the network, which shows the highest non-preservation score in the flowering co-expression network. As the most connected node of the pink module, PIA1 is a fascinating regulatory hub candidate. ...
... The functional roles of type II members in floral induction have been extensively studied, and there have been well-versed evolutionary and review studies on this topic. Some of them include Gramzow and Theissen's work [64] on both functional and evolutionary aspects of MADS box members, and simultaneous independent studies by Becker and Theissen [65] and Nam, et al. [66] on detailed dated evolutionary studies regarding MADS box gene origin and divergence. We briefly touched on this topic in our earlier review [67]. ...
Article
Full-text available
Most of the studied MADS box members are linked to flowering and fruit traits. However, higher volumes of studies on type II of the two types so far suggest that the florigenic effect of the gene members could just be the tip of the iceberg. In the current study, we used a systematic approach to obtain a general overview of the MADS box members’ cross-trait and multifactor associations, and their pleiotropic potentials, based on a manually curated local reference database. While doing so, we screened for the co-occurrence of terms of interest within the title or abstract of each reference, with a threshold of three hits. The analysis results showed that our approach can retrieve multi-faceted information on the subject of study (MADS box gene members in the current case), which could otherwise have been skewed depending on the authors’ expertise and/or volume of the literature reference base. Overall, our study discusses the roles of MADS box members in association with plant organs and trait-linked factors among plant species. Our assessment showed that plants with most of the MADS box member studies included tomato, apple, and rice after Arabidopsis. Furthermore, based on the degree of their multi-trait associations, FLC, SVP, and SOC1 are suggested to have relatively higher pleiotropic potential among others in plant growth, development, and flowering processes. The approach devised in this study is expected to be applicable for a basic understanding of any study subject of interest, regardless of the depth of prior knowledge.
... Functional roles of the type II members on floral induction have been extensively studied and there are well versed evolutionary as well as review studies on the topic. Some of them include Gramzow and Theissen [61] on both functional and evolutionary aspects of MADS-box members; simultaneous independent studies by Becker and Theissen [62] and Nam, et al. [63] on detailed dated evolutionary studies regarding MADS box gene origin and divergence. We briefly touched on the topic in our earlier review [64]. ...
Preprint
Full-text available
Majority of the studied MADS box members are linked with flowering and fruit traits. However, higher volumes of studies on the type II of their two types so far suggest that florigenic effect of the gene members could just be tip of the iceberg. In current study we used a systematic approach to have general overview on the MADS box members their cross-trait and multifactor associations as well as their pleiotropic potentials based on local reference database curated for MADS box members. While doing so, we screened for the co-occurrence of the terms of interest within the title or abstract of each reference with threshold of 3 hits. Analysis results showed that our approach can retrieve multi-faceted information on the subject of study (MADS box gene members in current case) which could otherwise have been skewed depending on the authors’ expertise and/or volume of literature reference base. Overall, our study discusses on the roles of MADS box members in association with plant organs and traits-linked factors among plant species. Our assessment showed that plants with majority of the MADS box member studies include tomato, apple, rice etc., after Arabidopsis. Furthermore, based on the degree of their multi-trait associations, FLC, SVP, SOC1, etc. are suggested to have relatively higher pleiotropic potential among others in plant growth, development and flowering process. The approach devised in this study is expected applicable for having basic understanding on any study subject of interest regardless of depth prior knowledge.
Article
Bud dormancy, which serves as a survival mechanism during winter, is crucial for determining the timing and quality of flowering in many perennial woody plants, including tree peony. The gibberellin (GA) signalling pathway participates in breaking bud dormancy in tree peony. Specifically, PsRGL1, a key DELLA protein, is a negative regulator in this process. MADS-box family members participate in plant growth and development regulation. In this study, a MADS-domain transcription factor, AGAMOUS-LIKE 9 (PsAGL9), was identified as a candidate interaction protein of PsRGL1 using a pull-down assay coupled with liquid chromatography-tandem mass spectrometry. PsAGL9 expression was induced by chilling and exogenous GA3. Yeast two-hybrid (Y2H), pull-down, and luciferase complementation assays (LCAs) confirmed that PsAGL9 interacted with PsRGL1. PsAGL9 overexpression significantly promoted dormancy break and upregulated the expression of marker genes such as PsBG6, PsBG9, PsEBB1, PsEBB3, and PsCYCD, suggesting a potential regulatory function of PsAGL9. Classical and non-classical CArG motifs were identified in the promoter regions of PsCYCD and PsEBB3, respectively. Yeast one-hybrid, electrophoretic mobility shift, and dual luciferase assays confirmed that PsAGL9 directly bound to and activated PsCYCD and PsEBB3 expression, and PsRGL1 abolished the DNA-binding activity of PsAGL9. Furthermore, interaction proteins of PsAGL9 were screened, and MADS-box members PsAGL9, PsAGL6, and PsPI were identified. Y2H, LCA, and pull-down assays confirmed that PsAGL9 formed both homodimers and heterodimers, and heterodimers further promoted target gene expression. This study provides an in-depth exploration of the GA pathway and elucidates a novel pathway, PsRGL1-PsAGL9-PsCYCD, involved in regulating dormancy break in tree peony.
Article
Full-text available
Chickpea (Cicer arietinum L.)—an important legume crop cultivated in arid and semiarid regions—has limited genetic diversity. Efforts are being undertaken to broaden its diversity by utilizing its wild relatives, which remain largely unexplored. Here, we present the Cicer super-pangenome based on the de novo genome assemblies of eight annual Cicer wild species. We identified 24,827 gene families, including 14,748 core, 2,958 softcore, 6,212 dispensable and 909 species-specific gene families. The dispensable genome was enriched for genes related to key agronomic traits. Structural variations between cultivated and wild genomes were used to construct a graph-based genome, revealing variations in genes affecting traits such as flowering time, vernalization and disease resistance. These variations will facilitate the transfer of valuable traits from wild Cicer species into elite chickpea varieties through marker-assisted selection or gene-editing. This study offers valuable insights into the genetic diversity and potential avenues for crop improvement in chickpea.
Article
Full-text available
To estimate approximate divergence times of species or species groups with molecular data, we have developed a method of constructing a linearized tree under the assumption of a molecular dock. We present two tests of the molecular clock for a given topology: two-cluster test and branch-length test. The two-cluster test examines the hypothesis of the molecular clock for the two lineages created by an interior node of the tree, whereas the branch-length test examines the deviation of the branch length between the tree root and a tip from the average length. Sequences evolving excessively fast or slow at a high significance level may be eliminated. A linearized tree will then be constructed for a given topology for the remaining sequences under the assumption of rate constancy. We have used these methods to analyze hominoid mitochondrial DNA and drosophilid Adh gene sequences.
Article
Full-text available
Multicellular filaments from the ca. 1200-Ma Hunting Formation (Somerset Island, arctic Canada) are identified as bangiacean red algae on the basis of diagnostic cell-division patterns. As the oldest taxonomically resolved eukaryote on record Bangiomorpha pubescens n. gen. n. sp. provides a key datum point for constraining protistan phylogeny. Combined with an increasingly resolved record of other Proterozoic eukaryotes, these fossils mark the onset of a major protistan radiation near the Mesoproterozoic/Neoproterozoic boundary. Differential spore/gamete formation shows Bangiomorpha pubescens to have been sexually reproducing, the oldest reported occurrence in the fossil record. Sex was critical for the subsequent success of eukaryotes, not so much for the advantages of genetic recombination, but because it allowed for complex multicellularity. The selective advantages of complex multicellularity are considered sufficient for it to have arisen immediately following the appearance of sexual reproduction. As such, the most reliable proxy for the first appearance of sex will be the first stratigraphic occurrence of complex multicellularity. Bangiomorpha pubescens is the first occurrence of complex multicellularity in the fossil record. A differentiated basal holdfast structure allowed for positive substrate attachment and thus the selective advantages of vertical orientation; i.e., an early example of ecological tiering. More generally, eukaryotic multicellularity is the innovation that established organismal morphology as a significant factor in the evolutionary process. As complex eukaryotes modified, and created entirely novel, environments, their inherent capacity for reciprocal morphological adaptation, gave rise to the “biological environment” of directional evolution and “progress.” The evolution of sex, as a proximal cause of complex multicellularity, may thus account for the Mesoproterozoic/Neoproterozoic radiation of eukaryotes.
Article
Full-text available
The very late-flowering behavior of Arabidopsiswinter-annual ecotypes is conferred mainly by two genes,FRIGIDA (FRI) and FLOWERING LOCUS C(FLC). A MADS-domain gene, AGAMOUS-LIKE 20(AGL20), was identified as a dominant FRI suppressor in activation tagging mutagenesis. Overexpression of AGL20suppresses not only the late flowering of plants that have functionalFRI and FLC alleles but also the delayed phase transitions during the vegetative stages of plant development. Interestingly, AGL20 expression is positively regulated not only by the redundant vernalization and autonomous pathways of flowering but also by the photoperiod pathway. Our results indicate that AGL20 is an important integrator of three pathways controlling flowering in Arabidopsis. Keywords • Flowering • MADS domain protein • AGL20 • phase transition • activation tagging
Article
†Summary MADS-box genes encode transcriptional regulators involved in diverse aspects of plant development. Here we describe the cloning and mRNA spatio-temporal expression patterns of five new MADS-box genes from Arabidopsis: AGL16, AGL18, AGL19, AGL27 and AGL31. These genes will probably become important molecular tools for both evolutionary and functional analyses of vegetative structures. We mapped our data and previous expression patterns onto a new MADS-box phylogeny. These analyses suggest that the evolution of the MADS-box family has involved a rapid and simultaneous functional diversification in vegetative as well as reproductive structures. The hypothetical ancestral genes had broader expression patterns than more derived ones, which have been co-opted for putative specialized functions as suggested by their expression patterns. AGL27 and AGL31, which are closely related to the recently described flowering-time gene FLC (previously AGL25), are expressed in most plant tissues. AGL19 is specifically expressed in the outer layers of the root meristem (lateral root cap and epidermis) and in the central cylinder cells of mature roots. AGL18, which is most similar in sequence to the embryo-expressed AGL15 gene, is expressed in the endosperm and in developing male and female gametophytes, suggesting a role for AGL18 that is distinct from previously characterized MADS-box genes. Finally, AGL16 RNA accumulates in leaf guard cells and trichomes. Our new phylogeny reveals seven new monophyletic clades of MADS-box sequences not specific to flowers, suggesting that complex regulatory networks involving several MADS-box genes, similar to those that control flower development, underlie development of vegetative structures.
Article
Discordant molecular clock estimates of the divergence time between monocots and dicots (angiosperms) have accumulated in recent years. The diverse estimates are more than 100 Myr apart and are derived from nuclear and chloroplast gene sequences. Using two landmark events and nucleotide sequences from 10 mitochondrial genes for which rate homogeneity was formally tested, we have estimated this date to be around 200 Myr. The estimates obtained with the two multiple-gene clocks are highly congruent and in line with previous estimates derived from protein-coding gene sequences from the chloroplast genome. The agreement of this date with recent findings from the fossil record is discussed briefly.
Article
Following introductory comments on the nature of plant fossils (eg. how age is determined, systematic relationships, reconstruction and nomenclature), chapters trace the evolution of different forms from the Precambrian through to the 'living fossil', Ginkgo. The last 4 chapters provide overviews: discussion of the first coniferophytes; the diversification of conifers and taxads; the origin and early evolution of angiosperms; and major evolutionary events and trends - a summary. Throughout, description (including many line drawings) is linked with functional and evolutionary development. -P.J.Jarvis
Conference Paper
Eukaryotes and archaebacteria form the clade neomura and are sisters, as shown decisively by genes fragmented only in archaebacteria and by many sequence trees. This sisterhood refutes all theories that eukaryotes originated by merging an archaebacterium and an alpha-proteobacterium, which also fall to account for numerous features shared specifically by eukaryotes and actinobacteria. I revise the phagotrophy theory of eukaryote origins by arguing that the essentially autogenous origins of most eukaryotic cell properties (phagotrophy, endomembrane system including peroxisomes, cytoskeleton, nucleus, mitosis and sex) partially overlapped and were synergistic with the symbiogenetic origin of mitochondria from an alpha-proteobacterium. These radical innovations occurred in a derivative of the neomuran common ancestor, which itself had evolved immediately prior to the divergence of eukaryotes and archaebacteria by drastic alterations to its eubacterial ancestor, an actinobacterial posibacterium able to make sterols, by replacing murein peptidoglycan by Minked glycoproteins and a multitude of other shared neomuran novelties. The conversion of the rigid neomuran wall into a flexible surface coat and the associated origin of phagotrophy were instrumental in the evolution of the endomembrane system, cytoskeleton, nuclear organization and division and sexual life-cycles. Cilia evolved not by symbiogenesis but by autogenous specialization of the cytoskeleton. I argue that the ancestral eukaryote was unicilliate with a single centriole (unikont) and a simple centrosomal cone of microtubules, as in the aerobic amoebozoan zooflagellate Phalansterium. I infer the root of the eukaryote tree at the divergence between opisthokonts (animals, Choanozoa, fungi) with a single posterior cilium and all other eukaryotes, designated 'anterokonts' because of the ancestral presence of an anterior cilium. Anterokonts comprise the Amoebozoa, which may be ancestrally unikont, and a vast ancestrally biciliate clade, named 'bikonts'. The apparently conflicting rRNA and protein trees can be reconciled with each other and this ultrastructural interpretation if long-branch distortions, some mechanistically explicable, are allowed for. Bikonts comprise two groups: corticoflagellates, with a younger anterior cilium, no centrosomal cone and ancestrally a semi-rigid cell cortex with a microtubular band on either side of the posterior mature centriole; and Rhizaria [a new infrakingdom comprising Cercozoa (now including Ascetosporea classis nov.), Retaria phylum nov., Heliozoa and Apusozoa phylum nov.], having a centrosomal cone or radiating microtubules and two microtubular roots and a soft surface, frequently with reticulopodia. Corticoflagellates comprise photokaryotes (Plantae and chromalveolates, both ancestrally with cortical alveoli) and Excavata (a new protozoan infrakingdom comprising Loukozoa, Discicristata and Archezoa, ancestrally with three microtubular roots). All basal eukaryotic radiations were of mitochondrial aerobes; hydrogenosomes evolved polyphyletically from mitochondria long afterwards, the persistence of their double envelope long after their genomes disappeared being a striking instance of membrane heredity. I discuss the relationship between the 13 protozoan phyla recognized here and revise higher protozoan classification by updating as subkingdoms Lankester's 1878 division of Protozoa into Corticata (Excavata, Alveolata; with prominent cortical microtubules and ancestrally localized cytostome - the Parabasalia probably secondarily internalized the cytoskeleton) and Gymnomyxa [infrakingdoms Sarcomastigota (Choanozoa, Amoebozoa) and Rhizaria; both ancestrally with a non-cortical cytoskeleton of radiating singlet microtubules and a relatively soft cell surface with diffused feeding]. As the eukaryote root almost certainly lies within Gymnomyxa, probably among the Sarcomastigota, Corticata are derived. Following the single symbiogenetic origin of chloroplasts in a corticoflagellate host with cortical alveoli, this ancestral plant radiated rapidly into glaucophytes, green plants and red algae. Secondary symbiogeneses subsequently transferred plastids laterally into different hosts, making yet more complex cell chimaeras - probably only thrice: from a red alga to the corticoflagellate ancestor of chromalveolates (Chromista plus Alveolata), from green algae to a secondarily uniciliate cercozoan to form chlorarachneans and independently to a biciliate excavate to yield photosynthetic euglenoids. Tertiary symbiogenesis involving eukaryotic algal symbionts replaced peridinin-containing plastids in two or three dinoflagellatelineages, but yielded no major novel groups. The origin and well-resolved primary bifurcation of eukaryotes probably occurred in the Cryogenian Period, about 850 million years ago, much more recently than suggested by unwarranted backward extrapolations of molecular 'clocks' or dubious interpretations as 'eukaryotic, of earlier large microbial fossils or still more ancient steranes. The origin of chloroplasts and the symbiogenetic incorporation of a red alga into a corticoflagellate to create chromalveolates may both have occurred in a big bang after the Varangerian snowball Earth melted about 580 million years ago, thereby stimulating the ensuing Cambrian explosion of animals and protists in the form of simultaneous, poorly resolved opisthokont and anterokont radiations.
Article
Screening for seed pigmentation phenotypes in Arabidopsis led to the isolation of three allelic yellow-seeded mutants, which defined the novel TRANSPARENT TESTA16 (TT16) locus. Cloning of TT16 was performed by T-DNA tagging and confirmed by genetic complementation and sequencing of two mutant alleles. TT16 encodes the ARABIDOPSIS BSISTER (ABS) MADS domain protein. ABS belongs to the recently identified “B-sister” (BS) clade, which contains genes of unknown function that are expressed mainly in female organs. Phylogenetic analyses using a maximum parsimony approach confirmed that TT16/ABS and related proteins form a monophyletic group. TT16/ABS was expressed mainly in the ovule, as are the other members of the BS clade. TT16/ABS is necessary for BANYULS expression and proanthocyanidin accumulation in the endothelium of the seed coat, with the exception of the chalazal-micropylar area. In addition, mutant phenotype and ectopic expression analyses suggested that TT16/ABS also is involved in the specification of endothelial cells. Nevertheless, TT16/ABS apparently is not required for proper ovule function. We report the functional characterization of a member of the BS MADS box gene subfamily, demonstrating its involvement in endothelial cell specification as well as in the increasingly complex genetic control of flavonoid biosynthesis in the Arabidopsis seed coat.
Article
A MADS box gene, FLF (for FLOWERING LOCUS F), isolated from a late-flowering, T-DNA–tagged Arabidopsis mutant, is a semidominant gene encoding a repressor of flowering. The FLF gene appears to integrate the vernalization-dependent and autonomous flowering pathways because its expression is regulated by genes in both pathways. The level of FLF mRNA is downregulated by vernalization and by a decrease in genomic DNA methylation, which is consistent with our previous suggestion that vernalization acts to induce flowering through changes in gene activity that are mediated through a reduction in DNA methylation. The flf-1 mutant requires a greater than normal amount of an exogenous gibberellin (GA3) to decrease flowering time compared with the wild type or with vernalization-responsive late-flowering mutants, suggesting that the FLF gene product may block the promotion of flowering by GAs. FLF maps to a region on chromosome 5 near the FLOWERING LOCUS C gene, which is a semidominant repressor of flowering in late-flowering ecotypes of Arabidopsis.