Current Biology 17, 1241–1247, July 17, 2007 ª2007 Elsevier Ltd All rights reserved DOI 10.1016/j.cub.2007.06.036
C4Photosynthesis Evolved in Grasses
via Parallel Adaptive Genetic Changes
Pascal-Antoine Christin,1,* Nicolas Salamin,1
Vincent Savolainen,2Melvin R. Duvall,3
and Guillaume Besnard1,*
1Department of Ecology and Evolution
University of Lausanne
Royal Botanic Gardens, Kew
TW9 3DS Surrey
3Department of Biological Sciences
Northern Illinois University
DeKalb, Illinois 60115
Phenotypic convergence is a widespread and well-
recognized evolutionary phenomenon. However, the
responsible molecular mechanisms remain often un-
known mainly because the genes involved are not
vergence is the C4 photosynthetic pathway, which
evolved independently more than 45 times . Here,
we address the question of the molecular bases of the
C4convergent phenotypes in grasses (Poaceae) by
reconstructing the evolutionary history of genes en-
coding a C4key enzyme, the phosphoenolpyruvate
carboxylase (PEPC). PEPC genes belong to a multi-
gene family encoding distinct isoforms of which only
logenetic analyses, we showed that grass C4PEPCs
appeared at least eight times independently from the
same non-C4PEPC. Twenty-one amino acids evolved
under positive selection and converged to similar or
identical amino acids in most of the grass C4PEPC
lineages. This is the first record of such a high level
of molecular convergent evolution, illustrating the
repeatability of evolution. These amino acids were re-
sponsible for a strong phylogenetic bias grouping
all C4PEPCs together. The C4-specific amino acids
detected must be essential for C4PEPC enzymatic
characteristics, and their identification opens new av-
enues for the engineering of the C4pathway in crops.
Results and Discussion
Congruence between Gene and Species Trees
Recovered Only with Nearly Neutral Sites
We constituted a data set of 169 PEPC encoding genes,
of which 127 were sequenced in this study. In the phylo-
genetic tree inferred on PEPC coding sequences
(Figure 1), all grass genes encoding C4PEPC (ppc-C4),
except that of Centropodia, cluster together whereas its
closest non-C4genes (referred toas ppc-B2) form a par-
aphyletic group. The same pattern occurred whether
based on amino acid or nucleotide sequences and re-
gardless of the phylogenetic method used. The species
relationships deduced from ppc-C4as well as from ppc-
B2 were highly incongruent with the species tree in-
ferred by other markers [3–5]. The obtained gene tree
can beexplained only by postulating a very high number
of gene duplications and losses or horizontal gene
transfers, making this topology very unlikely. Because
the C4trait evolved several times independently in the
grass family [1, 3], a single origin of the C4PEPC was
A potential source of bias, which could beresponsible
for the ppc-C4grouping found in our analyses, is the
evolutionary forces driving C4PEPC evolution. Because
of the crucial role played by this PEPC isoform in the C4
photosynthetic pathway , it could have been the
target of strong selective pressures that would have
drastically altered the amino acid sequences. If a high
enough number of identical amino acids appeared inde-
pendently in the different C4lineages, the codon posi-
tions that determine the transition between non-C4and
C4-characteristic amino acids would tend to group the
ppc-C4together and thus be misleading in a phyloge-
netic context. This hypothesis would then predict that
the trees constructed with sites less affected by selec-
tion, for example the third positions of the codons or
the intron sequences, would have a different topology
reflecting the species relationships. This prediction
was verified with grass PEPC: species relationships de-
were congruent with accepted species trees [3–5]. In
The species tree is thus recovered when nearly neutral
sites are used. This pattern could not be imputed to co-
don usage bias between the different lineages because
codon frequencies were approximately constant across
the phylogenetic tree (data not shown).
The bias observed in the topology obtained with all
positions suggests that a proportion of amino acids
essential for the C4function converged between the dif-
ferent ppc-C4lineages, as confirmed by the positive se-
lection analyses (see below). When the 21 codons under
positive selection were removed from the phylogenetic
analyses, a topology congruent with the species tree
tal Data available online). This result confirmed that the
clustering of ppc-C4was in a large part due to these
few codons under positive selection.
A phylogenetic bias resulting from molecular conver-
gence was already proposed for other genes in other
organisms [6, 7]. However, these studies were not able
to recover the species tree from the coding sequence.
Thus, the phylogenetic bias they observed could be due
to complex gene evolutionary history (e.g., exon or gene
transfer). Our study is the first to clearly show that phy-
logenetic reconstruction methods can be misleading
because of a small proportion of convergent codons. It
also highlights the importance of understanding how
different parts of the data influence phylogenetic recon-
ber of characters and thus the accuracy of the tree con-
structions, third positions and introns can, under certain
of the genes .
C4PEPC Evolved through Parallel Changes
It was shown before that an alanine amino acid con-
served in all the known non-C4PEPCs changed to a ser-
, representing a strong example of parallel changes.
Other modifications of the coding sequence are ex-
pected because C4PEPCs present catalytic properties
and sensitivities toward repressors different from non-
C4isoforms [10, 11]. In order to identify sites that under-
wentadaptative changesduring C4evolutioningrasses,
we performed positive selection tests that use a u (dN/
dS ratio) greater than 1 as evidence of past positive se-
lection [12, 13]. Different codon models were optimized
on third positions and intron topology and compared
with likelihood ratio tests. The model allowing a propor-
tion of codons to evolve under positive selection in
branches defined a priori as the foreground branches
(in this case, branches leading to ppc-C4, identified by
an alanine-to-serine transition at position 780) was sig-
nificantly better than the null models (A versus M1a,
df = 2, p value < 0.0001; A versus A0, df = 1, p value =
0.066; see Experimental Procedures for further details
on the models). This result shows that ppc-C4evolved
under adaptive molecular evolution. To take into ac-
count the uncertainty in topology, the three codon
models were run again with 11 alternative topologies
sampled during the Bayesian search. 21 out of the 442
codons considered (4.8%) were identified as having
evolved under positive selection in branches leading to
ppc-C4with a posterior probability greater than 0.95 in
all analyses (Figure 3). By changing the nominal value
of the test to 0.99 and 0.999, the number of codons
under positive selection is reduced to 15 (3.4%) and 12
(2.7%), respectively (Figure 3).
These results show that the same positions evolved
under positiveselection inthedifferent grassppc-C4lin-
eages. Many of these sites were mutated recurrently to
an identical amino acid (Figure 3). In addition, some of
the amino acids under positive selection identified in
grasses underwent the same transitions between non-
families such as Asteraceae or Amaranthaceae (Fig-
ure 3). In addition to the alanine-to-serine transition at
position 780, amino acids at positions 517, 577, 665,
Figure 1. Maximum Likelihood Tree Containing All Grass and Main Monocot and Dicot Genes Encoding PEPC
This tree was constructed on nucleotide coding sequence with PhyML under a GTR+I+G model. Genes belonging to the different grass gene
lineages and the main dicot clades are compressed. Uncompressed tree is available in Supplemental Data. Support values of 100 bootstrap
replicates are indicated above branches when greater than 50%. The position of Centropodia forskalii gene with a serine at position 780 is
indicated by an asterisk. The logarithm of the likelihood for this tree was 256730.88.
Figure 2. Bayesian Tree Constructed with MrBayes on Third Positions and Introns Combined, Including ppc-B2 and ppc-C4Genes
Branches leading to ppc-C4, determined by the presence of a serine at position 780, are in bold. Capital letters identify branches used in the
positive selection tests. Bayesian support values greater than 0.5 are indicated for the principal branches. Support values for all branches
are available in Supplemental Data. Subfamilies are indicated on the left of the tree. The three main Panicoideae tribes, Andropogoneae, Pan-
iceae, and Centotheceae, are indicated. Aristi, Aristidoideae; Mi, Micrairoideae; Ar+Da, Arundinoideae + Danthonioideae; Cent, Centotheceae.
In some Chloridoideae, both ppc-B2 and ppc-C4are present, suggesting an ancestral gene-duplication event. On the right, the most frequent
amino acid of each clade is shown for the 12 sites (positions indicated correspond to Zea mays PEPC; CAA33317) with a posterior probability
lineages. Residues with similar biochemical properties are identically colored. For visual clarity, C3-specific amino acids are brightened.
Convergent C4Photosynthesis Molecular Evolution
and 761 recurrently changed from the same C3residue
to identical C4-specific amino acid in grasses and other
families (Figure 3). The evolution of a C4-specific PEPC
was performed through many parallel changes in a high
number of independent C4 lineages, highlighting the
repeatability of some evolutionary processes.
Phenotypic convergence between distant lineages is
a widespread feature and concerns morphological as
well as physiological traits. The recurrent appearance of
lution has already been demonstrated [14–20]. Some
studies traced the convergence to different modifica-
tions of the same gene [15–17, 19] or to the same muta-
tions taking place independently in different lineages
[14, 18, 20]. However, these cases concerned only a
small proportion of sites in a restricted number of line-
ages.Ourstudy reports thefirst case ofsuch ahighlevel
of molecular convergent evolution in up to eight distinct
lineages. The observed amino acid transitions between
non-C4and C4PEPC enzymes are all due to a single nu-
cleotide change. This increases the probability of these
mutations occurring by chance. The mutations that im-
prove the encoded enzyme can later be fixed by natural
selection. The presence of a non-C4PEPC gene (i.e.,
ppc-B2) with a nucleotide sequence allowing the acqui-
sition of C4-advantageous amino acids through simple
single nucleotide changes likely favored recurrent evo-
lution of the C4pathway by allowing a rapid and efficient
The sites under selection show different degrees of
parallelisms. For instance, residues at positions 531,
in six to eight grass C4PEPC lineages (parallel changes
sensu stricto; Figures 2 and 3). In contrast, residues at
acid (Figures 2 and 3), suggesting that the C4character-
istics are conferred by the absence of the non-C4amino
acid at these positions rather than the presence of a C4-
specific amino acid. Although the latter does not match
the strict definition of parallel change, it corresponds to
parallel genotypic adaptation  because the same lo-
cus (i.e., ppc-B2) evolved independently through similar
ation in mesophyll cells). Unfortunately, the effects of
these different changes are difficult to predict because
the described active sites and regulation targets of the
PEPC [22, 23] are not affected. The alanine-to-serine
transition (position 780, Figure 3) has been shown to
alter the catalytic properties of the encoded enzyme
[9, 11]. The histidine-to-asparagine transition that oc-
curred at position 665 in C4grasses as well as in several
C4dicots (Figure 3) could have an important effect on
protein folding because it creates a putative N-glycosyl-
ation site (positions 665–668 ) that is absent from
non-C4 PEPCs. Serine at position 761 is part of
Figure 3. Amino Acids Detected as Evolving under Positive Selection in Branches Leading to Genes Encoding C4PEPC in Grasses
These sites were detected with a posterior probability (PP) greater than 0.999, 0.99, or 0.95. The amino acids are shown for the different C4and
non-C4PEPC gene lineages (capital letters identify independent grass ppc-C4lineages as identified on Figure 2). When one lineage exhibited
different amino acids, the most abundant is written first. For grasses, the number of sequences included in each lineage is indicated (n), as is
the photosynthetic type for nongrasses. Amino acids that differ between ppc-B2 andppc-C4are highlighted inblue. Amino acidsthat underwent
the same changes in non-grass C4PEPCs are in green.
a predicted casein kinase II phosphorylation site (posi-
tions 761–763 ) that disappears once this serine is
mutated to an alanine, which is the case in C4PEPCs.
Breaking this phosphorylation site could have helped
the acquisition of the C4-specific regulation pattern of
the PEPC. This amino acid is also part of a putative N-
myristylation site (positions 757–762 ), which works
with either an alanine or a serine. Thus, the only single-
tion site without altering the myristylation site was pre-
cisely a serine-to-alanine substitution (serine in non-C4
PEPC is encoded by a UCN codon). The effect of the
other mutations is still unpredictable. The use of the
3D structure predictions could help evaluate whether
some C4-specific amino acids can putatively alter the
enzyme structure and thus its catalytic properties .
Implications for Bioengineering
21 amino acids were detected, with high probability, to
have undergone positive selection along the branches
in other branches. These changes are thus likely to be
important for the C4function of the encoded enzyme.
Their recurrent evolution in different lineages strongly
supports their high adaptive significance, a fact that is
reinforced by the similar or identical changes occurring
at the same residues in very distant plant families (Fig-
ure 3). Knowledge of these C4putative determinants
opens promising opportunities for the molecular engi-
neering of grass C3crops, such as rice, barley, and
wheat. This is especially relevant for the biotechnologi-
cal efforts to incorporate some C4characteristics in C3
crops [25–27]. Identification of the major C4determi-
nants has been performed in Flaveria through expres-
properties . This approach allowed the detection of
the alanine-to-serine transition (position 780 in maize).
However, such a procedure can identify only changes
having a detectable effect on the phenotype. The
changes evidenced in our study have certainly minor in-
dependent effects, but, taken together, would help the
is not feasible in an experimental framework. The use of
phylogenetic inference to detect potential residues
important for thefunction ofanenzymeisthus afeasible
alternative and powerful approach that should be ex-
tended to other important enzymes.
Amplification of PEPC Genes
Samples from 111 grass species were taken, focusing on the PAC-
CAD clade that contains all C4grass species [1, 4]. Panicoideae,
which contains several putatively independent C4lineages, was es-
pecially densely sampled. DNAs (listed in Supplemental Data) were
obtained either from aliquots provided by other teams or extracted
from leaves dried in silica gel via the CTAB method. The photosyn-
merase chain reaction (PCR). The primers were designed to amplify
a segment of ppc-C4genes as well as ppc-B1 gene previously de-
tected in Oryza sativa . Because of the length of the complete
gene (more than 6000 bp in Zea mays, X15239), we focused on a
segment from exon 8 (PEPC-1362-For: 50-CATCCGGCAGGAGTCG
GAGCG-30) to exon 10 (PEPC-2701-Rev: 50-TGTASGCCTGGMAC
ACGTTCAG-30) that carries major C4determinants . The PCR
reaction mixture contained w100 ng of genomic DNA template,
5 ml of 10X AccuPrime PCR Buffer II, 200 pmol of each dNTP, 20
pmol of each primer, 3 mmol of MgSO4, 2.5 ml (5% vol) of DMSO,
and 1 unit of a proof-reading Taq polymerase (AccuPrime Taq
DNA Polymerase High Fidelity, Invitrogen) in a total volume of 50
ml. The samples were incubated for 2 min at 94?C, followed by 35 cy-
cles consisting of 30 s at 94?C, 30 s at 57?C (annealing temperature),
and 2 min at 68?C. The last cycle was followed by a 20 min extension
at 68?C. Total PCR products were purified with QIAquick Gel Extrac-
tion Kit (QIAGEN). To separate the different genes (or alleles) puta-
tively amplified, purified PCR products were cloned into the
pTZ57R/T vector with InsT/Aclone PCR Product Cloning Kit (Fer-
mentas) and PCR amplified with the M13 primers. Between 8 and
20 positive clones were then digested with TaqI restriction enzyme
(Invitrogen). The degree of polymorphism for TaqI digestion prod-
ucts was high, allowing an unambiguous distinction of the different
ppc gene lineages. For each species, inserts of each clone present-
ing a different restriction pattern were sequenced with the M13
primers with the Big Dye 3.1 Terminator cycle sequencing kit (Ap-
plied Biosystems), according to the provider instructions, and sep-
arated on an ABI Prism 3100 genetic analyzer (Applied Biosystems).
A segment of about 1500 bp, including w40% of the total coding se-
quence and two introns, was sequenced. All sequences have been
deposited in the EMBL database (accession numbers in Supple-
DNA Sequence Analyses
For PEPC-gene segments isolated from genomic DNA, exons were
identified by homology with Zea mays, Sorghum bicolor, and Oryza
sativa genes (X15239, X63756, and AK101274, respectively) and ac-
cording to, when possible, the GT-AG rule. Coding sequences were
then translated into amino acids and aligned with ClustalW .
Once retranslated into nucleotides, alignment was checked visually.
19 grass and 23 nongrass ppc genes available on GenBank were
added to the data set (Supplemental Data). A phylogenetic tree was
inferred both by maximum likelihood via PhyML , DNAML ,
and PAUP*  (NNI branch swapping on 151 trees found during
a first round of tree selection with 1000 random addition sequences
with TBR branch swapping under the Parsimony criterion; this was
needed to reduce computational time) and by Bayesian inference
via MrBayes  (two runs of 10,000,000 generations with four
chains, burn-in period of 2,000,000) under a GTR model with base
frequencies gamma shape parameter and proportion of invariants
estimated from the data (hereafter referred as GTR+I+G). PhyML
 and ProML  were further used to compute a phylogenetic
tree based on the amino acids sequence under a JTT substitution
model with a gamma shape parameter. For DNAML and ProML,
gamma shape parameter was fixed to the value estimated by
PhyML. These analyses allowed the identification of the number of
gene lineages present in grasses and their relationships to each
Further analyses included only ppc-C4lineage and its closest
non-C4ancestor (hereafter named ppc-B2, Figure 1). More distantly
related sequences were omitted to avoid saturation of fast-evolving
nucleotides such as introns and third positions. To distinguish the
phylogenetic information provided by the different parts of the
sequences, two data sets were created. First, all coding positions
were considered for a total of 1326 bp. Second, the third codon po-
extracted and aligned with ClustalW with gap opening and gap ex-
tension penalties set to 15 and 6.6 for the pairwise and multiple
alignments. To avoid subjectivity, intron alignments were checked
visually but not manually edited. Because of their fast evolutionary
rate, introns are useless to resolve basal nodes but give a strong
signal to infer the top nodes. Their use in combination with unequiv-
ocally aligned third positions appeared as the best way to obtain a
supported tree only weakly affected by selective pressures. The
substitution model used for the introns was the HKY model. All cod-
ing positions and third positions of codons were analyzed under
a GTR+I+G model. Best-fit substitution models were determined
with hierarchical likelihood ratio tests (LRT). Both data sets were
to 1000 generations. Prior distributions were left to their default
Convergent C4Photosynthesis Molecular Evolution
allcoding positionsanalysis tosixforintrons andthirdposition anal-
ysis because of convergence problems. Base frequency, which was
the only parameter common to the substitution models of these two
data sets, was optimized separately for each partition (option unlink
To test for the action of positive selection at particular sites of the
nucleotide sequence of the ppc genes along branches leading to
ppc-C4, three different codon models [12, 13] were optimized on
the topology obtained by combining third codon positions and
introns via codeml . The neutral model M1a allows u (the dN/
dS ratio) to vary among codons. This parameter is constant among
the branches of the tree and its value is allowed to be either 1 (neu-
tral) or smaller than 1 (purifying selection). The alternative model,
modelA,allowsu tovaryamong bothsitesandbranches.Itrequires
the specification of two branch types, the background branches in
ground branches under positive selection (u > 1). The last model A0
1. It is therefore identical to model A except that the u value in fore-
and background branches. Models were compared with LRT. Test 1
compares model M1a and A and thus tests for the occurrence of
different selective pressures on the foreground branches . Test
2, which compares models A0and A, is more conservative and
specifically tests the significance of a u value greater than 1 on
the foreground branches .
Models A0and A require an a priori identification of the foreground
branches. All the branches leading to full C4PEPC groups (identified
by a serine at position 780, see Figure 2) were used simultaneously
as foreground branches. To ensure that the results were not due to
abiasinthe tree used, thesameprocedure was repeated withtopol-
ogies sampled during the Bayesian search. Trees were taken each
500,000 generations between 5,000,000 and 10,000,000 for a total
of 11 additional topologies. By the Bayes Empirical Bayes approach
, only codons with posterior probability of being under positive
selection greater than a given threshold (i.e., 0.95, 0.99, or 0.999)
in all 12 analyses were considered as having evolved under positive
selection during C4evolution.
The most likely ancestral residue at position 780 was determined
with codeml under a F3x4 model of codon substitution. The de-
duced amino acid was used to trace the ppc-C4evolution events
on the phylogenetic trees.
Subsequent to these analyses, sequences from nongrasses ppc-
C4and their related non-C4PEPC gene available in GenBank were
aligned to the grass DNA sequences. The amino acids correspond-
ing to sites under positive selection in grasses were reported.
andtwotables are availableathttp://www.
This work was funded by Swiss NSF grant 3100AO-105886/1. N.S.
and V.S. were funded by the European Commission (Marie Curie
EST ‘‘HOTSPOTS,’’ contract MEST-CT-2005-020561). We thank
the Swiss Institute of Bioinformatics for access to the Vital-IT clus-
ter. The authors are especially thankful to F. Anthelme, Y. Bouche-
nak-Khelladi, V.R. Clark, M. Gonzalez, T.R. Hodkinson, J. Kissling,
C. Lavergne, A. Persico, T. Renaud, P. Rondeau, S. Sunkkaew,
A. Teerawatanakon, and Y. Wang who provided either DNA aliquots
or grass samples. N. Fumeaux at the herbarium of the botanical
garden of Geneva helped with grass identification. Finally, O. Bro ¨n-
nimann, L. Bu ¨chi, M. Chapuisat, P.B. Pearman, E. Samaritani, and I.
Sanders made useful comments on the earlier versions of the man-
uscript. We thank two anonymous reviewers for useful comments.
Received: May 3, 2007
Revised: June 4, 2007
Accepted: June 12, 2007
Published online: July 5, 2007
1. Sage, R.F. (2004). The evolution of C4photosynthesis. New
Phytol. 161, 341–370.
2. Lepiniec,L.,Vidal, J.,Chollet,R., Gadal, P.,andCre ´tin, C.(1994).
Phosphoenolpyruvate carboxylase—structure, regulation and
evolution. Plant Sci. 99, 111–124.
3. Giussani, L., Cota-Sa ´nchez, J.H., Zuloaga, F., and Kellogg, E.A.
(2001). A molecular phylogeny of the grass subfamily Panicoi-
deae (Poaceae) shows multiple origins of C4photosynthesis.
Am. J. Bot. 88, 1993–2012.
4. GPWG-Grass Phylogeny Working Group (2001). Phylogeny and
subfamilial classification of the grasses (Poaceae). Ann. Mo.
Bot. Gard. 88, 373–457.
5. Sanchez-Ken, J.G., Clark, L.G., Kellogg, E.A., and Kay, E.E.
(2007). Reinstatement and emendation of subfamily Micrairoi-
deae (Poaceae). Syst. Bot. 32, 71–80.
6. Stewart, C.B., Schilling, J.W., and Wilson, A.C. (1987). Adaptive
evolution in the stomach lysozymes of foregut fermenters.
Nature 330, 401–404.
7. Kriener, K., O’hUigin, C., Tichy, H., and Klein, J. (2000). Conver-
gent evolution of major histocompatibility complex molecules
in humans and New World monkeys. Immunogenetics 51, 169–
8. Savolainen, V., Chase, M.W., Salamin, N., Soltis, D.E., Soltis,
P.E., Lopez, A.J., Fedrigo, O., and Naylor, G.J.P. (2002). Phylog-
eny reconstruction and functional constraints in organellar
genomes: plastid atpB and rbcL sequences versus animal mito-
chondrion. Syst. Biol. 51, 638–647.
9. Bla ¨sing,O.E.,Westhoff,P.,andSvensson,P.(2000).Evolutionof
C4phosphoenolpyruvate carboxylase in Flaveria, a conserved
serine residue in the carboxyl-terminal part of the enzyme is
a major determinant for C4-specific characteristics. J. Biol.
Chem. 275, 27917–27923.
10. Dong, L.Y., Masuda, T., Kawamura, T., Hata, S., and Izui, K.
(1998). Cloning, expression, and characterization of a root-form
phosphoenolpyruvate carboxylase from Zea mays: comparison
with the C4-form enzyme. Plant Cell Physiol. 39, 865–873.
11. Svensson,P.,Bla ¨sing,O.E.,andWesthoff,P.(2003).Evolutionof
C4phosphoenolpyruvate carboxylase. Arch. Biochem. Biophys.
12. Yang, Z.H., and Nielsen, R. (2002). Codon-substitution models
for detecting molecular adaptation at individual sites along spe-
cific lineages. Mol. Biol. Evol. 19, 908–917.
13. Zhang, J.Z., Nielsen, R., and Yang, Z.H. (2005). Evaluation of an
improved branch-site likelihood method for detecting positive
selection at the molecular level. Mol. Biol. Evol. 22, 2472–2479.
14. Andreev, D., Kreitman, M., Phillips, T.W., Beeman, R.W., and
Ffrench-Constant, R.H. (1999). Multiple origins of cyclodiene
insecticide resistance in Tribolium castaneum (Coleoptera:
Tenebrionidae). J. Mol. Evol. 48, 615–624.
15. Mundy, N.I., Badcock, N.S., Hart, T., Scribner, K., Janssen, K.,
and Nadeau, N.J. (2004). Conserved genetic basis of a quantita-
tive plumage trait involved in mate choice. Science 303, 1870–
16. Mundy,N.I.(2005).Awindow onthegenetics ofevolution:MC1R
and plumage colouration in birds. Proc. R. Soc. Lond. B. Biol.
Sci. 272, 1633–1640.
17. Protas, M.E., Hersey, C., Kochanek, D., Zhou, Y., Wilkens, H.,
Jeffery, W.R., Zon, L.I., Borowsky, R., and Tabin, C.J. (2006).
Genetic analysis of cavefish reveals molecular convergence in
the evolution of albinism. Nat. Genet. 38, 107–111.
18. Yokoyama, R., and Yokoyama, S. (1990). Convergent evolution
of the red- and green-like visual pigment genes in fish, Astyanax
fasciatus, and human. Proc. Natl. Acad. Sci. USA 87, 9315–9318.
19. Zakon, H.H., Lu, Y., Zwickl, D.J., and Hillis, D.M. (2006). Sodium
channel genes and the evolution of diversity in communication
signals of electric fishes: convergent molecular evolution.
Proc. Natl. Acad. Sci. USA 103, 3675–3680.
20. Zhang, J.Z. (2006). Parallel adaptive origins of digestive RNases
in Asian and African leaf monkeys. Nat. Genet. 38, 819–823.
21. Wood, T.E., Burke, J.M., and Rieseberg, L.H. (2005). Parallel ge-
notypic adaptation: when evolution repeats itself. Genetica 123,
22. Kai, Y., Matsumura, H., Inoue, T., Terada, K., Nagara, Y., Yoshi-
naga, T., Kihara, A., Tsumura, K., and Izui, K. (1999). Three-
dimensional structure of phosphoenolpyruvate carboxylase: a
proposed mechanism for allosteric inhibition. Proc. Natl. Acad.
Sci. USA 96, 823–828.
carboxylase: three-dimensional structure and molecular mech-
anisms. Arch. Biochem. Biophys. 414, 170–179.
24. Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., De Castro, E., Lan-
gendijk-Genevaux, P.S., Pagni, M., and Sigrist, C.J.A. (2006).
The PROSITE database. Nucleic Acids Res. 34, D227–D230.
25. Matsuoka, M., Furbank, R.T., Fukayama, H., and Miyao, M.
(2001). Molecular engineering of C4 photosynthesis. Annu.
Rev. Plant Biol. 52, 297–314.
C4photosynthetic enzymes. J. Exp. Bot. 54, 179–189.
27. Raines, C.A. (2006). Transgenic approaches to manipulate the
environmental responses of the C3carbon fixation cycle. Plant
Cell Environ. 29, 331–339.
28. Christin, P.A., Salamin, N., Savolainen, V., and Besnard, G.
(2007). A phylogenetic study of the phosphoenolpyruvate
carboxylase multigene family in Poaceae: understanding the
molecular changes linked to C4photosynthesis evolution. Kew
Bull. 62, in press.
29. Thompson, J.D., Higgins, D.J., and Gibson, T.J. (1994). Clus-
talW: improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, position specific gap
penalties and matrix choice. Nucleic Acids Res. 22, 4673–4680.
30. Guindon, S.,and Gascuel, O.(2003).A simple, fast, andaccurate
algorithm to estimate large phylogenies by maximum likelihood.
Syst. Biol. 52, 696–704.
31. Felsenstein, J. (2005). PHYLIP (Phylogeny Inference Package)
version 3.6 (Seattle, WA: Department of Genome Sciences,
University of Washington).
32. Swofford, D.L. (2002). PAUP*: Phylogenetic Analysis Using
Parsimony (* and other methods), version 4.0b8 (Sunderland,
MA: Sinauer Associates).
33. Ronquist, F., and Huelsenbeck, J.P. (2003). MrBayes 3: Bayes-
ian phylogenetic inference under mixed models. Bioinformatics
34. Yang, Z.H. (1997). PAML: a program package for phylogenetic
analysis by maximum likelihood. Comput. Appl. Biosci. 13,
35. Yang, Z.H., Wong, W.S.W., and Nielsen, R. (2005). Bayes empir-
ical Bayes inference of amino acids sites under positive selec-
tion. Mol. Biol. Evol. 22, 1107–1118.
The accession numbers assigned to the sequences we submitted
to GenBank are from AM689877 to AM689901 and from AM690209
Convergent C4Photosynthesis Molecular Evolution