Genome-wide Non-additive Gene Regulation in Arabidopsis Allotetraploids
Jianlin Wang*, Lu Tian*, Hyeon-Se Lee*, Ning E. Wei*†, Hongmei Jiang‡, Brian Watson§,
Andreas Madlung§∏, Thomas C. Osborn¶, R.W. Doerge‡, Luca Comai§, and Z. Jeffrey
*Genetics Program and Department of Soil and Crop Sciences, Texas A&M University, College
Station, TX 77843-2474, USA
†Department of Computer Science, Texas A&M University, College Station, TX 77843, USA
‡Department of Statistics, Purdue University, West Lafayette, IN 47906, USA
§Department of Biology, University of Washington, Seattle, WA 98195-5325, USA
¶ Department of Agronomy, University of Wisconsin, Madison, WI 53706, USA
∏ Present Address: Department of Biology, University of Puget Sound, Tacoma, WA 98416
** Present address and author for correspondence:
Institute for Cellular and Molecular Biology
University of Texas at Austin
1 University Station, A-4800
Austin, TX 78714-0159
Genetics: Published Articles Ahead of Print, published on September 19, 2005 as 10.1534/genetics.105.047894
Polyploidy has occurred throughout the evolutionary history of all eukaryotes and is
extremely common in plants. Reunification of the evolutionarily-divergent genomes in
allopolyploids creates regulatory incompatibilities that must be reconciled. Here we report
genome-wide gene expression analysis of Arabidopsis synthetic allotetraploids using spotted
70-mer oligo-gene microarrays. We detected >15% transcriptome divergence between the
progenitors, and 2,105 and 1,818 genes were highly expressed in Arabidopsis thaliana and
A. arenosa, respectively. Approximately 5.2% (1,362) and 5.6% (1,469) genes displayed
expression divergence from the mid-parent value in two independently-derived synthetic
allotetraploids, suggesting non-additive gene regulation following interspecific
hybridization. Remarkably, the majority of non-additively expressed genes in the
allotetraploids also display expression changes between the parents, indicating that
transcriptome divergence is reconciled during allopolyploid formation. Moreover, >65% of
the non-additively expressed genes in the allotetraploids are repressed, and >94% of the
repressed genes in the allotetraploids match the genes that are expressed at higher levels in
A. thaliana than in A. arenosa, consistent with the silencing of A. thaliana subjected to
nucleolar dominance and overall suppression of the A. thaliana phenotype in the synthetic
allotetraploids and natural A. suecica. The non-additive gene regulation is involved in
various biological pathways, and the changes in gene expression are developmentally
regulated. In contrast to the small effects of genome doubling on gene regulation in
autotetraploids, the combination of two divergent genomes in allotetraploids by
interspecific hybridization induces genome-wide non-additive gene regulation, providing a
molecular basis for de novo variation and allopolyploid evolution.
Whole genome duplication may occur via autopolyploidization by multiplying a single genome
or allopolyploidization by combining two or more divergent genomes (GRANT 1971; STEBBINS
1971). The common occurrence of allopolyploidy in many plant (MASTERSON 1994; STEBBINS
1971) and some animal (BECAK and KOBASHI 2004) species in nature suggests an evolutionary
advantage of allopolyploids over their progenitors, and implicates allopolyploidy as a rapid
speciation process (SOLTIS and SOLTIS 2000; WENDEL 2000). The combination of homoeologous
chromosomes from divergent species not only promotes functional divergence of duplicate genes
(ADAMS et al. 2003; BLANC and WOLFE 2004), but also generates heterozygosity and novel
interactions leading to genetic and phenotypic variability and heterosis (OSBORN et al. 2003;
RAMSEY and SCHEMSKE 1998; SOLTIS and SOLTIS 2000; WENDEL 2000) that are stably
maintained in the disomic allopolyploids. The data document rapid changes, such as de novo
phenotypic variation, transposon activation, nucleolar dominance, gene loss and silencing, and
subfunctionalization (ADAMS et al. 2003; ADAMS et al. 2004; CHEN and PIKAARD 1997a; COMAI
et al. 2000; HE et al. 2003; KASHKUSH et al. 2002; KASHKUSH et al. 2003; OSBORN et al. 2003;
OZKAN et al. 2001; PIKAARD 1999; SONG et al. 1995; WANG et al. 2004; WENDEL 2000) in
allopolyploids, which are caused by mechanisms involving dosage compensation, regulatory
incompatibility, genetic alteration, and epigenetic modifications (OSBORN et al. 2003). Evidently,
polyploidy is a prominent and pervasive force in plant evolution (SOLTIS and SOLTIS 2000;
WENDEL 2000), in contrast to the notion that polyploidy has contributed little to progressive
evolution (STEBBINS 1971).
Despite the general importance and increased interests in understanding the mechanisms
and evolution of polyploidy (OSBORN et al. 2003; SOLTIS and SOLTIS 2000; WENDEL 2000;
WOLFE 2001), little is known about genome-wide effects on the expression of progenitors’ genes
between the diverged genomes in nascent allopolyploids. We have produced Arabidopsis
allotetraploids using interspecific hybridization between two tetraploid species, A. thaliana (Ler)
and A. arenosa (CHEN et al. 2004; COMAI et al. 2000; WANG et al. 2004) and tested the
consequences of interspecific hybridization on gene expression during early stages of
allotetraploid formation. We report the first comprehensive analysis of transcriptome divergence
between the progenitors and their allotetraploid lineages. Approximately 3,900 genes (~15%)
were differentially expressed between A. thaliana and A. arenosa. The majority of non-
additively expressed genes in the synthetic allotetraploids displayed expression divergence
between the parents and were involved in various biological pathways, which may provide a
molecular basis of de novo variation for the selection and adaptation of new allopolyploid
Materials and Methods
Plant materials and RNA samples
Plant materials included A. thaliana isogenic autotetraploids (At4, accession#CS3900)
and diploid (At2, Ler), A. arenosa (Aa, accession#CS3901), and synthetic allotetraploid lines
(Allo733 and 738) (accession#CS3895-96). These plant materials were generated as previously
described (COMAI et al. 2000; WANG et al. 2004). All plants were grown in a growth chamber at
22oC and under 16-hours of light per day at the University of Washington with two biological
replications. Leaves were collected from 20 plants in each biological replication prior to bolting
(with 7-8 rosette leaves) in each line to minimize developmental variation among species and
bulked for DNA and RNA analyses (MADLUNG et al. 2002; WANG et al. 2004). Another 20
plants were grown until flowering, and flower buds were harvested when the first flower
bloomed in individual plants.
Total RNA was isolated using Trizol reagent (Invitrogen) according to the
manufacturer’s recommendations. Each RNA sample was quantified by measuring 260/280
ratios using a UV-spectrometer (GeneQuant pro, Amersham Biosciences) and by agarose-
formaldehyde gel electrophoresis. Total RNA was subjected to mRNA isolation using Micro-
FastTrack 2.0 mRNA isolation kit (Invitrogen). An equal amount of mRNA from A. thaliana and
A. arenosa was mixed as mid-parent value to detect non-additive gene expression in the
Fluorescence in situ hybridization (FISH)
FISH in anther meiocytes was performed using A. thaliana- or A. arenosa-specific 180-
bp centromeric repeats as probes (COMAI et al. 2003). The chromosomal images in meiotic cells
were analyzed using a Zeiss Axiovert microscope.
Analysis of spotted oligo-gene microarrays
Spotted Arabidopsis 70-mer oligo-gene microarrays using 26,090 annotated genes were
cooperatively developed with Qiagen/Operon (LEE et al. 2004; TIAN et al. 2005; WANG et al.
2005). The 70-mer oligo was designed from the 3’-end of each annotated gene. Every feature
was spotted once on each slide. We used 500 ng of mRNA in each labeling reaction using Cy3-
or Cy5-dCTP (Amersham Biosciences). The Cy3-dCTP reaction is mixed with the Cy5-dCTP
reaction for one hybridization, and the equal amount of RNA samples was reversely labeled for
another hybridization (supporting data Fig. 1). Therefore, two “identical” samples each
containing an equal amount of Cy3- and Cy5-labled cDNAs was hybridized with two slides,
which constitutes one dye-swap. The dye-swap experiment was replicated using an
independently isolated RNA sample. Each experiment contains four dye-swaps (8 slide
hybridizations) or two dye-swaps per biological replication (supporting data Tables 1-2) (CHEN
et al. 2004; TIAN et al. 2005).
A total of 48 slide hybridizations were performed for 6 experimental comparisons to
determine changes in expression between the progenitors (At4 and Aa), Allo733 and mid-parent
value (MPV, leaves), Allo738 and MPV (leaves), Allo733 and MPV (flower buds), Allo738 and
MP (flower buds), and At2 and At4 (leaves) (supporting data Table 2). Probe labeling, slide
hybridization and washing were performed as previously described (TIAN et al. 2005). Raw data
were collected using Genepix Pro4.1 after the slides were scanned using Genepix 4000B. The
data were processed using a lowess function to remove non-linear components and analyzed
using a linear model (LEE et al. 2004). This linear model was employed to partition variation in
the observed data relative to technical and biological variation. Given that each feature is
represented once on an array, the linear model is
Yijkplm = µ + Ai + Dj + Tk + Gl + AGil + DGjl + TGkl + TDGjkl+ eijklm
where µ represents the overall mean effect, A, D, T, F and G represent main fixed effects
from the array, dye, treatment (e.g., RNA from two species), and gene, respectively, and i =
1……8, k =1,2, j = 1,2, and l = 1……26,090. The interaction terms AG, DG, TG, and TDG
represent array by gene, dye by gene, treatment by gene, and treatment by dye by gene
interactions, and εijklm denotes the random error and is used to test for significance of main and
interaction effects in the model. Due to confounding and/or aliasing issues involving the array,
dye, and treatment terms, not all two-way interactions are included in the model. The model
residuals are assumed to be normally distributed with a common variance (i.e., εijklm i.i.d. N(0, σ2
)), unless evidence of variance non-constancy is observed. In such case, a per gene variance is
assumed (i.e., εijklm independent N(0, σl2 )).
We tested differential expression using significant differences in T + TG terms for a
particular gene (BLACK 2002) because we are interested in changes in expression beyond the
average treatment effect. The hypotheses that reflect whether a gene, g, has undergone
differential expression between treatments, t and t’ (e.g., A. thaliana and A. arenosa).
H0 : Tt + TGtg = Tt’ + TGt’g
H1 : Tt + TGtg ≠ Tt’ + TGt’g
A standard t-test statistics is used for this comparison, based on the normality assumption
for the residuals. To control for multiple testing errors the false discovery rate (FDR) of
Benjamini and Hochberg (BENJAMINI and HOCHBERG 1995) was employed as it provides weak
control of the family-wise error rate (FWER) and controls the FDW below level α. The FDR is
defined as the expected proportion of incorrect rejections of H0, relative to the total number of
rejections. The significance level α = 0.05 was chosen for these investigations. All analyses of
variance models were fit using standard statistical packages (SAS, R and Matlab) (IHAKA and
GENTLEMAN 1996; MOSER et al. 1988).
As mentioned previously, the common variance assumption was used for all genes and
per-gene variances for individual genes to estimate the significant changes of gene expression
between the two treatments. The genes that were differently expressed at a statistically
significant level (FDR, α=0.05) using a common variance had large-fold changes, some of which
also had high-standard deviations, whereas the genes that were expressed significantly different
using a per-gene variance included those with small-fold changes, which may be difficult for
verification. We used a conservative approach and selected the genes that were expressed
significantly different under both statistical tests.
Functional categories of up- and down-regulated genes were classified using PENDANT
(http://mips.gsf.de/proj/thal/db/index.html) and compared using Venn diagrams. The non-
additively expressed genes (identities) were mapped to oligonucleotide sequences using Perl
scripts. The oligos were mapped to genomic coordinates using high (red) and low (blue)
gradients corresponding to gene densities. Vertical lines above and below the chromosomes
showed up- and down-regulation, respectively, and the length was proportional to the logarithm-
fold changes in differential gene expression.
RT-PCR, qRT-PCR, and SSCP analyses
Approximately 5 µg of total RNA was treated with DNase I, and first-strand cDNA was
synthesized using reverse transcriptase (RT) Superscript II (Invitrogen) according to the
manufacturer’s recommendations. An aliquot (1/100) of cDNA was used as template in
quantitative (or real-time) RT-PCR (qRT-PCR), single strand conformation polymorphism
(SSCP) (ADAMS et al. 2003), and cleaved amplified polymorphism sequence (CAPS) analyses
(WANG et al. 2004). qRT-PCR was performed in an ABI7500 machine (ABI Biosystems) using
the primers (supporting data Table 5) and SYBER green dye method as previously described
(LEE et al. 2004), except that ACT2 was used as a control to estimate the relative expression
levels of the genes tested. The expression levels were converted to log-ratios (supporting data
Table 4) in comparison with the microarray data. For SSCP and CAPS analyses, the primers
were from A. thaliana loci and used to amplify both A. thaliana and A. arenosa loci. The PCR
reactions were performed using one cycle of 94oC for 2 min followed by 25-30 cycles of
amplification at 94 oC for 30 sec, 53 oC for 30 sec, and 72 oC for 90 sec. The amplified products
were digested by a restriction enzyme and subjected to agrose gel electrophoresis (CAPS
analysis) or denatured in a loading buffer and resolved in a 0.5X mutation detection
enhancement (MDE) gel (SSCP analysis). The images were captured, and band-intensities
quantified using a Fujifilm Phosphorimager.
Genetically stable allotetraploids resembled the A. arenosa parent
Allotetraploids can be formed through combination of unreduced gametes or interspecific
hybridization between diploid species followed by chromosome doubling (GRANT 1971;
STEBBINS 1971). To study early events of gene regulation in synthetic allopolyploids, we created
independent allotetraploid lineages (Allo733 and Allo738) by pollinating Arabidopsis thaliana
autotetraploid (At4) with A. arenosa tetraploid (Aa) (COMAI et al. 2000; WANG et al. 2004) (Fig.
1), which appear to have the same number of genes at the same ploidy levels (COMAI et al.
2000). Heterozygosity in the allotetraploid progeny was minimized by self-pollination for five
generations. The chromosome numbers and parental origins were verified in the first and fourth
generations using fluorescence in situ hybridization (FISH) (COMAI et al. 2003) and informative
microsatellite markers (data not shown). Without exception, each line possessed 5 pairs of
chromosomes from A. thaliana and 8 pairs from A. arenosa (Fig. 1) (COMAI et al. 2003; COMAI
et al. 2000). The morphology of these plants varied between lineages, coincident with rapid
genetic and epigenetic changes observed in new allopolyploids (COMAI et al. 2000; KASHKUSH
et al. 2002; MADLUNG et al. 2002; WANG et al. 2004). Many allotetraploid lineages resembled
the A. arenosa parent and A. suecica, a natural allotetraploid (COMAI et al. 2000; MADLUNG et
al. 2002; PIKAARD 1999). These morphological characteristics include long leaves, tall stature,
many branches, deeply serrated rosette leaves, and large rosettes and flowers. The data indicate
that A. arenosa appear to be morphologically dominant over A. thaliana in the allotetraploids.
The flower colors varied from pink (like A. arenosa) in the early generation (S1) to a mixture of
pink and white flowers in the intermediate generations (S2-4) and white colors in the late
generation (S5), suggesting the notion of stochastic and rapid changes in gene expression
(COMAI et al. 2000; KASHKUSH et al. 2002; WANG et al. 2004).
Transcriptome divergence between the progenitors
To determine the molecular basis of phenotypic differences, we analyzed transcriptome
changes in the progenitors using spotted oligo-gene microarrays designed from A. thaliana
annotated genes (TIAN et al. 2005). Microarray data from four dye-swap experiments (i.e., two
dye-swaps per biological replication) (supporting data Fig. 1) were analyzed using a linear model
(LEE et al. 2004) and the results were adjusted for multiple comparisons (TIAN et al. 2005).
Unless otherwise noted, we selected the differentially expressed genes that were statistically
significant under both common and per-gene variances (Fig. 2, Table 1).
We characterized transcriptome differences between A. thaliana and A. arenosa that
diverged ~5.8 Mya. We found that 3,923 (~15%) genes were differentially expressed between
the progenitors, of which 2,105 (~8%) and 1,818 (~7%) were expressed at significantly higher
levels in A. thaliana and A. arenosa, respectively (Fig. 3A, Table 1). The differentially expressed
genes represented as much as ~43% of the transcriptome using a per-gene variance analysis
(Table1), indicating a wide range of gene expression differences between two species, which is
reminiscent of the >50% of transcriptome changes in Drosophila species that diverged ~2.5 Mya
(RANZ et al. 2003). Among 11,199 differential expressed genes, 5,232 (47%) genes were
expressed at a level higher in A. thaliana than in A. arenosa, whereas 5,967 (53%) genes were
expressed higher in A. arenosa than in A. thaliana. In a separate study using Affymetrix chips,
Schmid et al. detected several hundred genes that were expressed more than two-fold differences
between A. thaliana Col and Ler ecotypes (SCHMID et al. 2003). Although the two arrays employ
different analytical tools, it appears that the gene expression differences detected between
species are much greater than between ecotypes.
Genome-wide non-additive gene regulation in the allotetraploids
To determine how transcriptome divergence contributes to genetic and morphological
variation in allotetraploids, we compared mRNA abundance in an allotetraploid with the mid-
parent value (MPV) (an equal mixture of RNAs from two parents) (supporting data Fig. 1).
Violating the null hypothesis for no gene expression difference between the allotetraploid and
mid-parent value suggests that a gene(s) is non-additively expressed; however, we cannot detect
the situation where silencing of a locus is compensated by increased expression of its
homoeologous locus. Thus, microarray analysis may underestimate the number of genes that are
differentially expressed between an allotetraploid line and the parents. We discovered that 1,362
(~5.2%) and 1,469 (~5.6%) genes were expressed non-additively in Allo733 and Allo738,
respectively (Table 1). When a per-gene variance was used, the non-additively expressed genes
accounted for ~32% (8,377, Allo733) and ~38% (9,875, Allo738) of the transcriptome. The data
suggest that orthologous genes in allopolyploids are frequently expressed in a non-additive
If the regulatory changes inherited from the parents determine species divergence, the
genes displaying species-specific expression patterns may be modulated in the allotetraploids.
Indeed, among the 2,011 genes that were non-additively regulated in two allotetraploids, 1,377
(~68%) genes were included in those that were differentially expressed between the parents (Fig.
3B), which are significantly different from a random distribution of non-additively expressed
genes (~15%, χ2 = 1180.5, p ≤ 0.00001). Among them, 820 (~41%) genes were common to both
allotetraploids (Allo733 and Allo738), whereas 649 (~32%) and 542 (~27%) genes were unique
to All733 and Allo738, respectively, indicating general and specific effects of allopolyploid
formation on gene regulation in the independently-derived allotetraploids. The 820 non-
additively expressed genes in both allotetraploids (Allos) were randomly distributed across the
genome and displayed no obvious chromosomal regions susceptible to allopolyploidy-dependent
gene regulation (Fig. 3C).
Progenitor-biased gene repression in the allotetraploids
We analyzed direction of change and parental origin of non-additively expressed genes.
Among them, 1,038 (~76%) and 952 (~65%) genes were down-regulated in Allo733 and
Allo738, respectively (Fig. 4A), suggesting that repression is a mode of non-additive gene
regulation in synthetic allotetraploids. We divided the repressed genes into three categories based
on their expression patterns in the parents. First, 838 (~99%) and 611 (~94%) genes that showed
higher levels of expression in A. thaliana than in A. arenosa were repressed in Allo733 and
Allo738, respectively (Fig. 4B), which coincides with the silencing of A. thaliana but not A.
arenosa rRNA genes (CHEN et al. 1998; PIKAARD 1999) and the overall suppression of A.
thaliana phenotype in new allotetraploids and natural A. suecica. Second, 90 (~35%) and 159
(~50%) genes that were expressed at higher levels in A. arenosa than in A. thaliana were down-
regulated in Allo733 and All738, respectively (Fig. 4C). Third, 110 (~42%) and 182 (36%) genes
that were equally expressed in A. thaliana autotetraploid and A. arenosa were repressed in
Allo733 and Allo738, respectively (Fig. 4D). There was no bias towards gene repression in the
last two categories. The data demonstrate that the genes more highly expressed in A. thaliana
autotetraploids than in A. arenosa are subject to orchestrated repression in the synthetic
Non-additive gene regulation in various biological pathways
According to 15 functional classifications of 820 non-additively expressed genes detected
in both allotetraploids (Fig. 5A), the percentages of genes in the hormonal regulation and cell
defense and aging were 150-175% of those in the same categories classified using all annotated
genes in Arabidopsis (Fig. 5B), suggesting that these genes are particularly susceptible to
expression changes in response to the perturbation resulting from inter-genomic interactions in
the allotetraploids. Many genes involved in the ethylene biosynthesis pathway were repressed in
one or two allotetraploids (Fig. 6A and supporting data Table 3), which may induce expression
changes in ethylene-responsive genes involved in a wide range of developmental processes and
fitness responses, including seed germination, leaf and flower senescence, fruit ripening,
programmed cell death, and biotic and abiotic stress responses (GUO and ECKER 2004). Of the 97
HSPs in Arabidopsis (ARABIDOPSIS GENOME INITIATIVE 2000), 33 displayed expression
differences from the mid-parent value (Fig. 6B). Thirty-one HSPs that were highly expressed in
A. thaliana were repressed, which may reflect “buffering” effects (QUEITSCH et al. 2002) on
pathway redundancy. Notably, fewer than expected transposons altered expression in the
allotetraploids (Fig. 5B), although some might be included in the unclassified category. This
appears to be inconsistent with B. McClintock’s notion of “genomic shock” (MCCLINTOCK
1984) but we note that these allopolyploid lineages represent the survivors among the original F1
products (COMAI et al. 2000) and in the late generation (S5).
Developmental and parental contributions to non-additive gene regulation
We tested whether non-additive gene regulation in synthetic allotetraploids is sensitive to
developmental changes by comparing the gene expression divergence detected in leaves and
flowers. Allo733 displayed 1,355 genes non-additively expressed in flower buds, of which 175
(~7%) were also detected in the leaves (Fig. 7A). Little overlap of the genes detected between
leaves and flowers suggests a developmental role in non-additive gene regulation in the
allopolyploids in a manner reminiscent of developmental de-repression of silenced rRNA genes
(CHEN and PIKAARD 1997b) and subfunctionalization of some duplicate genes (ADAMS et al.
2003). It is notable that gene expression changes may occur during the transition from vegetative
to reproductive development, but appear to be consistent within a tissue-type (e.g., rosette
leaves) (CHEN and PIKAARD 1997b). Compared to Allo738, fewer non-additively expressed
genes were detected in the flower buds in Allo733 (supporting data Table1), which may reflect
developmental variation among allotetraploid lineages (COMAI et al. 2000; MADLUNG et al.
We verified expression patterns of 11 non-additively expressed genes using qRT-PCR
analysis (Fig. 7B). Six were repressed and five were upregulated in the allotetraploids, consistent
with the microarray data (supporting data Table 5). Five genes (WRKY, BCB, HSP90, PDF, and
LRR) that were expressed at higher levels in A. thaliana than in A. arenosa (At4>Aa) were
repressed in the allotetraploids. Three of four genes (FLC, PORa and PORb) that displayed
higher expression levels in A. arenosa than in A. thaliana (Aa>At4) were upregulated, and one
(CYC) was repressed in the allotetraploids. Two genes (CHI and SPP) that were equally
expressed in the parents (At4=Aa) were upregulated in the allotetraploids. Using locus-specific
SSCP or CAPS assays, we analyzed the contribution of A. thaliana and A. arenosa loci to the
non-additive gene regulation (Fig. 7C). For WRKY, BCB, and COL2, both A. thaliana and A.
arenosa loci were repressed in the allotetraploids, whereas HSP17.6b repression was due to A.
thaliana locus. The repression of several COLs and upregulation of FLC may correlate with late
flowering in the allotetraploids. Similarly, upregulation of PORb and SPP in the allotetraploids
was related to A. thaliana loci, whereas upregulation of CHI was caused by A. arenosa locus.
The data suggest both cis-regulatory and trans-acting effects (WITTKOPP et al. 2004) on non-
additive gene regulation in the allotetraploids. Furthermore, upregulation of SPP encoding starch
phosphorylase and of PORa and PORb encoding protochlorophyllide oxidoreductases in the
photosynthetic pathway may lead to vigorous growth in the allotetraploids.
Autopolyploidization does not induce genome-wide non-additive gene regulation
To determine whether non-additive gene regulation is affected by genome dosage, we
analyzed transcriptome differences between A. thaliana diploid (At2) and isogenic autotetraploid
(At4) (supporting data Table 1). Only 88 genes were expressed significantly differently between
the diploid and autotetraploid, which is reminiscent of the dosage-dependent regulation of a
dozen genes as observed in yeast auto-ploids (GALITSKI et al. 1999). The results suggest that
doubling the same genome in autopolyploids has much smaller effects on gene regulation than
combining the divergent genomes in allopolyploids. However, allopolyploidy effects may not be
as simple as the sum of “hybridization” and “genome doubling”.
Effects of autopolyploidization and allopolyploidization on gene regulation
Polyploidy effects on gene regulation may be caused by genome-doubling and/or inter-
genomic interactions. Autopolyploidization induces gene expression changes in response to the
increase in genome dosage (BIRCHLER 2001). Only 12 and 88 genes, respectively, respond to
autoploidy changes in yeast (GALITSKI et al. 1999) and Arabidopsis, suggesting that increasing
genome dosage affects a small subset of genes. During autopolyploidization, mechanisms such
as dosage compensation (BIRCHLER 2001) are responsible for maintaining expression patterns of
the genes except those associated with the large size of polyploid cells (GALITSKI et al. 1999).
For the majority of genes studied in maize, their expression levels are dependent on the dosage
of chromosomes or chromosome arms (AUGER et al. 2005; GUO et al. 1996).
The dramatic changes in non-additive gene regulation observed in the allotetraploids may
be induced by interspecific hybridization. Our data suggest that 15% transcriptome diverged
between A. thaliana and A. arenosa, which accounted for 68% of non-additively expressed genes
(2,011) in the synthetic allotetraploids. In addition to 820 genes that changed expression in both
allotetraploids, 649 and 542 genes were unique to Allo733 and Allo738, respectively. These
genes may correlate with specific changes in individual allotetraploids and facilitate selection
and adaptation of new allopolyploid species in response to environmental cues and
developmental changes. Indeed, non-additive gene regulation is developmentally regulated,
which may lead to subfunctionalization of duplicate genes (ADAMS et al. 2003; LYNCH and
FORCE 2000) in different organs or tissues (CHEN and PIKAARD 1997b). Transposons are
underrepresented in the genes that display expression changes in the allotetraploids. It is likely
that many transposons are not included in the annotated genes for microarray analysis.
Alternatively, the effects of genomic shock (MCCLINTOCK 1984) may be “settled” in the selfing
progeny (S5). Finally, dosage-dependent gene regulation (AUGER et al. 2005; BIRCHLER 2001)
may account for part of the gene expression changes in the allotetraploids. Indeed, 51% (45/88)
and 32% (28/88) of the genes that display expression divergence between A. thaliana diploids
and isogenic autotetraploids were also expressed non-additively in two allotetraploids.
There is a possibility that 70-mer oligos designed from A. thaliana may not hybridize
well to the A. arenosa genes, although we have shown that 192 A. thaliana oligos hybridized
equally well to A. arenosa and Brassica genes (LEE et al. 2004) probably because of >95% genic
sequence identity between A. thaliana and A. arenosa (LEE and CHEN 2001) and over 85%
between A. thaliana and B. oleracea (CAVELL et al. 1998). As a result, the sequence divergence
may also contribute to the difference in gene expression detected between A. thaliana and A.
Insights into non-additive gene regulation in the synthetic allotetraploids
In Arabidopsis allotetraploids, the progenitor-dependent gene regulation is not restricted
to rDNA loci subjected to nucleolar dominance (PIKAARD 1999) but occurs at a genome-wide
scale in various biological pathways. The available data suggest that the expression of
orthologous genes during evolution and speciation is not purely neutral. Selection and adaptation
over evolutionary time may promote divergence of regulatory elements and/or transcription
factors and regulatory proteins. The competition between the diverged regulatory pathways may
determine non-additive gene regulation in allopolyploids of Arabidopsis (WANG et al. 2004),
cotton (ADAMS et al. 2004), Senecio (HEGARTY et al. 2005), and wheat (HE et al. 2003;
KASHKUSH et al. 2002), interspecific hybrids (WITTKOPP et al. 2004) in Drosophila, intraspecific
hybrids in maize (GUO et al. 2004), and sex-dependent gene regulation in Drosophila (GIBSON et
al. 2004; RANZ et al. 2003). It is notable that outcrossing in A. arenosa and inbreeding in A.
thaliana may accelerate their divergence during evolution. Each progenitor might have evolved
specific regulatory systems affecting rDNA and other loci, perhaps via concerted evolution
(COEN et al. 1982), and that the interactions between these diverged regulatory systems in
allopolyploids trigger repression of the A. thaliana-“specific” genes and of the rDNA loci (CHEN
and PIKAARD 1997a; PIKAARD 1999; WANG et al. 2004). Although the underlying mechanisms
for preferential repression of A. thaliana genes are yet to be determined, sudden reunification of
divergent genomes may induce genome instability (MADLUNG et al. 2002; WANG et al. 2004)
and changes in chromatin structure and RNA-mediated processes (OSBORN et al. 2003). Hybrid-
or allopolyploidy-induced incompatibilities may be overcome by gene expression modulation
through chromatin modifications, transcription factors such as Myb (BARBASH et al. 2003),
and/or RNA interference. Interestingly, non-additive gene regulation in the allotetraploids largely
depends on expression divergence between the parents. Thus, hybrids derived from distantly-
related species may induce a high level of gene expression changes in a non-additive fashion,
providing molecular bases of hybrid vigor (BIRCHLER et al. 2003) and of novel variation in the
allotetraploid progeny. Furthermore, the stochastic establishment of non-additive gene regulation
in newly synthesized allotetraploids (WANG et al. 2004) may increase the potential for fitness
and selective adaptation. In contrast to the lethality and sterility observed in interspecific hybrids
(BARBASH et al. 2003), non-additive gene expression changes may be maintained and
transmitted in meiotically stable allopolyploids, providing a mechanism for de novo variation
and evolutionary opportunities for selection and adaptation of new allopolyploid species.
We thank James A. Birchler, Gary E. Hart, Robert A. Martienssen, J. Chris Pires, Douglas E.
Soltis, Jennifer Tate, and Jonathan F. Wendel for critical suggestions. The work was supported
by a grant from the National Science Foundation Plant Genome Research Program
(DBI0077774). Work in the Chen lab is supported in part by a grant from the National Institute
of Health (GM067015). The authors declare that they have no financial conflict of interest.
ADAMS, K. L., R. CRONN, R. PERCIFIELD and J. F. WENDEL, 2003 Genes duplicated by polyploidy show
unequal contributions to the transcriptome and organ-specific reciprocal silencing. Proc Natl
Acad Sci USA 100: 4649-4654.
ADAMS, K. L., R. PERCIFIELD and J. F. WENDEL, 2004 Organ-specific silencing of duplicated genes in a
newly synthesized cotton allotetraploid. Genetics 168: 2217-2226.
ARABIDOPSIS GENOME INITIATIVE, 2000 Analysis of the genome sequence of the flowering plant
Arabidopsis thaliana. Nature 408: 796-815.
AUGER, D. L., A. D. GRAY, T. S. REAM, A. KATO, E. H. COE, JR. et al., 2005 Nonadditive gene
expression in diploid and triploid hybrids of maize. Genetics 169: 389-397.
BARBASH, D. A., D. F. SIINO, A. M. TARONE and J. ROOTE, 2003 A rapidly evolving MYB-related
protein causes species isolation in Drosophila. Proc Natl Acad Sci U S A 100: 5302-5307.
BECAK, M. L., and L. S. KOBASHI, 2004 Evolution by polyploidy and gene regulation in Anura. Genet
Mol Res 3: 195-212.
BENJAMINI, Y., and Y. HOCHBERG, 1995 Controlling the false discovery rate: A practical and powerful
approach to multiple testing. J Royal Stat Soci (Series B) 57: 289-300.
BIRCHLER, J. A., 2001 Dosage-dependent gene regulation in multicellular eukaryotes: Implications for
dosage compensation, aneuploid syndromes, and quantitative traits. Dev Biol 234: 275-288.
BIRCHLER, J. A., D. L. AUGER and N. C. RIDDLE, 2003 In search of the molecular basis of heterosis.
Plant Cell 15: 2236-2239.
BLACK, M. A., 2002 Statistical Issues in the Design and Analysis of Spotted Microarray Experiments,
pp. in Ph.D. Thesis Dissertation, Department of Statistics. Purdue University, West Lafayette,
BLANC, G., and K. H. WOLFE, 2004 Functional divergence of duplicated genes formed by polyploidy
during Arabidopsis evolution. Plant Cell 16: 1679-1691.
CAVELL, A. C., D. J. LYDIATE, I. A. PARKIN, C. DEAN and M. TRICK, 1998 Collinearity between a 30-
centimorgan segment of Arabidopsis thaliana chromosome 4 and duplicated regions within the
Brassica napus genome. Genome 41: 62-69.
CHEN, Z. J., L. COMAI and C. S. PIKAARD, 1998 Gene dosage and stochastic effects determine the
severity and direction of uniparental ribosomal RNA gene silencing (nucleolar dominance) in
Arabidopsis allopolyploids. Proc Natl Acad Sci U S A 95: 14891-14896.
CHEN, Z. J., and C. S. PIKAARD, 1997a Epigenetic silencing of RNA polymerase I transcription: a role
for DNA methylation and histone modification in nucleolar dominance. Genes Dev 11: 2124-
CHEN, Z. J., and C. S. PIKAARD, 1997b Transcriptional analysis of nucleolar dominance in polyploid
plants: biased expression/silencing of progenitor rRNA genes is developmentally regulated in
Brassica. Proc Natl Acad Sci U S A 94: 3442-3447.
CHEN, Z. J., J. WANG, L. TIAN, H.-S. LEE, J. J. WANG et al., 2004 The development of an Arabidopsis
model system for genome-wide analysis of polyploidy effects. Biol J Linn Soci 82: 689-700.
COEN, E. S., J. M. THODAY and G. DOVER, 1982 Rate of turnover of structural variants in the rDNA
gene family of Drosophila melanogaster. Nature 295: 564-568.
COMAI, L., A. P. TYAGI and M. A. LYSAK, 2003 FISH analysis of meiosis in Arabidopsis allopolyploids.
Chromosome Res 11: 217-226.
COMAI, L., A. P. TYAGI, K. WINTER, R. HOLMES-DAVIS, S. H. REYNOLDS et al., 2000 Phenotypic
instability and rapid gene silencing in newly formed Arabidopsis allotetraploids. Plant Cell 12:
GALITSKI, T., A. J. SALDANHA, C. A. STYLES, E. S. LANDER and G. R. FINK, 1999 Ploidy regulation of
gene expression. Science 285: 251-254.
GIBSON, G., R. RILEY-BERGER, L. HARSHMAN, A. KOPP, S. VACHA et al., 2004 Extensive sex-specific
nonadditivity of gene expression in Drosophila melanogaster. Genetics 167: 1791-1799.
GRANT, V., 1971 Plant Speciation. Columbia University Press, New York.
GUO, H., and J. R. ECKER, 2004 The ethylene signaling pathway: new insights. Curr Opin Plant Biol 7:
GUO, M., D. DAVIS and J. A. BIRCHLER, 1996 Dosage effects on gene expression in a maize ploidy
series. Genetics 142: 1349-1355.
GUO, M., M. A. RUPE, C. ZINSELMEIER, J. HABBEN, B. A. BOWEN et al., 2004 Allelic variation of gene
expression in maize hybrids. Plant Cell 16: 1707-1716.
HE, P., B. R. FRIEBE, B. S. GILL and J. M. ZHOU, 2003 Allopolyploidy alters gene expression in the
highly stable hexaploid wheat. Plant Mol Biol 52: 401-414.
HEGARTY, M. J., J. M. JONES, I. D. WILSON, G. L. BARKER, J. A. COGHILL et al., 2005 Development of
anonymous cDNA microarrays to study changes to the Senecio floral transcriptome during
hybrid speciation. Mol Ecol 14: 2493-2510.
IHAKA, R., and R. GENTLEMAN, 1996 A language for data analysis and graphics. J Comput Graphical
Statist 5: 299-314.
KASHKUSH, K., M. FELDMAN and A. A. LEVY, 2002 Gene loss, silencing and activation in a newly
synthesized wheat allotetraploid. Genetics 160: 1651-1659.
KASHKUSH, K., M. FELDMAN and A. A. LEVY, 2003 Transcriptional activation of retrotransposons alters
the expression of adjacent genes in wheat. Nat Genet 33: 102-106.
LEE, H. S., and Z. J. CHEN, 2001 Protein-coding genes are epigenetically regulated in Arabidopsis
polyploids. Proc Natl Acad Sci U S A 98: 6753-6758.
LEE, H. S., J. WANG, L. TIAN, H. JIANG, M. A. BLACK et al., 2004 Sensitivity of 70-mer
oligonucleotides and cDNAs for microarray analysis of gene expression in Arabidopsis and
Brasssica. Plant Biotechnol J 2: 45-57.
LYNCH, M., and A. FORCE, 2000 The probability of duplicate gene preservation by subfunctionalization.
Genetics 154: 459-473.
MADLUNG, A., R. W. MASUELLI, B. WATSON, S. H. REYNOLDS, J. DAVISON et al., 2002 Remodeling of
DNA methylation and phenotypic and transcriptional changes in synthetic Arabidopsis
allotetraploids. Plant Physiol 129: 733-746.
MASTERSON, J., 1994 Stomatal size in fossil plants: evidence for polyploidy in majority of angiosperms.
Science 264: 421-424.
MCCLINTOCK, B., 1984 The significance of responses of the genome to challenge. Science 226: 792-
MOSER, E. B., A. M. SAXTON and J. P. GEAGHAN, 1988 Biological applications of the SAS system: an
overview. Comput Appl Biosci 4: 233-238.
OSBORN, T. C., J. C. PIRES, J. A. BIRCHLER, D. L. AUGER, Z. J. CHEN et al., 2003 Understanding
mechanisms of novel gene expression in polyploids. Trends Genet 19: 141-147.
OZKAN, H., A. A. LEVY and M. FELDMAN, 2001 Allopolyploidy-induced rapid genome evolution in the
wheat (Aegilops- Triticum) group. Plant Cell 13: 1735-1747.
PIKAARD, C. S., 1999 Nucleolar dominance and silencing of transcription. Trends Plant Sci 4: 478-483.
QUEITSCH, C., T. A. SANGSTER and S. LINDQUIST, 2002 Hsp90 as a capacitor of phenotypic variation.
Nature 417: 618-624.
RAMSEY, J., and D. W. SCHEMSKE, 1998 Pathways, mechanisms, and rates of polyploid formation in
flowering plants. Ann Rev Ecol Syst 29: 467-501.
RANZ, J. M., C. I. CASTILLO-DAVIS, C. D. MEIKLEJOHN and D. L. HARTL, 2003 Sex-dependent gene
expression and evolution of the Drosophila transcriptome. Science 300: 1742-1745.
SCHMID, M., N. H. UHLENHAUT, F. GODARD, M. DEMAR, R. BRESSAN et al., 2003 Dissection of floral
induction pathways using global expression analysis. Development 130: 6001-6012.
SOLTIS, P. S., and D. E. SOLTIS, 2000 The role of genetic and genomic attributes in the success of
polyploids. Proc Natl Acad Sci U S A 97: 7051-7057.
SONG, K., P. LU, K. TANG and T. C. OSBORN, 1995 Rapid genome change in synthetic polyploids of
Brassica and its implications for polyploid evolution. Proc Natl Acad Sci U S A 92: 7719-7723.
STEBBINS, G. L., 1971 Chromosomal Evolution in Higher Plants. Edward Arnold, London.
TIAN, L., M. P. FONG, J. J. WANG, N. E. WEI, H. JIANG et al., 2005 Reversible histone acetylation and
deacetylation mediate genome-wide, promoter-dependent and locus-specific changes in gene
expression during plant development. Genetics 169: 337-345.
WANG, J., J. J. LEE, L. TIAN, H. S. LEE, M. CHEN et al., 2005 Methods for genome-wide analysis of
gene expression changes in polyploids. Methods Enzymol 395: 570-596.
WANG, J., L. TIAN, A. MADLUNG, H. S. LEE, M. CHEN et al., 2004 Stochastic and epigenetic changes of
gene expression in Arabidopsis polyploids. Genetics 167: 1961-1973.
WENDEL, J. F., 2000 Genome evolution in polyploids. Plant Mol Biol 42: 225-249.
WITTKOPP, P. J., B. K. HAERUM and A. G. CLARK, 2004 Evolutionary changes in cis and trans gene
regulation. Nature 430: 85-88.
WOLFE, K. H., 2001 Yesterday's polyploidization and the mystery of diploidization. Nat Rev Genet 2:
Figure 1. A. Production of stable synthetic allotetraploids (Allo733 and 738). A self-fertile A.
thaliana autotetraploid (Ler, At4) was pollinated with a natural A. arenosa tetraploid (Aa).
Multiple independent allotetraploids in S1 were self-pollinated by single-seed decent to S5
generation. Allo733 and Allo738 resembled A. arenosa and natural A. suecica (COMAI et al.
2000). Fluorescence in situ hybridization (FISH) analysis indicates that two sets of centromeres
in Allo733 are derived from A. thaliana (At4) and A. arenosa (Aa), respectively. The bar is equal
to 5 mm. Allo: allotetraploid.
Figure 2. Logarithm fold-change versus per-gene standard deviation in a replicated experiment
containing four dye-swaps. The hybridization probes were cDNAs from Allo733 and two
parents, A. thaliana tetraploid and A. arenosa. The data were analyzed using a linear model as
previously described (TIAN et al. 2005). Green, black, and red dots indicate the pools of
significant genes selected by multiple comparison tests (false discovery rate, FDR, α = 0.05)
using a per-gene variance, a common variance, and the intersection of the two, respectively.
Statistical significance for extremely small-fold changes was detected for two features replicated
6 and 49 times within each microarray slide, indicating the power of replication in microarray
Figure 3. Transcriptome divergence and non-additive gene expression between allotetraploids
and their progenitors. A. The proportion of transcriptome that was highly expressed in A.
thaliana (At4), A. arenosa (Aa), or equally expressed (both). B. Venn diagram showing the
number of genes with expression divergence between the progenitors (blue) and between
Allo733 (red) or Allo738 (green) and the mid-parent value (MPV, supporting data Fig. 1). C.
Chromosomal distribution of the 820 genes displaying non-additive expression in both
allotetraploids (see text).
Figure 4. Down-regulation of A. thaliana genes in the synthetic allotetraploids. A. Distribution
of non-additively expressed genes detected in each allotetraploid (Allo733 or Allo738) or both
allotetraploids (Allos). B. The non-additively expressed genes in each allotetraploid matched the
genes that were highly expressed in A. thaliana autotetraploid. C. The non-additively expressed
genes in each allotetraploid matched the genes that were highly expressed in A. arenosa. D. The
non-additively expressed genes matched the genes that were equally expressed in both parents.
The percentages of down-regulated genes are indicated above the columns in each histogram.
Figure 5. Classification of non-additively expressed genes detected in synthetic Arabidopsis
allotetraploids. A. The 820 genes detected in both Allo733 and Allo738 lines (Allos) were
classified into 15 functional categories using the PEDANT analysis system
(http://mips.gsf.de/proj/thal/db/index.html) (ARABIDOPSIS GENOME INITIATIVE 2000). B. The
percentages of the genes in each functional category detected in Allo733, Allo738, or both
(Allos). The relative ratios in Y-axis were estimated using the percentage of genes detected in
each functional category in an Allo line divided by the percentage of all ~26,000 annotated genes
in the Arabidopsis genome (ARABIDOPSIS GENOME INITIATIVE 2000). The percentage of the
genes detected in the Allo line equal to that of all genes in the whole genome is shown as 100%
Figure 6. Non-additive gene regulation occurs in various pathways. A. Progenitor-dependent
repression of the genes involved in the ethylene biosynthesis pathway in Arabidopsis
allotetraploids. Each number in parenthesis below an enzyme or molecule in the pathways
indicates the fold-change for the expression of a gene, homolog (h), or putative (p) homolog
detected by microarray analysis. Red, green, blue, and purple colors indicate that gene
expression differences are detected in both allotetraploids (Allo733 and 738), in Allo738, in
Allo733, and between the two parents, respectively. B. Repression of heat shock protein (HSP)
genes in Arabidopsis allotetraploids. Thirty-one out of 33 HSPs were repressed in each
allotetraploid (Allo733 or 738). The length and directions of vertical bars indicates logarithm
fold-changes in up- (above the line) or down-regulation (below the line) of the HSPs relative to
the mid-parent in the allotetraploid lines.
Figure 7. Developmental and parental contributions to non-additive gene expression. A. Venn
diagrams of the genes that displayed non-additive expression in leaves and flower buds in
Allo733. Only 175 of 2,542 genes showed overlap between leaves and flower buds. B.
Verification of 11 genes detected in microarrays by qRT-PCR. The gene expression levels in the
parents were higher in A. thaliana (At>Aa), higher in A. arenosa (Aa>At4), and equal (At4=Aa).
C. SSCP and CAPS analyses showing parental contributions to non-additive gene regulation in
the allotetraploids. The genes studied in B and C are in the functional classifications of stress
(HSP or HSP90 and HSP17.6b), cell cycle, defense and aging (CYC, CHI, LRR, PDF and
WRKY), metabolism and energy (BCB, PORa, PORb, and SPP), and flower development (COL2
and FLC). The restriction enzymes used in CAPS analysis are indicated in parentheses.
Table 1. The number of differentially expressed genes detected using a common variance and/or
a per-gene variance
False Discovery Rate (FDR) (α = 0.05)
±1.5 fold (from
At4 vs. Aa 4,363 11,199 3,923 4,476
Allo733 vs. MPV 1,708 8,377 1,362 1,792
Allo738 vs. MPV 1,856 9,875 1,469 2,358
At4: A. thaliana autotetraploid; Aa: A. arenosa tetraploid; MPV: mid-parent value; Shared:
shared data sets of the statistically significant genes detected using both common variance and
per-gene variance; The last column indicates the number of significant genes using an arbitrarily
cut for fold-change (±1.5) from the genes selected based on per-gene variance.
Supporting Data and Online Materials
1. Microarray data are available at the website
2. Supporting data Figure 1
3. Supporting data Tables 1, 2, 3, 4, and 5
Supporting data Figure 1. Dye-swap microarray experimental design. A. RNA samples
prepared from two treatments (genotypes) were reverse-transcribed and labeled using Cy3-dCTP
and Cy5-dCTP. In each dye-swap experiment, the labeled cDNAs were divided equally and
mixed reciprocally to hybridize two microarray slides containing 70-mer oligos designed from
~26,090 annotated genes (TIAN et al. 2005). The experiment was repeated to generate four dye-
swaps using 8 microarray slides. B. The same dye-swap experimental design was performed for
analyzing gene expression changes in each allotetraploid lineage. Two labeled-cDNA samples
were prepared from (1) an allotetraploid (Allo733 or 738) and (2) the mid-parent using a mixture
of total RNAs containing an equal aliquot of RNA from the two parents. In each dye-swap
experiment, the labeled cDNAs were divided equally and mixed reciprocally to hybridize two
microarray slides and repeated 4 times using a total of 8 slides.
Supporting data Table 1. Microarray data analysis using a liner model approach (LEE et al.
2004). Each experiment consisted of four dye-swaps and eight slides as described in the Methods
and shown below. The analyzed data can be viewed in a “scatter plot” or “text” at the following
Experiment 1 At4 vs. Aa4
(leaves) (Plot, Text)
Experiment 2 Allo733 vs. MPV
(leaves) (Plot, Text)
Experiment 3 Allo738 vs. MPV
(leaves) (Plot, Text)
Experiment 4 Allo733 vs. MPV
(flower buds) (Plot, Text)
Experiment 5 Allo738 vs. MPV
(flower buds) (Plot, Text)
Experiment 6 At2 vs. At4
(leaves) (Plot, Text)
MPV: mid-parent value.
1. In each experiment, a text-delineated table of the differentially expressed genes detected is
displayed. For example, in experiment 1 for the comparison of gene expression between A.
thaliana and A. arenosa in leaves, the list included 4,363 and 11,199 significant genes using
common and per-gene variance, respectively. The list was tabulated using locus IDs and
2. The microarray data were generated using procedures provided by MIAME
(http://www.mged.org/Workgroups/MIAME/miame.html). The data were obtained from six
experiments as shown in tables S1 and S2. The detailed procedures for microarray and
experimental design, slide printing, hybridization, targets (cDNA probes), data collection, and
analysis were described in a previous paper (LEE et al. 2004). The data can be down-loaded for
re-analysis or verification of data analysis using other statistical packages or commercial
software in addition to the linear model used in this study.
3. Raw data (spot quantitation matrix) were generated using a GenePix 4000B scanner and
GenePix Pro4.1 software (Axon Instruments). The data obtained from each slide will be
displayed after clicking the slide number.
4. The raw data (hybridization intensities obtained in Cy3 and Cy5 channels) were converted
using the logarithm and lowess functions and subjected to analysis of variance (ANOVA) as
described in the Methods section. No additional step of data processing was used.
Common Variance Per-gene Variance
Common Variance Text
Common Variance Text
Text Common Variance
Common Variance Text
Common Variance) Text
Supporting data Table 2. Microarray experimental design
Supporting data Table 2a. Microarray analysis of gene expression in leaves between two parents,
A. thaliana (At4) and A. arenosa (Aa4).
Slide No. Leaf RNA Cy3
1 RNA1 At4
2 RNA1 Aa
3 RNA1 At4
4 RNA1 Aa
5 RNA2 At4
6 RNA2 Aa
7 RNA2 At4
8 RNA2 Aa4
Supporting data Table 2b. Microarray analysis of gene expression in leaves between Allo733 and
mid-parent value (MPV).
Slide No. Leaf RNA Cy3
9 RNA1 Allo733
10 RNA1 MPV
11 RNA1 Allo733
12 RNA1 MPV
13 RNA2 Allo733
14 RNA2 MP
15 RNA2 Allo733
16 RNA2 MPV
Supporting data Table 2c. Microarray analysis of gene expression in leaves between Allo738 and
Slide No. Leaf RNA Cy3
17 RNA1 Allo738
18 RNA1 MPV
19 RNA1 Allo738
20 RNA1 MPV
21 RNA2 Allo738
22 RNA2 MPV
23 RNA2 Allo738
24 RNA2 MPV
Supporting data Table 2d. Microarray analysis of gene expression in flowers between Allo733
and mid-parent (MP).
Slide No. Flower bud RNA Cy3
25 RNA1 Allo733
26 RNA1 MPV
27 RNA1 Allo733
28 RNA1 MPV
29 RNA2 Allo733
Supporting data Table 2e. Microarray analysis of gene expression in flowers between Allo738
and mid-parent (MP).
Slide No. Flower bud RNA Cy3
33 RNA1 Allo738
34 RNA1 MPV
35 RNA1 Allo738
36 RNA1 MPV
37 RNA2 Allo738
38 RNA2 MPV
39 RNA2 Allo738
40 RNA2 MPV
Supporting data Table 2f. Microarray analysis of gene expression in leaves between A. thaliana
diploid (At2) and isogenic autotetraploid (At4)
Slide No. Flower bud RNA Cy3
41 RNA1 At2
42 RNA1 At4
43 RNA1 At2
44 RNA1 At4
45 RNA2 At2
46 RNA2 At4
47 RNA2 At2
48 RNA2 At4
Supporting data Table 3. List of the genes involved in ethylene biosynthesis and signal pathways that display
differential expression in Arabidopsis progenitors and their allotetraploids.
Oligo ID Locus Symbol TAIR Description P-vales Fold-change
A002070_01 At1g02500 SAMS s-adenosylmethionine synthetase 8.96E-06 0.58
A007898_01 At2g36880 SAMSh1 s-adenosylmethionine synthetase -related 8.56E-08 0.55
A025959_01 At4g01850 SAMSh2 S-adenosylmethionine synthase 2 0.0001 0.56
A014348_01 At4g11280 ACS 1-aminocyclopropane-1-carboxylate synthase 6 (ACC synthase 6) (ACS6) 0.0008 0.55
A005743_01 At1g05010 ACC 1-aminocyclopropane-1-carboxylate oxidase (ACC oxidase) (ethylene-forming enzyme) (EFE) 0.0023 0.56
A002643_01 At1g12010 ACCp 1-aminocyclopropane-1-carboxylate oxidase (ACC oxidase), putative 0.0009 2.23
A020429_01 ACT S-adenosyl-L-methionine:trans-caffeoyl-Coenzyme A mRNA, complete cds 0.0011 0.32
A020880_01 At4g34050 ACTh1 caffeoyl-CoA 3-O-methyltransferase 4.1004E-07 0.45
A023983_01 At3g11480 ACTh2 S-adenosyl-L-methionine:carboxyl methyltransferase family 0.0119 0.28
A007990_01 At2g14060 ACTh3 S-adenosyl-L-methionine:carboxyl methyltransferase family 0.0003 0.57
A012784_01 At3g25570 ACTh4 S-adenosylmethionine decarboxylase -related 2.7863E-05 0.55
A019988_01 At1g19640 ACTh5 S-adenosyl-L-methionine:jasmonic acid carboxyl methyltransferase (JMT) 4.89E-06 0.28
A005709_01 At4g17500 ERF1 ethylene responsive element binding factor 1 8.5999E-05 0.55
A005935_01 At3g15210 ERF4 ethylene responsive element binding factor 4 (ERF4) 6.0962E-06 0.36
A003712_01 At1g28370 ERF11 ethylene responsive element binding factor 11, putative (EREBP11) (ERF11) 1.25E-05 0.29
A006840_01 At2g31230 ERFp1 ethylene response factor, putative 0.0002 0.61
A009272_01 At3g24500 ERFp2 ethylene-responsive transcriptional coactivator -related 0.0005 0.50
A010702_01 At3g16050 ERFp3 ethylene-inducible protein -related 0.0003 0.57
A001728_01 At1g73730 EIL3 ethylene-insensitive3-related3 (EIL3) 1.2543E-05 1.69
ERFp4 Arabidopsis thaliana mRNA for ethylene-responsive element binding protein
ERS1 ethylene response sensor (ERS)
A005945_01 ERSp1 EST, Moderately similar to T00758 ethylene response sensor T20B5.14 0.0018 1.86
A019499_01 At1g50640 ERF3 ethylene responsive element binding factor 3 (ERF3) 2.5894E-05 2.01
A012996_01 At4g18450 ERFp5 ethylene response factor, putative 3.4369E-05 0.60
A011007_01 At3g11930 ERFp6 ethylene-responsive protein -related 5.63E-07 4.01
A016278_01 At5g61590 ERFp7 ethylene responsive element binding factor, putative 0.0037 1.59
A019601_01 At1g53170 ERF8 ethylene responsive element binding factor 8 0.0001 1.69
A021347_01 At5g07580 EREBP ethylene responsive element binding factor family (EREBP) 1.8811E-06 2.52
A019772_01 At2g27050 EIL1 ethylene-insensitive3-related1 (EIL1) 0.0003 1.84
A020587_01 At1g66340 ETR1 ethylene-response protein, ETR1 1.5876E-05 0.62
Red, green, blue, and purple colors indicate differential gene expression was detected in both allotetraploid lines
(Allo733 and 738), Allo738, Allo733, and between the progenitors (A. thaliana and A. arenosa), respectively.
Fold-changes are shown as ratios of gene expression between Allo733 or Allo738 and parental mix or between
A. thaliana and A. arenosa autotetraploids (see microarray experimental design in Fig. 1B and supporting data
Fig. 1). Subscript “p” in the gene symbol followed by numerical numbers represents different “putative”
Supporting data Table 4. RT-PCR analysis of the differentially expressed genes detected by microarray
Locus TAIR description Symbol RNA
(UW) (TAMU) (At4/Aa4)
RNA Ratio Ratio
At5g56030 heat shock protein
At5g12020 heat shock protein
At1g80840 WRKY family
At1g19610 plant defensin
At1g75830 plant defensin
At2g43590 glycosyl hydrolase
At3g15210 ethylene responsive
factor 4 (AtERF4)
At5g20230 blue copper binding
At3g02380 CONSTANS-like 2
At5g10140 MADS box protein
LOCUS F (FLF)
At5g09810 Actin 2
HSP90 + + 0.90406 F: 5'-TGTCTCTGCAACCAAGGAAGGTC-3’
HSP17.6b + - 1.4619 -1.02 -1.8317 494
WRKY40 + + 1.9652 -2.5912 -2.4936 465
PDF1.4 + + 1.0466 -0.60814 -0.57314 360
PDF1.1 + + -2.1055 0.83476 0.90425 322 F: 5’-CGCTGCTCTTGTTTTCTTTGCT-3’
CHI + + n.s. 0.7796 0.89841 471 F: 5’-CGTAACTACTGCCAGAGCAGCAA-3’
ATERF4 + + 0.97404 -1.021 -1.079 433 F: 5’-GACCCACAATAATGCCAAGGA-3’
BCB + - 2.7399 -3.1919 -2.2944 599 F: 5’-GAAAAGGGGGTGACCTGAGTTCT-3’
PORA - - -0.9472 1.7914 0.78946 455
PORB + + -1.2321 2.0486 2.189 433 F: 5’-ACCAAATCAAATCCGAACATGG-3’
SPP + + n.s. 1.0583 1.5007 499
COL2 + + 0.77842 -0.71412 -0.73129 436 F: 5’-ACCACCTGTGATGCTCGAGTT-3’
FLC + + -0.6118 0.78811 0.55986 617
Act2 + + n.s. n.s. n.s. 656 F: 5'-CTCATGAAGATTCTCACTGAG-3’
Note: The genes were randomly selected from different functional categories and across five chromosomes; “+”
matched microarray data; “-” did not match microarray data; “n.s.”: not significant; P-values associated with
each gene are omitted in this table but displayed in table S1; Ratios shown are logarithm-fold changes in
Supporting data Table 5. Expression verification of eleven genes detected in microarrays using qRT-PCR
WRKY family transcription factor 40
blue copper binding protein
heat shock protein HSP81.2
plant defensin protein PDF1.4
Microarray data (log ratios)
Quantitative real-time RT-PCR (log ratios)
leucine-rich repeat protein kinase
FLOWING LOCUS C
protochlorophyllide reductase precursor
glycosyl hydrolase (chitinase)
ND: Not determined due to no or low amplification from Aa
NS: Not significant in microarray analysis
MP: Mid-parent value.
At Cen Aa CenComposite
2n = 4x = 202n = 4x = 32
2n = 4x = 26
Per-gene standard deviation
5 1015 202530 Mb
40 oligos1 oligo/100 Kb
1 log-fold change
Allo738 vs. MPV (1/2At4+1/2Aa)
Allo733 vs. MPV (1/2At4+1/2Aa)
At4 vs. Aa
Allo733 Allo738 Allos
Cell growth, division
& DNA synthesis
& signal transduction
Plant hormonal regulation
Cell rescue, defense,
cell death & aging
Transposons & viral proteins
Cell growth, division & DNA synthesis
Cell rescue, defence, cell death & ageing
Cellular communication & signal transduction
Transposons & viral proteins
Plant hormonal regulation
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
carboxylic acid (ACC)
HCN + H2O
ATPPPi + Pi
Cell signaling, defense, and development
At4 Aa At4 733 738
Leaves Flower buds
BCBWRKY HSP PDF LRR
FLCCYC CHI SPP
Relative Expression Levels
RNA Download full-text
Arabidopsis oligo-gene micoarrays
containing all ~26,090 annotated genes
(4 dye swaps x 2 = 8 slides)
Allo Aa At4
50% + 50%
Supporting data Figure 1
Arabidopsis oligo-gene micoarrays
containing all ~26,090 annotated genes
(4 dye swaps x 2 = 8 slides)
~ 5.8 Mya
1 42 3
1 42 3