Content uploaded by Wei-Seng Ho
Author content
All content in this area was uploaded by Wei-Seng Ho on Jul 20, 2015
Content may be subject to copyright.
SHORT COMMUNICATION
Gene discovery in the developing xylem tissue of a tropical timber
tree species: Neolamarckia cadamba (Roxb.) Bosser (kelampayan)
Shek Ling Pang
1
&Wei Seng Ho
2
&M. N. Mat-Isa
3
&Julaihi Abdullah
1
Received: 8 January 2014 /Revised: 27 March 2015 /Accepted: 15 April 2015
#Springer-Verlag Berlin Heidelberg 2015
Abstract A complementary DNA (cDNA) library was con-
structed from the developing xylem tissues of Neolamarckia
cadamba. A total of 10,368 single-pass sequences was gener-
ated through high-throughput 5′-expressed sequence tag
(EST) sequencing of the cDNA clones, and 6622 high-
quality ESTs were obtained after removing the low-quality
sequences; this gave approximately 3.17 Mb of data.
Clustering of the high-quality ESTs revealed 4728 unigenes,
consisting of 2100 consensus and 2628 singletons. A total of
2405 ESTs were successfully annotated with 7753 gene on-
tology (GO) terms that distributed among three main GO cat-
egories, which were biological processes (2333), molecular
function (3056) and cellular component (2364). Simple se-
quence repeat (SSR) mining revealed that the frequency of
SSR in the N. cadamba EST database (NcbdEST) was
3.3 %, with the GCT/AGC motif being the most abundant
repeat motif. The most abundant transcript with known func-
tion found in this database was 60S ribosomal protein follow-
ed by 40S ribosomal protein. Some of the important genes
involved in xylogenesis and lignin biosynthesis were found
in NcdbEST; these include tubulin genes, cellulose synthase
(CesA), xyloglucan endotransglycosylase (XET),
arabinogalactan, cinnamate 4-hydroxylase (C4H), caffeoyl-
coenzyme A O-methyltransferase (CCoAOMT) and peroxi-
dase. The data obtained from this study will provide a power-
ful means for identifying mechanisms controlling wood for-
mation pathways of kelampayan and supply many new cloned
genes for future endeavours to modify wood and fibre
properties.
Keywords Expressed sequence tags (ESTs) .Forest
plantation .Lignin biosynthesis .Neolamarckia cadamba .
Wood formation
Introduction
In Malaysia, the State Government of Sarawak has set a target
for 1 million ha of forest land degraded by shifting cultivation
to be planted with fast-growing timber tree species by year
2020 in order to meet the increasing demand for wood and
wood fibre. To meet this target, an estimation of 42 million
quality seedlings is needed for annual planting or reforesta-
tion. Neolamarckia cadamba (Roxb.) Bosser, locally known
as kelampayan, has been identified as one of the potential
indigenous fast-growing timber tree species for forest planta-
tion in Sarawak (Tchin et al. 2012; Lai et al. 2013; Tiong et al.
2014). It is a large, deciduous and fast-growing tree that gives
early economic returns within 8–10 years. Under normal con-
ditions, it attains a height of 17.67 m and diameter of 25.3 cm
at breast height within 9 years. Kelampayan not only serves as
Communicated by W. Ratnam
This article is part of the Topical Collection on Genome Biology
Electronic supplementary material The online version of this article
(doi:10.1007/s11295-015-0873-y) contains supplementary material,
which is available to authorized users.
*Shek Ling Pang
shekling@gmail.com; slpang@sarawakforestry.com
1
Applied Forest Science and Industry Development, Tree Breeding
Unit, SARAWAK FORESTRY, 93250 Kuching, Sarawak, Malaysia
2
Forest Genomics and Informatics Laboratory, Department of
Molecular Biology, Faculty of Resource Science and Technology,
Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak,
Malaysia
3
Malaysia Genome Institute, 43600 Bangi, Selangor Darul Ehsan,
Malaysia
Tree Genetics & Genomes (2015) 11:47
DOI 10.1007/s11295-015-0873-y
one of the best raw materials for plywood and veneer but is
also suitable for pulp and paper production, light construction
and furniture (Joker 2000).
Most of the current forest genomic projects are interested in
understanding major events of wood formation as it is likely to
be the key factor in determining the wood properties that will
influence its industrial performance and value. Wood is essen-
tially composed of cellulose, hemicellulose and lignin. These
components are important in the plant defence system, the
transportation of nutrients and the providing of mechanical
strength. Wood formation or xylogenesis involves an extreme-
ly subtle and sophisticated network of spatial and temporal
gene regulation events, in order to coordinate the expression
of several hundreds of genes involved in cell division, cell
expansion and elongation, secondary cell wall formation (in-
cluding cellulose, hemicellulose and lignin synthesis) and pro-
grammed cell death (Bhalerao et al. 2003). These processes
are strongly interlinked and modulation of any one aspect of
wood formation may affect many other aspects. For example,
downregulation of a lignin gene, 4-coumarate: coenzyme A
ligase (4CL), in aspen resulted in a reduction of 45 % in lignin
content but this was compensated for by an increase of 15 % in
cellulose contain of the transgenic tree (Hu et al. 1999).
Therefore, an understanding on the process of wood
formation is important as it could help in the improve-
ment of the wood quality trait through marker-assisted
selection and the assessment of the genetic diversity of
a breeding programme.
Genomics approaches are now widely used to explore the
molecular basis of xylogenesis particularly in economically
important tree species. Expressed sequence tags (ESTs) are
fragments of mRNA generated through single-pass sequenc-
ing of the 5′and/or 3′ends of randomly selected complemen-
tary DNA (cDNA) clones. High-throughput EST sequencing
provides rapid access of a significant portion of the expressed
genes in one organism, and this valuable source of sequence
information can serve as a foundation for initiating a genome
sequencing project (van der Hoevan et al. 2002). A large
number of ESTs for wood formation have been achieved from
studies in pine (Li et al. 2009; Whetten et al. 2001), poplar
(Sterky et al. 1998; Hertzberg et al. 2001) and eucalyptus
(Paux et al. 2004). Despite the high economic value of tropical
wood, the mechanism controlling wood formation in
tropical tree species still remains poorly understood.
Therefore, this project is directed at the understanding
of wood formation of kelampayan based on sequencing
analysis of ESTs derived from the developing xylem
cDNA library. Here, we present the first EST database
for N. cadamba generated through high-throughput 5′
single-pass sequencing. We believe the genomic infor-
mation from our study will give some insights into the wood
formation in tropical wood in general and kelampayan in
particular.
Materials and methods
Sample preparation and cDNA library construction
The bark of a healthy 2-year-old kelampayan tree was peeled
off, and developing xylem tissues were collected from the
exposed surface. The tissues were collected in a clean plastic
bag and directly into liquid nitrogen. The samples were kept in
−80 °C until needed. About 0.5 g developing xylem tissues
was used for total RNA isolation using RNeasy Midi Kit
(Qiagen, Germany) with modification. The cDNA library
was constructed using CloneMiner
TM
cDNA Library
Construction Kit (Invitrogen, USA) according to the manu-
facturer’s protocol. cDNA clones were manually picked and
cultured overnight in 96-well culture blocks. The glycerol
stocks for 10,368 clones were then sent on dry ice for high-
throughput plasmid extraction and 5′end cDNA sequencing at
the Malaysia Genome Institute (MGI).
Sequence processing, clustering, BLAST search
and annotation
Raw ABI-formatted chromatogram reads were base-
called using Phred (Ewing et al. 1998) with a threshold
value of 20. Vector sequences were masked using
Cross-Match. The trimming and removing of vectors
and low-quality nucleotides were done using customized
Perl scripts. High-quality ESTs with a minimum of 100
bases and fewer than 4 % Ns were retained. StackPACK
(Miller et al. 1999) was used for multiple sequence
alignment, clustering, assembling and the generation of
consensus sequences. The sequences were grouped to-
gether by d2_cluster (Burke et al. 1999)iftherewasat
least 96 % sequence similarity in any window of 150
bases. The loose clusters were then aligned using PHRA
P (Laboratory of PHIL GREEN) and subsequently
CRAW (Chou and Burke 1999). Homology searches of
unigenes, annotation and gene ontology of the ESTs
was done using Blast2GO (Conesa et al. 2005) with
default settings.
In silico identification of SSR
Unigenes were mined for simple sequence repeat (SSR)
markers using MISA (Thiel et al. 2003). The polyA
and polyT sequences in a 50-bp window at the terminal
regions were removed. The minimum numbers of re-
peats for SSR detection used in MISA were six for
di-nucleotides, five for tri-nucleotides, four for tetra-nu-
cleotides, three for penta-nucleotides and three for
hexa-nucleotides.
47 Page 2 of 6 Tree Genetics & Genomes (2015) 11:47
Results
Clustering of ESTs
The EST database for N. cadamba developing xylem
(NcdbEST) consisted of 10,368 5′end reads. The removal
oflow-qualityESTsgaveriseto6622high-qualitysequences.
Cluster analysis of these 6622 ESTs revealed 4728 unigenes
which consisted of 2100 consensus sequences and 2628 sin-
gletons. These ESTs were submitted to dbEST at the NCBI
with the library accession number LIBEST_028358. The as-
sembly of the ESTs revealed 28.6 % of redundancy in the
NcdbEST. BLAST analysis of the unigenes was done using
Blast2GO with the default setting. A total of 2913 showed
significant similarity to known sequences in GenBank non-
redundant protein database. The remaining 1815 did not show
a BLAST hit or no significant similarity to any known se-
quences in the database. Only 2050 sequences out of the
2913 sequences that showed significant similarity to the data-
base were assigned with known functions. The remaining 863
were either conserved hypothetical protein, hypothetical pro-
tein, predicted hypothetical protein, predicted protein or
unnamed.
The most abundant transcript with a known function de-
tected in the NcdbEST was putatively identified as 60s ribo-
somal protein (49 ESTs), followed by 40s ribosomal protein
(33 ESTs), whereas tubulin genes (29 ESTs) were the most
abundant cell wall-related genes identified in this database,
followed by arabinogalactan (19 ESTs) and s-
adenosylmethionine synthetase (15 ESTs). Some cell wall-
related genes were moderately abundant in the developing
xylem tissue of kelampayan (3–7 ESTs). These include cellu-
lose synthase, endo-1,4-β-glucanase (cellulose), sucrose syn-
thase (SuSY), expansin, glucan endo-1,3-β-n-glucosidase,
xyloglucan endotransglycosylase-hydrolase (XTH) and
xyloglucan endotransglycosylase (XET). All genes involved
in the lignin biosynthesis pathway were identified in the
kelampayan cDNA library. Genes involved in biotic and abi-
otic stresses were also found in the NcbdEST. Heat shock
protein was the most abundant stress-related gene found in
this NcdbEST (20 ESTs), followed by ethylene (8 ESTs),
14-3-3 protein (8 ESTs), defensin protein (6 ESTs), disease
resistance protein (6 ESTs) and other low-expression genes.
Gene ontology annotation
The gene ontology (GO) annotation of the 4728 unigenes in
NcdbEST was performed using Blast2GO (Conesa et al.
2005). A total of 2405 ESTs were successfully annotated with
GO terms. There were 967 consensi and 1438 singletons. The
2405 ESTs were annotated with 7753 GO terms that distrib-
uted among three main GO categories, which are biological
processes (2333), molecular function (3056) and cellular
component (2364). The annotation showed that 49.13 % of
the unique genes were not annotated. As a result, the number
of ESTs that were represented with GO terms might be prob-
ably underestimated. A total of 1400 of the 1815 sequences
with no hit could not be annotated. Apart from that, an addi-
tional of 397 sequences with a BLAST hit also could not be
annotated with a GO term as most of the sequences were
similar to protein, hypothetical protein or unknown. The se-
quence distribution and percentage in each GO term and their
respective subcategories (cutoff point=100.0) were calculat-
ed. A percentage of 100 is defined as the total number of ESTs
that are assigned with GO terms. It must be noted that the
percentage of the subcategories does not add up to 100 % as
many of the ESTs are involved in different classes of function
and annotated with multiple GO terms. In functional classifi-
cation with GO terms, 73.7 % of the 2405 unigenes with
assigned GO terms have cellular component, 64.1 % were
involved in molecular function and 54.5 % were biological
process. In the cellular component category with 1772
unigenes, 12.18 % were involved in the intracellular organelle
part, and 10.06 % were mitochondrion. A total of 1542 ESTs
were involved in molecular function, with 10.35 % showing
transferase activity and 9.98 % showing protein binding ac-
tivity. For biological process with 1311 ESTs, 11.6 % were
involved in transportation while 6.57 % were involved in reg-
ulation of cellular process (Supplementary 1).
EST-derived SSR marker
Data mining of 4728 kelampayan unigenes identified 178
SSRs in 157 unigenes with a total of 11 sequences that
contained more than one SSR. The frequency of ESTcontain-
ing SSRs in the NcdbEST was 3.3 %, with the overall SSR
density of one in 13.88 kb. The EST-derived SSRs were rep-
resented by di-, tri- and tetra- repeat motifs (Supplementary 2).
Among all the SSR motifs, GCT/AGC motif represented the
most abundant repeat motif in the NcdbEST. Of these, 113
unigenes (63.5 %) were tri-nucleotide repeats, 53 unigenes
(29.8 %) were di-nucleotide repeats and 12 unigenes (6.7 %)
were tetra-nucleotide repeats. There were 4 SSRs that linked
to cell wall formation (cellulase, expansin, α-tubulin and
XTH), 3 SSRs linked to lignin biosynthesis genes (F5H,
HCTand CCR) and 1 SSR linked to disease resistance protein.
These SSRs will be useful in the tree improvement pro-
gramme of kelampayan.
Discussion
Analysis of the 10,368 ESTs from NcdbEST gave rise to 6622
high-quality sequences with approximately 3.17 Mb data. A
total of 4728 unigenes (2.47 Mb) were identified with 2628
ESTs being the singleton and 2100 being the contigs ofat least
Tree Genetics & Genomes (2015) 11:47 Page 3 of 6 47
two ESTs, and this resulted in a library redundancy of 28.6 %.
Blast2GO analysis revealed 2913 ESTs (61.6 %) out of 4728
unigenes showed significant similarity with protein sequences
from other organisms in the GenBank non-redundant protein
database. The remaining 38.4 % or 1815 ESTs did not show
any BLAST hit. The average size of the ESTs not showing
significant similarity to the sequences in GenBank is 307 bp
compared to 657 bp for those with significant similarity.
Sequence length might influence the BLAST result, with the
probability of no significant match to the protein database
being directly proportional to the decrease in sequence length.
According to Bausher et al. (2003), sequence length of 150 to
250 bp can reach maximum 60 % of not getting significant
matches. In the NcdbEST, the average lengths of the ESTs
with poor match (with E-value> 10
−10
) for the contigs and
singletons were 529 and 538 bp, respectively. Therefore, it
was suggested that the ESTs with no BLAST hit and ESTs
with poor match were not mainly caused by the length of the
EST. It is most probably caused by the lack of sequence in-
formation in the public database. This also indicates that these
ESTs might have some specific roles in kelampayan which are
yet to be identified.
The most abundant transcript found in NcdbEST was 60s
ribosomal protein followed by 40s ribosomal protein. The
high percentage of ribosomal protein was expected as they
play a significant role in living systems and function as inter-
mediary for protein translation. Among all the plant cell wall-
related genes, tubulin was the most detected ESTs in this da-
tabase. There were 14 ESTs for α-tubulin genes, 13 ESTs for
β-tubulin genes and 2 for γ-tubulin genes. The α- and β-
tubulin genes are the major constituent of microtubules which
are essential in the intracellular structures and play an impor-
tant role in fundamental mechanisms such as cell division,
vesicular transport, cell wall deposition, signal propagation,
etc. (Nogales 2000), while γ-tubulin is required for microtu-
bule nucleation at microtubule-organizing centres (Horio and
Oakley 2003). In Arabidopsis thaliana,γ-tubulin has been
shown to be essential for the formation of spindle, phragmo-
plast and cortical microtubule arrays (Pastuglia et al. 2006).
The high abundance of tubulin genes in NcdbEST possibly
implies the importance of this gene in the wood formation of
kelampayan. Other cell wall-related genes were
arabinogalactan (19 ESTs) and s-adenosylmethionine (15
ESTs).
Most of the genes in the monolignol/lignin biosynthesis
pathway are represented with relatively moderate or low abun-
dance in kelampayan unigenes. Among all the lignin genes
discovered in this database, C4H (12 ESTs) is the most abun-
dant EST. C4H is a member of the cytochrome P450
monooxygenase superfamily. Together with phenylalanine
ammonia-lyase (PAL) and 4-coumarate: coenzyme A ligase
(4CL), this enzyme directs the carbon flux to an array of im-
portant phenolic compounds in plants (Chapple 1998). C4H
catalyzes the first oxygenation step in phenylpropanoid bio-
synthesis, and the phenylpropanoid branch pathways lead to a
wide array of secondary products essential for UV protection,
differentiation of tissues and defence system, including lig-
nins, flavonoids, hydroxycinnamic esters and coumarins
(Whitbred and Schuler 2000). Ye (1997) stated the importance
of S-adenosyl methionine (SAM) in the methylation of lignin
precursors. In NcdbEST, 15 ESTs were found to encode for
SAM protein. The presence of all lignin genes in the
NcdbEST suggests active secondary cell wall biosynthesis in
the developing xylem tissues sampled.
Genes involved in the reactive oxygen species (ROS) gene
network were identified in this database, and these include
superoxide dismutase, monodehydroascorbate reductase, glu-
tathione reductase, catalase, glutathione peroxidase, NADPH
oxidase and peroxiredoxin.ROS production has been reported
in response to the pathogen defence mechanism (Huang et al.
2011) and most of the abiotic stresses including salinity
(Abogadallah 2010), ozone exposure (Kangasjärvi et al.
2005), heat (Kolupaev et al. 2008), osmotic stress (Xiong
et al. 2002) and other stresses. Other biotic/abiotic-related
proteins were also found in NcdbEST, for example abscisic
acid, salicyclic acid, drought-induced protein, defense-related
protein and disease resistance protein.This informationis very
useful and valuable for the kelampayan tree improvement as
this species is very site specific and having defoliator and stem
borer infection problems at the early stage of growing.
The 4728 non-redundant unigenes were further mined for
the identification of EST-SSR markers. A total of 178 SSRs
were identified in 157 unigenes. The overall density of SSRs
in NcdbEST was one SSR in 13.88 kb, nearly one in 30
unigenes (3.3 %). This SSR frequency more accurately re-
flects the density of SSRs in the transcript regions of the ge-
nome. The overall EST-SSR density of kelampayan was com-
parable to that of poplar (1/14 kb) and Arabidopsis
(1/13.83 kb) (Cardle et al. 2000), and higher as compared to
that of loblolly pine (1/49.8 kb, Bérubé et al. 2006). The SSR
density in plants is believed to be negatively correlated with
the genome size (Varshney et al. 2005). Qiu et al. (2010)
assumed that the high frequency of SSR (1/1.77 kb) in
Ricinus communis L. (castor bean) EST sequences might be
related to its small genome size. Based on our unpublished
data, the genome size of kelampayan was approximately
800 Mb and this might explain the relatively lower SSR den-
sity in this species. Tri-nucleotide repeat motifs were the most
prevalent (63.5 %) class of SSR followed by di-nucleotide
(29.8 %) and tetra-nucleotide (6.7 %), and this was in agree-
ment with cereal species (Varshney et al. 2005). EST se-
quences are the expressed genes that consisted of exonic re-
gions which are under heavy selection against frameshift mu-
tations as they will be translated into proteins. As codons are
functional units of three nucleotides, any shift in three nucle-
otides that are caused by indel mutations will not perturb the
47 Page 4 of 6 Tree Genetics & Genomes (2015) 11:47
current reading frame of a gene (Metzgar et al. 2000). Thus,
tri-nucleotide repeat motifs are expected to be the most abun-
dant class of SSR in EST (Bérubé et al. 2006)andtheabun-
dance of these repeat motifs has been reported in cacao (Riju
et al. 2009) and castor bean (Qiu et al. 2010). Among all the
repeat motifs found in the NcdbEST, GCT/AGC was the most
abundant repeat motif, followed by the AT motif. Previous
findings had shown that AT is the most dominant repeat motif
in plants (Temnykh et al. 2001).
Conclusion
The NcdbESTis currently used as baseline information for the
characterization of full-length cDNA and the genomic se-
quence of candidate genes, EST-SSR development, SNP dis-
covery in candidate genes and association mapping of
kelampayan. We are also using this information as a reference
for the next-generation sequencing study of this species. With
these data, not only are we able to better understand the mech-
anism controlling wood formation in a tropical timber tree
species but we are also able to develop genetic markers for
the trait of interest for the high-throughput marker-assisted
breeding of kelampayan.
Acknowledgments This work is part of the joint Industry-University
Partnership Programme, a research programme funded by the Sarawak
Forestry Corporation (SFC), Sarawak Timber Association (STA) and
Universiti Malaysia Sarawak (UNIMAS) under grant no. RACE/a(2)/
884/2012(02) and GL(F07)/ 06/2013/STA-UNIMAS(06).
References
Abogadallah GM (2010) Insight into the significant of antioxidative de-
fense under salt stress. Plant Signal Behav 5(4):369–374. doi:10.
4161/psb.5.4.10873
Bausher M, Shatters R, Chaparro J, Dang P, Hunter W, Niedz R (2003)
An expressedsequence tag (EST) set from Citrus sinensis L. Osbeck
whole seedlings and the implications of further perennial source
investigations. Plant Sci 165:415–422. doi:10.1016/S0168-
9452(03)00202-4
Bérubé Y, Jun Zhuang J, Rungis D, Ralph S, Bohlmann J, Ritland K
(2006) Characterization of EST-SSRs in loblolly pine and spruce.
Tree Genet Genom. doi:10.1007/s11295-006-0061-1
Bhalerao R, Nilsson O, Sandberg G (2003) Out of the woods: forest
biotechnology enters the genomic era. Curr Opin Biotechnol 14:
206–213. doi:10.1016/S0958-1669(03)00029-6
Burke J, Davison D, Hide W (1999) d2_cluster: a validated method for
clustering EST and full-length cDNA sequences. Genome Res 9:
1135–1142. doi:10.1101/gr.9.11.1135
Cardle L, Ramsay L, Milbourne D, Macaulay M, Marshall D, Waugh R
(2000) Computational and experimental characterization of physi-
cally clustered simple sequence repeats in plants. Genetics 156:847–
854
Chapple C (1998) Molecular genetic analysis of plant cytochrome P450-
dependent monooxygenases. Annu Rev Plant Physiol Plant Mol
Biol 49:311–343. doi:10.1146/annurev.arplant.49.1.311
Chou A, BurkeJ (1999) CRAWview: for viewing splicing variation, gene
families and polymorphism in clusters of ESTs and full-length se-
quences. Bioinformatics 15:376–381. doi:10.1093/bioinformatics/
15.5.376
Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M (2005)
Blast2Go: a universal tool for annotation, visualization and analysis
in functional genomics research. Bioinformatics 21:3674–3676. doi:
10.1093/bioinformatics/bti610
Ewing B, Green P (1998) Base calling of automated sequencer traces
using Phred. II. error probabilities. Genome Res 8:186–194. doi:
10.1101/gr.8.3.175
Hertzberg M, Aspeborg H, Schrader J, Andersson A, Erlandsson R,
Blomqvist K, Bhalerao R, Uhlén M, Teeri TT, Lundeberg J,
Sunberg B, Nilsson P, Sandberg G (2001) A transcriptional roadmap
to wood formation. Proc Natl Acad Sci U S A 98:14732–14737. doi:
10.1073/pnas.261293398
Horio T, Oakley BR (2003) Expression of Arabidopsis gamma-tubulin in
fission yeast reveals conserved and novel functions of gamma-tubu-
lin. Plant Physiol 133:1926–1934. doi:10.1104/pp. 103.027367
Hu WJ, Harding SA, Lung J, Popko JL, Ralph J (1999) Repression of
lignin biosynthesis promotes cellulose accumulation and growth in
transgenic trees. Nat Biotechnol 17:808–812
Huang J, Czymmek KJ, Caplan JL, Sweigard JA, Donofrio NM (2011)
HYR-1 mediated detoxification of reactive oxygen species is re-
quired for full virulence in the rice blast fungus. PLoS Pathog
7(4): e1001335. Doi: 10.1371/journal.ppat.1001335
Joker D (2000) SEED LEAFLET Neolamarckia cadamba (Roxb.)
Bosser (Anthocephalus chinensis (Lam.) A. Rich. ex Walp.)
(http://curis.ku.dk/portal-life/files /20648324/ neolamarckia_
cadamba_int.pdf)
Kangasjärvi J, Japers P, Kollist H (2005) Signalling and cell death in
ozone-exposed plants. Plant, Cell Environ 28:1021–1036. doi:10.
1111/j.1365-3040.2005.01325.x
Kolupaev YY, Karpets YV, Kosakovska IV (2008) The importance of
reactive oxygen species in the induction of plant resistance to heat
stress. Gen Appl Physiol Special Issue 34(3–4):251–266
Lai PS, Ho WS, Pang SL (2013) Development, characterization and
cross-species transferability of expressed sequence tag-simple se-
quence repeat (EST-SSR) markers derived from kelampayan tree
transcriptome. Biotechnology 12(6):225–235
Li XG, Wu HX, Dillon SK, Southerton SG (2009) Generation and anal-
ysis of expressed sequence tags from six developing xylem libraries
in Pinus radiate D. Don BMC Genom 10:41. doi:10.1186/1471-
2164-10-41
Metzgar D, Bytof J, Wills C (2000) Selection against frameshift muta-
tions limits microsatellite expansion in coding DNA. Genome Res
10:72–80
Miller RT, Christoffels AG, Gopalakrishnan C, Burke J, Ptitsyn AA,
Broveak TR, Hide WA (1999) A comprehensive approach to clus-
tering of expressed human gene sequence: the sequence tag align-
ment and consensus knowledge base. Genome Res 9:1143–1155.
doi:10.1101/gr.9.11.1143
Nogales E (2000) Structural insights into microtubule function. Annu
Rev Biochem 69:277–302
Pastuglia M, Azimzadeh J, Goussot M, Camilleri C, Belcram K, Evrard
JL, Schmit AC, Guerche P, Bouchez D (2006) γ-Tubulin is essential
for microtubule organization and development in Arabidopsis. Plant
Cell 18:1412–1425. doi:10.1105/tpc.105.039644
Paux E, Tamasloukht MB, Ladouce N, Sivadon P, Grima-Pettenati J
(2004) Identification of genes preferentially expressed during wood
formation in Eucalyptus.PlantMolBiol55:263–280. doi:10.1007/
s11103-004-0621-4
Qiu LQ, Yang C, Tian B, Yang JB, Liu AZ (2010) Exploiting EST
databases for the development and characterization of EST-SSR
markers in castor bean (Ricinus communis L.). BMC Plant Biol
10:278. doi:10.1186/1471-2229-10-278
Tree Genetics & Genomes (2015) 11:47 Page 5 of 6 47
Riju A, Rajesh MK, Sherin PTPF, Chandrasekar A, Apshara SE,
Arunachalam V (2009) Mining of expressed sequence tag
libraries of cacao for microsatellite markers using five com-
putational tools. J Genet 88(2):217–225. doi:10.1007/s12041-
009-0030-1
Sterky F, Regan S, Karlsson J, Hertzberg M, Rohde A, Holmberg A,
Amini B, Bhalerao R, Larsson M, Villarroel R, Van Montagu M,
Sandberg G, Olsson O, Teeri TT, Boerjan W, Gustafsson P, Uhlén
M, Sundberg B, Lundeberg J (1998) Gene discovery in the wood-
forming tissues of poplar: analysis of 5692 expressed sequence tags.
Proc Natl Acad Sci U S A 95:13330–13335
Tchin BL, Ho WS, Pang SL, Ismail J (2012) Association genetics of the
cinnamyl alcohol dehydrogenase (CAD) and cinnamate 4-
hydroxylase (C4H) genes with basic wood density in
Neolamarckia cadamba. Biotechnology 11(6):307–317
Temnykh S, Declerk G, Lukashover A, Lipovich L, Cartinhour S,
McCouch S (2001) Computational and experimental analysis of
microsatellites in rice (Oryza sativa L.): frequency, length variation,
transposon associations and genetic marker potential. Genome Res
11:1 441–1452. doi:10.1101/gr.184001
Thiel T, Michalek V, Graner A (2003) Exploiting EST databases for the
development and characterization of gene-derived SSR-markers in
barley (Hordeum vulgare L.). Theor Appl Genet 106:411–422
Tiong SY, Ho WS, Pang SL, Ismail J (2014) Nucleotide diversity and
association genetics of xyloglucan endotransglycosylase/hydrolase
(XTH) and cellulose synthase (CesA) genes in Neolamarckia
cadamba. J Biol Sci 14(4):267–375
van der Hoevan R, Ronning C, Giovannoni J, Martin G, Tanksley S
(2002) Deductions about the number, organization and evolution
of genes in the tomato genome based on analysis of a large
expressed sequence tag collection and selective genomic sequenc-
ing. Plant Cell 14:1441–1456. doi:10.1105/tpc.010478
Varshney RK, Granner A, Sorrells ME (2005) Genic microsatellite
markers in plants: features and applications. Trends Biotechnol 23:
48–55. doi:10.1016/j.tibtech.2004.11.005
Whetten R, Sun YH, Zhang Y, Sederoff R (2001) Functional genomics
and cell wall biosynthesis in loblolly pine. Plant Mol Biol 47:275–
291. doi:10.1007/978-94-010-0668-2
Whitbred JM, Schuler MA (2000) Molecular characterization of
CYP73A9 and CYP82A1 P450 genes involved in plant defense in
pea. Plant Physiol 124:47–58. doi:10.1104/pp. 124.1.47
Xiong L, Schumaker KS, Zhu JK (2002) Cell signaling during cold,
drought and salt stress. Plant Cell 14:S165–S183
Ye ZH (1997) Association of caffeoyl coenzyme A 3-O-methyltransfer-
ase expression with lignifying tissues in several dicot plants. Plant
Physiol 115:1341–1350. doi:10.1104/pp. 115.4.1341
47 Page 6 of 6 Tree Genetics & Genomes (2015) 11:47