ArticlePDF Available

De novo assembly of the pennycress (Thlaspi arvense) transcriptome provides tools for the development of a winter cover crop and biodiesel feedstock

Authors:

Abstract and Figures

Field pennycress (Thlaspi arvense L.) has potential as an oilseed crop that can be grown during fall and winter months and harvested in the early spring as a biodiesel feedstock in the Midwestern United States. There has been little agronomic improvement in pennycress through traditional breeding. The recent advances in genomic technologies allow for the development of genomic tools to enable rapid improvements to be made through genomic assisted breeding. Here we report an annotated transcriptome assembly for pennycress. RNA was isolated from representative plant tissues, and 203 million unique Illumina RNAseq reads were produced and used in the transcriptome assembly. The draft transcriptome assembly consists of 33,873 contigs with an average length of 1,242 base pairs. A global comparison of homology between the pennycress and Arabidopsis transcriptomes, along with four other Brassicaceae species, revealed a high level of global sequence conservation within the family. The final assembly was functionally annotated, which allowed for the identification of putative genes controlling important agronomic traits such as flowering and glucosinolate metabolism. The identification of these genes leads to testable hypotheses concerning their conserved function and to rational strategies to improve agronomic properties in pennycress. Future work to characterize isoform variation between diverse pennycress lines and develop a draft genome sequence for pennycress will further direct trait improvement. This article is protected by copyright. All rights reserved.
Content may be subject to copyright.
De novo assembly of the pennycress (Thlaspi arvense)
transcriptome provides tools for the development of a
winter cover crop and biodiesel feedstock
Kevin M. Dorn
1
, Johnathon D. Fankhauser
1
, Donald L. Wyse
2
and M. David Marks
1,
*
1
Department of Plant Biology, University of Minnesota, 1445 Gortner Avenue, 250 Biological Sciences Center, Saint Paul,
MN 55108, USA, and
2
Department of Agronomy and Plant Genetics, University of Minnesota, 411 Borlaug Hall, 1991 Upper Buford Circle, Saint
Paul, MN 55108, USA
Received 13 March 2013; revised 1 June 2013; accepted 10 June 2013.
*For correspondence (e-mail marks004@umn.edu).
SUMMARY
Field pennycress (Thlaspi arvense L.) has potential as an oilseed crop that may be grown during fall
(autumn) and winter months in the Midwestern United States and harvested in the early spring as a biodie-
sel feedstock. There has been little agronomic improvement in pennycress through traditional breeding.
Recent advances in genomic technologies allow for the development of genomic tools to enable rapid
improvements to be made through genomic assisted breeding. Here we report an annotated transcriptome
assembly for pennycress. RNA was isolated from representative plant tissues, and 203 million unique Illumi-
na RNA-seq reads were produced and used in the transcriptome assembly. The draft transcriptome assem-
bly consists of 33 873 contigs with a mean length of 1242 bp. A global comparison of homology between
the pennycress and Arabidopsis transcriptomes, along with four other Brassicaceae species, revealed a high
level of global sequence conservation within the family. The final assembly was functionally annotated,
allowing for the identification of putative genes controlling important agronomic traits such as flowering
and glucosinolate metabolism. Identification of these genes leads to testable hypotheses concerning their
conserved function and to rational strategies to improve agronomic properties in pennycress. Future work
to characterize isoform variation between diverse pennycress lines and develop a draft genome sequence
for pennycress will further direct trait improvement.
Keywords: Thlaspi arvense, pennycress, RNAseq, de novo assembly, comparative transcriptomics, transla-
tional research.
INTRODUCTION
Plant-derived biofuels have the potential to reduce carbon
emissions and provide a renewable source of energy (Hill
et al., 2006). Replacement of fossil fuels with those derived
from plant biomass or oilseeds holds promise to slow glo-
bal climate change due to anthropological release of
greenhouse gasses. The increased access and affordability
of next-generation sequencing resources (i.e. genomics
and transcriptomics) provides new approaches for rapidly
employing new plant species for use as biofuel feedstock.
Many plant species are being considered not only as new
sources of biofuel, but also as components of the land-
scape that improve the environment. Species that have
only been recently removed from the wild will need to be
modified in ways that remove their weedy traits while
enhancing their agronomic properties. Application of next-
generation sequencing resources in the development of
candidate species should allow rapid advancement and
improvement in these species (Varshney et al., 2009).
Biofuel crop species that do not displace land for food
production or encourage the destruction of natural lands
are especially attractive as alternatives to the biofuel stan-
dard: corn-derived ethanol (Fargione et al., 2008; Tilman
et al., 2009). In addition, new species that provide ecosys-
tem services to reduce the effects of large-scale intensive
farming are essential to ensure food security. This is
especially important in the Midwestern United States,
where large portions of the land dedicated to agriculture
are left barren for almost half the year, from the time of
harvest until the next crop establishment. Planting winter
annual crops following the fall harvest has been shown to
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd
1
The Plant Journal (2013) doi: 10.1111/tpj.12267
alleviate soil degradation, topsoil loss through erosion
and nutrient run-off, to help prevent water pollution by
scavenging excess nitrogen from the soil, and to limit
spring weed growth (Dabney et al., 2001; Snapp et al.,
2005).
Pennycress is especially attractive because it provides a
winter cover that uses excess nitrogen and slows soil ero-
sion, provides a spring cover that suppresses weeds, and
yields a harvestable oilseed. The combination of these
traits makes pennycress one of the best candidate biofuel
plant species. Pennycress may be harvested in the spring
using conventional machinery, and yields up to 1345 kg
seed/hectare (Best and Mcintyre, 1975; Mitich, 1996).
Pennycress seeds are high in oils that can easily be
converted into biodiesel (Moser et al., 2009a,b; Boateng
et al., 2010; Isbell and Cermak, 2012; Hojilla-Evangelista
et al., 2013). A recent study showed that pennycress may
be planted as a winter cover crop after corn in the fall
(autumn) and harvested in the spring without impeding
subsequent soybean cultivation, or dramatically affecting
soybean yield, protein content and oil quantity and quality
(Phippen and Phippen, 2012). Thus, the use of pennycress
does not require any new land or displace traditional food
crops. A recent life cycle assessment indicated that penny-
cress-derived fuels could qualify as advanced biofuels
under the US Environmental Protection Agency Renewable
Fuels Standard (Fan et al., 2013).
While the inherent agronomic properties of pennycress
are already good, efforts are required to maximize oilseed
yield, content and composition while reducing seed dor-
mancy and glucosinolate content. Previous studies com-
pared various genetic aspects of pennycress with those of
its close relative Thlaspi caerulescens, which hyper-accu-
mulates zinc and cadmium (Hammond et al., 2006; Milner
and Kochian, 2008). Analysis of over 600 pennycress ESTs
revealed a close relationship between pennycress and Ara-
bidopsis (Sharma et al., 2007). The limited genetic diver-
gence between Arabidopsis and its wild relatives, such as
pennycress, facilitates translation of basic knowledge
gleaned from years of Arabidopsis research.
Here we report the sequencing, de novo assembly and
annotation of the transcriptome of several pennycress tis-
sues, including roots, leaves, shoots, flowers and seed
pods. The draft transcriptome consists of 33 873 tran-
scripts. Comparative analyzes versus other Brassicaceae
species showed a high degree of conservation, which
serves as a validation of the assembly. Comparative analy-
sis versus Arabidopsis thaliana allowed us to identify
many pennycress orthologs that are probably responsible
for controlling flowering time and glucosinolate metabo-
lism. This pennycress dataset, together with further devel-
opment of genomic tools and germplasm resources will
provide unprecedented tools for starting a breeding pro-
gram.
RESULTS
Generation of RNA-seq reads and de novo assembly
RNA was isolated from five pennycress tissue types and
sequenced on a single land of the Illumina HiSeq 2000 plat-
form (100 bp paired-end), yielding 374 725 460 reads with
a mean quality score >Q30 (see Experimental Procedures).
After removing duplicate reads, trimming adaptors, and fil-
tering for low quality sequences, a total of 203 003 444
unique, clean reads were obtained, with a mean length of
87.6 bp. The full, unfiltered short-read dataset was depos-
ited in the National Center for Biotechnology Information
(NCBI) Short Read Archive under accession number
SRR802670.
The filtered reads were de novo assembled using the
CLC Genomics Workbench software package. The effect of
varying de novo assembly parameters was examined by
performing 41 separate assemblies. Word size (k-mer),
match length (the percentage length of a read required to
match the initial contig build), and the match percentage
(the percentage sequence identity required to match a read
to the initial contig build) were varied, and the effect on
various assembly statistics was examined. Regardless of
match length and percentage, assemblies with smaller
word sizes had smaller mean contig lengths. Assemblies
with smaller word sizes also assembled a few contigs
that were significantly larger (16–18 kb) compared to
assemblies with word sizes 52 (15 kb). These large contigs
are probably mis-assembled because each contained
sequences similar to multiple Arabidopsis genes. The
assemblies created with 95% match length and 95% match
percentage parameters were chosen for further compari-
son of how word size affected the relative assembly qual-
ity. Increasing the word size caused the percentage of
reads used in the final assembly and the mean contig
length to increase, while decreasing the number of contigs
assembled. The assemblies with larger word sizes also had
a higher percentage of contigs that had at least 1 BLASTX
hit to at least one Arabidopsis peptide. The statistics
regarding the assembly optimization and
BLAST
results for
the assembly with word size 64, 95% match length and
95% match percentage are shown in Table S1.
The assembly with a word size of 64, 95% match length,
and 95% match percentage was chosen for further analysis
and annotation due to the high quality of assembly statis-
tics and high proportion of assembled transcripts with sig-
nificant matches to Arabidopsis genes compared to the
other assemblies. A summary of sequencing reads and
assembly statistics is shown in Table 1. A total of 33 874
contigs were assembled using these parameters. This
includes a spiked phiX174 genome sequence that serves
as a sequencing control, which was subsequently removed
from the final assembly and total assembly length. The
mean contig length was 1242 bp, with minimum and maxi-
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
2 Kevin M. Dorn et al.
mum contig lengths of 215 and 15 516 bp, respectively.
The size distribution of contig lengths is shown in Fig-
ure 1(a). The N50 was 1729 bp, meaning all contigs this
size or larger encompassed 50% of the total 42 069 800 bp
assembly length. This Transcriptome Shotgun Assembly
project has been deposited at DDBJ/EMBL/GenBank under
the accession GAKE00000000. The version described in this
paper is the first version, GAKE01000000. Approximately
1.5% of the contigs were excluded from the archives
due to the number of ambiguous nucleotides in those
sequences. The complete, annotated FASTA file is
available at http://www.cbs.umn.edu/lab/marks/pennycress/
transcriptome.
Annotation and functional characterization of pennycress
transcripts
The pennycress transcriptome sequences were annotated
using Blast2GO Pro (Conesa et al., 2005). The database
used in this analysis only contains well-characterized
sequences and does not include sequences from
resources such as newly assembled draft genomes. The
taxonomic distribution from this analysis was examined
(Figure 1b). Over 20 000 transcripts had top hits to an
Arabidopsis species, including 11 936 transcripts with a
top hit to A. thaliana, and 11 364 transcripts with a top
hit to Arabidopsis lyrata. Almost 75% of the pennycress
transcripts had top
BLAST
hits within the Brassicaceae
family. Species of the sister genus, Brassica, had a large
proportion of these top hits: Brassica rapa (283), Brassica
napus (233) and Brassica oleraceae (164). Top matches
to plant sequences outside the Brassicaceae were found
for 713 transcripts. Overall, approximately 23% of the
transcripts had top
BLAST
hits to either non-plant
sequences or lacked significant similarity to any
sequence in the public database (Figure 1b). The com-
plete dataset from the final assembly including annota-
tions and associated GO terms from this analysis is
provided in Table S2.
Annotations and associated cellular component, molec-
ular function and biological process gene ontology (GO)
terms were produced for each pennycress transcript. A
total of 27 456 transcripts had a significant hit in the pub-
lic databases (
BLAST
E-value 0.01), and 26 797 tran-
scripts received at least one GO annotation. The most
highly represented biological process GO terms were oxi-
dation/reduction processes (1403 transcripts) and DNA-
dependent regulation of transcription (1255 transcripts).
GO terms associated with response to cold (727 tran-
scripts), the vegetative to reproductive phase transition of
the meristem (462 transcripts) and the regulation of
flower development (411) were also highly represented.
The 50 most highly represented GO terms are shown in
Figure S1.
Table 1 Illumina RNA-seq reads and de novo assembly statistics
Parameter Value
Number of raw unfiltered reads 374 725 460
Total length of reads pre-filtering (bp) 37 472 546 000
Total length of reads post-filtering (bp) 17 799 652 172
Number of trimmed unique reads 203 003 444
Number of contigs 33 873
Mean contig length (bp) 1242
Minimum/maximum contig length (bp) 215/15 516
N50 (bp) 1729
Total assembly length (bp) 42 069 800
A pooled RNA sample consisting of representative plant tissues
was sequenced using the Illumina HiSeq 2000 platform
(2 9 100 bp). Duplicate reads were removed first, then filtered for
quality score, trimmed, and assembled into contigs using the de
novo assembly tool in CLC Genomics Workbench.
Arabidopsis spp. 23,364
Brassica spp.
Other Brassicaceae
0
500
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
14000
15000
16000
0 500 1000 1500 2000 2500
Number of contigs
Contig Length (bp)
bp
Average = 1,242 bp
(a) (b)
Figure 1. Contig length distribution and taxonomic distribution of top annotation hits.
(a) Histogram of the length distribution of assembled contigs.
(b) Taxonomic distribution of the top
BLAST
hits for each transcript in the de novo transcriptome assembly from Blast2GO. Only taxonomic data for the top
BLAST
result of each transcript are shown.
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
De novo assembly of the pennycress transcriptome 3
Comparative transcriptomics of pennycress versus other
Brassicaceae species
Previous molecular analyzes of the Brassicaceae have
divided the family into three basic lineages, recently
reviewed by Franzke et al. (2011). Thlaspi arvense is a
member of expanded lineage 2, and is more closely related
to Thellungiella halophila and other Eutrema/Thellungiella
species than the Brassica species in lineage 2 (Figure 2a).
Arabidopsis thaliana, A. lyrata and Capsella rubella are
members of lineage 1. To explore the relationship between
pennycress and other Brassicaceae at the transcriptome
level, we compared the assembled translated pennycress
transcriptome to a peptide database derived from the
sequenced genomes of A. thaliana, A. lyrata, C. rubella,
B. rapa and T. halophila. A BLASTx comparison of the pen-
nycress transcriptome with this peptide database showed
that 16 298 of the 33 873 pennycress contigs had signifi-
cant (e 0.05) top hits to T. halophila (Figure 2b). B. rapa
had the next highest number of top hits (4972), with the
Brassica rapa
Capsella rubella
Arabidopsis lyrata 3105
Arabidopsis thaliana
Thellungiella
halophila
16298
Significant nr BLASTx
Lineage 1 - Arabidopsis spp.
Capsella rubella
Lineage 3 - Euclidieae
Anastaticeae
Lineage 2 - Brassica spp.
Thellungiella halophila
Thlaspi arvense
Arabidopsis thaliana
Arabidopsis lyrata Brassica rapa
Thellungiella halophilaCapsella rubella
Thlaspi
arvense
5,584
22,863 24,411
4,873
7,719
33,300
8,915
23,755
6,371
29,015
(a)
(c)
(b)
Figure 2. Comparative transcriptomics of pennycress versus five Brassicaceae species.
(a) Representation of the Brassicaceae phylogeny, adapted from Beilstein et al. (2010) and Franzke et al. (2011).
(b) BLASTx comparison of the pennycress transcriptome assembly versus A. thaliana, Arabidopsis lyrata, Brassica rapa, Capsella rubella and Thellungiella halo-
phila. The top
BLAST
hit (e 0.05) for each pennycress transcript versus the five species is shown. Contigs without significant hits were then compared to the
NCBI peptide non-redundant database.
(c) Five pairwise tBLASTn comparisons of Brassicaceae species to the pennycress transcriptome assembly. Sequences with significant homology (e 0.05 and
positive match percentage 370%) shared between the five Brassicaceae species and pennycress (Thlaspi arvense) are shown in the inner circle.
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
4 Kevin M. Dorn et al.
lineage 1 species having approximately 3000 top hits each.
A BLASTx comparison of the remaining sequences without
significant hits to one of the five Brassicaceae species
revealed that 3386 sequences had no significant hit in the
NCBI non-redundant peptide database. This
BLAST
search
returned 779 pennycress contigs with significant hits in the
non-redundant peptide database, including 424 fungi.
Many of these fungal hits (273) were to fungal plant patho-
gens, including Fusarium, Pyrenophora, Phaeosphaeria,
Leptosphaeria and Bipolaris species (Table S3). These fun-
gal transcripts were left in the assembly as the association
between pennycress and these fungi may be informative
in future analyzes.
To examine the degree of conservation between penny-
cress and other sequenced Brassicaceae species, five pair-
wise tBLASTn comparisons were performed between
pennycress and each of the five Brassicaceae species (Fig-
ure 2c). Thellungiella halophila had the highest number of
sequences with significant hits to the pennycress database
(e 0.05 and 70% positive match percentage), together
with the greatest proportion of peptides with significant
matches (24 411/29 284). All five species had at least 72%
of their proteins significantly represented in the penny-
cress database. All five Brassicaceae genomes share
14 677 of the pennycress transcripts (e 0.05 and 70%
positive match percentage). An additional 4547 sequences
were shared between pennycress and at least one of the
other Brassicaceae species. The tBLASTn results from this
analysis are provided in Table S4. A global view of the top
pennycress transcripts and the similarity to each A. thali-
ana peptide (primary transcripts only) is shown in Figure
S2. Of the 27 416 Arabidopsis loci, 14 186 had transcripts
with >70% similarity and >70% coverage in the pennycress
transcriptome.
To more closely examine the level of global sequence
conservation between pennycress and A. thaliana, we fur-
ther examined a BLASTx comparison of the pennycress
transcriptome assembly to the Arabidopsis TAIR10 peptide
database (primary transcripts only). The relative homology
of each predicted peptide to the most similar Arabidopsis
protein was measured by the percentage of positive
sequence similarity (Figure 3a) and percentage coverage
(Figure 3b). A smooth scatter plot representing the per-
centage similarity and percentage coverage for each pen-
nycress sequence compared to the closest Arabidopsis
peptide sequence is shown in Figure 3(c). A large propor-
tion (>85%) of transcripts show at least 70% similarity to
an Arabidopsis protein. A total of 16 556 pennycress pre-
dicted peptides had at least one match to an Arabidopsis
gene with >70% similarity/>70% coverage (Figure 3c,
boxes), of which 4846 pennycress transcripts showed
95% similarity and coverage, 9685 transcripts showed
between 80 and 95% similarity and coverage, and 2025
transcripts showed between 70 and 80% similarity and
coverage. A total of 17 317 transcripts showed <70% simi-
larity and coverage, and 4783 transcripts lacked a signifi-
cant BLASTx hit (e 0.05) to an Arabidopsis peptide.
Identification of candidate pennycress genes controlling
flowering time and glucosinolate levels
The close evolutionary relationship between pennycress
and Arabidopsis enabled identification of potential penny-
cress orthologs responsible for controlling important
agronomic traits such as time to flower and glucosinolate
metabolism. For each pennycress transcript, the top 20
BLASTx hit against the Arabidopsis peptide database was
mined for hits to Arabidopsis genes known to control these
traits. For these transcripts, the longest theoretical transla-
tion was obtained to explore protein sequence conserva-
tion. The nucleotide sequences and predicted peptides for
each sequence, along with the amino acid alignment to
their respective Arabidopsis homolog, are shown in Data
S1.
To investigate the conservation of the flowering time
pathway in pennycress, we attempted to reconstruct the
flowering pathway in Arabidopsis using predicted peptides
from the transcriptome assembly (Jung and M
uller, 2009).
Full-length predicted peptides (methionine to stop codon)
with high homology to their respective Arabidopsis pep-
tides were obtained for VERNALIZATION1 (At3G18990),
VERNALIZATION2 (At4G16845), VERNALIZATION INSENSI-
TIVE 3 (At5G57380), LIKE HETEROCHROMATIN PROTEIN 1
(At5G17690), FLOWERING LOCUS C (At5G10140), SHORT
VEGETATIVE PHASE (At2G22540), TWIN SISTER OF FT
(At4G20370
), AGAMOUS-LIKE 19 (At4G22950), SUPPRES-
SOR OF OVEREXPRESSION OF CO 1 (At2G45660), FLOWER-
ING LOCUS T (At1G65480), LATE ELONGATED HYPOCOTYL
(At1G01060), TIMING OF CAB EXPRESSION 1 (At5G61380),
PSEUDO-RESPONSE REGULATOR 7 (At5G02810), PSEUDO-
RESPONSE REGULATOR 9 (At2G46790), FLAVIN-BINDING
KELCH REPEAT F BOX 1 (At1G68050), and GIGANTEA
(At1G22770). Partial or incomplete matches were found for
MULTICOPY SUPRESSOR OF IRA1 (At5G58230), ATBZIP14/
FD (At4G35900), TERMINAL FLOWER 1 (At5G03840), APE-
TALA1 (At1G69120), FRUITFULL (At5G60910), CIRCADIAN
CLOCK ASSOCIATED 1 (At2G46830), and CONSTANS
(At5G15840). None of the pennycress transcripts had a top
hit to the Arabidopsis FRIGIDA (FRI) locus (At4G00650);
however, we found a 613 amino acid predicted peptide
similar to B. napus FRI.a (GenBank accession number
AFA43306.1), which was previously shown to be a major
determinant of flowering in rapeseed (Wang et al., 2011a).
No pennycress ortholog for LEAFY (At5G61850) was found
in the final assembly. However, truncated transcripts simi-
lar to the Arabidopsis LEAFY sequence were detected in
the assemblies created using word sizes of 24, 30 and 46
(95% match length and percentage). Putative orthologs of
the FRIGIDA protein complex were also found (Choi et al.,
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
De novo assembly of the pennycress transcriptome 5
2011). Full-length predicted peptides with high sequence
similarity were found for EARLY FLOWERING IN SHORT
DAYS (At1G77300), SUPPRESSOR OF FRIGIDA4
(At1G30970), FLC EXPRESSOR (At2G30120), FRIGIDA-
ESSENTIAL 1 (At2G33835), FRIGIDA LIKE 1 (At5G16320),
and HOMOLOG OF YEAST YAF9 A (At5G45600). No unique
matches were found for TBP-ASSOCIATED FACTOR 14
(At2G18000). A reconstruction of the flowering time path-
way using pennycress transcripts is shown in Figure S3.
Through comparative transcriptomics, we have identi-
fied potential orthologs of both myrosinases and specifier
proteins responsible for controlling the breakdown of gluc-
osinolates in pennycress. We performed a BLASTX com-
parison of the pennycress transcriptome assembly against
the Arabidopsis proteome for the main myrosinases
TGG1-6 (THIOGLUCOSIDE GLUCOHYDROLASE 1-6) and
PEN2 (PENETRATION 2), and specifier proteins ESP
(EPITHIOSPECIFIER PROTEIN) and NSP1-5 (NITRILE SPECI-
FIER PROTEIN 1-5) that are responsible for glucosinolate
breakdown in Arabidopsis. An Arabidopsis TGG1 ortholog
was found whose longest ORF produced a predicted pep-
tide with high sequence conservation compared to Arabid-
opsis. The top BLASTp hit against the non-redundant
protein database was a Eutrema wasabi myrosinase (Gen-
Bank accession number BAE16356). The predicted peptide
from another pennycress transcript was found to be highly
similar to Arabidopsis TGG4. This predicted peptide had a
top BLASTp hit to a myrosinase from Amoracia rusticana
(horseradish) (GenBank accession number AEZ01595.1).
An ortholog for the Arabidopsis atypical myrosinase PEN2
was also found. The pennycress PEN2 predicted peptide
has 95% sequence identity conservation to an unnamed
protein product from T. halophila (GenBank accession
number BAJ34425.1).
The conservation of specifier proteins was also exam-
ined. Three pennycress transcripts were found to have
high homology to three Arabidopsis NSP genes. A
full-length predicted peptide was similar to the AtNSP1
40 50 60 70 80 90 100
0 200 400 600 800 1000 1200 1400
Frequency of percent similarity
Percent similarity to an Arabidopsis peptide
Frequency
0 20406080100
0 2000 4000 6000 8000
Frequency of percent coverage
Percent coverage of an Arabidopsis peptide
Frequency
40 45 50 55 60 65 70 75 80 85 90 95
100
0 102030405060708090100
753
transcripts
0
transcripts
Percent similarity of an Arabidopsis peptide
Percent coverage of an Arabidopsis peptide
Greater than
80% similarity
80% coverage
14,531/33,873
Less than
70% similarity
70% coverage
17,317/33,873
Greater than
70% similarity
70% coverage
16,556/33,873
Greater than
95% similarity
95% coverage
4,846/33,873
(a) (c)
(b)
Figure 3. Similarity and coverage of pennycress transcripts versus Arabidopsis genes.
(a) Histogram showing frequency versus percentage similarity (positive amino acid identity) of pennycress contigs versus an Arabidopsis peptide.
(b) Histogram showing frequency versus percentage coverage (longest positive hit/peptide length) of pennycress contigs versus an Arabidopsis peptide. Most
assembled pennycress transcripts have high coverage which greatly skews the histogram to the right.
(c) Smoothed color density representation of the percentage similarity (x axis) of each pennycress transcript plotted against the percent coverage of the
Arabidopsis protein similarity (y axis). The plot was produced using the ‘smoothScatter’ function in
R
(R Development Core Team, 2008), which produces a
smoothed density representation of the scatterplot using a kernel density estimate (nbin = 100). Darker color indicates a higher density of transcripts in a given
position, with the darkest ‘bin’ containing over 700 transcripts. Boxes encompassing transcripts encoding peptides with 70, 80 and 95% sequence similarity and
coverage are shown in the upper right corner. Raw similarity and coverage data are provided in Table S2.
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
6 Kevin M. Dorn et al.
peptide, but was most similar to the nitrile-specifier protein
from another member of the Brassicaceae, Schouwia pur-
purea (GenBank accession number AFP47629.1). Another
transcript was found that encoded a 1073 amino acid
predicted peptide with high sequence similarity to the
C-terminal Kelch domain-containing region of AtNSP4. The
N-terminus of this peptide has high similarity to other
AtNSPs. A third transcript was found to encode a predicted
peptide with high similarity to AtNSP5. We also identified
orthologs to the glucosinolate transporters GTR1 and
GTR2 in the pennycress transcriptome. These predicted
peptides have significant homology to the Arabidopsis
GLUCOSINOLATE TRANSPORTER 1-2 peptides (GTR1,
At3G47960; GTR2, At5G62680).
This comparison of assembly coverage is at least
qualitatively indicative of expression level differences in
the total RNA library. Although directly comparing the
non-normalized statistic of mean coverage across tran-
scripts for quantification is inappropriate, we observed
many high-coverage transcripts related to glucosinolate
metabolism. Interestingly, we observed that, among the
100 transcripts with the highest mean coverage, six were
similar to b-glucosidase (two transcripts), myrosinases
(three transcripts) and myrosinase-binding protein (one
transcript). The remaining 94 transcripts in this group may
be considered ‘housekeeping’ genes. Predictably, most of
these transcripts are involved in photosynthetic processes.
It remains unknown whether the high levels of glucosino-
lates and glucosinolate by-products in pennycress are
simply due to high expression of these myrosinates and/or
specifier proteins, unique activity, unique hormonal regula-
tion of activity or expression, or some combination of
these.
DISCUSSION
Comparative transcriptomics of pennycress and
Arabidopsis
We have sequenced, assembled and annotated the pen-
nycress transcriptome. The draft transcriptome consists
of 33 873 unique sequences, of which 27 442 were anno-
tated using the Blast2GO pipeline. Of these transcripts,
35% were most similar to an A. thaliana gene, and 74%
had top hits in the Brassicaceae, indicating a high level
of sequence conservation across the family.
BLAST
com-
parisons between pennycress and five other sequenced
Brassicaceae species showed that our pennycress tran-
scriptome has good coverage of homologous sequences.
These analyzes are consistent with previous phylogenetic
findings indicating that pennycress is more closely
related to T. halophila than to Brassica species (Franzke
et al., 2011).
The total transcriptome assembly length was over
42 Mbp. The pennycress genome (2n = 14) is approxi-
mately 539 Mbp (Hume et al., 1995; Johnston, 2005). In
comparison, the Arabidopsis genome (2n = 10) is esti-
mated to be 125 Mbp (Kaul et al., 2000), with the latest
genome annotation release (TAIR10) containing 33 602
genomic features, including 27 416 protein-coding genes.
Brassica rapa has 41 174 protein-coding genes, with mean
transcript/coding lengths of 2015/1172 bp (Wang et al.,
2011b). The number of genes identified here in pennycress,
together with the estimated genome size, matches similar
observations on total gene number in the Arabidopsis and
B. rapa genomes.
Characterization of pennycress glucosinolate metabolism
and translocation
Many plants in the order Brassicales produce high levels of
glucosinolates and glucosinolate hydrolysis products,
which are thought to provide a defensive function (Bones
and Rossiter, 1996). Glucosinolates are one of the most
highly characterized secondary metabolites in Arabidopsis
(Wittstock and Burow, 2010). Myrosinases, also known as
thioglucoside glucohydrolases, hydrolyze the glucosino-
late, forming an intermediate aglycone. The aglycone is
either spontaneously rearranged to form isothiocyanates,
or converted to a simple nitrile, epithionitrile or thiocya-
nate by specifier proteins. The characteristic ‘garlic-like’
odor of pennycress has led researchers to investigate the
levels of glucosinolates and glucosinolate by-products in
pennycress (Warwick et al., 2002; Kuchernig et al., 2011).
This has led to another common name for this species:
‘stinkweed’. A single thiocyanate-forming protein has pre-
viously been identified and characterized in pennycress
(Kuchernig et al., 2011). Pennycress seed has also been
investigated for its biofumigant properties probably due
to the high levels of glucosinolates in the seeds (Vaughn
et al., 2005). After oil is pressed from pennycress seed, the
remaining press cake has high levels of protein (25%),
which has the potential to serve as an animal feed supple-
ment or for use in industrial products (Selling et al., 2013).
However, the high levels of glucosinolates, which may be
toxic to animals, prohibit such use (Best and Mcintyre,
1975; Warwick et al., 2002; Vaughn et al., 2005). Previous
work in Arabidopsis identified key glucosinolate transport-
ers that are responsible for translocating glucosinolates
(Nour-Eldin et al., 2012). The Arabidopsis double mutant
gtr1 gtr2 showed significantly reduced levels of glucosino-
lates in seed. We predict that loss-of-function mutations in
the pennycress GTR-like genes identified here would cause
a reduction in seed glucosinolate levels.
Genetics of flowering time in winter annual pennycress
The genetic mechanisms controlling the transition from
vegetative to reproductive growth have been widely stud-
ied in Arabidopsis and other plant species (Simpson, 2002;
Amasino, 2005; Kim et al., 2009). In many species adapted
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
De novo assembly of the pennycress transcriptome 7
to winter climates, a period of cold provided by over-
wintering is required to make plants competent to flower,
a process that is known as vernalization. In many crucifer
species, there is natural variation in populations adapted
to different climates. Much of this variation is attributed to
the complex interaction of FRIGIDA (FRI), the FRIGIDA
protein complex and FLOWERING LOCUS C (FLC), which
provide the main response to vernalization (Choi et al.,
2011). The period of vernalization provided by winter epi-
genetically represses FLC expression (Sheldon et al., 2000;
Michaels and Amasino, 2001). This removes the transcrip-
tional repression by FLC on FLOWERING LOCUS T (FT), a
main integrator of environmental cues promoting flower-
ing. ‘Fast-cycling’ lines of Arabidopsis contain a loss-
of-function mutation in FRI (Johanson et al., 2000; Gazzani,
2003).
Variation of FRI and FLC orthologs in B. rapa (Schranz
et al., 2002; Yuan et al., 2009), B. oleracea (Irwin et al.,
2012) and B. napus (Tadege et al., 2001; Wang et al.,
2011a) is associated with vernalization and flowering. Both
‘early’ and ‘late’ flowering lines of pennycress have been
reported (Best and Mcintyre, 1976). Much like the fast
cycling lines of Arabidopsis, the ‘early’ pennycress lines
flower without a period of vernalization, exhibiting a spring
annual habit. The late-flowering lines grow for a period of
time in the fall (autumn) as a vegetative rosette, but do not
flower until the spring. The genetic differences between
winter and spring annual pennycress lines were deter-
mined to be caused by a single dominant allele (Mcintyre
and Best, 1978). We predict that the natural variation
between spring and winter lines is due to mutations in FRI
or FLC-like genes. In order for pennycress to be easily used
as a winter cover crop in various climates, precise control
of spring flowering time is required. Perturbations of the
flowering time pathway in cultivated species through
breeding and genetic modification have served as an
important tool for controlling flowering time (Jung and
M
uller, 2009). Our identification of likely orthologous
genes responsible for controlling flowering time will be a
useful tool for making rapid improvements in the penny-
cress germplasm.
Considerations regarding de novo transcriptome assembly
Varying de novo assembly parameters using short-read
data has been shown to enable assembly of unique tran-
scripts corresponding to real genes (Zhao et al., 2011). In
this study, we chose a single assembly based on the high
quality assembly statistics and the high number of
transcripts with significant similarity to Arabidopsis pep-
tides. The finding of a high number of potentially ortholo-
gous sequences in the pennycress and its relatives
provides a validation of the pennycress assembly. How-
ever, different assembly programs and parameters affect
the assembly of transcripts expressed at both high and
low levels (Zhao et al., 2011; Gongora-Castillo and Buell,
2013). For example, de novo assemblies created with
large word sizes poorly assemble lowly expressed genes
(Gruenheit et al., 2012). Thus, it is not expected that any
one assembly will truly represent the complete biological
transcriptome. This was highlighted in the current analy-
sis between pennycress and Arabidopsis. We predicted
that a LEAFY-like ortholog should be represented in our
RNA pools, but it was not found in the final assembly.
Assemblies with smaller word sizes (24, 30, and 46) did
assemble LEAFY-like transcripts (see transcript sequences
in Data S1). These transcripts had low coverage
(79 mean) with few mapped reads. Combined with the
high number of reads used to create our final assembly
(over 200 million), this indicates that the pennycress
LEAFY ortholog was expressed at low levels in our sam-
ple and is probably not included in the final assembly due
to the larger word size. In our optimization, smaller word
sizes also resulted in assembly of some obviously mis-
assembled transcripts in which multiple transcripts from
unlinked genes were joined together. These results further
support the requirement for full characterization of the
potential changes caused by various de novo assembly
parameters.
Future perspectives
We have identified pennycress transcripts that are proba-
bly responsible for controlling key agronomic traits such
as seed glucosinolate levels and flowering time, which are
primary targets for future research in order to improve the
pennycress germplasm. It should be straightforward to
make improvements using reverse genetic approaches to
identify inactive or altered alleles by using well-established
TILLING protocols (McCallum et al., 2000; Kurowska et al.,
2011). Our ongoing sequencing of the pennycress genome
will enable rapid screening of TILLING populations through
next-generation sequencing. In addition, the ability to
make improvements using transgenic approaches to mod-
ify gene expression should soon be possible as we have
found that pennycress is relatively easy to regenerate
in vitro (Matthew Krause, Kevin M. Dorn, M. David Marks,
unpublished observation). Pennycress has tremendous
agronomic potential as a winter cover and new source of
oilseeds. A recent report by the Massachusetts Institute of
Technology Joint Program on Science and Policy of Global
Change indicates that pennycress could be grown on over
40 million acres each year, yielding up to 6 billion gallons
of oil that may be converted to biodiesel (Moser et al.,
2009a; Winchester et al., 2013). This represents approxi-
mately 15% of the 40 billion gallons of diesel consumed
annually in the USA. The recent advances in omics-based
technologies will allow for the use of the resources devel-
oped here to make rapid improvements to the pennycress
germplasm.
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
8 Kevin M. Dorn et al.
EXPERIMENTAL PROCEDURES
Plant growth conditions and RNA extraction
Seed from a small natural population of T. arvense L. was col-
lected near Coates, MN. Seeds were planted in moist Berger BM2
germination mix (Berger Inc., www.berger.ca), stratified for 7 days
at 4°C, and then placed in a 21°C growth chamber. Individual
seedlings were transferred to 4-inch pots after 2 weeks, and were
grown under banks of AgroMax 6400K T5 fluorescent lights
(HTGSupply, http://www.htgsupply.com) with a 16 h/8 h day/night
cycle at 98 micromoles/m
2
/s PAR. To initiate flowering, 6-week-old
plants with established rosettes were covered and transferred to a
4°C cold room for 1429 days in the dark. After vernalization,
plants were transfered back to the growth chamber and grown
under 400 W metal halide bulbs (Philips, http://www.usa.lighting.
philips.com) at 50 micromoles/m
2
/s PAR. Roots, hypocotyls, coty-
ledons and young leaves were obtained by planting sterilized seed
on 19 Murashige and Skoog medium with 0.8% agar. Seed was
stratified at 4 °C for 3 days, and then grown for 7 days in constant
light under T12 fluorescent bulbs (Philips) at 42 micromoles/m
2
/s
PAR.
RNA was extracted from (i) roots from 12 seedlings grown on
MS plates, (ii) hypocotyls, cotyledons, young meristems and first
leaves from 12 seedlings grown on MS plates, (iii) four new leaves
from each of two 120-day-old unvernalized plants, (iv) aerial
leaves and stems from 128-day-old flowering plants, and (v)
flowers and seed pods from 128-day-old flowering plants. RNA
was purified using an RNeasy plant mini kit (Qiagen, http://www.
qiagen.com) according to the manufacturer’s instructions.
Following the initial total RNA extraction, samples were treated
with Ambion TURBO DNase (Life Technologies, http://www.
lifetechnologies.com) according to the manufacturer’s instruc-
tions, immediately followed by the RNA clean-up procedure from
the Qiagen RNeasy kit.
High-throughput RNA sequencing and de novo assembly
A pooled sample containing equal amounts of purified total RNA
from each of the five tissue samples was submitted to the Univer-
sity of Minnesota Biomedical Genomics Center for sequencing.
RNA was subjected to quality control using the Invitrogen Ribo-
Green RNA assay (Life Technologies), and RNA integrity was ana-
lyzed by capillary electrophoresis on an Agilent BioAnalyzer 2100
(Agilent Technologi es, http://www.agilent.com). Polyadenylated
RNA was selected using oligo(dT) purification and reverse-tran -
scribed to cDNA. cDNA was fragmented, blunt-ended, and ligated
to the Illumina TruSeq Adaptor Index 3 (Illumina Inc., http://www.
illumina.com). The library was size-selected for an insert size of
200 bp, and quantified using the Invitrogen PicoGreen dsDNA
assay (Life Technologies). The pooled RNA sample was
sequenced using the Illumina HiSeq 2000 platform using 100 bp,
paired-end reads, producing 374 million reads above Q30. Read
pairs had a mean insert size of 200 bp. Duplicate reads were
removed, and the first 10 nucleotides were trimmed from the 5
end of each read using the tools in the CLC Genomics Workbench
5.5 (CLC Bio, http://www.clcbio.com). The additional trimming
parameters were: removal of low-quality sequence limit = 0.05;
removal of ambiguous nucleotides, maximum two nucleotides
allowed; removal of terminal nucleotides, 10 nucleotides from the
5
0
end; removal of Illumina TruSeq Indexed Adaptor 3 and Univer-
sal Adapter sequences.
Reads were de novo assembled into contigs using the CLC
Genomics Workbench 5.5 de novo assembly tool. A series of inde-
pendent assemblies were performed to analyze the effects of vary-
ing the de novo assembly parameters. Assemblies were
performed using varying word size (18, 24, 30, 36, 40, 46, 52, 58
and 64), and with length fractions (match length) of 0.7 and 0.95.
An additional 23 assemblies were performed using values outside
these parameters, with a total of 41 assemblies performed. The
remaining assembly parameters were: auto bubble size, yes; mini-
mum contig length, 300 bp; perform scaffolding, yes; mismatch
cost, 3; insertion cost, 3; deletion cost, 3; update contigs, yes.
Functional annotations and gene ontologies were assigned to
each assembled contig from the final assembly using Blast2GO
with the following parameters: BLASTx against the NCBI non-
redundant protein database, BLAST E-value = 0.001, and reporting
the top 20 hits. Comparative
BLAST
searches against Arabidopsis
were performed using the CLC Genomics Workbench
BLAST
func-
tion, using sequences obtained from the TAIR10 release of the
Arabidopsis transcriptome and proteome (www.arabidopsis.org)
(Lamesch et al., 2012). Sequences for Arabidopsis lyrata (Hu et al.,
2011), Capsella rubella (Slotte et al., 2013), B. rapa (Wang et al.,
2011b) and T. halophila were obtained from Phytozome v9.1
(www.phytozome.net). Further statistical analysis and figures
were prepared using
R
(R Development Core Team, 2008). The
final assemb ly described here has been submitted to DDBJ/EMBL/
GenBank under the accession GAKE01000000. The complete,
annotated FASTA file is available at http://www.cbs.umn.edu/lab/
marks/pennycress/transcriptome.
ACKNOWLEDGEMENTS
This work was supported by the University of Minnesota Institute
on the Environment, the University of Minnesota College of Bio-
logical Sciences, the University of Minnesota College of Food,
Agriculture and Natural Resource Sciences, the University of Min-
nesota Alexander and Lydia Anderson Fellowship, and royalties
obtained by D.L.W. for intellectual property rights. The results
reported here are based on work supported by the US National
Science Foundation Graduate Research Fellowship under grant
number 00006595 to K.M.D. and J.D.F. Any opinions, findings and
conclusions or recommendations expressed in this material are
those of the authors, and do not necessarily reflect the views of
the US National Science Foundation.
SUPPORTING INFORMATION
Additional Supporting Information may be found in the online ver-
sion of this article.
Figure S1. Most highly represented GO terms in the pennycress
transcriptome annotation.
Figure S2. Global comparison of the Arabidopsis and pennycress
transcriptomes.
Figure S3. Reconstruction of the flowering time pathway in penny-
cress.
Table S1. Effects of de novo assembly parameters on assembly
statistics.
Table S2. Transcript annotations, GO terms and top
BLAST
hits to
Arabidopsis.
Table S3. BLASTx results of pennycress transcriptome against five
Brassicaceae species,
Table S4. tBLASTn results of five global
BLAST
comparisons of five
Brassicaceae species to the pennycress transcriptome assembly.
Data S1. Predicted peptide sequences for candidate orthologs
involved in flowering and glucosinolate metabolism.
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
De novo assembly of the pennycress transcriptome 9
REFERENCES
Amasino, R.M. (2005) Vernalization and flowering time. Curr. Opin. Biotech-
nol. 16, 154158.
Beilstein, M.A., Nagalingum, N.S., Clements, M.D. et al. (2010) Dated
molecular phylogenies indicate a Miocene origin for Arabidopsis thali-
ana. Proc. Natl Acad. Sci. USA, 107, 1872418728.
Best, K.F. and Mcintyre, G.I. (1975) Biology of Canadian weeds, 9. Thlaspi
arvense L. Can. J. Plant Sci. 55, 279292.
Best, K.F. and Mcintyre, G.I. (1976) Studies on flowering of Thlaspi arvense
L. 3. Influence of vernalization under natural and controlled conditions.
Bot. Gaz. 137, 121127.
Boateng, A.A., Mullen, C.A. and Goldberg, N.M. (2010) Producing stable
pyrolysis liquids from the oil-seed presscakes of mustard family plants:
pennycress (Thlaspi arvense L.) and camelina (Camelina sativa). Energy
Fuels, 24, 66246632.
Bones, A.M. and Rossiter, J.T. (1996) The myrosinaseglucosinolate system,
its organisation and biochemistry. Physiol. Plant. 97, 194208.
Choi, K., Kim, J., Hwang, H.J. et al. (2011) The FRIGIDA complex activates
transcription of FLC, a strong flowering repressor in Arabidopsis,by
recruiting chromatin modification factors. Plant Cell, 23, 289303.
Conesa, A., Gotz, S., Garcia-Gomez, J.M. et al. (2005) Blast2GO: a universal
tool for annotation, visualization and analysis in functional genomics
research. Bioinformatics, 21, 36743676.
Dabney, S.M., Delgado, J.A. and Reeves, D.W. (2001) Using winter cover
crops to improve soil and water quality. Commun. Soil Sci. Plant Anal.
32, 12211250.
Fan, J., Shonnard, D., Kalnes, T., Johnsen, P. and Rao, S. (2013) A life cycle
assessment of pennycress (Thlaspi arvense L.)-derived jet fuel and die-
sel.
Biomass Bioenergy, 55,87100.
Fargione, J., Hill, J., Tilman, D. et al. (2008) Land clearing and the biofuel
carbon debt. Science, 319, 12351238.
Franzke, A., Lysak, M.A., Al-Shehbaz, I.A. et al. (2011) Cabbage family affairs:
the evolutionary history of Brassicaceae. Trends Plant Sci. 16, 108116.
Gazzani, S. (2003) Analysis of the molecular basis of flowering time varia-
tion in Arabidopsis accessions. Plant Physiol. 132, 11071114.
Gongora-Castillo, E. and Buell, C.R. (2013) Bioinformatics challenges in de
novo transcriptome assembly using short read sequences in the absence
of a reference genome sequence. Nat. Prod. Rep. 30, 490500.
Gruenheit, N., Deusch, O., Esser, C. et al. (2012) Cutoffs and k-mers: impli-
cations from a transcriptome study in allopolyploid plants. BMC Genom-
ics, 13, 92.
Hammond, J.P., Bowen, H.C., White, P.J. et al. (2006) A comparison of the
Thlaspi caerulescens and Thlaspi arvense shoot transcriptomes. New
Phytol. 170, 239260.
Hill, J., Nelson, E., Tilman, D. et al. (2006) Environmental, economic, and
energetic costs and benefits of biodiesel and ethanol biofuels. Proc. Natl
Acad. Sci. USA, 103, 1120611210.
Hojilla-Evangelista, M.P., Evangelista, R.L., Isbell, T.A. and Selling, G.W.
(2013) Effects of cold-pressing and seed cooking on functional properties
of protein in pennycress (Thlaspi arvense L.) seed and press cakes. Ind.
Crops Prod. 45, 223229.
Hu, T.T., Pattyn, P., Bakker, E.G. et al. (2011) The Arabidopsis lyrata genome
sequence and the basis of rapid genome size change. Nat. Genet. 43,
476481.
Hume, L., Devine, M.D. and Shirriff, S. (1995) The influence of temperature
upon physiological processes in early-flowering and late-flowering
strains of Thlaspi arvense L. Int. J. Plant Sci. 156, 445449.
Irwin, J.A., Lister, C., Soumpourou, E. et al. (2012) Functional alleles of the
flowering time regulator FRIGIDA in the Brassica oleracea genome. BMC
Plant Biol. 12, 21.
Isbell, T.A. and Cermak, S.C. (2012) Extraction of pennycress (Thlaspi
arvense L.) seed oil by full pressing. Ind. Crops Prod. 37,7681.
Johanson, U., West, J., Lister, C. et al. (2000) Molecular analysis of FRIG-
IDA, a major determinant of natural variation in Arabidopsis flowering
time. Science, 290, 344347.
Johnston, J.S. (2005) Evolution of genome size in Brassicaceae. Ann. Bot.
95, 229235.
Jung, C. and M
uller, A.E. (2009) Flowering time control and applications in
plant breeding. Trends Plant Sci. 14, 563573.
Kaul, S., Koo, H.L., Jenkins, J. et al. (2000) Analysis of the genome sequence
of the flowering plant Arabidopsis thaliana. Nature, 408, 796815.
Kim, D.-H., Doyle, M.R., Sung, S. et al. (2009) Vernalization: winter and the
timing of flowering in plants. Annu. Rev. Cell Dev. Biol. 25, 277299.
Kuchernig, J.C., Backenkohler, A., Lubbecke, M. et al. (2011) A thiocya-
nate-forming protein generates multiple products upon allylglucosino-
late breakdown in Thlaspi arvense. Phytochemistry, 72, 16991709.
Kurowska, M., Daszkowska-Golec, A., Gruszka, D. et al. (2011) TILLING a
shortcut in functional genomics. J. Appl. Genet. 52
, 371390.
Lamesch, P., Berardini, T.Z., Li, D.H. et al. (2012) The Arabidopsis Informa-
tion Resource (TAIR): improved gene annotation and new tools. Nucleic
Acids Res. 40, D1202D1210.
McCallum, C.M., Comai, L., Greene, E.A. et al. (2000) Targeting induced
local lesions in genomes (TILLING) for plant functional genomics. Plant
Physiol. 123, 439442.
Mcintyre, G.I. and Best, K.F. (1978) Studies on flowering of Thlaspi arvense
L. 4. Genetic and ecological differences between early-flowering and
late-flowering strains. Bot. Gaz. 139, 190195.
Michaels, S.D. and Amasino, R.M. (2001) Loss of FLOWERING LOCUS C
activity eliminates the late-flowering phenotype of FRIGIDA and autono-
mous pathway mutations but not responsiveness to vernalization. Plant
Cell, 13, 935941.
Milner, M.J. and Kochian, L.V. (2008) Investigating heavy-metal hyperaccu-
mulation using Thlaspi caerulescens as a model system. Ann. Bot. 102 ,
313.
Mitich, L.W. (1996) Field pennycress (Thlaspi arvense L.) the stinkweed.
Weed Technol. 10, 675678.
Moser, B.R., Knothe, G., Vaughn, S.F. et al. (2009a) Production and evalua-
tion of biodiesel from field pennycress (Thlaspi arvense L.) oil. Energy
Fuels, 23, 41494155.
Moser, B.R., Shah, S.N., Winkler-Moser, J.K. et al. (2009b) Composition and
physical properties of cress (Lepidium sativum L.) and field pennycress
(Thlaspi arvense L.) oils. Ind. Crops Prod. 30, 199205.
Nour-Eldin, H.H., Andersen, T.G., Burow, M. et al. (2012) NRT/PTR transport-
ers are essential for translocation of glucosinolate defence compounds
to seeds. Nature, 488, 531534.
Phippen, W.B. and Phippen, M.E. (2012) Soybean seed yield and quality as
a response to field pennycress residue. Crop Sci. 52, 27672773.
R Development Core Team (2008) R: A Language and Environment for Statisti-
cal Computing. Vienna, Austria: R Foundation for Statistical Computing.
Schranz, M.E., Quijada, P., Sung, S.B. et al. (2002) Characterization and
effects of the replicated flowering time gene FLC in Brassica rapa. Genet-
ics, 162, 14571468.
Selling, G.W., Hojilla-Evangelista, M.P., Evangelista, R.L. et al. (2013) Extrac-
tion of proteins from pennycress seeds and press cake. Ind. Crops Prod.
41, 113119.
Sharma, N., Cram, D., Huebert, T. et al. (2007) Exploiting the wild crucifer
Thlaspi arvense to identify conserved and novel genes expressed during
a plant’s response to cold stress. Plant Mol. Biol. 63, 171184.
Sheldon, C.C., Rouse, D.T., Finnegan, E.J. et al. (2000) The molecular basis
of vernalization: the central role of FLOWERING LOCUS C (FLC). Proc.
Natl Acad. Sci. USA, 97, 37533758.
Simpson, G.G. (2002) Arabidopsis, the Rosetta stone of flowering time? Sci-
ence, 296, 285289.
Slotte, T., Hazzouri, K.M.,
Agren, J.A. et al. (2013) The Capsella rubella gen-
ome and the genomic consequences of rapid mating system evolution.
Nat. Genet., 45, 831835.
Snapp, S.S., Swinton, S.M., Labarta, R. et al. (2005) Evaluating cover crops
for benefits, costs and performance within cropping system niches.
Agron. J. 97, 322332.
Tadege, M., Sheldon, C.C., Helliwell, C.A. et al. (2001) Control of flowering
time by FLC orthologues in Brassica napus. Plant J. 28, 545553.
Tilman, D., Socolow, R., Foley, J.A. et al. (2009) Beneficial biofuels the
food, energy, and environment trilemma. Science, 325, 270271.
Varshney, R.K., Nayak, S.N., May, G.D. et al. (2009) Next-generation
sequencing technologies and their implications for crop genetics and
breeding. Trends Biotechnol. 27, 522530.
Vaughn, S.F., Isbell, T.A., Weisleder, D. et al. (2005) Biofumigant
compounds released by field pennycress (Thlaspi arvense) seedmeal.
J. Chem. Ecol. 31 167177.
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
10 Kevin M. Dorn et al.
Wang, N.A., Qian, W., Suppanz, I. et al. (2011a) Flowering time variation in
oilseed rape (Brassica napus L.) is associated with allelic variation in the
FRIGIDA homologue BnaA.FRI.a. J. Exp. Bot. 62, 56415658.
Wang, X.W., Wang, H.Z., Wang, J. et al. (2011b) The genome of the meso-
polyploid crop species Brassica rapa. Nat. Genet. 43, 10351039.
Warwick, S.I., Francis, A. and Susko, D.J. (2002) The biology of Canadian
weeds. 9. Thlaspi arvense L. (updated). Can. J. Plant Sci. 82, 803823.
Winchester, N., McConnachie, D., Wollersheim, C. and Waitz, I. (2013)
Market cost of renewable jet fuel adoption in the United States Massa-
chusetts Institute of Technology. http://globalchange.mit.edu/files/
document/MITJPSPGC_Rpt238.pdf
Wittstock, U. and Burow, M. (2010) Glucosinolate breakdown in Arabidop-
sis: mechanism, regulation and biological significance. Arabidopsis
Book, 8, e0134.
Yuan, Y.X., Wu, J., Sun, R.F. et al. (2009) A naturally occurring splicing site
mutation in the Brassica rapa FLC1 gene is associated with variation in
flowering time. J. Exp. Bot. 60, 12991308.
Zhao, Q.Y., Wang, Y., Kong, Y.M. et al. (2011) Optimizing de novo transcrip-
tome assembly from short-read RNA-Seq data: a comparative study.
BMC Bioinformatics, 12(Suppl. 14), S2.
© 2013 The Authors
The Plant Journal © 2013 John Wiley & Sons Ltd, The Plant Journal, (2013), doi: 10.1111/tpj.12267
De novo assembly of the pennycress transcriptome 11
... The broad biodistribution of the Chinese T. arvense population demonstrates that T. arvense has relatively strong adaptability to environmental changes. Furthermore, T. arvense has been well recognized as a potential winter cover biofuel crop given its extreme cold tolerance and high seed oil content (Dorn et al., 2013(Dorn et al., , 2015Sedbrook et al., 2014;Claver et al., 2017). ...
... The last four families, i.e., Cullin, Cysteine-rich receptor-like kinases (CRK), Aspartic proteinase inhibitors (API), and Toll/interleukin-1 receptor (TIR), all contain LTR-genes in SEOs belonging to Group A3 and are also expanded by TDs (Supplementary Figures 18-21). Lastly, for each of these nine gene families, we used previously described transcriptomes of T. arvense to obtain expression levels (transcripts per million, TPM) for all retroduplicated genes and tandem duplicated genes that were younger than retroduplicated genes or had divergence time less than 6 MYA (Dorn et al., 2013;Thomas et al., 2017). Overall, 78% of these genes showed expression, especially one tandem duplicated gene (TaChr1G15052) in the SKP1 family, and one free-retrogene (TaChr4G2027) in the AMPB family showed much higher expression levels in nectary tissues (Supplementary Table 22). ...
... After masking the repetitive sequences of the genome as described above, we identified protein-coding gene models using the FGENESH++ pipeline (Softberry Inc., Mount Kisco, NY, United States) with parameters trained with A. thaliana gene models. A transcriptome of T. arvense assembled from Illumina RNA-seq reads was used to facilitate the gene prediction with transcriptome evidence (Dorn et al., 2013). The de novo predicted gene model correction was performed by comparing all known plant protein sequences from the NCBI NR database. ...
Article
Full-text available
Retrotransposons are the most abundant group of transposable elements (TEs) in plants, providing an extraordinarily versatile source of genetic variation. Thlaspi arvense , a close relative of the model plant Arabidopsis thaliana with worldwide distribution, thrives from sea level to above 4,000 m elevation in the Qinghai-Tibet Plateau (QTP), China. Its strong adaptability renders it an ideal model system for studying plant adaptation in extreme environments. However, how the retrotransposons affect the T. arvense genome evolution and adaptation is largely unknown. We report a high-quality chromosome-scale genome assembly of T. arvense with a scaffold N50 of 59.10 Mb. Long terminal repeat retrotransposons (LTR-RTs) account for 56.94% of the genome assembly, and the Gypsy superfamily is the most abundant TEs. The amplification of LTR-RTs in the last six million years primarily contributed to the genome size expansion in T. arvense. We identified 351 retrogenes and 303 genes flanked by LTRs, respectively. A comparative analysis showed that orthogroups containing those retrogenes and genes flanked by LTRs have a higher percentage of significantly expanded orthogroups (SEOs), and these SEOs possess more recent tandem duplicated genes. All present results indicate that RNA-based gene duplication (retroduplication) accelerated the subsequent tandem duplication of homologous genes resulting in family expansions, and these expanded gene families were implicated in plant growth, development, and stress responses, which were one of the pivotal factors for T. arvense ’s adaptation to the harsh environment in the QTP regions. In conclusion, the high-quality assembly of the T. arvense genome provides insights into the retroduplication mediated mechanism of plant adaptation to extreme environments.
... The pennycress genome is about 539 Mb, 4 times larger than that of Arabidopsis ($135 Mb), and is organized into 7 chromosomes (2n ¼ 14). Transcriptomes and a draft genome from the winter annual line MN106 were generated (Dorn et al., 2013(Dorn et al., , 2015 and recently complemented by whole genome sequencing of several accessions , as well as sequencing of inbred line Spring 32-10 (McGinn et al., 2019). Besides a draft genome being available, pennycress can be easily transformed using Agrobacterium-mediated floral dip (McGinn et al., 2019), at efficiencies comparable to those of Arabidopsis (Clough and Bent, 1998). ...
... For the genome-guided transcriptome assembly, the reads were aligned with default parameters to the reference pennycress genome (Dorn et al., 2015) with HISAT v2.0.4 (Kim et al., 2015) and each alignment file in bam format was used as input in StringTie v.2.1.1 (Pertea et al., 2015). In addition, we incorporated previously-generated RNA-seq data (Dorn et al., 2013), into a set of transcripts. ...
Article
Full-text available
The Brassicaceae family comprises more than 3,700 species with a diversity of phenotypic characteristics, including seed oil content and composition. Recently, the global interest in Thlaspi arvense L. (pennycress) has grown as the seed oil composition makes it a suitable source for biodiesel and aviation fuel production. However, many wild traits of this species need to be domesticated in order to make pennycress ideal for cultivation. Molecular breeding and engineering efforts require the availability of an accurate genome sequence of the species. Here, we describe pennycress genome annotation improvements, using a combination of long and short read transcriptome data obtained from RNA derived from embryos of 22 accessions, in addition to public genome and gene expression information. Our analysis identified 27,213 protein-coding genes, as well as on average 6,188 biallelic SNPs. Additionally, we used the identified SNPs to evaluate the population structure of our accessions. The data from this analysis supports that the accession Ames 32872, originally from Armenia, is highly divergent from the other accessions, while the accessions originating from Canada and the USA cluster together. When we evaluated the likely signatures of natural selection from alternative SNPs, we found seven candidate genes under likely recent positive selection. These genes are enriched with functions related to in amino acid metabolism and lipid biosynthesis, and highlight possible future targets for crop improvement efforts in pennycress.
... Containing 30%-32% oil with a fatty acid profile that allows for easy conversion to biofuels meeting the U.S. Renewable Fuels Standard, pennycress can help fill the demand for SAF (Moser et al., 2009;Moser, 2012). Pennycress is unique in that it has a small non-repetitive diploid genome (Dorn et al., 2013;Dorn et al., 2015;McGinn et al., 2018) and can be easily genetically manipulated (Sedbrook et al., 2014;Chopra et al., 2018;McGinn et al., 2018;Chopra et al., 2020), akin to its well-known model relative Arabidopsis thaliana. Extremely winter hardy with high oilseed yields and a short life cycle, pennycress can be integrated into the fallow period of existing cropping systems in the U.S. Midwest as a profitable winter cover crop (Phippen and Phippen, 2012;Fan et al., 2013;Johnson et al., 2015;Jordan et al., 2016;Johnson et al., 2017;Ott et al., 2018). ...
Article
Full-text available
Thlaspi arvense L. (Field Pennycress; pennycress) is being converted into a winter-annual oilseed crop that confers cover crop benefits when grown throughout the 12 million-hectares U.S. Midwest. To ensure a fit with downstream market demand, conversion involves not only improvements in yield and maturity through traditional breeding, but also improvements in the composition of the oil and protein through gene editing tools. The conversion process is similar to the path taken to convert rapeseed into Canola. In the case of field pennycress, the converted product that is suitable as a rotational crop is called CoverCress™ as marketed by CoverCress Inc. or golden pennycress if marketed by others. Off-season integration of a CoverCress crop into existing corn and soybean hectares would extend the growing season on established croplands and avoid displacement of food crops or ecosystems while yielding up to 1 billion liters of seed oil annually by 2030, with the potential to grow to 8 billion liters from production in the U.S. Midwest alone. The aviation sector is committed to carbon-neutral growth and reducing emissions of its global market, which in 2019 approached 122 billion liters of consumption in the U.S. and 454 billion liters globally. The oil derived from a CoverCress crop is ideally suited as a new bioenergy feedstock for the production of drop-in Sustainable Aviation Fuel (SAF), renewable diesel, biodiesel and other value-added coproducts. Through a combination of breeding and genomics-enabled mutagenesis approaches, considerable progress has been made in genetically improving yield and other agronomic traits. With USDA-NIFA funding and continued public and private investments, improvements to CoverCress germplasm and agronomic practices suggest that field-scale production can surpass 1,680 kg ha−1 (1,500 lb ac−1) in the near term. At current commodity prices, economic modeling predicts this level of production can be profitable across the entire supply chain. Two-thirds of the grain value is in oil converted to fuels and chemicals, and the other one-third is in the meal used as an animal feed, industrial applications, and potential plant-based protein products. In addition to strengthening rural communities by providing income to producers and agribusinesses, cultivating a CoverCress crop potentially offers a myriad of ecosystem services. The most notable service is water quality protection through reduced nutrient leaching and reduced soil erosion. Biodiversity enhancement by supporting pollinators’ health is also a benefit. While the efforts described herein are focused on the U.S., cultivation of a CoverCress crop will likely have a broader application to regions around the world with similar agronomic and environmental conditions.
... During the last decade, significant research has been performed on Thlaspi arvense L. and has primarily focused on the following: (a) description and evaluation of the apparent characteristics of pennycress [24,25]; (b) development and use of different molecular markers to evaluate the genetic variability of pennycress germplasm [26,27]; (c) domestication of crop rotation of cover crops, multisite testing, and performance evaluation [19,[28][29][30]; (d) system evaluation and industrial application of pennycress as an oil crop [13,19,30,31]. Exploring and comparing the germplasm resources obtained is the key step in the preservation of germplasm resources, so that high-yield, high-quality, stable and suitable provenances can be selected for practical production [32]. ...
Article
Full-text available
Background Pennycress ( Thlaspi arvense L.) is an annual herbaceous plant of the Cruciferae family that has attracted attention as an oil crop and interseeded cover crop. We collected seeds of pennycress from five provenances in Northeast China, compared their characteristics, i.e. oil content, fatty acid composition, physical, chemical and antioxidant properties, their correlations with environmental factors were also analysed. Results There were significant differences in the seed characteristics, oil content, quality indicators and composition among different provenances ( P < 0.05). The 1000-seed weight ranged from 0.80 to 1.03 g; seed oil content from 28.89 to 42.57%; iodine from 79.19 to 99.09; saponification value from 186.51 to 199.60; peroxide value from 0.07 to 10.60; and acid value from 0.97 to 13.02. The range of seed oil colours were 66.53–78.78 (L*), 4.51–10.29 (a*), and 105.68–121.35 (b*). Erucic acid (C22:1) was the fatty acids with the highest content in pennycress seed oils (31.12–35.31%), followed by linoleic acid (C18:2 16.92–18.95%) and α-linolenic acid (C18:3 14.05–15.34%). The fatty acid 8,11,14-eicosatrienoic acid (C20:3) was detected for the first time in seed oils from Beian city, Panshi city and Kedong county, with contents of 1.13%, 0.84% and 1.03%, respectively. We compare and report for the first time on the radical-scavenging activity of the seed oils of pennycress. The EC50 values of the DPPH radical-scavenging activity and ABTS ⁺ radical-scavenging activity of the seed oils from different provenances were 8.65–19.21 mg/mL and 6.82–10.61 mg/mL, respectively. The ferric ion reduction antioxidant capacity (FRAP) ranged from 0.11 to 0.30 mmol Fe ²⁺ /g, which is equivalent to 4 mg/mL FeSO 4 of pennycress seed oils. Conclusions There was a significant correlation between seed characteristics and changes in geographical factors. With increasing longitude, the thickness of seeds, 1000-seed weight, and seed oil content increased, while the acid and peroxide values of the seed oil decreased. As the latitude increased, the 1000-seed weight and seed oil content increased, while the seed oil peroxide value decreased. Furthermore, mean annual temperature and annual rainfall are the two key environmental factors affecting the quality of pennycress. Graphical Abstract
Preprint
Full-text available
Amylose biosynthesis is strictly associated with granule-bound starch synthase I (GBSSI) encoded by the Waxy gene. Mutagenesis of single bases in the Waxy gene, which induced by CRISPR/Cas9 genome editing, caused absence of intact GBSSI protein in grain of the edited line. Consequently, B-type granules disappeared. The amylose and amylopectin contents of waxy mutants were zero and 31.73%, while those in the wild type were 33.50% and 39.00%, respectively. The absence of waxy protein led to increase in soluble sugar content to 37.30% compared with only 10.0% in the wild type. Sucrose and β-glucan, were 39.16% and 35.40% higher in waxy mutants than in the wild type, respectively. Transcriptome analysis identified differences between the wild type and waxy mutants that could partly explain the reduction in amylose and amylopectin contents and the increase in soluble sugar, sucrose and β-glucan contents. This waxy flour, which showed lower final viscosity and setback, and higher breakdown, could provide more option for food processing.
Article
Full-text available
The transcription factor WRINKLED1 ( WRI1 ) is known as a master regulator of fatty acid synthesis in developing oilseeds of Arabidopsis thaliana and other species. WRI1 is known to directly stimulate the expression of many fatty acid biosynthetic enzymes and a few targets in the lower part of the glycolytic pathway. However, it remains unclear to what extent and how the conversion of sugars into fatty acid biosynthetic precursors is controlled by WRI 1. To shortlist possible gene targets for future in-planta experimental validation, here we present a strategy that combines phylogenetic foot printing of cis-regulatory elements with additional layers of evidence. Upstream regions of protein-encoding genes in A. thaliana were searched for the previously described DNA-binding consensus for WRI1, the ASML1/WRI1 (AW)-box. For about 900 genes, AW-box sites were found to be conserved across orthologous upstream regions in 11 related species of the crucifer family. For 145 select potential target genes identified this way, affinity of upstream AW-box sequences to WRI1 was assayed by Microscale Thermophoresis. This allowed definition of a refined WRI1 DNA-binding consensus. We find that known WRI1 gene targets are predictable with good confidence when upstream AW-sites are phylogenetically conserved, specifically binding WRI1 in the in vitro assay, positioned in proximity to the transcriptional start site, and if the gene is co-expressed with WRI1 during seed development. When targets predicted in this way are mapped to central metabolism, a conserved regulatory blueprint emerges that infers concerted control of contiguous pathway sections in glycolysis and fatty acid biosynthesis by WRI1. Several of the newly predicted targets are in the upper glycolysis pathway and the pentose phosphate pathway. Of these, plastidic isoforms of fructokinase ( FRK 3) and of phosphoglucose isomerase ( PGI 1) are particularly corroborated by previously reported seed phenotypes of respective null mutations.
Article
Full-text available
Thlaspi arvense (field pennycress) is being domesticated as a winter annual oilseed crop capable of improving ecosystems and intensifying agricultural productivity without increasing land use. It is a selfing diploid with a short life cycle and is amenable to genetic manipulations, making it an accessible field-based model species for genetics and epigenetics. The availability of a high-quality reference genome is vital for understanding pennycress physiology and for clarifying its evolutionary history within the Brassicaceae. Here, we present a chromosome-level genome assembly of var. MN106-Ref with improved gene annotation and use it to investigate gene structure differences between two accessions (MN108 and Spring32-10) that are highly amenable to genetic transformation. We describe non-coding RNAs, pseudogenes, and transposable elements, and highlight tissue specific expression and methylation patterns. Resequencing of forty wild accessions provided insights into genome-wide genetic variation and QTL regions were identified for a seedling color phenotype. Altogether, these data will serve as a tool for pennycress improvement in general and for translational research across the Brassicaceae.
Preprint
Full-text available
Background Amylose biosynthesis is strictly associated with granule-bound starch synthase I (GBSSI) encoded by the Waxy gene. Waxy barley has extensive prospects for application in functional food development and the brewing industry; however, amylose-free waxy barleys are relatively scarce in nature. Results Here we created new alleles of the Waxy gene using CRISPR/Cas9 genome editing. Mutagenesis of single bases in these novel alleles caused absence of intact waxy protein in grain of the edited line. Consequently, B-type granules disappeared. The amylose and amylopectin contents of the edited line were zero and 31.73%, while those in the wild type (WT) were 33.50% and 39.00%, respectively. The absence of waxy protein led to increase in soluble sugar content to 37.30% compared with only 10.0% in the WT. Typical soluble sugars, sucrose and β-glucan, were 39.16% and 35.40% higher in the edited line than in the WT, respectively. Transcriptome analysis identified differences between the edited line and the WT that could partly explain the reduction in amylose and amylopectin contents and the increase in soluble sugar, sucrose and β-glucan contents. Conclusions The barley cultivar with novel alleles of the Waxy gene contained zero amylose, lower amylopectin, and higher soluble sugar, sucrose and β-glucan than the wild type. This new cultivar provides a good germplasm resource for improving the quality of barley.
Article
In the Upper Midwest, corn (Zea mays L.) and soybean (Glycine max [L.] Merr.) dominate the landscape, but only for six to seven months of the year. Thus, opportunities exist to establish crops that can utilize the remainder of the growing season and contribute to overall farm profitability. One species of interest is pennycress (Thlaspi arvense L.), but a lack of established agronomic best management practices is a barrier to successful crop production. The objectives of this study were to identify a range of cumulative growing degree days (CGDD) corresponding to pennycress physiological maturity, determine the optimal harvest window that maximizes pennycress seed yield and oil content, and characterize changes in pennycress seed attributes over seed maturation. This study was conducted over the 2016–2017 and 2017–2018 growing seasons with ‘MN106’ pennycress at two locations in Minnesota, USA. Seed dry weight stabilized within the window of maximum seed yield, but oil content did not maximize until after this period. However, there was minimal loss of oil content when pennycress was harvested within the seed yield maximization window. Based on these parameters, as well as seed moisture, it was estimated that pennycress reached physiological maturity between 2230 and 2250 °C d CGDD, or about a week prior to harvest maturity in terms of crop phenology. Delaying harvest to harvest maturity resulted in a 26% loss in harvestable seed due to seed shatter compared with the average maximum seed yield of 928 kg ha⁻¹. Ensuring maximum pennycress seed yield and oil content at harvest is imperative to successful production and contribution to farm economic viability.
Article
Full-text available
Intermediate wheatgrass (IWG) breeding with food use as the primary goal has been ongoing for about 30 years. Tremendous improvements in grain yield, shatter resistance, and free-threshing ability have been achieved, coupled with considerable but comparably moderate increases in seed size. Larger seeds have prompted flour refinement evaluations, which has led to pronounced improvements in flour and bread properties.
Article
Full-text available
The myrosinase-glucosinolate system is involved in a range of biological activities affecting herbivorous insects, plants and fungi. The system characteristic of the order Capparales includes sulphur-containing substrates, the degradative enzymes myrosinases, and cofactors. The enzyme-catalyzed hydrolysis of glucosinolates initially involves cleavage of the thioglucoside linkage, yielding D-glucose and an unstable thiohydroximate-Ο-sulphonate that spontaneously rearranges, resulting in the production of sulphate and one of a wide range of possible reaction products. The products are generally a thiocyanate, isothiocyanate or nitrile, depending on factors such as substrate, pH or availability of ferrous ions. Glucosinolates in crucifers exemplify components that are often present in food and feed plants and are a major problem in the utilization of products from the plants. Toxic degradation products restrict the use of cultivated plants, e.g. those belonging to the Brassicaceae. The myrosinase-glucosinolate system may, however, have several functions in the plant. The glucosinolate degradation products are involved in defence against insects and phytopathogens, and potentially in sulphur and nitrogen metabolism and growth regulation. The compartmentalization of the components of the myrosinase-glucosinolate system and the cell-specific expression of the myrosinase represents a unique plant defence system. In this review, we summarize earlier results and discuss the organisation and biochemistry of the myrosinase-glucosinolate system.
Article
Full-text available
In order to more fully utilize pennycress, a potentially viable bio-diesel source, the proteinaceous components were extracted from pennycress seeds and press cake. The amino acid composition of the proteins present in pennycress was typical for proteins derived from plants, with glycine, glutamic acid and alanine being prevalent. Water, 0.5 M sodium chloride, 60% acetic acid, 0.1 M sodium hydroxide and ethanol were used in sequential order to remove the protein from pennycress seeds and press cake and determine the various soluble protein fractions. Extraction temperature was varied from 5 to 77 °C. The highest yield of material (35%) was obtained by extracting pennycress seeds with water at 77 °C. However, this material had only moderate levels of protein (25%) with the remainder being carbohydrates and oil (as determined by infrared spectroscopy). The use of 0.5 M sodium chloride to remove protein from press cake at 5 °C produced material with the highest protein content (83%), but extraction yield was 25%. When extractions were carried out at 77 °C, oil typically began to be a major impurity in the protein. Using bomb calorimetry, the material remaining after extraction was found to have some value as a fuel source.
Article
The US Federal Aviation Administration (FAA) has a goal that one billion gallons of renewable jet fuel is consumed by the US aviation industry each year from 2018. We examine the cost to US airlines of meeting this goal using renewable fuel produced from a Hydroprocessed Esters and Fatty Acids (HEFA) process from renewable oils. Our approach employs an economy-wide model of economic activity and energy systems and a detailed partial equilibrium model of the aviation industry. If soybean oil is used as a feedstock, we find that meeting the aviation biofuel goal in 2020 will require an implicit subsidy to biofuel producers of $2.69 per gallon of renewable jet fuel. If the aviation goal can be met by fuel from oilseed rotation crops grown on otherwise fallow land, the implicit subsidy is $0.35 per gallon of renewable jet fuel. As commercial aviation biofuel consumption represents less than two per cent of total fuel used by this industry, the goal has a small impact on the average price of jet fuel and carbon dioxide emissions. We also find that, as the product slate for HEFA processes includes diesel and jet fuel, there are important interactions between the goal for renewable jet fuel and mandates for ground transportation fuels.
Article
Field Pennycress (Thlaspi arvense L.)—The Stinkweed - Volume 10 Issue 3 - Larry W. Mitich
Article
One of the most important breakthroughs in the history of genetics was the discovery that mutations can be induced (Muller, 1930; Stadler, 1932). The high frequency with which ionizing radiation and certain chemicals can cause genes to mutate made it possible to perform genetic studies that were not feasible when only spontaneous mutations were available.....
Article
Field pennycress (Thlaspi arvense L.; hereafter pennycress) is an oilseed crop being investigated as an off-season biofuel source that can potentially fit into the existing crop rotation cycle with soybean [Glycine max (L.) Merr.]. The objective of this 2-yr study was to evaluate the effect of pennycress residue on seed yield and quality components of soybean planted during five consecutive weeks, from mid-May to mid-June. In 2009 and 2010, the mean soybean dry weight seed yield after pennycress residue for all planting dates (4108 and 3490 kg ha(-1), respectively) was greater than yield from fallow control plots (3636 and 2992 kg ha(-1), respectively). However, in 2010, soybean planted after pennycress had slightly lower oil content (202 g kg(-1)) than that obtained from fallow control plots (207 g kg(-1)) (P < 0.01). Delayed planting until mid-June resulted in lower population density, plant height, seed yield, and oil concentration. Before June, planting date had no significant influence on soybean seed yield and quality. Protein content in soybean seed was not affected by year, pennycress residue, or planting date. Variation in the experimental year temperature values led to significant changes in oil components. High temperatures decreased levels of linoleic, linolenic, and stearic acids but increased levels of palmitic and oleic acids. Overall, pennycress had no negative effect on the subsequent soybean crop.
Article
An updated review of biological information is provided for Thlaspi arvense. Native to Eurasia, the species is naturalized and widely spread in temperate regions of the northern hemisphere, including all of Canada's provinces and territories, and has recently spread to temperate regions in the southern hemisphere. It is an annual pioneer of disturbed soils and is an important weed of grain, oilseed, and forage crops in Canada, particularly in the prairies. High levels of erucic acid and glucosinolates can contaminate canola. When present in hay or other fodder, its seeds or leaves can be toxic to animals, as well as contaminate milk and meat with unpleasant flavors. It can serve as a host for insect, nematode, fungal and viral pests of canola and mustard crops. A persistent seed bank, high fecundity, and the growth habit of a hardy winter annual with early- (EF) and late-flowering (LF) strains, all contribute to its ability to compete with crops. Effective herbicides include the sulfonylureas, chlorsulfuron and ethametsulphuron, MCPA, tribenuronmethyl, phenocyacetic acid, flurtamone, 2,4-D, 2,4-D + dicamba, and 2,4-D + picloram. A resistant biotype to Group 2 herbicides, which inhibit acetolactate synthase (ALS), has been found at two to five sites in Alberta in 2001. The potential of T. arvense as an industrial oilseed crop is being investigated.