ArticlePDF Available

Gene Networks Underlying Cannabinoid and Terpenoid Accumulation in Cannabis

Authors:

Abstract and Figures

Glandular trichomes are specialized anatomical structures that accumulate secretions with important biological roles in plant-environment interactions. These secretions also have commercial uses in the flavor, fragrance, and pharmaceutical industries. The capitate-stalked glandular trichomes of Cannabis sativa (cannabis), situated on the surfaces of the bracts of the female flowers, are the primary site for the biosynthesis and storage of resins rich in cannabinoids and terpenoids. In this study, we profiled nine commercial cannabis strains with purportedly different attributes, such as taste, color, smell and genetic origin. Glandular trichomes were isolated from each of these strains and cell type-specific transcriptome data sets were acquired. Cannabinoids and terpenoids were quantified in flower buds. Statistical analyses indicated that these data sets enable the high-resolution differentiation of strains by providing complementary information. Integrative analyses revealed a coexpression network of genes involved in the biosynthesis of both cannabinoids and terpenoids from imported precursors. Terpene synthase genes involved in the biosynthesis of the major mono- and sesquiterpenes routinely assayed by cannabis testing laboratories were identified and functionally evaluated. In addition to cloning variants of previously characterized genes, specifically CsTPS14CT ((-)-limonene synthase) and CsTPS15CT (β-myrcene synthase) we functionally evaluated genes that encode enzymes with activities not previously described in cannabis, namely CsTPS18VF and CsTPS19BL (nerolidol/linalool synthases); CsTPS16CC (germacrene B synthase); and CsTPS20CT (hedycaryol synthase). This study lays the groundwork for developing a better understanding of the complex chemistry and biochemistry underlying resin accumulation across commercial cannabis strains.
Content may be subject to copyright.
Gene Networks Underlying Cannabinoid and Terpenoid
Accumulation in Cannabis1[OPEN]
Jordan J. Zager,
a
Iris Lange,
a
Narayanan Srividya ,
a
Anthony Smith,
b
and B. Markus Lange
a,2,3
a
Institute of Biological Chemistry and M.J. Murdock Metabolomics Laboratory, Washington State University,
Pullman, Washington 99164-6340
b
Evio Labs, Central Point, Oregon 97502
ORCID IDs: 0000-0001-6970-5832 (J.J.Z.); 0000-0001-7934-7987 (N.S.); 0000-0001-6565-9584 (B.M.L.).
Glandular trichomes are specialized anatomical structures that accumulate secretions with important biological roles in plant-
environment interactions. These secretions also have commercial uses in the avor, fragrance, and pharmaceutical industries.
The capitate-stalked glandular trichomes of Cannabis sativa (cannabis), situated on the surfaces of the bracts of the female owers,
are the primary site for the biosynthesis and storage of resins rich in cannabinoids and terpenoids. In this study, we proled nine
commercial cannabis strains with purportedly different attributes, such as taste, color, smell, and genetic origin. Glandular
trichomes were isolated from each of these strains, and cell type-specic transcriptome data sets were acquired. Cannabinoids
and terpenoids were quantied in ower buds. Statistical analyses indicated that these data sets enable the high-resolution
differentiation of strains by providing complementary information. Integrative analyses revealed a coexpression network of
genes involved in the biosynthesis of both cannabinoids and terpenoids from imported precursors. Terpene synthase genes
involved in the biosynthesis of the major monoterpenes and sesquiterpenes routinely assayed by cannabis testing laboratories
were identied and functionally evaluated. In addition to cloning variants of previously characterized genes, specically
CsTPS14CT [(2)-limonene synthase] and CsTPS15CT (b-myrcene synthase), we functionally evaluated genes that encode
enzymes with activities not previously described in cannabis, namely CsTPS18VF and CsTPS19BL (nerolidol/linalool
synthases), CsTPS16CC (germacrene B synthase), and CsTPS20CT (hedycaryol synthase). This study lays the groundwork for
developing a better understanding of the complex chemistry and biochemistry underlying resin accumulation across commercial
cannabis strains.
Cannabis sativa (cannabis) was originally discovered
in Central Asia and has likely been cultivated for tens of
thousands of years by human civilizations, with the
rst mention about 5,000 years ago in Chinese texts
(Unschuld, 1986). Whereas the initial utility was pri-
marily as a source of grain and ber, strains with me-
dicinal properties were already in use in northwest
China some 2,700 years ago, as evidenced by the
detection of the psychoactive cannabinoid, (2)-trans-
D
9
-tetrahydrocannabinol (THC), in plant residues re-
covered from an ancient grave (Russo et al., 2008).
Cannabis strains containing less THC but more of
the nonpsychoactive cannabidiol (CBD), commonly
referred to as hemp, were grown in Roman Britain for
grain and ber but later found additional uses as a
medicine during the Anglo-Saxon period (Grattan and
Singer, 1952). The 1925 Geneva International Opium
Convention required signatories to control the trade of
certain drugs (including cannabis), which was followed
by increasingly restrictive resolutions by the League of
Nations and later the United Nations (United Nations,
1966). Until very recently, cannabis was considered an
illicit substance of abuse by many governments and
could only be researched by selected, authorized sci-
entists in tightly supervised laboratories. Despite these
restrictions, evidence for the medicinal potential was
sufciently convincing that, by the mid-1980s, the
synthetic cannabinoids nabilone and dronabinol had
been granted approval by the U.S. Food and Drug
Administration to suppress nausea during chemother-
apy (Abuhasira et al., 2018). The discovery of the exis-
tence of a high-afnity cannabinoid receptor in the
rat brain during the late 1980s (Devane et al., 1988)
prompted further research to identify the endogenous
ligands. This resulted in the characterization, beginning
in the early 1990s, of several lipid-based retrograde
1
This work was supported by gifts from private individuals, with
no association with the cannabis industry. All work with raw mate-
rials was conducted by A.S. at a facility accredited to National Envi-
ronmental Laboratory Accreditation Program standards and licensed
by the Oregon Liquor Control Commission. Work of employees of
Washington State University (J.J.Z., I.L., and B.M.L.) was performed
in accordance with the OR/ORSO Guideline of July 2017.
2
Author for contact: lange-m@wsu.edu.
3
Senior author.
The author responsible for distribution of materials integral to the
ndings presented in this article in accordance with the policy de-
scribed in the Instructions for Authors (www.plantphysiol.org) is: B.
Markus Lange (lange-m@wsu.edu).
J.J.Z., A.S., and B.M.L. designed the experiments; A.S. harvested
and extracted plant materials; A.S. performed metabolite analyses;
J.J.Z., I.L. and N.S. cloned terpene synthase genes and performed
functional assays; J.J.Z., A.S., and B.M.L. analyzed the data; J.J.Z.
and B.M.L. wrote the article, with input from all authors.
[OPEN]Articles can be viewed without a subscription.
www.plantphysiol.org/cgi/doi/10.1104/pp.18.01506
Plant Physiology
Ò
,August 2019, Vol. 180, pp. 18771897, www.plantphysiol.org Ó2019 American Society of Plant Biologists. All Rights Reserved. 1877
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
neurotransmitters (endocannabinoids) and multiple
enzymes involved in their biosynthesis, trafcking, and
perception (the endocannabinoid system), which were
subsequently demonstrated to regulate a multitude of
physiological and cognitive processes in humans and
other animals (Devane et al., 1992). With receptor tar-
gets in hand, follow-up research and clinical trials
brought several additional cannabis-related products to
the pharmaceutical marketplace, including nabiximols
(marketed as Sativex in Canada since 2005), a cannabis
extract used to treat symptoms of multiple sclerosis,
and a formulation of highly puried, plant-sourced
CBD (marketed as Epidiolex in the United States since
early 2018) to treat certain forms of epilepsy. In the
meantime, several jurisdictions and even entire coun-
tries changed their policies on cannabis, endorsing laws
that allow its therapeutic use and decriminalizing or
even legalizing it for recreational purposes (Abuhasira
et al., 2018). Legislation has not been able to keep up
with these recent developments, and specic labeling
regulations with regard to the composition of active
ingredients, serving sizes, and recommended doses are
woefully lacking (Subritzky et al., 2016). This situation
is exacerbated by an inadequate understanding of how
the chemistry (cannabinoids and other specialized
metabolites) of cannabis extracts and formulations re-
lates to their biological effects.
Since the original structural elucidation, during the
early 1960s, of THC as a psychoactive principle in
cannabis (Gaoni and Mechoulam, 1964), the structures
of more than 90 biogenic cannabinoids have been
reported to occur in members of the genus Cannabis
(Andre et al., 2016), with a handful of constituents being
the most prominent across strains (Fig. 1). These
Figure 1. Shared origin of the cannabinoid and
terpenoid biosynthetic pathways. A circled P
denotes phosphate moieties.
1878 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
cannabinoids accumulate primarily in capitate-stalked
glandular trichomes of female plants at the owering
stage. A second class of metabolites with high abun-
dance and even greater chemical diversity in cannabis
glandular trichomes are monoterpenes and sesquiter-
penes (Fig. 1; Brenneisen, 2007). These volatile terpe-
noids are responsible for the distinctive aromas of
different cannabis strains. The popular press and trade
magazines liberally use the term entourage effect
to suggest that synergism among cannabinoids or
between cannabinoids and other constituents (in par-
ticular terpenoids) may contribute to different psycho-
logical perceptions of cannabis varieties by users. In
support of this view, b-caryophyllene, a sesquiterpene
with almost ubiquitous occurrence in plant oils and
resins, was demonstrated to bind with high afnity to
the CB2 cannabinoid receptor and has therefore been
referred to as a dietary cannabinoid (Gertsch et al.,
2008). However, there is only limited clinical evidence
for entourage effects of terpenoids in cannabis formu-
lations (Gertsch et al., 2010; Russo, 2011). Irrespective of
these considerations, the chemical composition of each
cannabis strain is unique, and acquiring a metabolic
ngerprint is an excellent rst step in building a more
robust scientic foundation for assessing the correlation
between the composition of plant material and the
perception by users (Fischedick et al., 2010).
Most of the cannabis products traded licitly or illicitly
today are sourced from strains for which minimal
documentation is available in the public domain and
for which the primary goal was clearly to breed high-
THC strains (Cascini et al., 2012). In other words, the
genetics underlying chemical diversity in commercial
cannabis strains is currently poorly understood
(Welling et al., 2016). In this context, it is interesting that
cannabinoids and terpenoids share a common biosyn-
thetic origin. The biosynthesis of the prominent can-
nabinoids involves two direct precursor pathways. The
polyketide pathway gives rise to olivetolic acid from a
short-chain fatty acid intermediate (hexanoyl-CoA),
whereas the methylerythritol 4-phosphate (MEP)
pathway provides geranyl diphosphate (GPP; Fig. 1;
Fellermeier et al., 2001; Taura et al., 2009; Gagne et al.,
2012; Stout et al., 2012 ). An aromatic prenyltransferase
catalyzes the formation of cannabigerolic acid from
oilvetolic acid and GPP (Fellermeier and Zenk, 1998;
Page and Boubakir, 2012). The pathway then branches
again toward different cyclized products, such as tet-
rahydrocannabinolic acid (THCA), cannabidiolic acid
(CBDA), and cannabichromanic acid (Fig. 1; Sirikantaramas
et al., 2005; Taura et al., 2007). Reduced metabolic
products of these acids are formed nonenzymatically
by exposure to heat . Plant monoterpenes are mostly
derived from the plastid-localized MEP pathway,
whereas the cytosolic/peroxisomal mevalonate path-
way is a common source of precursors for sesquiter-
penes, although cross talk between both pathways has
also been reported (Fig. 1; Hemmerlin et al., 2012).
Terpene synthases catalyze the rst committed step in
the biosynthesis of a specic terpenoid from a prenyl
diphosphate precursor of the appropriate chain length.
To date, monoterpene synthases (accepting a C10 pre-
cursor) and sesquiterpene synthases (acting on a C15
precursor) that are responsible for the production of
about half a dozen terpenoids in cannabis have been
reported (Fig. 1; Günnewich et al., 2007; Booth et al.,
2017), with many more awaiting functional characteri-
zation. In this article, we report the chemical proles and
corresponding gene networks across several cannabis
strains, thereby building the foundation for a better un-
derstanding of their chemical and biochemical diversity.
RESULTS
Strategic Considerations for Logistics, Strain Selection, and
Experimental Design
One of the goals of this pilot study was to test the
utility of combining metabolic and transcriptomic data
to differentiate cannabis strains with regard to the most
relevant traits. To ensure the consistency of data sets, all
plant materials were sourced from the same facility,
where they had been maintained under comparable
growth conditions (Shadowbox Farms in Williams,
Oregon). Plant harvest was performed when the ap-
pearance of glandular trichome content had changed
from a turbid white to clear and before another change
to an amber-like color occurred. For most strains, the
pistils had changed color from white to yellow or or-
ange. These are the visual cues used by experienced
growers to indicate optimal harvest time. All further
processing was performed with fresh (uncured) mate-
rial to avoid the previously reported loss of terpenoid
volatiles during drying (Ross and ElSohly, 1996). Can-
nabinoids and terpenoids were extracted and quanti-
ed at a testing facility licensed according to the National
Environmental Laboratory Accreditation ProgramsTNI
2009 Standard (Evio Labs). At this facility, fractions highly
enriched in glandular trichomes were obtained and RNA
was isolated, with minor modications, using previously
established protocols (Lange et al., 2000). Glandular
trichome-specic RNA sequencing (RNA-seq) data were
then acquired by a commercial service provider (Quick
Biology). Metabolite and transcriptome data were ac-
quired for three biological replicates per strain.
This study involved a selection of strains with
C. sativa ancestry, whereas Cannabis indica (formally
classied as C. sativa forma indica) was dominant in
others (Fig. 2). Strains of C. sativa provenance are gen-
erally characterized by fairly thin and narrow leaves,
comparatively longer owering cycles, and a relatively
tall stature. A typical example in this study is Mama
Thai, which is generally considered a landrace of
C. sativa. In contrast, C. indica strains ordinarily have
large and thick leaves, a rather short owering cycle
(68 weeks), and a proportionately short habitus
(Fig. 2A). Our pilot study featured Blackberry Kush as a
C. indica dominant strain. The remaining strains were
hybrids of mixed C. sativa and C. indica lineage, plus one
strain (Terple) with poorly documented origin (Fig. 2B).
Plant Physiol. Vol. 180, 2019 1879
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
To address our goal of assessing the utility of our data
for classifying strains, RNA-seq and chemical data
(cannabinoid and terpenoid proles) were subjected to
multivariate statistical analyses. We then tested the
hypothesis that cannabinoid and terpenoid pathways
are coregulated by performing gene coexpression net-
work analyses. A combination of gene network and
phylogenetic analyses was subsequently used to iden-
tify candidate genes for hitherto uncharacterized ter-
pene synthases that contribute signicantly to the
cannabis volatile bouquet.
Strain Differentiation Based on RNA-Seq Data
High-quality libraries reecting transcripts expressed
in isolated glandular trichomes were subjected to RNA-
seq analysis (nine strains, three biological replicates each,
27 samples total) on the Illumina HiSeq 4000 platform.
A de novo consensus transcriptome assembly was
generated using the Trinity suite (Haas et al., 2013; as-
sembly statistics are given in Supplemental Table S1).
The reads were assembled into contigs covering a total
of 305 Mb of sequence with a GC content of 40.4%.
The resulting assembly produced an N50 (sum of the
lengths of all contigs of N50 value or longer contain at
least 50 % of the total transcriptome sequence) value of
833 bp, containing 514,208 contigs of at least 201 bp in
length. The assembled transcriptome data set was
searched against the National Center for Biotechnology
Information nonredundant protein database, which
resulted in the annotation of 82,523 sequences at
e-values ,1e-5. Read counts for each transcript in each
sample were then processed with the RSEM software
package (Li and Dewey, 2011) to calculate normalized
Figure 2. Characteristics of cannabis
strains. A, Floral phenotypes. B, Origins
and aroma descriptions (according to
https://www.leafly.com).
1880 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
expression levels as transcripts per kilobase million
(TPM). Transcripts with TPM values lower than 5
across all varieties were removed from subsequent
analysis, resulting in 46,559 predicted genes with sig-
nicant expression (Supplemental Table S2).
As a rst step to investigate the utility of RNA-seq for
strain categorization, transcriptome data sets were
subjected to principal component analysis (PCA), a
statistical procedure that reduces attribute space from a
larger number of variables to a smaller number of so-
called principal components, thereby decreasing the
dimensionality of the original data. The rst three
principal components accounted for 83% of the varia-
bility in the data set (Fig. 3A). The replicates for each
strain clustered together in a three-dimensional PCA
plot, whereas the component scores for each strain were
separated from those of all other strains, indicating that
the overall transcriptome of each strain is unique
(Fig. 3A). Processing of RNA-seq data by hierarchical
clustering analysis (HCA), which builds a cluster hier-
archy that is commonly displayed as a dendrogram,
grouped strains into two major clades (Fig. 3B). The rst
clade contained Blackberry Kush, Cherry Chem, and
Terple, whereas the second consisted of Mama Thai,
White Cookies, Valley Fire, Black Lime, Canna Tsu, and
Sour Diesel, indicating a clear separation of strains by
heritage (C. indica for clade 1 and C. sativa for clade 2).
Strain Differentiation Based on Metabolite Proling Data
The highly robust analytical platforms that served as
the basis for the analysis of six cannabinoids and 24
terpenoids were described in a previous report
(Fischedick et al., 2010) and used here with minor
modications. Cannabinoid concentration was highest
in White Cookies (28.4% of ower bud dry weight),
with relatively high contents also occurring in Cherry
Chem (17.7%), Black Lime (17.5%), Backberry Kush
(15.8%), Valley Fire (15.7%), Terple (15.6%), Sour Diesel
(12.4%), and Canna Tsu (12.2%; Table 1). Signicantly
lower concentrations were detected in Mama Thai
(6.4%). In eight of the nine strains investigated, THCA
was the major cannabinoid, ranging from 26.3% of the
ower bud dry weight in White Cookies to 5.9% in
Mama Thai (Table 1). The only exception was the Canna
Figure 3. Cannabis strain differentiation based on glandular trichome-specific RNA-seq data. A, Three-dimensional plot rep-
resenting outcomes of a PCA. B, Heat map of a two-way HCA. The numerical values and red-white-blue color code indicate the
log
2
fold change compared with the average geneexpression value across all strains. Strain abbreviations at the bottom of B are as
follows: BB, Blackberry Kush; BL, Black Lime; CC, Cherry Chem; CT, Canna Tsu; MT, Mama Thai; SD, Sour Diesel; T, Terple; VF,
Valley Fire; WC, White Cookies.
Plant Physiol. Vol. 180, 2019 1881
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Table 1. Constituents of cannabis female flower buds (metabolite content in nine strains expressed as percentage of dry weight)
n.d., Not detectable.
Metabolite Blackberry Kush Black Lime Canna Tsu Cherry Chem Valley Fire Mamma Thai Sour Diesel Terple White Cookies
Cannabinoids
THCA 13.56 60.90 15.02 61.10 3.19 60.20 16.55 60.81 13.89 61.33 5.91 60.60 11.31 61.04 13.72 61.36 26.33 60.54
Tetrahydrocannabinol 0.31 60.02 1.62 60.19 0.55 60.055 0.15 60.008 0.41 60.049 0.14 60.02 0.22 60.027 1.15 60.12 0.86 60.091
CBDA 0.45 60.02 0.12 60.012 7.76 60.63 0.079 60.007 0.037 60.001 0.016 60.003 0.032 60.002 0.067 60.002 0.088 60.004
CBD 0.95 60.07 0.139 60.016 0.085 60.013 0.079 60.008 0.12 60.004 0.047 60.005 0.086 60.006 0.11 60.008 0.098 60.013
Cannabigerol 0.12 60.015 0.086 60.008 0.093 60.008 0.051 60.005 0.15 60.027 0.016 60.001 0.052 60.004 0.093 60.002 0.25 60.005
Cannabinol 1.74 60.20 0.55 60.019 0.53 60.051 0.83 60.019 1.12 60.14 0.29 60.028 0.68 60.033 0.502 60.007 0.78 60.025
Cannabichromene n.d. n.d. n.d. n.d. n.d. n.d. n.d. n.d. n.d.
Total cannabinoids 15.87 61.13 17.53 61.25 12.20 60.85 17.74 60.82 15.70 61.53 6.41 60.65 12.387 61.05 15.64 61.47 28.40 60.54
Monoterpenes
?-Myrcene 2.35 60.2 4.34 60.36 1.70 60.15 1.61 60.049 2.24 60.28 0.11 60.009 0.70 60.046 2.96 60.25 1.14 60.17
(2)-Limonene 0.29 60.02 0.89 60.08 0.16 60.021 0.23 60.015 0.65 60.098 0.03 60.003 0.17 60.011 0.23 60.019 1.53 60.24
?-Pinene 0.015 60.001 1.99 60.12 0.38 60.039 0.016 60.001 0.044 60.008 0.007 60.001 0.004 60 0.82 60.051 0.20 60.032
?-Pinene 0.086 60.005 0.50 60.034 0.18 60.025 0.056 60.003 0.11 60.013 0.026 60.002 0.039 60.002 0.31 60.022 0.04 60.007
1,8-Cineole 0.26 60.02 0.38 60.038 0.52 60.075 0.464 60.012 0.22 60.028 0.057 60.007 0.11 60.011 0.00 60 0.31 60.037
Linalool 0.082 60.005 0.079 60.004 0.052 60.005 0.13 60.003 0.16 60.027 0.023 60.002 0.074 60.005 0.067 60.006 0.57 60.072
Terpinolene 0.019 60.001 0.034 60.003 0.019 60.002 0.019 60.001 0.02 60.003 0.13 60.016 0.017 60.001 0.02 60.002 0.041 60.006
Borneol 0.039 60.002 0.041 60.003 n.d. 0.032 60.002 0.033 60.005 0.021 60.002 0.026 60.002 0.036 60.002 0.048 60.008
?-Ocimene n.d. 0.039 60.003 n.d. n.d. 0.006 60.001 0.13 60.014 n.d. 0.086 60.007 0.015 60.002
Camphene n.d. 0.089 60.008 0.055 60.007 n.d. 0.004 60.001 n.d. n.d. 0.019 60.002 0.07 60.012
d-3-Carene 0.029 60.002 0.052 60.006 0.003 60 0.008 60.001 0.022 60.003 0.003 60.001 n.d. 0.027 60.002 0.016 60.002
Camphor 0.044 60.003 0.006 60.001 n.d. n.d. n.d. n.d. n.d. n.d. 0.101 60.013
(1)-Terpinene 0.001 60.001 n.d. n.d. n.d. n.d. 0.005 60.001 n.d. n.d. 0.002 60
Total monoterpenes 3.23 60.26 8.43 60.66 3.07 60.32 2.56 60.085 3.52 60.47 0.54 60.057 1.14 60.078 4.57 60.36 4.09 60.60
Sesquiterpenes
?-Caryophyllene 0.13 60.01 0.24 60.023 0.21 60.022 0.74 60.012 0.23 60.034 0.12 60.013 0.45 60.026 0.15 60.009 0.60 60.068
?-Humulene 0.03 60.002 0.06 60.005 0.051 60.005 0.20 60.011 0.087 60.014 0.068 60.008 0.19 60.009 0.058 60.003 0.15 60.018
Nerolidol n.d. 0.06 60.004 n.d. n.d. n.d. n.d. n.d. n.d. n.d.
Total sesquiterpenes 0.16 60.015 0.361 60.032 0.26 60.027 0.93 60.019 0.32 60.048 0.19 60.021 0.64 60.035 0.21 60.012 0.75 60.086
Total terpenoids 3.39 60.27 8.79 60.69 3.33 60.35 3.49 60.10 3.84 60.51 0.73 60.078 1.78 60.11 4.78 60.38 4.83 60.69
1882 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Tsu strain, in which CBDA (7.8% of ower bud dry
weight) dominated over THCA (3.2%), whereas CBDA
in all other strains remained at 1% or less. Two addi-
tional cannabinoids of fairly high abundance were
cannabinol, which accumulated to 0.2% to 1.7% of
ower bud dry weight, and tetrahydrocannabinol,
which amounted to 0.2% to 1.6% (Table 1; for struc-
tures, see Fig. 1). Cannabichromene was not detected in
any of the sampled varieties.
Terpenoid content was highest in Black Lime (8.8% of
ower bud dry weight), with fairly high contents also
occurring in White Cookies (4.8%), Terple (4.8%), Val-
ley Fire (3.8%), Cherry Chem (3.5%), Blackberry Kush
(3.4%), and Canna Tsu (3.3%; Table 1). Signicantly
lower concentrations were detected in Sour Diesel
(1.8%) and Mama Thai (0.7%). The monoterpene (C10)-
to-sesquiterpene (C15) ratio was generally very high
(greater than 10), with only three strains in which the
ratio was below 3 (Cherry Chem, Mama Thai, and
Sour Diesel; Table 1). It should be noted that this ratio
only applies to the terpenoids we were able to quan-
tify based on the availability of authentic standards.
b-Myrcene was the most abundant monoterpene in
most strains (up to 4.3% of ower bud dry weight in
Black Lime). The only exceptions were Mama Thai
(generally low terpenoid contents, with terpinolene as
the most abundant monoterpene at 0.1%) and White
Cookies (with limonene at 1.5%; Table 1). Limonene
content was also high in Black Lime (0.9%) and Valley
Fire (0.7%). a-Pinene and b-pinene amounts were quite
high in Black Lime (2% and 0.5%, respectively). 1,8-
Cineole was particularly abundant in Canna Tsu
and Cherry Chem (0.5% in both; Table 1). All other
monoterpenes had concentrations below 0.2%. All
strains contained sesquiterpenes, of which b-caryophyllene
was consistently the most abundant (0.1%0.7% of ower
bud dry weight). a-Humulene was also detectable in all
strains (less than 0.2%), whereas Black Lime was the only
strain in which the nerolidol concentration rose above the
limit of quantitation (less than 0.1%; Table 1).
Processing of the metabolite data (cannabinoid and
terpenoid proles) by PCA resulted in a clear separa-
tion of the strains, with individual biological replicates
clustering closely together (Fig. 4A). Remarkably, 99%
of the data variation across genotypes was captured by
the rst three principal components. Application of
orthogonal projections to latent structures discriminant
analysis (OPLS-DA), a statistical modeling tool used
commonly in metabolomics research (Worley and
Powers, 2013), indicated a separation of strains into
two groups based on our metabolite proling data, one
representing the C. indica-dominant strains, whereas
the other constituted the C. sativa-dominant strains
(Fig. 4B). Biological replicates for each strain once again
clustered together, whereas signicant separation was
observed across strains. In summary, glandular
trichome-specic gene expression and metabolite data
were consistent in differentiating cannabis strains.
Evidence for Coexpression of Cannabinoid and
Terpenoid Pathways
Our glandular trichome RNA-seq data sets were l-
tered to eliminate genes with consistently low expression
levels (below 50 TPM), thereby retaining roughly 16,000
Figure 4. Cannabis strain differentiation based on cannabinoid and terpenoid profiles. A, Three-dimensional plot representing
outcomes of a PCA. B, Two-dimensional plot of the outcomes of OPLS-DA.
Plant Physiol. Vol. 180, 2019 1883
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
expressed genes with signicant expression levels in at
least one strain. Gene abundance across strains was
then evaluated using the weighted gene correlation
network analysis (WGCNA) package in R(Langfelder
and Horvath, 2008), which resulted in the binning of
genes (only those with Spearman correlation coef-
cients [SCCs] $0.8 were considered) into seven coex-
pression modules (Supplemental Table S3). Further
analysis using the moduleEigengenes function indi-
cated that the accumulation of CBDA, the signature
cannabinoid of the Canna Tsu strain, was highly
correlated (SCC of 0.97, Pvalue of 2e-17) with one of the
coexpression modules (indicated by brown color in
Fig. 5A). Interestingly, this module contained the gene
coding for CBDA synthase, the enzyme responsible
for the conversion of cannabigerolic acid to CBDA
(Table 2). An analogous analysis for THCA or THC
(which correlated with a module indicated by yellow
color in Fig. 5A) and THCA synthase was not possible,
because single-nucleotide polymorphisms in this gene
(and not lack of expression) result in an inactive enzyme
in strains that accumulate primarily CBDA (Kojoma
Figure 5. Coexpression of genes involved in cannabinoid and terpenoid biosynthesis. A, WGCNA of glandulartrichome-specific
RNA-seq data categorizes transcripts into eight color-coded modules (for gene lists, see Supplemental Table S3). B, Correlation of
WGCNA modules with metabolites. A color code is used to visualize the SCCs for each module-metabolite pair, with red color
representing positive and blue color indicating negative SCCs. C, Genes involved in cannabinoid and terpenoid biosynthesis are
enriched in the yellow coexpression module obtained by WGCNA. Color code for pathways: light blue, hexanoate
formation; dark green, precursors for monoterpenes; light green, monoterpene synthases; orange, sesquiterpenes; dark blue,
cannabinoids; cyan, remaining genes. D, Functional context of genes highlighted in C in a simplified metabolic pathway scheme.
AAE1, Acyl-activating enzyme for short-chain fatty acids; Ac-CoA, acetyl-CoA; ACC1, acetyl-CoA carboxylase; CsTPS1FN/
CsTPS14CT, (2)-limonene synthase; CsTPS2SK, (1)-a-pinene synthase; CsTPS3FN/CsTPS15CT, b-myrcene synthase;
CsTPS16CC, germacrene B synthase; DHAP, dihydroxyacetone phosphate; DXS, 1-deoxy-D-xylulose-5-phosphate synthase;
ENO, enolase; FNR-Root, ferredoxin-NADP
1
reductase (isoform of roots and glandular trichomes); FPPS, farnesyl diphosphate
synthase; GAP, glyceraldehyde-3-phosphate; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; GPP, geranyl diphosphate;
GPPS, geranyl diphosphate synthase; KR, b-ketoacyl reductase (fatty acid synthase complex); OA, olivetolic acid; PDH, pyruvate
dehydrogenase; PFK, phosphofructokinase; PGI, phosphoglucoisomerase; PGM, phosphoglucomutase; PK, pyruvate kinase;
PT1, cannabigerolic acid synthase; Pyr, pyruvate; THCAS, tetrahydrocannabinolic acid synthase; TPI, triose phosphate isomerase.
1884 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Table 2. Transcript abundance (in TPM) for genes involved in the biosynthesis of cannabinoids and terpenoids in cannabis strains
n.d., Not detectable.
Gene Annotation UniProt Identifier
Transcript Abundance
Blackberry
Kush
Black
Lime Canna Tsu Cherry
Chem Mama Thai Sour
Diesel Terple Valley Fire White
Cookies
Cannabinoid pathway
Acyl activating enzyme1 H9A1V3_CANSA 80.63 160.44 316.06 840.92 377.29 397.99 93.59 188.84 229.65
Olivetol synthase OLIS_CANSA 3,946.85 9,454.00 10,400.03 14,619.66 17,955.05 4,984.60 9,706.06 11,374.75 12,373.11
Geranyl diphosphate:olivetolate
geranyltransferase
CsPT1 422.42 222.42 189.43 407.76 649.37 263.62 246.13 175.87 115.21
CBDA synthase CBDAS_CANSA n.d. n.d. 1282.46 n.d. n.d. 18.39 n.d. n.d. n.d.
THCA synthase THCAS_CANSA 885.17 423.29 1203.31 2321.64 2317.68 1557.54 619.22 309.23 524.08
MEP pathway
1-Deoxy-D-xylulose-5-phosphate synthase A0A1V0QSH6_CANSA 221.85 284.41 412.74 1627.02 319.76 1751.70 533.57 288.69 16.32
1-Deoxy-D-xylulose 5-phosphate
reductoisomerase
A0A1V0QSG8_CANSA 172.63 228.15 185.07 667.96 304.62 117.92 176.79 256.25 16.01
2-C-Methyl-D-erythritol 4-phosphate
cytidylyltransferase
A0A1V0QSI6_CANSA 36.77 95.99 96.25 168.24 160.40 146.38 46.96 75.40 64.73
4-(Cytidine 59-diphospho)-2-C-methyl-D-
erythritol kinase
A0A1V0QSI2_CANSA 35.20 3.70 67.94 211.85 212.43 109.88 57.60 104.05 80.23
2-C-Methyl-D-erythritol 2,4,-
cyclodiphosphate synthase
G9C075_HUMLU 67.75 118.23 315.86 338.21 184.98 419.84 69.75 171.17 207.15
(E)-4-Hydroxy-3-methylbut-2-enyl-
diphosphate synthase
A0A1V0QSG3_CANSA 107.65 287.57 794.25 744.09 444.09 596.36 349.56 297.07 317.55
(E)-4-Hydroxy-3-methylbut-2-enyl-
diphosphate reductase
A0A1V0QSH9_CANSA 1,485.98 561.96 3,447.50 3,468.57 3,090.49 3,024.22 1,889.37 1,031.90 4,544.35
Isopentenyldiphosphate isomerase A0A1V0QSG5_CANSA 165.10 272.72 433.46 1,836.07 306.03 347.85 476.86 509.70 9.96
Mevalonate pathway
Acetoacetyl-CoA thiolase A0A1V0QSH3_CANSA 38.35 11.90 253.38 302.58 313.99 134.71 252.40 54.35 248.13
3-Hydroxy-3-methylglutaryl-CoA synthase A0A1V0QSH3_CANSA 13.44 22.98 20.81 21.60 27.81 34.33 9.24 19.32 91.24
3-Hydroxy-3-methylglutaryl-CoA reductase A0A1V0QSF5_CANSA 26.69 56.93 21.92 43.41 29.05 107.71 19.75 69.30 48.26
Mevalonate kinase A0A1V0QSI0_CANSA 1.63 1,449.32 3.63 3.41 5.81 4.75 2.45 5.93 5.05
Phosphomevalonate kinase A0A1V0QSH8_CANSA 3.68 7.58 7.99 6.63 8.09 6.03 3.81 7.40 305.27
Mevalonate diphosphate decarboxylase A0A1V0QSG4_CANSA 5.00 11.89 10.21 14.89 21.24 19.39 9.67 9.64 9.96
Plant Physiol. Vol. 180, 2019 1885
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
et al., 2006; Laverty et al., 2019; Table 2). Interestingly, the
THCA synthase sequences were essentially identical,
with the exception of that of the Canna Tsu strain, the only
CBDA accumulator in our pilot study (Supplemental Fig.
S1). Consequently, a full-length CBDA synthase gene was
expressed only in the Canna Tsu strain (Supplemental
Fig. S2), which is novel information that furthers our
understanding of the mechanisms underlying CBDA ac-
cumulation. Finally, the yellow-colored module (which as
mentioned above contained THCA synthase) also com-
prised cannabigerolic acid synthase (Table 2), the gene
preceding THCA synthase in the cannabinoid pathway
(Fig. 1), thereby providing additional evidence for gene-
to-metabolite correlation in the cannabinoid pathway.
We then asked if similar gene-to-metabolite correla-
tions occurred in the terpenoid pathway. Interestingly,
two coexpression modules (indicated by black and
yellow color in Fig. 5A) correlated with b-myrcene ac-
cumulation (Fig. 5B). This metabolite is formed by a
monoterpene synthase encoded by the CsTPS3FN gene
(Booth et al., 2017), which was contained in one of these
modules (yellow color in Fig. 5A; Table 3). Analogous
gene-to-metabolite correlations were observed for lim-
onene and CsTPS1FN,a-pinene and CsTPS2FN,
b-ocimene and CsTPS6FN, and b-caryophyllene/
a-humulene and CsTPS9FN (color of modules in
Fig. 5A: black, yellow, and yellow, turquoise, respec-
tively; terpene synthase annotation based on
Günnewich et al. [2007] and Booth et al. [2017]; Fig. 5B).
Transcripts corresponding to CsTPS5FN (b-myrcene/
a-pinene synthase), CsTPS4FN (alloaromadendrene
synthase), CsTPS8FN (g-eudesmol/valencene syn-
thase), and CsTPS13PK (a second b-ocimene synthase;
Booth et al., 2017) remained below the threshold ex-
pression level in our data sets. The corresponding ter-
penoids were not detected in the strains investigated,
indicating that the expressed gene complement was
generally sufcient to account for the presence of the
major terpenoids (Table 3). Linalool and nerolidol were
exceptions for which the corresponding terpene syn-
thases had hitherto not been identied from cannabis.
Notably, genes involved in the formation of these ter-
penoids (and others) were cloned and functionally
characterized as part of this study, which contributes
signicantly to a better understanding of the genetic
underpinnings of terpenoid diversity.
The yellow module featured prominently in our
gene-to-metabolite correlation analysis for the canna-
binoid and terpenoid pathways. Interestingly, a Gene
Ontology (GO) analysis implied a substantial enrich-
ment of genes involved in terpenoid biosynthesis in the
yellow module (Pvalue of 1.4e-05; Supplemental Table
S3; note that GO terms for cannabinoid biosynthesis as
a biological process have not yet been released). Inter-
estingly, a total of 22 genes involved in the conver-
sion of precursor metabolites into cannabinoid and
terpenoid end products were coexpressed with THCA
synthase (Fig. 5C). Specically, these genes code for
enzymes involved in glycolysis (conversion of an
imported carbon source into triose phosphates and
pyruvic acid), the MEP pathway toward GPP and ul-
timately monoterpenes, the production of sesquiter-
penes, the formation of olivetolic acid from fatty acid
precursors, and the incorporation of olivetolic acid and
GPP into cannabinoids (Fig. 5D).
Target Gene Identication and Characterization
Building on our terpenoid proling and glandu-
lar trichome-specic transcriptome data sets, we
embarked on gene discovery efforts aimed at charac-
terizing terpene synthases associated with the biosyn-
thesis of major monoterpenes and sesquiterpenes
routinely quantied in commercial cannabis testing as
well as other terpenoids that are not assayed routinely.
The analytical chemistry data were employed to assess
which genes would be expected to be expressed to
support the observed terpenoid proles. We then per-
formed BLASTX searches with previously character-
ized terpene synthases to identify contigs with high
sequence identity in our transcriptome data sets. We
then asked which of the putative cannabis terpene
synthases were expressed at appreciable levels in par-
ticular cannabis strains. Sequences of selected contigs
were then chosen to perform a sequence relatedness
analysis with previously characterized terpene syn-
thases, thereby enabling their categorization by class.
cDNAs of putative terpene synthases were cloned into
appropriate vectors and expressed heterologously in
Escherichia coli, the corresponding recombinant proteins
were puried, and assays were performed with ap-
propriate prenyl diphosphate substrates. Expression
for genes putatively encoding geranyl diphosphate
synthase and trans,trans-farnesyl diphosphate syn-
thase was readily detectable in transcriptome data sets
of all strains; in contrast, no putative orthologs of neryl
diphosphate (NPP) synthase and cis,cis-farnesyl di-
phosphate synthase were recognizable based on se-
quence identity (Supplemental Tables S1 and S2).
Nevertheless, terpene synthase assays were performed
with GPP, NPP, 2-trans,6-trans-farnesyl diphosphate
(tFPP), and 2-cis,6-cis-farnesyl diphosphate (cFPP).
b-Myrcene and (2)-limonene were principal mono-
terpenes in all strains (Table 1), and expectedly, contigs
with high sequence identity to the previously charac-
terized b-myrcene and (2)-limonene synthases of can-
nabis (Günnewich et al., 2007; Booth et al., 2017), which
belong to the TPS-b clade of terpene synthases (Fig. 6;
Supplemental Table S4), were expressed at high levels
across most strains investigated in this study (Table 2).
Cloning was successful for the corresponding cDNAs
from the Canna Tsu strain (CsTPS14CT and CsTPS15CT),
and a functional evaluation conrmed the annotation
[(2)-limonene synthase and b-myrcene, respectively;
Fig. 7, A and B]. The translated peptide sequences of
b-myrcene synthases (CsTPS3FN and CsTPS15CT; ex-
cluding plastidial targeting sequence) had 13 mis-
matches (Supplemental Fig. S3) but identical specicity
(100% b-myrcene as product with GPP as substrate).
1886 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Table 3. Transcript abundance (in TPM) for terpene synthases across cannabis strains
n.d., Not detectable.
Gene
GenBank Accession
No.
CsTPS
Identifier Transcript Abundance
Blackberry
Kush
Black
Lime Canna Tsu Cherry
Chem Valley Fire Mamma
Thai
Sour
Diesel Terple White
Cookies
Monoterpene synthases (TPS-b clade)
(2)-Limonene synthase
a
MK801766 CsTPS14CT 646.24 898.94 612.37 651.86 2272.48 751.48 201.94 2.46 895.86
(1)-a-Pinene synthase
b
KY014565 CsTPS2FN 217.36 2,041.33 1,554.77 101.32 96.90 n.d. n.d. 1,298.95 49.52
b-Myrcene synthase
a
MK801765 CsTPS15CT 183.29 597.88 325.85 272.65 157.78 254.10 183.29 436.63 n.d.
b-Myrcene/(2)-a-pinene synthase
b
KY014560 CsTPS5FN 217.59 640.97 483.09 547.24 157.78 445.85 125.94 472.33 50.51
(E)-b-Ocimene synthase
b
KY014563 CsTPS6FN n.d. n.d. n.d. n.d. n.d. 103.41 n.d. 191.65 n.d.
(Z)-b-Ocimene synthase
b
KY014558 CsTPS13PK n.d. n.d. n.d. n.d. n.d. n.d. n.d. n.d. n.d.
Acyclic terpene synthases (TPS-g clade)
(E)-Nerolidol/(1)-linalool synthase
a
MK801764 CsTPS18VF 2.82 9.41 2.62 16.21 16.39 2.51 4.80 16.77 8.76
(E)-Nerolidol/linalool synthase
a
MK801763 CSTPS19BL 56.78 81.13 27.22 80.23 249.23 62.53 47.73 90.86 66.47
Sesquiterpene synthases (TPS-a clade)
Alloaromadendrene synthase
b
KY014564 CsTPS4FN n.d. 108.92 n.d. 639.56 n.d. 329.87 148.17 n.d. 323.36
g-Eudesmol/valencene synthase
(putative)
b
KY014556 CsTPS8FN n.d. n.d. n.d. n.d. n.d. n.d. n.d. n.d. n.d.
d-Selinene synthase (putative)
b
KY014554 CsTPS7FN 356.34 n.d. 367.47 n.d. 316.74 210.58 n.d. n.d. 268.50
b-Caryophyllene/a-humulene
synthase
b
KY014555 CsTPS9FN 764.18 794.46 435.11 3,241.85 1,090.94 738.74 555.25 495.72 591.86
Germacrene B synthase
a
MK131289 CsTPS16CC 16.14 19.44 9.13 156.08 20.60 40.36 20.22 7.19 22.72
Hedycaryol synthase
a
MK801762 CSTPS20CT 310.43 27.00 498.70 98.21 19.35 11.98 17.67 0.00 17.02
a
Functionally characterized as part of this study.
b
From Booth et al. (2017).
Plant Physiol. Vol. 180, 2019 1887
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
The sequence of the (2)-limonene synthase character-
ized as part of this study (CsTPS14CT; excluding plas-
tidial targeting sequence) had two mismatches when
compared with CsTPS1SK and nine mismatches when
compared with CsTPS1FN (Supplemental Fig. S3). As
described for CsTPS1SK, CsTPS14CT generated several
other products, and we report the stereochemistry of
those (Fig. 7A).
The monoterpene linalool was accumulated to fairly
high amounts in the Valley Fire and White Cookies
strains, whereas the sesquiterpene nerolidol was
quantiable only in the Black Lime strain (Table 1).
Contigs with moderate sequence identity (slightly
above 50%) to bifunctional nerolidol/linalool syn-
thases (strawberry [Fragaria spp.; Aharoni et al., 2004]
and snapdragon [Antirrhinum majus; Nagegowda
et al., 2008]) and considerable expression in glandular
trichomes were identied in our transcriptome data
sets (Table 3), and corresponding cDNAs were cloned
from the Valley Fire (CsTPS18VF) and Black Lime
(CsTPS19BL) strains. These sequences belong to the
TPS-g clade of terpene synthases (Fig. 6; Supplemental
Table S4). Heterologous expression and functional
characterization conrmed that the corresponding re-
combinant proteins were able to catalyze the formation
of (E)-nerolidol from tFPP and linalool from GPP, but
no activity was detected with NPP or cFPP (Fig. 8).
Interestingly, follow-up chiral separation of products
from assays performed with GPP as substrate indicated
that CsTPS18VF generated almost exclusively (1)-
linalool, whereas CsTPS19BL produced a mixture of
(2)-linalool and (1)-linalool (Fig. 7, C and D). Sequence
differences across sesquiterpene synthases with differ-
ent product proles included residues with potential
roles in catalysis (Fig. 9), and the implications are
evaluated in Discussion.
To further investigate the genetic potential for
generating terpenoid chemical diversity, two repre-
sentatives of the TPS-b clade of terpene synthases
(CsTPS16CC and CsTPS20CT) were selected for
functional characterization. CsTPS16CC had very
high expression levels in the Cherry Chemstrain
(Table 3). The sequence was most similar to that of the
previously characterized alloaromadendrene syn-
thase (Booth et al., 2017; Fig. 6; Supplemental Table
S4). In our assays, the recombinant protein generated
germacrene B from tFPP (Fig. 8C), with g-elemene
being detected as a thermal breakdown product (de
Kraker et al., 1998). Other prenyl diphosphate sub-
strates were not accepted as substrates with appre-
ciable conversion rates (Fig. 8). The Canna Tsustrain
had a particularly high expression level of CsTPS20CT
Figure 6. Maximum likelihood phylo-
genetic tree of selected, functionally
characterized terpene synthases. The
tree is rooted with the ancestral ent-
kaurene synthase of Physcomitrella
patens (PpCPS/KS). A color code is used
to indicate different clades (yellow,
TPS-a; green, TPS-b; and purple, TPS-
g). Abbreviations are as follows: BL,
Black Lime; CC, Cherry Chem; Cs,
Cannabis sativa; CT, Canna Tsu; FN,
Finola; FRAAN, Fragaria 3ananassa;
FRAVE, Fragaria vesca; HUMLU, Hu-
mulus lupulus; OCIBA, Ocimum basi-
licum; ROSRU, Rosa rugosa; SALOF,
Salvia officinalis; VF, Valley Fire; VITVI,
Vitis vinifera. The accession numbers
and sequences of the terpene synthases
are provided in Supplemental Table S4.
1888 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Figure 7. Functional characterization of cannabis terpene synthases that act on GPP as substrate. Left, Chiral gas chromatography
(GC) scans; center, mass spectra of primary products; right, product distribution. A, (2)-Limonene synthase (CsTPS14CT). B,
b-Myrcene synthase (CsTPS15CT). C, (E)-Nerolidol/(1)-linalool synthase (CsTPS18VF). D, (E)-Nerolidol/(1)-linalool synthase
(CsTPS19BL).
Plant Physiol. Vol. 180, 2019 1889
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Figure 8. Functional characterization of cannabis terpene synthases that act on tFPP as substrate. Left, GC-mass spectrometry
scans; center, mass spectra of primary products; right, product distribution. A, (E)-Nerolidol/(1)-linalool synthase (CsTPS18VF). B,
(E)-Nerolidol/(1)-linalool synthase (CsTPS19BL). C, Germacrene B synthase (CsTPS16CC). D, Hedycaryol synthase (CsTPS20CT).
1890 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
(Table 3). Its closest neighbor in the sequence relat-
edness tree was a putative d-selinenesynthasefrom
cannabis(Boothetal.,2017;Fig.6;Supplemental
Table S4). Functional assays with the puried, re-
combinant protein indicated a conversion of tFPP to ele-
mol, a thermal breakdown product of the sesquiterpene
hedycaryol (Koo and Gang, 2012; Hattan et al., 2016), but
there was little or no activity with other prenyl diphos-
phate substrates (Fig. 8D). In summary, we demon-
strate that the resources and approaches described here
can be employed to identify candidates and subse-
quently characterize functions of terpene synthase genes
that belong to three different clades, thereby contribut-
ing to a better understanding of the genetic determinants
of terpenoid chemical diversity in cannabis.
DISCUSSION
Utility of Transcript Proling for Strain Differentiation
Competition in decriminalized retail markets for
cannabis has put pressure on breeders to differentiate
their product from that of their competitors. This has
led to branding with a plethora of distinct and memo-
rable names, which has caused both confusion and
controversy (Small, 2015). Chemical proling can be
employed as a powerful tool in strain differentiation,
but adding genotyping information further increases
the resolution of the analysis. The differentiation of
drug-type and ber-type cannabis strains can be ach-
ieved with standard genotyping analyses (Piluzza et al.,
2013). However, a differentiation of genetically related
strains has been much more challenging (Sawler et al.,
2015; Punja et al., 2017). Traditional genotyping
approaches benet signicantly from high-quality ref-
erence genome sequences (Scheben et al., 2017), but,
unfortunately, only fairly low-quality genome se-
quences have been published for two cannabis strains
(van Bakel et al., 2011). We employed RNA-seq as an
alternative approach for genotyping (Haseneyer et al.,
2011), which does not depend on prior sequence data
(Wang et al., 2009). We used RNA-seq to obtain the
transcriptome of glandular trichome cells of nine se-
lected cannabis strains (with three biological replicates
each). Importantly, statistical analyses of these data sets
allowed the differentiation of strains into broader
clades (descendants of landraces of C. sativa or C. indica)
Figure 9. Variation of the residue putatively stabilizing carbocation intermediates correlates with outcome of catalysis in can-
nabis sesquiterpene synthases. A, Sequence alignment of sesquiterpene synthases (with carbocation-stabilizing residues high-
lighted). B, Proposed cyclization reactions catalyzed by sesquiterpene synthases. Identifiers for sequences from the literature
(Aharoni et al., 2004; Nagegowda et al., 2008) are as follows: AmNES/LIS1, EF433761; AmNES/LIS2, EF433762; FvNES1,
AX529002; FaNES2, AX529067; FaNES1, KX450224, with species abbreviations as follows: Am, Antirrhinum majus;Fa,Fragaria
3ananassa;Fv,Fragaria vesca).
Plant Physiol. Vol. 180, 2019 1891
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
but also resulted in the full separation of all individual
strains (with biological replicates clustering closely to-
gether; Fig. 3). We fully recognize that RNA-seq is not a
viable option for routine genotyping, but it can be used
to develop single-nucleotide polymorphism-based
genotyping platforms. This approach has been employed
successfully for a number of crops, including alfalfa
(Medicago sativa; Yang et al., 2011), maize (Zea mays;Hansey
et al., 2012), and wheat (Triticum aestivum;Ramirez-
Gonzalez et al., 2015). Our data sets are therefore highly
valuable for building resources for follow-up research with
cannabis. As an added benet, RNA-seq data can be used
for gene expression analysis, thereby providing a func-
tional context, which is discussed in more detail below.
Utility of Metabolite Proling for Strain Differentiation
We assessed the utility of cannabinoid and terpenoid
proling, in addition to strain differentiation by geno-
typing as discussed above, to demarcate nine com-
mercial cannabis strains. Two independent statistical
approaches, PCA and OPLS-DA, grouped biological
replicates closely together while still separating indi-
vidual strains and classes of strains (those of C. sativa or
C. indica heritage; Fig. 4). Several authors have advo-
cated the proling of both cannabinoids and terpenoids
in recent publications (Fischedick et al., 2010; Elzinga
et al., 2015; Aizpurua-Olaizola et al., 2016; Hazekamp
et al., 2016; Fischedick, 2017; Lewis et al., 2018;
Orser et al., 2018; Richins et al., 2018; Sexton et al., 2018).
The key advantage of this approach over merely pro-
ling cannabinoids lies in the enormous diversity of
terpenoids accumulated in cannabis (and in other
plants as well), which signicantly increases the power
of statistical analyses. It also reects the fact that many
users select cannabis strains based on both the reported
THC content and aroma (which is largely imparted by
terpenoids; Gilbert and DiVerdi, 2018). A comprehen-
sive analysis of cannabis strains recently indicated the
presence of close to 200 detectable volatiles, which were
tentatively identied based on searches against various
spectral databases (Rice and Koziel, 2015). A notable
challenge with terpenoid proling pertains to the lim-
itation that authentic standards are often very costly or
unavailable from commercial sources, which is partic-
ularly true for sesquiterpenes (dozens detected by Rice
and Koziel [2015]). Commercial cannabis testing labo-
ratories therefore rarely offer services that comprise
more than 20 terpenoids. While such analyses may
detect the most abundant terpenoids for popular
strains, it is not unlikely that important aroma volatiles
with a low odor detection threshold could be missed
(Chin and Marriott, 2015). Another reason why a
comprehensive proling of terpenoids would be de-
sirable relates to testing the validity of the entourage
effect, the proposed synergism between cannabinoids
and other constituents (in particular terpenoids) that
might affect the experience of the user (Gertsch et al.,
2008; Russo, 2011). Should such effects be substantiated
by empirical evidence, it would be advisable to recon-
sider the current laws and rules for formulations con-
taining cannabis extracts, which are based solely on
THC. An improved understanding of terpenoid phy-
tochemistry in cannabis would be an important rst
step in this direction (Booth and Bohlmann, 2019).
Coregulation of Metabolic Pathways in Cannabis Is
Consistent with Gene Expression Patterns Commonly
Observed in Glandular Trichomes
Our statistical analyses using the WGCNA package
indicated a tight correlation of biosynthetic genes with
cannabinoid and terpenoid end products (Fig. 5). We
recently performed a meta-analysis of gene expression
patterns in glandular trichomes across various species
(Zager and Lange, 2018). One of the conclusions, con-
sistent with the data presented here, was that gene ex-
pression patterns correlate well with the metabolic
specialization in these anatomical structures. Cor-
egulation has been observed for genes across multiple
pathways of specialized metabolism, such as cannabi-
noids and terpenoids (this study), monoterpenes and
diterpenes (Salvia pomifera; Trikka et al., 2015), avo-
noids and acyl sugars (Salpiglossis sinuata and Solanum
quitoense; Moghe et al., 2017), and bitter acids and pre-
nylavonoids (Humulus lupulus; Kavalier et al., 2011;
Clark et al., 2013). These tight gene-to-metabolite cor-
relations were also reective of predicted uxes
through the relevant pathways (Zager and Lange,
2018). In contrast, gene expression patterns appear to
be less predictive of uxes through central carbon me-
tabolism, where regulation at the protein level plays a
more signicant role (Paul and Pellny, 2003; Koch, 2004;
Gibon et al., 2006; Schwender et al., 2014; Rocca et al.,
2015). This does not mean that feedback regulation of
specialized metabolism is negligible in glandular tri-
chomes; there is just a particularly strong overall gene-
to-metabolite correlation, and unraveling the details
will be an exciting topic for future research.
Functional Characterization of Terpene Synthases
Contributes to an Improved Understanding of the Genetic
Determinants of Terpenoid Diversity
The observed gene-to-metabolite correlations in
cannabis glandular trichomes provided opportunities
for gene discovery efforts. Booth et al. (2017) analyzed
transcriptome data sets obtained with the Finola and
Purple Kush strains to obtain candidate genes for ter-
pene synthases that were subsequently characterized to
encode enzymes for the production of 14 monoterpenes
and sesquiterpenes. Those that contribute to the for-
mation of some of the common monoterpenes and
sesquiterpenes [e.g. b-myrcene, (2)-limonene, a-pinene,
b-caryophyllene, and a-humulene] were found to be
expressed at fairly high levels across the strains in-
cluded in this analysis, whereas those that generate less
1892 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
common products [e.g. (Z)-b-ocimene, g-eudesmol,
alloaromadendrene, d-selinene, and valencene] were
found to be expressed only in a limited number of
strains or not at all (Table 3). To assess sequence vari-
ation among these genes, we cloned genes with high
sequence identity to the previously characterized
b-myrcene and (2)-limonene synthases.
Prior to this study, a notable gap existed with regard
to the terpene synthases underlying the formation of
the monoterpene linalool and the sesquiterpene ner-
olidol, which are both common constituents in cannabis
resin. We now identied a gene coding for an enzyme
(CsTPS19BL) that generates a mixture of (1)-linalool
and (2)-linalool from GPP and (E)-nerolidol from tFPP
in the Black Lime strain. We also cloned a putative
ortholog from the Valley Fire strain to evaluate the ef-
fects of sequence variation. Interestingly, the encoded
enzyme (CsTPS18VF) had the same specicity as
CsTPS19BL with regard to the tFPP substrate [(E)-
nerolidol as product]; however, with GPP as substrate,
(1)-linalool was detected as the essentially exclusive
product. This difference in specicity is surprising
given that the peptide sequences have only three mis-
matches (Supplemental Fig. S3).
Finally, we cloned genes that, based on sequence
relatedness, were expected to code for enzymes that
generate sesquiterpene products not previously detec-
ted in assays with cannabis terpene synthases. Indeed,
CsTPS16CC was demonstrated to produce germacrene
B and CsTPS20CT formed hedycaryol as primary pro-
duct. In assays with CsTPS16CC, g-elemene was also
detected, but this is a well-known product of thermal
degradation in the GC inlet (de Kraker et al., 1998).
Elemol was the sole product of assays with CsTPS20CT,
which is also a thermal degradation product, in this
case of hedycaryol (Koo and Gang, 2012; Hattan et al.,
2016). Consequently, the enzyme activities are referred
to as germacrene B synthase and hedycaryol synthase,
respectively. To the best of our knowledge, the sesqui-
terpenes generated by these terpene synthases (ger-
macrene B and hedycaryol) have not been identied in
cannabis samples yet, indicating the need for a more
comprehensive coverage of terpenoids to better un-
derstand strain-specic aroma proles. It should also be
noted that several recent studies reporting on compre-
hensive chemical and sensory analyses of volatiles
emitted from cannabis found that nonterpenoid alco-
hols and aldehydes have potent odor impacts (Rice and
Koziel, 2015; Wiebelhaus et al., 2016; Calvi et al., 2018).
These considerations indicate that more emphasis
needs to be placed on comprehensive metabolite pro-
ling, including cannabinoids and terpenoids but also
extending to other volatiles, for future efforts focused
on strain characterization.
With a larger number of functionally characterized
genes in cannabis, sequence comparisons are now
allowing us to ask questions about some of the deter-
minants of specicity. The overall sequence identity of
the sesquiterpene synthases characterized here is fairly
low (less than 70% at the amino acid level), but there are
striking differences in the nature of a conserved aro-
matic residue (Tyr-527) that had previously been hy-
pothesized to stabilize the positive charge of the
carbocation occurring during the formation of a ger-
macrene intermediate in the epi-aristolochene synthase
catalytic sequence (Starks et al., 1997). The equivalent
residues in sesquiterpene synthases that catalyze the
formation of cyclic products (CsTPS16CC and CsTPS20CT)
are also Tyr residues (Fig. 9). In contrast, Gln residues oc-
cupy this position in CsTPS18VF, CsTPS19BL, and other
characterized enzymes of the TPS-g clade (Fig. 9A; Aharoni
et al., 2004; Nagegowda et al., 2008), which, possibly
because of insufcient carbocation stabilization, generate
(E)-nerolidol as a noncyclic product (Fig. 9). Testing this
hypothesis will be an important future goal for follow-up
research.
MATERIALS AND METHODS
Plant Materials and Chemicals
Clonal plant cuttings of nine Cannabis sativa (cannabis) strains (Sour Diesel,
Canna Tsu, Black Lime, Valley Fire, White Cookies, Mama Thai, Terple, Cherry
Chem, and Blackberry Kush) were placed in 250-L pots and grown in hoop-
style, light-deprivation greenhouses at Shadowbox Farms in Williams, Oregon,
under a 18-h-light/6-h-dark regime (natural light) to stimulate vegetative
growth, before shifting to a 12-h-light/12-h-dark cycle to induce owering. The
length of these time periods varied from strain to strain and was adjusted based
on phenotypic evaluations. All aspects of plant growth, harvest, and transport
were performed in accordance with the laws and rules under Chapter 475B, as
released by the Oregon Liquor Control Commission. Plant harvest was per-
formed when the consistency of glandular trichome content had changed from
a turbid white to clear and before another change to an amber-like color oc-
curred. For most strains, the pistils had changed color from white to yellow or
orange. Buds were harvested, parts with low glandular trichome content were
removed using scissors, and the remainder were placed on ice until further
processing (always within 3 h). Monoterpene and sesquiterpene reference
standards were purchased from Restek. Cannabinoid reference standards were
obtained from Sigma-Aldrich. Solvents for extraction were procured from
Sigma-Aldrich Solvents and chemicals for chromatography were sourced from
Burdick & Jackson. Substrates for enzyme assays (GPP, NPP, and E,E-FPP)
were prepared synthetically (Davisson et al., 1986) or obtained from a com-
mercial source (Z,Z-FPP; Echelon Biosciences). The sources of standards for
enzyme assays were as follows: germacrene B, isolated as a side product from
assays with germacrene C synthase (Colby et al., 1998); g-elemene, obtain ed by
heating germacrene B under argon (de Kraker et al., 1998); elemol, institutional
chemical repository (originally purchased from Parchem); hedycaryol, institu-
tional chemical repository (source unknown); (S)-(1)-linalool, isolated from c o-
riander (Coriandrum sativum)oil;(2)-limonene, (1)-limonene, (R)-(2)-linalool,
b-myrcene, (E)-nerolidol, (2)-a-pinene, (2)-b-pinene, and a-terpinolene, all
purchased from Sigma-Aldrich.
Metabolite Extraction and Analysis
Cannabinoids and terpenoids were extracted and quantied according to
Fischedick et al. (2010), with modications, at a testing facility with accredita-
tion by ISO/IEC 17025 and licensed through the National Environmental
Laboratory Accreditation Program (Evio Labs). Briey, roughly 2 g of fresh bud
tissue was crushed in a Falcon tube, suspended in 10 mL of methyl tert-butyl
ether (containing 1-octanol as internal standard) with gentle shaking for 15 min,
followed by centrifugation at 2,000gfor 5 min. The supernatant was transferred
to a new vial, and the plant material was extracted two more times as above (no
addition of internal standard to solvent). The combined supernatants were
ltered through a polytetrauoroethylene syringe lter (0.45 mm pore size,
25 mm diameter), and an aliquot was transferred to a screw-cap glass vial,
which was stored at 220°C until further analysis. Following extraction, the
Plant Physiol. Vol. 180, 2019 1893
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
remaining plant material was dried in an oven (50°C) and weighed to determine
dry weights for each sample.
Cannabinoids were separated via HPLC (modelLC-2030C; Shimadzu) using a
Kinetex C18 reverse-phase column (50 34.6 mm, 2.6 mm particle size; Phenom-
enex) and a binary gradient of solvent A (water containing 0.1% [v/v] formic acid
and 10 mMammonium formate) and solvent B (methanol containing 0.05% [v/v]
formic acid) with the following settings: 0 to 9 min, 68% to 78% B; 9 to 11.9 min,
78% to 100% B; 11.9 to 13.5 min, hold at 100% B. Analytes were monitored at 228
nm in a diode array detector. Peak identication was achieved based on com-
parisons of retention times and spectral characteristics with those of authentic
cannabinoid reference standards. Analytes were quantied based on calibration
curves acquired withauthentic standards.The validationof the analytical method
was performed according to Fischedick et al. (2010).
Terpenoids were separated via GC (model 6890; Agilent Technologies) using
a DB5 column (30 m 325 mm, 25 mmlm thickness; Agilent Technologies) and
detected with a ame ionization detector. The conditions for separation were as
follows: injector at 250°C, 20:1 split injection mode (1 mL injected); detector at
250°C (H
2
ow at 30 mL min
21
, airow at 400 mL min
21
, makeup ow [He] at
25 mL min
21
); oven heating from 40°C to 120°C at 2°C min
21
, then ramped to
200°C at 50°C min
21
, with a nal hold at 200°C for 2 min. GC peaks were
identied based on comparisons of retention times of authentic standards
(purchased from Sigma-Aldrich). Analytes were quantied based on calibra-
tion curves acquired with authentic standards. The validation of the analytical
method was performed according to Fischedick et al. (2010).
RNA Isolation from Glandular Trichomes and cDNA
Library Preparation
Secretory cells of glandular trichomes were removed from 10 to 15 g of bud tissue
by surface abrasion and then collected by ltering through a series of nylon meshes
(Lange et al., 2000). Total RNA was isolated from secretory cells using the RNeasy
Plant kit (Qiagen) according to the manufacturers instructions. RNA integrity was
determined using a BioAnalyzer 2100 (Agilent Technologies). cDNA libraries from
1to2mg of total RNA were generated using the SuperScript III Reverse Tran-
scriptase kit (Invitrogen) according to the manufacturersinstructions.
RNA-Seq and Transcriptome Assembly
RNA-seq libraries were prepared from 250 ng of total glandular trichome
RNA with the Stranded mRNA-Seq Poly(A) Selection kit (KAPA Biosystems).
The quality and quantity of the sequencing library were assessed using a Bio-
analyzer 2100 and a Qubit 3.0 Fluorometer (Agilent Technologies and Life
Technologies). Sequencing of 150-bp paired-end reads was performed on a
HiSeq 4000 instrument (Illumina). Sequenced reads were trimmed of adapter
sequences with Trimmomatic (Bolger et al., 2014), and sequence quality was
checked with FastQC (Andrews, 2010). Trimmed sequences were merged and
assembled using the Trinity de novo assembler, and downstream functional
annotation of the assembly was performed with Trinotate (Haas et al., 2013).
The resulting transcriptome assembly contained 514,208 contigs, with a mean
contig length of 875 bp and an N50 value of 1,529 bp. Transcript abundance in
each RNA-seq data set (three biological replicates per strain) was determined
with RSEM (Li and Dewey, 2011).
Analysis of Global Gene Expression Patterns and
GO Enrichment
Testing for differential gene expression across strains was performed using
the Bioconductor package DESeq2 (version 1.18.1; Love et al., 2014). Pvalues
were adjusted using the Benjamini-Hochberg procedure (Benjamini and
Hochberg, 1995). An adjusted Pvalue (false discovery rate) #1e-10 and log
2
ratio $3 were set as thresholds. A cluster analysis of gene expression patterns
between strains was performed within the Trinity suite (Haas et al., 2013) by
partitioning genes into clusters by cutting the hierarchically clustered gene tree
at 60% height of the tree. A GO enrichment analysis of differentially expressed
genes was performed using the GOseq package in R (Young et al., 2010). GO
terms with an adjusted P,0.01 were considered signicantly enriched.
Gene Coexpression Network Analysis
A gene coexpression network was built using the WGCNA package in R
(Langfelder and Horvath, 2008). Transcriptome data sets were ltered to
remove genes with an average expression value of 50 TPM or smaller. Coex-
pression modules were identied using the function blockwiseModules with
the following settings: power at 7, mergeCutHeight at 0.55, and minModuleSize
at 30. Eigengene values were determined for each coexpression module to test
for association signicance. Modules with similar eigengene values were
merged to obtain the nal coexpression modules.
Phylogenetic Analysis of TPS Candidates
The identication of TPS candidate genes was accomplished by searching the
translated transcriptome consensus assembly against a manually curated pro-
tein database specic to characterized plant TPSs using the BLASTx algorithm. A
reciprocal search (tBLASTn) was performed with sequences of 114 characterized
angiosperm TPSs against the assembly for each individual strain. Predic ted TPS
sequences were then analyzed for gene expression values across strains.
Translated amino acid sequences of these and reference TPSs (from C. sativa and
Humulus lupulus) were aligned using the MUSCLE algorithm. Alignments were
analyzed with maximum likelihood analysis using a Jones-Taylor-Thornton
model with gamma distribution for rates among amino acid sites. One thou-
sand bootstrap replicates were then used to construct a phylogeny using
MEGA7 (Jones et al., 1992; Kumar et al., 2016).
Cloning of TPS cDNAs
First-strand cDNA was prepared from RNA with the SuperScript III First
Strand Synthesis kit (Invitrogen) with random hexamer oligonucleotides. Open
reading frames for TPSs were amplied using gene-specic primers
(Supplemental Table S5; amplicons for full-length cDNAs were generated for
putative sesquiterpene synthases, whereas cDNAs devoid of the plastidial
targeting sequence were amplied for putative monoterpene synthases).
Amplicons were ligated into the pGEM-T Easy vector (Promega) and sequence
veried. For expression in Escherichia coli, full-length or truncated genes were
subcloned into the pSBET expression vector (predigested with NdeIand
BamHI). Several terpene synthase cDNAs (CsTPS18VF,CsTPS19BL,and
CsTPS20CT) were purchased as synthetic products (in the pET28B expression
vector) from GenScript.
In Vitro Functional Assays for Recombinant TPSs
Plasmids were transformed into chemically competent cells of several E. coli
strains [BL21 (DE3), C41 (DE3), C43 (DE3), C43 (DE3) pLysS, and ArcticExpress
(DE3)], which were then grown in 25 mL of liquid Luria-Bertani medium at
37°C with shaking to an OD
600
of 0.8. Expression of TPS genes was induced with
0.1 or 0.5 mMisopropyl b-D-1-thiogalactopyranoside (Goldbio), and cells were
grown for another 24 h at three different temperatures (16°C, 10°C, and 4°C).
Bacterial cells were harvested by centrifugation at 5,000gand resuspended in
300 mL of MOPSO buffer, pH 7, supplemented with 1 mMDTT (Goldbio). Cells
were lysed using a model 475 sonicator (VirTis), with three 15-s bursts and
cooling on ice between bursts. The resulting homogenate was centrifuged at
15,000gfor 30 min at 4°C, and the clear supernatant was mixed with ceramic
hydroxyapatite (Bio-Rad). The purication of recombinant protein was per-
formed as described by Srividya et al. (2016) for constructs in the pSBET ex-
pression vector, whereas those in the pET28B expression vector were puried
over Ni
21
afnity columns according to the manufacturers instructions
(Novagen-EMD Millipore). In vitro assays were performed in 2-mL glass vials
containing 200 mg of puried enzyme in MOPSO buffer containing DTT and
MgCl
2
(total volume of 100 mL). A prenyl diphosphate substrate (GPP, NPP,
tFPP, or cFPP) was added to a nal concentration of 0.5 mM. The assay mixtures
were overlaid with 100 mLofn-hexane (Avantor) and incubated at 30°C for 16 h
on a multitube rotator (Labquake; Barnstead Thermolyne). The enzymatic re-
action was stopped by vigorous mixing of the contents of the tubes, followed by
30 min at 280°C for phase separation. The organic phase was removed and
transferred to glass vial inserts and stored in GC vials at 220°C until further
analysis.
Enzymatically formed products were analyzed on a 6890N gas chromato-
graph coupled to a 5973 mass selective detector (Agilent). Analyte separation
was achieved under the conditions developed by Adams (2007), which includes
a comprehensive resource for spectral comparisons of volatiles. The chiral
separation of monoterpenes was achieved as described by Turner et al. (2019).
Enzymatically generated products were identied based on retention times and
mass spectral properties when compared with those of authentic standards.
1894 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Statistical Analyses
For metabolite analyses, statistical analyses were performed in R using the
MetaboAnalystR package (Chong and Xia, 2018). Quantitative terpenoid and
cannabinoid data were scaled by dividing mean centered values by the SD of
each variable to generate principal component loadings. Principal components
were then plotted in three dimensions within the R environment. OPLS-DA
analysis was also performed in the same way using the MetaboAnalystR
package. Differential gene expression patterns were assessed using the Bio-
conductor package DESeq2 (version 1.18.1; Love et al., 2014), with the Pvalue
for the Benjamini-Hochberg false discovery threshold being adjusted to 1e-10 or
less and the log
2
fold-change ratio to 3 or greater. Cluster analysis of differential
gene expression was performed within the Trinity suite (Haas et al., 2013) by
cutting the clustered gene tree at 60% tree height, and differentially expressed
genes were subjected to further analysis within GOseq as described above
(Young et al., 2010). TPS candidates were identied based on sequence identity
with functionally characterized TPSs in tBLASTn searches. Candidates with
e-values .0.001 and bitscores ,250 were removed from further consideration.
Accession Numbers
The raw transcriptome sequence data for cannabis strains are available at the
National Center for Biotechnology Information Sequence Read Archive, project
number PRJNA498707. Nucleotide sequences for genes characterized as part
of this study were deposited in GenBank and received the accession
numbers MK131289 (CsTPS16CC), MK801762 (CsTPS20CT), MK801763
(CsTPS19BL), MK801764 (CsTPS18VF), MK801765 (CsTPS15CT), and
MK801766 (CsTPS14CT).
Supplemental Data
The following supplemental materials are available.
Supplemental Figure S1. Alignment of translated peptide sequences,
based on RNA-seq data, of THCA synthase across cannabis strains.
Supplemental Figure S2. Nucleotide and translated peptide sequence,
based on RNA-seq data, of CBDA synthase from the cannabis strain
Canna Tsu.
Supplemental Figure S3. Alignment of terpene synthase sequences.
Supplemental Table S1. Statistics of de novo assemblies performed based
on cannabis glandular trichome-specic RNA-seq data sets.
Supplemental Table S2. Annotation of transcripts represented in cannabis
glandular trichome-specic RNA-seq data sets.
Supplemental Table S3. Clustering of genes into coexpression modules
obtained by WGCNA of cannabis glandular trichome-specic RNA-seq
data sets.
Supplemental Table S4. Accession numbers and sequences of terpene
synthases considered for phylogenetic analysis.
Supplemental Table S5. Primers used to clone cannabis cDNAs for func-
tional characterization.
ACKNOWLEDGMENTS
This study was supported by gifts from private individuals, and we are
grateful for their generosity. We also thank Shadowbox Farms for allowing A.S.
to harvest plant materials.
Received December 5, 2018; accepted May 15, 2019; published May 28, 2019.
LITERATURE CITED
Abuhasira R, Shbiro L, Landschaft Y (2018) Medical use of cannabis and
cannabinoids containing products: Regulations in Europe and North
America. Eur J Intern Med 49: 26
Adams RP (2007) Identication of Essential Oil Components By Gas
Chromatography/Mass Spectrometry, 4. Allured Publishing Corpora-
tion, Carol Steam, IL
Aharoni A, Giri AP, Verstappen FW, Bertea CM, Sevenier R, Sun Z,
Jongsma MA, Schwab W, Bouwmeester HJ (2004) Gain and loss of fruit
avor compounds produced by wild and cultivated strawberry species.
Plant Cell 16: 31103131
Aizpurua-Olaizola O, Soydaner U, Öztürk E, Schibano D, Simsir Y,
Navarro P, Etxebarria N, Usobiaga A (2016) Evolution of the cannabi-
noid and terpene content during the growth of Cannabis sativa plants
from different chemotypes. J Nat Prod 79: 324331
Andre CM, Hausman JF, Guerriero G (2016) Cannabis sativa: The plant of
the thousand and one molecules. Front Plant Sci 7: 19
Andrews S (2010) FastQC: A quality control tool for high throughput se-
quence data. http://www.bioinformatics.babraham.ac.uk/projects/
fastqc
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: A
practical and powerful approach to multiple testing. J R Stat Soc B 57:
289300
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: A exible trimmer for
Illumina sequence data. Bioinformatics 30: 21142120
Booth JK, Bohlmann J (2019) Terpenes in Cannabis sativa: From plant
genome to humans. Plant Sci 284: 6772
Booth JK, Page JE, Bohlmann J (2017) Terpene synthases from Cannabis
sativa.PLoSONE12: e0173911
Brenneisen R (2007) Chemistry and analysis of phytocannabinoids and
other cannabis constituents. In M.A. ElSohly, ed, Forensic Science and
Medicine. Marijuana and the Cannabinoids. Humana Press, New York,
pp 1749
Calvi L, Pentimalli D, Panseri S, Giupponi L, Gelmini F, Beretta G, Vitali
D, Bruno M, Zilio E, Pavlovic R, et al (2018) Comprehensive quality
evaluation of medical Cannabis sativa L. inorescence and macerated oils
based on HS-SPME coupled to GC-MS and LC-HRMS (q-exactive orbi-
trapÒ) approach. J Pharm Biomed Anal 150: 208219
Cascini F, Aiello C, Di Tanna G (2012) Increasing delta-9-tetrahydrocan-
nabinol (D-9-THC) content in herbal cannabis over time: Systematic re-
view and meta-analysis. Curr Drug Abuse Rev 5: 3240
Chin ST, Marriott PJ (2015) Review of the role and methodology of high
resolution approaches in aroma analysis. Anal Chim Acta 854: 112
Chong J, Xia J (2018) MetaboAnalystR: An R package for exible and re-
producible analysis of metabolomics data. Bioinformatics 34: 43134314
Clark SM, Vaitheeswaran V, Ambrose SJ, Purves RW, Page JE (2013)
Transcriptome analysis of bitter acid biosynthesis and precursor path-
ways in hop (Humulus lupulus). BMC Plant Biol 13: 12
Colby SM, Crock J, Dowdle-Rizzo B, Lemaux PG, Croteau R (1998)
Germacrene C synthase from Lycopersicon esculentum cv. VFNT cherry
tomato: cDNA isolation, characterization, and bacterial expression of
themultipleproductsesquiterpenecyclase.ProcNatlAcadSciUSA95:
22162221
Davisson VJ, Woodside AB, Neal TR, Stremler KE, Muehlbacher M,
Poulter CD (1986) Phosphorylation of isoprenoid alcohols. J Org Chem
51: 47684779
de Kraker JW, de Groot A, Franssen MC, Konig WA, Bouwmeester HJ
(1998) (1)-Germacrene A biosynthesis: The committed step in the bio-
synthesis of bitter sesquiterpene lactones in chicory. Plant Physiol 117:
13811392
Devane WA, Dysarz FA III, Johnson MR, Melvin LS, Howlett AC (1988)
Determination and characterization of a cannabinoid receptor in rat
brain. Mol Pharmacol 34: 605613
Devane WA, Hanus L, Breuer A, Pertwee RG, Stevenson LA, GrifnG,
Gibson D, Mandelbaum A, Etinger A, Mechoulam R (1992) Isolation
and structure of a brain constituent that binds to the cannabinoid re-
ceptor. Science 258: 19461949
Elzinga S, Fischedick J, Podkolinski R, Raber JC (2015) Cannabinoids and
terpenes as chemotaxonomic markers in cannabis. Nat Prod Chem Res 3:
2
Fellermeier M, Zenk MH (1998) Prenylation of olivetolate by a hemp
transferase yields cannabigerolic acid, the precursor of tetrahydrocan-
nabinol. FEBS Lett 427: 283285
Fellermeier M, Eisenreich W, Bacher A, Zenk MH (2001) Biosynthesis of
cannabinoids. Incorporation experiments with (13)C-labeled glucoses.
Eur J Biochem 268: 15961604
Fischedick JT (2017) Identication of terpenoid chemotypes among high
(2)-trans-D9-tetrahydrocannabinol-producing Cannabis sativa L. culti-
vars. Cannabis Cannabinoid Res 2: 3447
Plant Physiol. Vol. 180, 2019 1895
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Fischedick JT, Hazekamp A, Erkelens T, Choi YH, Verpoorte R (2010)
Metabolic ngerprinting of Cannabis sativa L., cannabinoids and terpe-
noids for chemotaxonomic and drug standardization purposes. Phyto-
chemistry 71: 20582073
Gagne SJ, Stout JM, Liu E, Boubakir Z, Clark SM, Page JE (2012) Iden-
tication of olivetolic acid cyclase from Cannabis sativa reveals a unique
catalytic route to plant polyketides. Proc Natl Acad Sci USA 109:
1281112816
Gaoni Y, Mechoulam R (1964) Isolation, Structure, and Partial Synthesis of
an Active Constituent of Hashish. J Am Chem Soc 86: 16461647
Gertsch J, Leonti M, Raduner S, Racz I, Chen JZ, Xie XQ, Altmann KH,
Karsak M, Zimmer A (2008) Beta-caryophyllene is a dietary cannabi-
noid.ProcNatlAcadSciUSA105: 90999104
Gertsch J, Pertwee RG, Di Marzo V (2010) Phytocannabinoids beyond the
cannabis plant - do they exist?. Br J Pharmacol 160: 523529
Gibon Y, Usadel B, Blaesing OE, Kamlage B, Hoehne M, Trethewey R,
Stitt M (2006) Integration of metabolite with transcript and enzyme
activity proling during diurnal cycles in Arabidopsis rosettes. Genome
Biol 7: R76
Gilbert AN, DiVerdi JA (2018) Consumer perceptions of strain differences
in Cannabis aroma. PLoS ONE 13: e0192247
Grattan JHG, Singer CJ (1952) Anglo-Saxon Magic and Medicine. Oxford
University Press, London
Günnewich N, Page JE, Köllner TG, Degenhardt J, Kutchan TM (2007)
Functional expression and characterization of trichome-specic(2)-
limonene synthase and (1)-a-pinene synthase from Cannabis sativa.
Nat Prod Commun 2: 223232
Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J,
CougerMB,EcclesD,LiB,LieberM,etal(2013) De novo transcript
sequence reconstruction from RNA-seq using the Trinity platform for
reference generation and analysis. Nat Protoc 8: 14941512
Hansey CN, Vaillancourt B, Sekhon RS, de Leon N, Kaeppler SM, Buell
CR (2012) Maize (Zea mays L.) genome diversity as revealed by RNA-
sequencing. PLoS ONE 7: e33071
Haseneyer G, Schmutzer T, Seidel M, Zhou R, Mascher M, Schön CC,
TaudienS,ScholzU,SteinN,MayerKF,etal(2011) From RNA-seq to
large-scale genotyping: Genomics resources for rye (Secale cereale L.).
BMC Plant Biol 11: 131
Hattan J, Shindo K, Ito T, Shibuya Y, Watanabe A, Tagaki C, Ohno F,
Sasaki T, Ishii J, Kondo A, et al (2016) Identication of a novel hedy-
caryol synthase gene isolated from Camellia brevistyla owers and oral
scent of Camellia cultivars. Planta 243: 959972
Hazekamp A, Tejkalová K, Papadimitriou S (2016) Cannabis: From cul-
tivar to chemovar. II. A metabolomics approach to Cannabis classica-
tion. Cannabis Cannabinoid Res 1: 202215
Hemmerlin A, Harwood JL, Bach TJ (2012) A raison dêtre for two distinct
pathways in the early steps of plant isoprenoid biosynthesis? Prog Lipid
Res 51: 95148
Jones DT, Taylor WR, Thornton JM (1992) The rapid generation of mu-
tation data matrices from protein sequences. Comput Appl Biosci 8:
275282
Kavalier AR, Litt A, Ma C, Pitra NJ, Coles MC, Kennelly EJ, Matthews PD
(2011) Phytochemical and morphological characterization of hop (Hu-
mulus lupulus L.) cones over ve developmental stages using high per-
formance liquid chromatography coupled to time-of-ight mass
spectrometry, ultrahigh performance liquid chromatography photodi-
ode array detection, and light microscopy techniques. J Agric Food
Chem 59: 47834793
Koch K (2004) Sucrose metabolism: Regulatory mechanisms and pivotal
roles in sugar sensing and plant development. Curr Opin Plant Biol 7:
235246
Kojoma M, Seki H, Yoshida S, Muranaka T (2006) DNA polymorphisms
in the tetrahydrocannabinolic acid (THCA) synthase gene in drug-
typeand ber-typeCannabis sativa L. Forensic Sci Int 159: 132140
Koo HJ, Gang DR (2012) Suites of terpene synthases explain differential
terpenoid production in ginger and turmeric tissues. PLoS ONE 7:
e51481
Kumar S, Stecher G, Tamura K (2016) MEGA7: Molecular Evolutionary
Genetics Analysis version 7.0 for bigger datasets. Mol Biol Evol 33:
18701874
Lange BM, Wildung MR, Stauber EJ, Sanchez C, Pouchnik D, Croteau R
(2000) Probing essential oil biosynthesis and secretion by functional
evaluation of expressed sequence tags from mint glandular trichomes.
Proc Natl Acad Sci USA 97: 29342939
Langfelder P, Horvath S (2008) WGCNA: An R package for weighted
correlation network analysis. BMC Bioinformatics 9: 559
Laverty KU, Stout JM, Sullivan MJ, Shah H, Gill N, Bolbrook L, Deikus
G, Sebra R, Hughes TR, Page JE, et al (2019) A physical and genetic
map of Cannabis sativa identies extensive rearrangement at the THC/
CBDacidsynthaselocus.GenomeRes29: 146156
Lewis MA, Russo EB, Smith KM (2018) Pharmacological foundations of
Cannabis chemovars. Planta Med 84: 225233
Li B, Dewey CN (2011) RSEM: Accurate transcript quantication from
RNA-Seq data with or without a reference genome. BMC Bioinformatics
12: 323
Love MI, Huber W, Anders S (2014) Moderated estimation of fold change
and dispersion for RNA-seq data with DESeq2. Genome Biol 15: 550
Moghe GD, Leong BJ, Hurney SM, Daniel Jones A, Last RL (2017) Evo-
lutionary routes to biochemical innovation revealed by integrative
analysis of a plant-defense related specialized metabolic pathway. eLife
6: e28468
Nagegowda DA, Gutensohn M, Wilkerson CG, Dudareva N (2008) Two
nearly identical terpene synthases catalyze the formation of nerolidol
and linalool in snapdragon owers. Plant J 55: 224239
Orser C, Johnson S, Speck M, Hilyard A, AaI(2018) Terpenoid che-
moproles distinguish drug-type Cannabis sativa L. cultivars in Nevada.
Nat Prod Chem Res 6: 304
Page JE, Boubakir Z (2012) Aromatic prenyltransferase from Cannabis. US
Patent 20120144523.
Paul MJ, Pellny TK (2003) Carbon metabolite feedback regulation of leaf
photosynthesis and development. J Exp Bot 54: 539547
Piluzza G, Delogu G, Cabras A, Marceddu S, Bullitta S (2013) Differen-
tiation between ber and drug types of hemp (Cannabis sativa L.) from a
collection of wild and domesticated accessions. Genet Resour Crop Evol
60: 23312342
Punja ZK, Rodriguez G, Chen S (2017) Assessing genetic diversity in
Cannabis sativa using molecular approaches. In S Chandra, H Lata, MA
ElSohly, eds, Cannabis sativa L.: Botany and Biotechnology. Springer
International Publishing, Cham, Switzerland, pp 395418
Ramirez-Gonzalez RH, Segovia V, Bird N, Fenwick P, Holdgate S, Berry
S, Jack P, Caccamo M, Uauy C (2015) RNA-Seq bulked segregant
analysis enables the identication of high-resolution genetic markers for
breeding in hexaploid wheat. Plant Biotechnol J 13: 613624
Rice S, Koziel JA (2015) Characterizing the smell of marijuana by odor
impact of volatile compounds: An application of simultaneous chemical
and sensory analysis. PLoS ONE 10: e0144160
RichinsRD,Rodriguez-UribeL,LoweK,FerralR,OConnell MA (2018)
Accumulation of bioactive metabolites in cultivated medical Cannabis.
PLoS ONE 13: e0201119
Rocca JD, Hall EK, Lennon JT, Evans SE, Waldrop MP, Cotner JB,
Nemergut DR, Graham EB, Wallenstein MD (2015) Relationships be-
tween protein-encoding gene abundance and corresponding process are
commonlyassumedyetrarelyobserved.ISMEJ9: 16931699
Ross SA, ElSohly MA (1996) The volatile oil composition of fresh and air-
dried buds of Cannabis sativa.JNatProd59: 4951
Russo EB (2011) Taming THC: Potential cannabis synergy and
phytocannabinoid-terpenoid entourage effects. Br J Pharmacol 163:
13441364
Russo EB, Jiang HE, Li X, Sutton A, Carboni A, del Bianco F, Mandolino
G, Potter DJ, Zhao YX, Bera S, et al (2008) Phytochemical and genetic
analyses of ancient cannabis from Central Asia. J Exp Bot 59: 41714182
Sawler J, Stout JM, Gardner KM, Hudson D, Vidmar J, Butler L, Page JE,
Myles S (2015) The genetic structure of marijuana and hemp. PLoS ONE
10: e0133292
Scheben A, Batley J, Edwards D (2017) Genotyping-by-sequencing ap-
proaches to characterize crop genomes: Choosing the right tool for the
right application. Plant Biotechnol J 15: 149161
Schwender J, König C, Klapperstück M, Heinzel N, Munz E,
Hebbelmann I, Hay JO, Denolf P, De Bodt S, Redestig H, et al (2014)
Transcript abundance on its own cannot be used to infer uxes in central
metabolism. Front Plant Sci 5: 668
Sexton M, Shelton K, Haley P, West M (2018) Evaluation of cannabinoid
and terpenoid content: Cannabis ower compared to supercritical CO
2
concentrate. Planta Med 84: 234241
1896 Plant Physiol. Vol. 180, 2019
Zager et al.
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Sirikantaramas S, Taura F, Tanaka Y, Ishikawa Y, Morimoto S, Shoyama
Y(2005) Tetrahydrocannabinolic acid synthase, the enzyme controlling
marijuana psychoactivity, is secreted into the storage cavity of the
glandular trichomes. Plant Cell Physiol 46: 15781582
Small E (2015) Evolution and classication of Cannabis sativa (marijuana,
hemp) in relation to human utilization. Bot Rev 81: 189294
Srividya N, Lange I, Lange BM (2016) Generation and functional evalua-
tion of designer monoterpene synthases. Methods Enzymol 576: 147165
Starks CM, Back K, Chappell J, Noel JP (1997) Structural basis for cyclic
terpene biosynthesis by tobacco 5-epi-aristolochene synthase. Science
277: 18151820
Stout JM, Boubakir Z, Ambrose SJ, Purves RW, Page JE (2012) The
hexanoyl-CoA precursor for cannabinoid biosynthesis is formed by an
acyl-activating enzyme in Cannabis sativa trichomes. Plant J 71: 353365
Subritzky T, Lenton S, Pettigrew S (2016) Legal cannabis industry
adopting strategies of the tobacco industry. Drug Alcohol Rev 35:
511513
Taura F, Sirikantaramas S, Shoyama Y, Shoyama Y, Morimoto S (2007)
Phytocannabinoids in Cannabis sativa: Recent studies on biosynthetic
enzymes. Chem Biodivers 4: 16491663
Taura F, Tanaka S, Taguchi C, Fukamizu T, Tanaka H, Shoyama Y,
Morimoto S (2009) Characterization of olivetol synthase, a polyketide
synthase putatively involved in cannabinoid biosynthetic pathway.
FEBS Lett 583: 20612066
Trikka FA, Nikolaidis A, Ignea C, Tsaballa A, Tziveleka LA, Ioannou E,
Roussis V, Stea EA, Bo
zi´
cD,ArgiriouA,etal(2015) Combined me-
tabolome and transcriptome proling provides new insights into di-
terpene biosynthesis in S. pomifera glandular trichomes. BMC Genomics
16: 935
Turner GW, Parrish AN, Zager JJ, Fischedick JT, Lange BM (2019) As-
sessment of ux through oleoresin biosynthesis in epithelial cells of
loblolly pine resin ducts. J Exp Bot 70: 217230
United Nations (1966) Commission on Narcotic Drugs. Document E/4294;
Economic and Social Council: Ofcial Records
Unschuld PU (1986) Medicine in China: A History of Pharmaceutics. Uni-
versity of California Press, Berkeley
van Bakel H, Stout JM, Cote AG, Tallon CM, Sharpe AG, Hughes TR,
Page JE (2011) The draft genome and transcriptome of Cannabis sativa.
Genome Biol 12: R102
Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for
transcriptomics. Nat Rev Genet 10: 5763
Welling MT, Shapter T, Rose TJ, Liu L, Stanger R, King GJ (2016) A
belated green revolution for cannabis: Virtual genetic resources to fast-
track cultivar development. Front Plant Sci 7: 1113
Wiebelhaus N, Kreitals NM, Almirall JR (2016) Differentiation of mari-
juana headspace volatiles from other plants and hemp products using
capillary microextraction of volatiles (CMV) coupled to gas-chroma-
tography-mass spectrometry (GC-MS). Forensic Chem 2: 18
Worley B, Powers R (2013) Multivariate analysis in metabolomics. Curr
Metabolomics 1: 92107
Yang SS, Tu ZJ, Cheung F, Xu WW, Lamb JF, Jung HJG, Vance CP,
Gronwald JW (2011) Using RNA-Seq for gene identication, polymor-
phism detection and transcript proling in two alfalfa genotypes with
divergent cell wall composition in stems. BMC Genomics 12: 199
Young MD, WakeeldMJ,SmythGK,OshlackA(2010) Gene Ontology
analysis for RNA-seq: Accounting for selection bias. Genome Biol 11:
R14
Zager JJ, Lange BM (2018) Assessing ux distribution associated with
metabolic specialization of glandular trichomes. Trends Plant Sci 23:
638647
Plant Physiol. Vol. 180, 2019 1897
Coregulation of Cannabinoid and Terpenoid Pathways
www.plantphysiol.orgon July 31, 2019 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
... Methyl-erythritol phosphate (MEP) pathway has as its final products monoterpenes, of which the best known are β-myrcene, α-pynene and limonene, but also a multitude of minor compounds such as camphene, α-terpineol, terpinene, and β-pinene [26][27][28][29][30][31][32]. Briefly, the process begins by coupling glyceraldehyde-3-phosphate with a molecule of Nutrients 2025, 17, 861 5 of 24 pyruvic acid, followed by decarboxylation and formation of deoxy-xylulose-5-phosphate. ...
... Methyl-erythritol phosphate (MEP) pathway has as its final products monoterpenes, of which the best known are β-myrcene, α-pynene and limonene, but also a multitude of minor compounds such as camphene, α-terpineol, terpinene, and β-pinene [26][27][28][29][30][31][32]. Briefly, the process begins by coupling glyceraldehyde-3-phosphate with a molecule of pyruvic acid, followed by decarboxylation and formation of deoxy-xylulose-5-phosphate. ...
Article
Full-text available
Objectives/Background: The Cannabis genus contain a mixture of cannabinoids and other minor components which have been studied so far. In this narrative review, we highlight the main aspects of the polarized discussion between abuse and toxicity versus the benefits of the compounds found in the Cannabis sativa plant. Methods: We investigated databases such as PubMed, Google Scholar, Web of Science and World Anti-doping Agency (WADA) documents for scientific publications that can elucidate the heated discussion related to the negative aspects of addiction, organ damage and improved sports performance and the medical benefits, particularly in athletes, of some compounds that are promising as nutrients. Results: Scientific arguments bring forward the harmful effects of cannabinoids, ethical and legislative aspects of their usage as doping substances in sports. We present the synthesis and metabolism of the main cannabis compounds along with identification methods for routine anti-doping tests. Numerous other studies attest to the beneficial effects, which could bring a therapeutic advantage to athletes in case of injuries. These benefits recommend Cannabis sativa compounds as nutrients, as well as potential pharmacological agents. Conclusions and Future Perspectives: From the perspective of both athletes and illegal use investigators in sport, there are many interpretations, presented and discussed in this review. Despite many recent studies on cannabis species, there is very little research on the beneficial effects in active athletes, especially on large groups compared to placebo. These studies may complete the current vision of this topic and clarify the hypotheses launched as discussions in this review.
... Terpenoids are made up of isoprene units and show a wide range of structural complexity. They include simple monoterpenes like limonene and more complex compounds such as cannabinoids, which demonstrate various pharmacological effects [41,42] (Figure 3). Phenolic compounds consist of a hydroxyl group attached to an aromatic ring and include important subgroups like flavonoids, tannins, and lignans. ...
Article
Full-text available
Chemotaxonomic profiling based on secondary metabolites offers a reliable approach for identifying and authenticating medicinal plants, addressing limitations associated with traditional morphological and genetic methods. Recent advances in microfluidics and nanoengineered technologies—including lab-on-a-chip systems as well as nano-enabled optical and electrochemical sensors—enable the rapid, accurate, and portable detection of key metabolites, such as alkaloids, flavonoids, terpenoids, and phenolics. Integrating artificial intelligence and machine learning techniques further enhances the analytical capabilities of these technologies, enabling automated, precise plant identification in field-based applications. Therefore, this review aims to highlight the potential applications of micro- and nanoengineered devices in herbal medicine markets, medicinal plant authentication, and biodiversity conservation. We discuss strategies to address current challenges, such as biocompatibility and material toxicity, technical limitations in device miniaturization, and regulatory and standardization requirements. Furthermore, we outline future trends and innovations necessary to fully realize the transformative potential of these technologies in real-world chemotaxonomic applications.
... Their dynamics under various hormone and abiotic stress treatments was also evaluated. CsCBDAS and CsPT1 genes, along with other structural genes of cannabinoid biosynthesis are primarily expressed in glandular trichomes of C. sativa, which also act as primary storage organ of these metabolites (Zager et al. 2019). Thus, the promoter regions of these two genes are expected to drive trichome-specific reporter expression. ...
Article
Full-text available
Main conclusion The functional characterization of promoter regions of CBDAS and PT genes of cannabinoids biosynthesis suggests that multiple factors including tissue-specific, phytohormones, and stress-related signals modulate their activity. Abstract Cannabis sativa L. has tremendous potential as a future crop for producing clinically important cannabinoid metabolites. While the cannabinoid biosynthetic pathway is largely known, the mechanistic details about its regulation are less understood. Decrypting the environmental and developmental factors regulating cannabinoid biosynthesis pathway may prove beneficial in pathway engineering and molecular breeding programs. Functional characterization of the promoter regions of key cannabinoid biosynthesis genes can provide useful insights into their transcriptional regulation. This study, therefore, is focused to uncover the role of different phytohormones and abiotic factors in influencing the activity of CsCBDAS and CsPT1 promoters through the development of promoter-GUS fusion expressing transgenic lines of Nicotiana tabacum. Spatial analysis across different tissues revealed that CsCBDAS and CsPT1 promoters drive a high level of GUS staining in leaf and flowers of the transgenic lines. A strong GUS staining was detected in the glandular trichomes of both tobacco transgenic lines. The results showed that out of the five hormones, three (IAA, GA3, and SA) and four (IAA, GA3, SA, and ABA) caused significant activation of CsCBDAS and CsPT1 promoters, respectively. While the light, heat, cold, salt, and wound stress induced promoter activity of both CsCBDAS and CsPT1, the drought stress was found to induce the activity of CsCBDAS promoter only. Validation of the expression patterns of these genes under different conditions in C. sativa through qRT-PCR suggested that phytohormones and abiotic factors may influence the cannabinoid biosynthesis in C. sativa by modulating their promoter activity.
... This implies that the production of cannabinoids is not only directly tied to the genes responsible for their synthesis in the cannabinoid pathway [8,17], but rather that additional areas of the genome also control these biosynthetic pathways [47,57,58]. This finding aligns with the earlier studies, further supporting the notion that multiple genetic factors contribute to cannabinoid content variation [3,[59][60][61][62][63]. Here we contribute further to this literature by confirming some of these important loci, but also identifying other novel loci for targeted future selection and breeding. ...
Article
Full-text available
Background Future breeding and selection of Cannabis sativa L. for both drug production and industrial purposes require a source of germplasm with wide genetic variation, such as that found in wild relatives and progenitors of highly cultivated plants. Limited directional selection and breeding have occurred in this crop, especially informed by molecular markers. Results This study investigated the population genomics of a natural cannabis collection comprising male and female individuals from various climatic zones in Iran. Using Genotyping-By-Sequencing (GBS), we sequenced 228 individuals from 35 populations. The data obtained enabled an association analysis, linking genotypes with key phenotypes such as inflorescence characteristics, flowering time, plant morphology, tetrahydrocannabinol (THC) and cannabidiol (CBD) content, and sex. We detected approximately 23,266 significant high-quality Single Nucleotide Polymorphisms (SNPs), establishing associations between markers and traits. The population structure analysis revealed that Iranian cannabis plants fall into five distinct groups. Additionally, a comparison with global data suggested that the Iranian populations is distinctive and generally closer to marijuana than to hemp, with some populations showing a closer affinity to hemp. The GWAS identified novel genetic loci associated with sex, yield, and chemotype traits in cannabis, which had not been previously reported. Conclusion The study's findings highlight the distinct genetic structure of Iranian Cannabis populations. The identification of novel genetic loci associated with important traits suggests potential targets for future breeding programs. This research underscores the value of the Iranian cannabis germplasm as a resource for breeding and selection efforts aimed at improving Cannabis for various uses.
... Terpenoids in Cannabis sativa L. play an important role in the biosynthesis of the cannabinoids that contribute to the much-appreciated aroma and flavor of cannabis seed oil (Booth and Buhlmann, 2019;Zager et al., 2019). More than 200 terpenoids have been identified in cannabis, with the main constituents being mono-and sesqui-terpenes (Gallily et al., 2018). ...
Article
Full-text available
Introduction: Cannabis terpenoids, especially volatile terpenes, were used for the classification of cannabis strains. The leaves of Cannabis sativa L. subsp. sativa Thai strain ‘Hang Krarok’ are used legally in traditional Thai medicines, cosmetics, and food ingredients in Thailand under the control of the tetrahydrocannabinol (if lower than 0.2% dry weight). One of the specific characteristics of this plant is the volatile oil which consists of mono-and the sesqui-terpenoids. Materials and methods: Fresh cannabis leaves were ground and 1 g samples were kept in gas chromatography/mass spectrometry glass vials at 4 °C prior to measurement using headspace. Results: More than 50 terpenoids were identified from the fresh leaves in the cannabis samples. The major compounds were ?–ocimene, L–limonene, terpinolene, p–cymenene, ?–(E)–caryophyllene, (Z,E)–?–farnesene, ?–bisabolene, and (E)–?–bisabolene. Conclusion: The variation in the unique terpenoids in the Thai strain could be used in novel medicines and food and cosmetic products.
... However, a significant number of studies emerged only after 2016 ( Figure S1e Figure S1h). These RNA-Seq studies addressed a myriad of Cannabis research questions, including fiber production and quality (Guerriero et al., 2017), biotic and abiotic resistance Gao et al., 2018;Jiang et al., 2021;McKernan et al., 2020;Pépin et al., 2021;Yan et al., 2023;Yin et al., 2022), sex determination (Adal et al., 2021;Dowling et al., 2023;Prentout et al., 2020), metabolite production, quality, chemotype identification Braich et al., 2019;Busta et al., 2022;Laverty et al., 2019;Livingston et al., 2020;McGarvey et al., 2020;McKernan et al., 2020;Mi et al., 2023;Tang et al., 2023;Yeo et al., 2022;Zager et al., 2019), and genome assembly (Bakel et al., 2011;Braich et al., 2020;Gao et al., 2020;McKernan et al., 2020). ...
Article
Full-text available
Cannabis sativa L., a plant originating from Central Asia, is a versatile crop with applications spanning textiles, construction, pharmaceuticals, and food products. This study aimed to compile and analyze publicly available Cannabis RNA‐Seq data and develop an integrated database tool to help advance Cannabis research in various topics such as fiber production, cannabinoid biosynthesis, sex determination, and plant development. We identified 515 publicly available RNA‐Seq samples that, after stringent quality control, resulted in a high‐quality dataset of 394 samples. Utilizing the Jamaican Lion genome as reference, we constructed a comprehensive database and developed the Cannabis Expression Atlas (https://cannatlas.venanciogroup.uenf.br/), a web application for visualization of gene expression, annotation, and functional classification. Key findings include the quantification of 27,640 Cannabis genes and their classification into seven expression categories: not‐expressed, low‐expressed, housekeeping, tissue‐specific, group‐enriched, mixed, and expressed‐in‐all tissues. The study revealed substantial variability and coherence in gene expression across different tissues and chemotypes. We found 2,382 tissue‐specific genes, including 177 transcription factors. The Cannabis Expression Atlas constitutes a valuable tool for exploring gene expression patterns and offers insights into Cannabis biology, supporting research in plant breeding, genetic engineering, biochemistry, and functional genomics.
Article
Full-text available
Cannabis sativa is a globally important seed oil, fibre and drug-producing plant species. However, a century of prohibition has severely restricted development of breeding and germplasm resources, leaving potential hemp-based nutritional and fibre applications unrealized. Here we present a cannabis pangenome, constructed with 181 new and 12 previously released genomes from a total of 144 biological samples including both male (XY) and female (XX) plants. We identified widespread regions of the cannabis pangenome that are surprisingly diverse for a single species, with high levels of genetic and structural variation, and propose a novel population structure and hybridization history. Across the ancient heteromorphic X and Y sex chromosomes, we observed a variable boundary at the sex-determining and pseudoautosomal regions as well as genes that exhibit male-biased expression, including genes encoding several key flowering regulators. Conversely, the cannabinoid synthase genes, which are responsible for producing cannabidiol acid and delta-9-tetrahydrocannabinolic acid, contained very low levels of diversity, despite being embedded within a variable region with multiple pseudogenized paralogues, structural variation and distinct transposable element arrangements. Additionally, we identified variants of acyl-lipid thioesterase genes that were associated with fatty acid chain length variation and the production of the rare cannabinoids, tetrahydrocannabivarin and cannabidivarin. We conclude that the C. sativa gene pool remains only partially characterized, the existence of wild relatives in Asia is likely and its potential as a crop species remains largely unrealized.
Article
Full-text available
Canada has made significant contributions to the field of plant biochemistry, with numerous researchers actively focusing on elucidating the biosynthetic pathways of plant specialized metabolites and producing these compounds in heterologous systems, such as bacteria, yeast, or other plant species. The review aims to highlight the strengths of Canadian research in this domain over the last three decades. It will describe advances in pathway elucidation, enzyme characterization, and production of enzymes and metabolites in heterologous systems, particularly in the areas of alkaloids, terpenoids, and phenolic compounds. Canadian researchers have not only made pivotal scientific discoveries but have also ensured the continuity of scientific excellence by mentoring new generations of principal investigators in plant specialized metabolites. These advances warrant recognition and financial support to retain future talent and to maintain Canada's leadership in scientific progress on the global stage.
Article
Cannabis sativa L. is an important medicinal plant with high commercial value. In recent years, the research interest in cannabidiol (CBD) and terpene-rich cannabis has been rapidly expanding due to their high therapeutic potential. The present study aims to explore the phytocannabinoids and terpenes diversity in Cannabis sativa collected from different parts of northern India. Our findings revealed that the cannabinoids and terpenes synthesize together in capitate stalked and capitate sessile glandular trichomes, whereas bulbous glands synthesize only terpenes. The North Indian C. sativa is mainly dominated by tetrahydrocannabinol (THC). The CBD-rich plant diversity is nominal (1.11%) in studied north Indian C. sativa. The essential oil profiling reveals (E)-caryophyllene (10.30-36.80%) as the major constituent, followed by α-humulene (0.50-15.29%) and α-bisabolol (0.00-16.40%) in the North Indian population. The cannabinoids and terpenes content showed significant diversity among and within the five studied populations. The correlation analysis between cannabinoids and terpenes indicates that α-pinene, β-pinene, and limonene positively correlated with CBD content. Similarly, α- and β-selinene correlate positively with tetrahydrocannabinolic acid (THCA) content. This study could help to identify the key cultivars from India and establish a consistent chemotype for future breeding programs.
Article
Angiosperms are prolific producers of structurally diverse terpenes, which are essential for plant defense responses, as well as the formation of floral scents, fruit flavors, and medicinal constituents. Terpene synthase genes (TPSs) play crucial roles in the biosynthesis of terpenes. This study specifically focuses on the catalytic products of 222 functionally characterized TPSs in 24 angiosperms, which mainly comprise monoterpenes, sesquiterpenes, diterpenes, and sesterterpene. Our systematic analysis of these TPSs uncovered a significant expansion of the angiosperm-specific TPS-a, b, and g subfamilies in comparison to the TPS-e/f and c subfamilies. The expanded subfamilies can be further partitioned into distinct branches, within which considerable functional innovation and diversification have been observed. Numerous TPSs exhibit bifunctional or even trifunctional activities in vitro, yet they exhibit only a single activity in vivo, which may be largely determined by their inherent properties, subcellular localization, and the availabilities of endogenous substrates. Additionally, we explored the biological functions of terpenes in various organs and tissues of angiosperms. We propose that the expansion and functional divergence of TPSs contribute to the adaptability and diversity of angiosperms, facilitating the production of a broad spectrum of terpenes that enable diverse interactions with the environment and other organisms. Our findings provide a foundation for comprehending the correlation between the evolutionary features of TPSs and the diversity of terpenes in angiosperms, which is significant for terpene biosynthesis research.
Article
Full-text available
We present the latest version of the Molecular Evolutionary Genetics Analysis (MEGA) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine. In this major upgrade, MEGA has been optimized for use on 64-bit computing systems for analyzing bigger datasets. Researchers can now explore and analyze tens of thousands of sequences in MEGA. The new version also provides an advanced wizard for building timetrees and includes a new functionality to automatically predict gene duplication events in gene family trees. The 64-bit MEGA is made available in two interfaces: graphical and command line. The graphical user interface (GUI) is a native Microsoft Windows application that can also be used on Mac OSX. The command line MEGA is available as native applications for Windows, Linux, and Mac OSX. They are intended for use in high-throughput and scripted analysis. Both versions are available from www.megasoftware.net free of charge.
Article
Full-text available
Cannabis sativa (cannabis) produces a resin that is valued for its psychoactive and medicinal properties. Despite being the foundation of a multi-billion dollar global industry, scientific knowledge and research on cannabis is lagging behind compared to other high-value crops. This is largely due to legal restrictions that have prevented many researchers from studying cannabis, its products, and their effects in humans. Cannabis resin contains hundreds of different terpene and cannabinoid metabolites. Many of these metabolites have not been conclusively identified. Our understanding of the genomic and biosynthetic systems of these metabolites in cannabis, and the factors that affect their variability, is rudimentary. As a consequence, there is concern about lack of consistency with regard to the terpene and cannabinoid composition of different cannabis ‘strains’. Likewise, claims of some of the medicinal properties attributed to cannabis metabolites would benefit from thorough scientific validation.
Article
Full-text available
Cannabis sativa is widely cultivated for medicinal, food, industrial, and recreational use, but much remains unknown regarding its genetics, including the molecular determinants of cannabinoid content. Here, we describe a combined physical and genetic map derived from a cross between the drug-type strain Purple Kush and the hemp variety “Finola.” The map reveals that cannabinoid biosynthesis genes are generally unlinked but that aromatic prenyltransferase (AP), which produces the substrate for THCA and CBDA synthases (THCAS and CBDAS), is tightly linked to a known marker for total cannabinoid content. We further identify the gene encoding CBCA synthase (CBCAS) and characterize its catalytic activity, providing insight into how cannabinoid diversity arises in cannabis. THCAS and CBDAS (which determine the drug vs. hemp chemotype) are contained within large (>250 kb) retrotransposon-rich regions that are highly nonhomologous between drug- and hemp-type alleles and are furthermore embedded within -40 Mb of minimally recombining repetitive DNA. The chromosome structures are similar to those in grains such as wheat, with recombination focused in gene-rich, repeat-depleted regions near chromosome ends. The physical and genetic map should facilitate further dissection of genetic and molecular mechanisms in this commercially and medically important plant.
Article
Full-text available
The shoot system of pines contains abundant resin ducts, which harbor oleoresins that play important roles in constitutive and inducible defenses. In a pilot study, we assessed the chemical diversity of oleoresins obtained from mature tissues of loblolly pine trees (Pinus taeda L.). Building on these data sets, we designed experiments to assess oleoresin biosynthesis in needles of 2-year-old saplings. Comparative transcriptome analyses of single cell types indicated that genes involved in the biosynthesis of oleoresins are significantly enriched in isolated epithelial cells of resin ducts, compared with those expressed in mesophyll cells. Simulations using newly developed genome-scale models of epithelial and mesophyll cells, which incorporate our data on oleoresin yield and composition as well as gene expression patterns, predicted that heterotrophic metabolism in epithelial cells involves enhanced levels of oxidative phosphorylation and fermentation (providing redox and energy equivalents). Furthermore, flux was predicted to be more evenly distributed across the metabolic network of mesophyll cells, which, in contrast to epithelial cells, do not synthesize high levels of specialized metabolites. Our findings provide novel insights into the remarkable specialization of metabolism in epithelial cells.
Article
Full-text available
There has been an increased use of medical Cannabis in the United States of America as more states legalize its use. Complete chemical analyses of this material can vary considerably between producers and is often not fully provided to consumers. As phytochemists in a state with legal medical Cannabis we sought to characterize the accumulation of phytochemicals in material grown by licensed commercial producers. We report the development of a simple extraction and analysis method, amenable to use by commercial laboratories for the detection and quantification of both cannabinoids and terpenoids. Through analysis of developing flowers on plants, we can identify sources of variability of floral metabolites due to flower maturity and position on the plant. The terpenoid composition varied by accession and was used to cluster cannabis strains into specific types. Inclusion of terpenoids with cannabinoids in the analysis of medical cannabis should be encouraged, as both of these classes of compounds could play a role in the beneficial medical effects of different cannabis strains.
Article
Full-text available
The MetaboAnalyst web application has been widely used for metabolomics data analysis and interpretation. Despite its user-friendliness, the web interface has presented its inherent limitations (especially for advanced users) with regard to flexibility in creating customized workflow, support for reproducible analysis, and capacity in dealing with large data. To address these limitations, we have developed a companion R package (MetaboAnalystR) based on the R code base of the web server. The package has been thoroughly tested to ensure that the same R commands will produce identical results from both interfaces. MetaboAnalystR complements the MetaboAnalyst web server to facilitate transparent, flexible and reproducible analysis of metabolomics data. Availability: MetaboAnalystR is freely available from https://github.com/xia-lab/MetaboAnalystR. Supplementary information: Supplementary data are available at Bioinformatics online.
Article
Full-text available
The smell of marijuana (Cannabis sativa L.) is of interest to users, growers, plant breeders, law enforcement and, increasingly, to state-licensed retail businesses. The numerous varieties and strains of Cannabis produce strikingly different scents but to date there have been few, if any, attempts to quantify these olfactory profiles directly. Using standard sensory evaluation techniques with untrained consumers we have validated a preliminary olfactory lexicon for dried cannabis flower, and characterized the aroma profile of eleven strains sold in the legal recreational market in Colorado. We show that consumers perceive differences among strains, that the strains form distinct clusters based on odor similarity, and that strain aroma profiles are linked to perceptions of potency, price, and smoking interest.
Article
Many aromatic plants accumulate mixtures of secondary (or specialized) metabolites in anatomical structures called glandular trichomes (GTs). Different GT types may also synthesize different mixtures of secreted metabolites, and this contributes to the enormous chemical diversity reported to occur across species. Over the past two decades, significant progress has been made in characterizing the genes and enzymes that are responsible for the unique metabolic capabilities of GTs in different lineages of flowering plants. Less is known about the processes that regulate flux distribution through precursor pathways toward metabolic end-products. We discuss here the results from a meta-analysis of genome-scale models that were developed to capture the unique metabolic capabilities of different GT types.
Article
In 1937, the United States of America criminalized the use of cannabis and as a result its use decreased rapidly. In recent decades, there is a growing interest in the wide range of medical uses of cannabis and its constituents; however, the laws and regulations are substantially different between countries. Laws differentiate between raw herbal cannabis, cannabis extracts, and cannabinoid-based medicines. Both the European Medicines Agency (EMA) and the United States Food and Drug Administration (FDA) do not approve the use of herbal cannabis or its extracts. The FDA approved several cannabinoid-based medicines, so did 23 European countries and Canada. However, only four of the reviewed countries have fully authorized the medical use of herbal cannabis - Canada, Germany, Israel and the Netherlands, together with more than 50% of the states in the United States. Most of the regulators allow the physicians to decide what specific indications they will prescribe cannabis for, but some regulators dictate only specific indications. The aim of this article is to review the current (as of November 2017) regulations of medical cannabis use in Europe and North America.