ThesisPDF Available

A Systematic Investigation of Cannabis

Authors:

Abstract and Figures

Botanists disagree whether Cannabis (Cannabaceae) is a monotypic or polytypic genus. A systematic investigation was undertaken to elucidate underlying evolutionary and taxonomic relationships within the genus. Genetic, morphological, and chemotaxonomic analyses were conducted on 157 Cannabis accessions of known geographic origin. Sample populations of each accession were surveyed for allozyme variation at 17 gene loci. Principal component (PC) analysis of the allozyme allele frequencies revealed that most accessions were derived from two major gene pools corresponding to C. sativa L., and C. indica Lam. A third putative gene pool corresponds to C. ruderalis Janisch. Previous taxonomic treatments were tested for goodness of fit to the pattern of genetic variation. Based on these results, a working hypothesis for a taxonomic circumscription of Cannabis was proposed that is a synthesis of previous polytypic concepts. Putative infraspecific taxa were assigned to “biotypes” pending formal taxonomic revision. Genetic variation was highest in the hemp and feral biotypes and least in the drug biotypes. Morphometric traits were analyzed by PC and canonical variates (CV) analysis. PC analysis failed to differentiate the putative species, but provided objective support for recognition of infraspecific taxa of C. sativa and C. indica. CV analysis resulted in a high degree of discrimination of the putative species and infraspecific taxa. Variation in qualitative and quantitative levels of cannabidiol (CBD), tetrahydrocannabinol (THC), and other cannabinoids was determined, as were frequencies of alleles that control CBD and THC biosynthesis. The patterns of variation support a two-species concept, but not recognition of C. ruderalis as a separate species from C. sativa. PC analysis of terpenoid variation showed that the wide-leaflet drug (WLD) biotype of C. indica produced enhanced mean levels of guaiol and isomers of eudesmol, and is distinct from the other putative taxa. In summary, the results of this investigation show that a taxonomic revision of Cannabis is warranted. However, additional studies of putative wild populations are needed to further substantiate the proposed taxonomic treatment.
Content may be subject to copyright.
A SYSTEMATIC INVESTIGATION OF CANNABIS
KARL WILLIAM HILLIG
Submitted to the faculty of the University Graduate School
in partial fulfillment of the requirements
for the degree
Doctor of Philosophy
in the Department of Biology,
Indiana University
March 2005
ii
Accepted by the Graduate Faculty, Indiana University, in partial
fulfillment of the requirements for the degree of Doctor of Philosophy.
____________________________________
Jeffrey Palmer, Ph.D.
____________________________________
Paul Mahlberg, Ph.D.
Doctoral
Committee
____________________________________
Gerald Gastony, Ph.D.
____________________________________
Keith Clay, Ph.D.
Date of Oral Examination:
February 8, 2005
iii
© 2005
KARL WILLIAM HILLIG
ALL RIGHTS RESERVED
iv
DEDICATION
This dissertation is dedicated to my dear parents, Dr. William B. Hillig and Dr. Beth C.
Hillig for their love, encouragement, and generous support.
Of all that Orient lands can vaunt
Of marvels with our own competing,
The strangest is the Haschish plant,
And what will follow on its eating.
John Greenleaf Whittier
v
ACKNOWLEDGEMENTS
I am grateful to Professor Paul G. Mahlberg for facilitating this investigation and for
many helpful discussions. I also thank Professor Jeffrey Palmer for chairing my
committee, and Professors Gerald Gastony and Keith Clay for their guidance.
Many thanks to all those who helped along the way including Valerie Savage for her
technical assistance, Donald Burton, Michael Peppler, John Lemon, and David Campbell
for their horticultural assistance, and John McPartland, Steve Arbuckle, Don Wirtshafter,
James Wynn, Rob Clarke, David Watson, David Pate, Etienne de Meijer, Hayo van der
Werf, and all my friends and family for their encouragement and good cheer.
I appreciate the generosity of all those who contributed germplasm for this investigation
including David Watson and Etienne de Meijer.
This research was supported in part by a grant from HortaPharm B.V., The
Netherlands.
vi
ABSTRACT
Botanists disagree whether Cannabis (Cannabaceae) is a monotypic or polytypic
genus. A systematic investigation was undertaken to elucidate underlying evolutionary
and taxonomic relationships within the genus. Genetic, morphological, and
chemotaxonomic analyses were conducted on 157 Cannabis accessions of known
geographic origin. Sample populations of each accession were surveyed for allozyme
variation at 17 gene loci. Principal component (PC) analysis of the allozyme allele
frequencies revealed that most accessions were derived from two major gene pools
corresponding to C. sativa L., and C. indica Lam. A third putative gene pool
corresponds to C. ruderalis Janisch. Previous taxonomic treatments were tested for
goodness of fit to the pattern of genetic variation. Based on these results, a working
hypothesis for a taxonomic circumscription of Cannabis was proposed that is a synthesis
of previous polytypic concepts. Putative infraspecific taxa were assigned to “biotypes”
pending formal taxonomic revision. Genetic variation was highest in the hemp and feral
biotypes and least in the drug biotypes. Morphometric traits were analyzed by PC and
canonical variates (CV) analysis. PC analysis failed to differentiate the putative species,
but provided objective support for recognition of infraspecific taxa of C. sativa and C.
indica. CV analysis resulted in a high degree of discrimination of the putative species
and infraspecific taxa. Variation in qualitative and quantitative levels of cannabidiol
(CBD), tetrahydrocannabinol (THC), and other cannabinoids was determined, as were
frequencies of alleles that control CBD and THC biosynthesis. The patterns of variation
support a two-species concept, but not recognition of C. ruderalis as a separate species
from C. sativa. PC analysis of terpenoid variation showed that the wide-leaflet drug
(WLD) biotype of C. indica produced enhanced mean levels of guaiol and isomers of
eudesmol, and is distinct from the other putative taxa. In summary, the results of this
investigation show that a taxonomic revision of Cannabis is warranted. However,
additional studies of putative wild populations are needed to further substantiate the
proposed taxonomic treatment.
vii
TABLE OF CONTENTS
PAGE
LIST OF TABLES ………………………………………………………………………………. ix
LIST OF FIGURES ……………………………………………………………………………... xi
CHAPTER ONE GENERAL INTRODUCTION ………………………………………………… 1
References ……………………… ……… ……… ……… ……… ……… ……… ……… .. 9
CHAPTER TWO GENETIC EVIDENCE FOR SPECIATION IN CANNABIS
Abstract ……………… …… …… ……… …… …… …… …… ……… …… …… …… …… 11
Introduction ……… …… ………… …… …… ………… …… …… ………… …… ….. 11
Materials and Methods ………………………………… ……………… …………… ….. 15
Results …………………… ……… ……… ……… …… ……… ……… ……… ……… …. 19
Discussion ………… ……………………………………………………………………... 33
References ……………………… ……… ……… ……… ……… ……… ……… ……… .. 39
CHAPTER THREE A MULTIVARIATE ANALYSIS OF PHENOTYPIC VARIATION IN CANNABIS
Abstract ……………… …… …… ……… …… …… …… …… ……… …… …… …… …… 44
Introduction ……… …… ………… …… …… ………… …… …… ………………….. 44
Materials and Methods ………………………………… ……………… …………… ….. 53
Results …………………… ……… ……… ……… …… ……… ……… ……… ……… …. 58
Discussion ………… ……… ……… ……… ……… ……… ……… ……… ……… ……… 72
References ……………………… ……… ……… ……… ……… ……… ……… ……… .. 79
CHAPTER FOUR A CHEMOTAXONOMIC ANALYSIS OF CANNABINOID VARIATION IN CANNABIS
Abstract ……………… …… …… ……… …… …… …… …… ……… …… …… …… ……. 83
Introduction ……… …… ………… …… …… ………… …… …… ………… …… ….. 84
Materials and Methods ………………………………… ……………… …………… …... 91
Results …………………… ……… ……… ……… …… ……… ……… ……… ………….. 94
viii
PAGE
Discussion ………… ……… ……… ……… ……… ……… ……… ……… ……… …… 101
References ……………………… ……… ……… ……… ……… ……… ……… ……… 109
CHAPTER FIVE A CHEMOTAXONOMIC ANALYSIS OF TERPENOID VARIATION IN CANNABIS
Abstract ……………… …… …… ……… …… …… …… …… ……… …… …… …… ….. 115
Introduction ……… …… ………… …… …… ………… …… …… ………… …… … 115
Materials and Methods ………………………………… ……………… …………… … 121
Results …………………… ……… ……… …… ………………………………………... 124
Discussion ………… ……… ……… ……… ……… ……… ……… ……… ……… ……. 133
References ……………………… ……… ……… ……… ……… ……… ……… ……… 136
CHAPTER SIX GENERAL DISCUSSION ……………………………………………………. 139
References ……………………… ……… ……… ……… ……… ……… ……… ……… 154
APPENDIX A – Passport data for 157 Cannabis accessions ……… …… … …… … … … 157
APPENDIX B – Enzymes, buffers, and staining recipes ………… ……… …… …… …… 161
APPENDIX C Allozyme migration ratios …………… …… ……… …… …… …… ……… 163
ix
LIST OF TABLES
PAGE
2-1 Mean allele frequencies for accessions assigned to the indica, sativa, and
ruderalis gene pools ……… …… … … …… … …… … …… … …… … … …… … …… 27
2-2 Means for the number of alleles per locus (A), number of alleles per polymorphic
locus (Ap), percentage of polymorphic loci (P) and average expected
heterozygosity (He) for gene pools and putative taxa of Cannabis. .……… …. 33
3-1 Taxonomic circumscription of the Cannabis germplasm collection based on a
previous analysis of allele frequencies and geographic origins. ………………. 52
3-2 Phenotypic characters evaluated in the 1996 trial. …………………… …… ……. 56
3-3 Traits with the largest eigenvector loadings for the first three PC axes. ……… 61
3-4 Stepwise variable selection for pairs of putative Cannabis species. …………… 65
3-5 Arithmetic means ± SD and ranges of selected characters for seven putative
taxa of Cannabis ………………………………………………………………………. 66
3-6 Number of significant differences between pairs of putative taxa for the
34 traits in Table 3-5. ………………………………………………………………… 68
4-1 Taxonomic circumscription of the Cannabis germplasm collection based on
a previous analysis of allozyme allele frequencies. ……………… …… … … ….. 90
4-2 Retention times and peak identities for the gas chromatogram in Figure 4-2
of a Cannabis indica extract. …………… … …… … …… … … …… … …… … …… . 95
4-3 Arithmetic means, standard deviations, and ranges of the THC/CBD
ratios and dry-weight percentages of CBD and THC for chemotype I, II,
and III plants…………… …… … …… … …… … ……………………………………… 96
4-4 Arithmetic means, standard deviations, and ranges of the dry-weight
percentages of CBC, CBD, CBG, THC, and (CBD+THC) for 253 Cannabis
plants assigned to seven putative taxa. ……… …… … …… … … …… … …… … . 98
x
PAGE
4-5 Arithmetic means, standard deviations, and ranges of the B
T
allele
frequencies for sample populations of 157 Cannabis accessions assigned
to seven putative taxa……… …… ………… …… …… ………… …… …… …… …. 101
5-1 Working hypothesis of a taxonomic circumscription of the Cannabis germplasm
collection. ……… … … … …… … … … … … … …… … … … … … … …… … … … … .. 124
5-2 Relative retention times, known identities, and mass spectral data for 48
terpenoid compounds in the essential oil of Cannabis. ………………………… 127
5-3 Numbered terpenoid peaks with largest eigenvector loadings for PC axis 1 and
PC axis 2. ...…………… ……… …… …… …… …… ……… …… …… …… … 129
5-4 Arithmetic means ± SD of the ratios of the peak areas (multiplied by 100)
relative to the total area under all 48 terpenoid peaks for 162 Cannabis plants
assigned to seven putative taxa. ……………… …… … …… … … …… … …… … … 131
xi
LIST OF FIGURES
PAGE
2-1 Starch gels stained for enzyme activity. ……………… …… ……… …… ….…… 20
2-2 Scatterplot of 156 Cannabis accessions on PC axis 1 and PC axis 2. ……. 23
2-3 Density contour overlay of the PC scatterplot. ……… …… … …… … …… … … .. 24
2-4 Density ellipses drawn on the PC scatterplot for the countries of origin of the
various accessions. ………… … … …… … … … … …… … … … … …… … … … … .. 25
2-5 Map showing the countries of origin of accessions assigned to the indica and
sativa gene pools. ………… …… … … …… … …… … …… … … …… … …… … …… 26
2-6 The PC scatterplot with density ellipses showing how well various conceptual
groups coincide with the genetic data. ……………… … …… …………………… 31
3-1 Map showing the approximate regions of origin of the Cannabis accessions
utilized in this study ……… …… … …… … … …… … …… … …… … …… … … …… . 54
3-2 Plots of leaf traits vs. nodal positions for 135 pistillate plants in the 1996 trial. .. 59
3-3 PC scatterplot of 135 Cannabis plants, each representing a different
accession. ……………………………………………………………………………. 60
3-4 Bivariate normal density ellipses drawn on the PC scatterplot for plants of
accessions assigned to C. indica, C. sativa, or C. ruderalis. ………… ……… 63
3-5 Scatterplot on the 1st and 2nd canonical axes resulting from a discriminant
analysis of 135 Cannabis plants, each representing a different accession.
The accessions were preassigned to C. sativa, C. indica, or C. ruderalis …… 64
3-6 Scatterplot on the 1st and 2nd canonical axes resulting from a discriminant
analysis of 135 Cannabis plants, each representing a different accession.
The accessions were preassigned to the seven putative taxa in Table 3-1. 68
4-1 Chemical structures of various cannabinoids in a Cannabis extract. …………. 85
4-2 Gas chromatogram of a chemotype II plant of an Indian accession assigned
to the feral biotype of Cannabis indica. …………… …… ………… …… …… …... 95
xii
PAGE
4-3 Histogram of log
10
values of the dry-weight ratios of THC/CBD for 194
Cannabis plants. ……… …… … …… … …… … … …… … …… … …… … … …… … 96
4-4 Plot of THC% vs. CBD% for 253 Cannabis plants. ……………………………… 97
5-1 Gas chromatogram of the essential oil of a plant of an accession from Afghanistan
assigned to the wide-leaflet drug (WLD) biotype of Cannabis indica. ………… 125
5-2 Molecular structure of three sesquiterpene alcohols typically found at relatively
high ratios in the essential oil of Cannabis indica plants of Afghani origin. …… 126
5-3 Scatterplot of 162 Cannabis plants on the 1st and 2nd PC axes. …………….. 128
5-4 Scatterplot of 162 Cannabis plants on the 1st and 2nd canonical axes. ……... 130
1
CHAPTER 1 GENERAL INTRODUCTION
Historical considerations Since prehistoric times, Cannabis has provided humans
with fiber to clothe them, food to nourish them, oil to light their lamps, euphoriant to
inspire them, and medicine to alleviate their suffering (Schultes, 1973). In return,
humans have disseminated Cannabis from its indigenous range in central Asia to the far
corners of the world. Cultivated Cannabis is conceptually divided into “hemp” grown for
fiber and seed, and “marijuana” (the flowering tops of pistillate plants) grown for
medicinal or recreational use. The underlying relationships between hemp and drug
strains, and between wild populations and landraces (locally adapted populations usually
associated with traditional agriculture) are uncertain so much so that hemp and
marijuana are often described as “cousins” in the popular press (Schreiner, 2000).
Although no fossilized remains of Cannabis have been found, there is a substantial
body of historical and archaeological evidence that must be integrated with botanical
studies to further elucidate our relationship with this ancient provider (Dewey, 1914;
Schultes, 1973; Abel, 1980; Schultes and Hofmann, 1980). The use of Cannabis is
recorded in ancient texts from China, India, and elsewhere. Hemp has been cultivated
in China for over 4,000 years, and the achenes or “seeds” were once a principal grain of
the Chinese diet. Legend has it that the Buddha’s only sustenance on the path to
enlightenment was the daily consumption of a single Cannabis seed. A beverage made
from Cannabis leaves steeped in milk and spices has been imbibed in India for
thousands of years to celebrate religious festivals and is said to be the favorite drink of
the Hindu god Shiva. The plant plays a central role in the Ayurvedic system of medicine
and is used to treat a wide range of illnesses. Nomadic Scythians of central Asia are
believed to have introduced Cannabis into Europe about 3,500 years ago. Herodotus
2
(ca. 400 BCE) described how the Scythians placed Cannabis material on heated rocks
under a small tent and breathed the vapors to induce a state of exuberance. Cannabis
may have been introduced into Africa by Arab traders one- to two-thousand years ago,
and is widely used on that continent for medicinal and recreational purposes. Hemp was
extensively cultivated in Europe and Russia during the Age of Exploration for the
manufacture of canvas sails and rope to rig sailing ships. Cannabis is not native to the
western hemisphere, and was introduced by the Spaniards into South America in the
16th century. Early colonists in North America planted hemp seed imported from
Europe, but seed of Chinese origin was favored for fiber production in the United States
after the mid-nineteenth century (Dewey, 1914). Indentured servants from India may
have introduced drug strains to the West Indies and the Americas. Recreational use of
Cannabis gained popularity in Europe in the early 1800’s after Napoleon’s army brought
back tales of hashish (detached glandular trichomes usually obtained by rubbing or
sifting mature pistillate inflorescences) use from their exploits in Egypt. In the 19th and
early 20th centuries, Cannabis from the Indian subcontinent was used in the formulation
of patent medicines in Europe and North America. The medicinal herb was commonly
referred to as “Cannabis indica” to distinguish it from the common hemp of Europe,
which was considered unsuitable for medicinal use.
Professional plant breeders began developing improved hemp strains in the early
20th century. Dewey (1914) described differences between hemp landraces from
various countries. Several of these landraces were selected for increased bast fiber
content and other agronomic traits. Monoecious strains were developed in which all of
the plants matured and were ready for harvest at the same time. Despite these
advances, worldwide hemp cultivation declined throughout most of the 20th century,
primarily due to its labor-intensive processing, replacement by synthetic materials, and
political pressure because of its association with marijuana.
3
In the mid 1960’s, marijuana use was popularized in the United States and was
associated with the anti-war movement. Marijuana from Mexico, Jamaica, Panama,
Colombia, and Thailand, and hashish from Morocco, Lebanon, Nepal, India, Pakistan,
and Afghanistan were available on the black market in the late 1960’s and 1970’s.
Cannabis strains from different regions were given names like “Acapulco Gold” and
“Panama Red,” and were prized for their different tastes (when smoked), aromas, and
“highs.” Marijuana production in the United States increased in the 1970’s and 1980’s.
The illicit sale of exotic marijuana seeds became a lucrative industry in the mid-1980’s,
with seeds selling for as much as $10 apiece. Since then there has been much
emphasis on breeding high yielding crosses between so-called “sativa” and “indica” drug
strains, and as a result there has been considerable mixing of the marijuana gene pool.
These mixed parentage strains have been introduced into most marijuana producing
regions and have largely replaced the “heirloom” drug strains of the 1960’s and 1970’s.
Since 1989 when this investigation was initiated, there has been a worldwide
resurgence of interest in hemp and medical marijuana. The hemp industry has been
revived in several countries, and so-called “industrial” hemp strains certified to produce
low levels of the inebriant Δ
9
-tetrahydrocannabinol (THC) are commercially available for
cultivation. Industrial hemp is an ecofriendly renewable resource with minimal potential
for illicit use. The recently founded International Hemp Association publishes a peer-
reviewed journal dedicated to Cannabis research. Several states in the USA have
passed hemp initiatives and have legalized marijuana for medicinal use. However, the
federal government does not differentiate between industrial hemp and marijuana and
classifies marijuana as a drug of abuse with no medicinal uses despite a growing body
of scientific evidence to the contrary.
A brief overview of Cannabis taxonomy The taxonomy of Cannabis has been in
turmoil for well over two centuries, ever since Jean Baptiste Pierre Antoine de Monet,
4
Chevalier de Lamarck (1785) recognized Cannabis indica Lam. as a taxon separate
from the single species of Cannabis (Cannabis sativa L.) described by Linnaeus (1753).
Lamarck’s taxonomic treatment opened a veritable “can of worms,” a fitting prelude to
his appointment to “professor of insects and worms” at the Musée National d'Histoire
Naturelle in Paris, France. His newly designated species included drug plants from the
East Indies that he differentiated from the common hemp of Europe on the basis of their
shorter stature, narrower leaflets, harder stems, thinner bark, stronger aroma, and
greater inebriant potential than C. sativa. Lamarck’s taxonomic treatment did not gain
widespread support because plant taxonomists traditionally rely on outward appearance
to differentiate taxa, and not all botanists agreed that the phenotypic traits described by
Lamarck were adequate to delimit the two putative species. Other species of Cannabis
have been described (reviewed in Schultes et al., 1974; Small and Cronquist, 1976).
Of these, only C. ruderalis Janisch. is a commonly (but not universally) accepted
species.
Cannabis has suffered from “taxonomic neglect” in recent decades, due in part to
stringent regulations on Cannabis research in the United States and elsewhere. This
situation is exacerbated by the fact that field studies of Cannabis throughout its
indigenous range are impractical for Western botanists. Obtaining a diverse array of
germplasm of known geographic origin is also problematic, because Cannabis seeds
were eradicated from many germplasm repositories due to political pressure from the
United States.
Taxonomy is not often a newsworthy discipline, but the taxonomy of Cannabis
became a cause célebrè in the 1970’s because of the legal definition of marijuana as
leaves and flowers of the species C. sativa. A few enterprising lawyers took advantage
of this legal loophole and successfully defended their clients from drug possession
charges, because the prosecution failed to prove that the defendants were in possession
5
of C. sativa and not another species. Botanists arguing for and against a polytypic
interpretation of Cannabis testified as expert witnesses and entered into a heated debate
over the species issue (Emboden, 1974, 1981; Schultes et al., 1974; Small, 1979).
Meanwhile, the courts ruled that the intent of the lawmakers was to proscribe all types of
marijuana and that the “species defense” is not a valid argument in a court of law.
A primary purpose of taxonomy is to assign unique names to important taxonomic
groups in a hierarchical arrangement so that these groups can be referred to
unambiguously. Nevertheless, the names C. sativa, C. indica, and C. ruderalis are a
source of confusion because not all botanists agree on the circumscriptions of these
species, or that C. indica and C. ruderalis deserve species recognition. Unambiguous
common names of plants are as important to everyday conversation as Latin binomials
are to scientific discourse. Most marijuana aficionados who refer to tall, laxly branched,
narrow-leaflet drug strains as “sativa” and to short, compactly branched, wide-leaflet
drug strains as “indica” are probably unaware that they are referring to Schultes et al.’s
(1974) and Anderson’s (1980) circumscriptions of C. sativa and C. indica, as opposed to
Lamarck’s (1785) delimitation of these same species. The terms “hemp” and
“marijuana” also mean different things to different people. Often, the intended meaning
of scientific and common names for Cannabis can be determined only by the context in
which they are used. Unless we can all agree on the same taxonomic treatment and
colloquial terminology, the scientific and common names for the various Cannabis taxa
will remain a source of confusion. However, agreement seems unlikely anytime soon
given the long history of taxonomic debate and the ingrained terminology in common
usage.
Taxonomists are often categorized as “lumpers” or “splitters.” The practical
advantage of lumping all Cannabis plants together under C. sativa is that an all-inclusive
name is unambiguous. The advantage of splitting Cannabis into multiple taxa is that it
6
enables one to specify commonly recognized subgroups within the genus. The problem
with splitting is when it is carried to excess and a distinction is made without a
recognizable difference. Taxonomists must decide when formal recognition of a given
group is appropriate and when it is best to make informal distinctions without assignment
of taxonomic names and ranks. This is particularly relevant for cultivated plants, where
the distinction between human-selected groups is at issue.
There is no consensus among plant taxonomists on what constitutes a “good”
species. Previous taxonomic treatments delimit Cannabis species primarily on the basis
of phenotypic differences in accord with the “taxonomic” species concept. The difficulty
in delimiting putative species of Cannabis on this basis has been a stumbling block for
well over two centuries. Modes of speciation differ depending on a number of factors.
Speciation can occur rapidly if one or two gene mutations result in the formation of
biological reproductive barriers, or gradually as a result of prolonged reproductive
isolation. In a dioecious, wind-pollinated, annual herb such as Cannabis, speciation may
result from isolated gene pools gradually diverging as a consequence of natural
selection, the accumulation of mutations, and genetic drift (Rieseberg, 1998). This mode
of speciation is consistent with the “biological” species concept, although concerns have
been expressed about its limitations (Mayr, 1976, Rieseberg and Brouillet, 1994).
Reproductive isolating barriers, if they exist in Cannabis, are probably geographic (e.g.,
an intervening mountain range) and/or physiological (e.g., different flowering times)
rather than a result of breeding incompatibility. Grant (1971) described stages of
evolutionary divergence proceeding from separate colonies to geographical races to
semi-species and finally to sympatric species. This is likely the trajectory that Cannabis
followed if speciation occurred in nature. If there is no apparent substructure to the
Cannabis gene pool then there may be little or no genetic support for speciation.
However, if separate gene pools with geographic integrity can be discerned by genetic
7
analysis then a case can be made for speciation. Even so, it will be necessary to
correlate these patterns of variation with phenotypic traits to arrive at a taxonomic
treatment that is of general utility. My taxonomic approach is to synthesize all available
information into a “practical and natural” treatment of Cannabis that is useful to others,
and provides a foundation for further studies.
Statement of purpose The primary purpose of this investigation was to elucidate
evolutionary and systematic relationships within the Cannabis gene pool. It was
anticipated that an analysis of allozyme allele frequencies would provide new insights
into the interrelationships among important groups within the genus. To integrate the
results of the genetic analysis with phenotypic and biochemical studies by other
researchers, morphological and chemotaxonomic studies were conducted on the same
set of accessions that was surveyed for allozyme variation. By testing the goodness of
fit of previous taxonomic treatments to the pattern of genetic variation, a working
hypothesis was devised that provided a basis for analyzing and interpreting the
phenotypic and chemotaxonomic data sets. This informal taxonomic treatment
challenges previous assumptions about the role of humans in the evolution of Cannabis
and will hopefully stimulate further studies along these lines.
This dissertation is a compilation of four manuscripts that were previously published
or are in the process of publication. Minor additions and clarifications were made to the
text of these manuscripts, and errors that have come to light have been corrected. The
chapters are presented in the order in which the corresponding manuscripts were
written. Unfortunately, this is not the order of appearance in their respective journals
because of delays in peer review and publication. The table of passport data in the
allozyme manuscript (Chapter 2) was moved to the appendix because it is referred to in
subsequent chapters. The appendices also include supplementary information not
included in the published manuscripts. Chapter 3 is a revision of the morphology
8
manuscript submitted to Systematic Botany. It was referenced as “in press” in
subsequent manuscripts because it was tentatively accepted for publication, but it will
have to be resubmitted to another journal. The chemotaxonomic manuscripts (Chapters
4 and 5) were written last, but appeared in print prior to the allozyme manuscript on
which they were based. Chapter 6 is a general discussion that helps tie together the
results of the different studies. It includes the usual caveats and disclaimers, and looks
to what future steps might be taken to further resolve the evolutionary and systematic
relationships within the Cannabis genus.
9
REFERENCES
Abel E.L. 1980. Marihuana. The First Twelve Thousand Years. Plenum Press, New
York, New York.
Anderson L.C. 1980. Leaf variation among Cannabis species from a controlled garden.
Harvard University Botanical Museum Leaflets 28: 6169.
Dewey L.H. 1914. Hemp. In: USDA Yearbook 1913. United States Department of
Agriculture, Washington, DC, pp. 283347.
Emboden W.A. 1974. Cannabisa polytypic genus. Economic Botany 28: 304310.
Emboden W.A. 1981. The genus Cannabis and the correct use of taxonomic
categories. Journal of Psychoactive Drugs 13: 1521.
Grant V. 1971. Plant Speciation. Columbia University Press, New York, New York.
de Lamarck J.B. 1785. Encyclopédie Méthodique de Botanique, Vol.1, Pt. 2.
Panckoucke, Paris, France, pp. 694695. [In French]
Linnaeus C. 1753. Species Plantarum 2: 1027. Salvius, Stockholm. [Facsimile edition,
1957-1959, Ray Society, London]
Mayr E. 1976. The biological meaning of species. In: Slobodchikoff C.N. (ed.),
Concepts of Species, Benchmark Papers in Systematic and Evolutionary Biology,
Dowden, Hutchinson, and Ross, Inc., Stroudsburg, Pennsylvania, pp. 267276.
Rieseberg L.H. 1998. Genetic mapping as a tool for studying speciation. In: Soltis D.E.,
Soltis P.S. and Doyle J.J. (eds.), Molecular Systematics of Plants II DNA
Sequencing. Kluwer Academic Publishers, Boston, Massachusetts, pp. 459487.
Rieseberg L.H. and Brouillet 1994. Are many plant species paraphyletic? Taxon 43:
2132.
Schreiner B. 2000. Bill would allow production of industrial hemp. March 7, 2000.
The Louisville Courier-Journal, Louisville, Kentucky.
Schultes R.E. 1973. Man and marijuana. Natural History 82: 5863, 80, 82.
Schultes R.E., Klein W.M., Plowman T., and Lockwood T.E. 1974. Cannabis: an
example of taxonomic neglect. Harvard University Botanical Museum Leaflets
23: 337367.
Schultes R.E. and Hofmann A. 1980. The Botany and Chemistry of Hallucinogens.
Charles C. Thomas, Springfield, Illinois.
Small E. 1979. The Species Problem in Cannabis, Vols. 1 and 2: Science and
Semantics. Corpus Information Services, Toronto, Canada.
10
Small E. and Cronquist A. 1976. A practical and natural taxonomy for Cannabis. Taxon
25: 405435.
11
CHAPTER 2 GENETIC EVIDENCE FOR SPECIATION IN CANNABIS
Based upon: Hillig K.W. 2005. Genetic evidence for speciation in Cannabis
(Cannabaceae). Genetic Resources and Crop Evolution 52: 161180. (Includes
corrections, clarifications, and appendicies).
ABSTRACT
Sample populations of 157 Cannabis accessions of diverse geographic origin were
surveyed for allozyme variation at 17 gene loci. The frequencies of 52 alleles were
subjected to principal component analysis. A scatterplot revealed two major groups of
accessions. The sativa gene pool includes fiber/seed landraces from Europe, Asia
Minor, and central Asia, and ruderal populations from eastern Europe. The indica gene
pool includes fiber/seed landraces from eastern Asia, narrow-leaflet drug strains from
southern Asia, Africa, and Latin America, wide-leaflet drug strains from Afghanistan and
Pakistan, and feral populations from India and Nepal. A third putative gene pool
includes ruderal populations from central Asia. None of the previous taxonomic
concepts that were tested adequately circumscribe the sativa and indica gene pools.
A polytypic concept of Cannabis is proposed that recognizes three species, C. sativa,
C. indica and C. ruderalis, and seven putative taxa.
INTRODUCTION
Cannabis is believed to be one of humanity’s oldest cultivated crops, providing a
source of fiber, food, oil, medicine, and inebriant since Neolithic times (Chopra and
Chopra, 1957; Schultes, 1973; Li, 1974; Fleming and Clarke, 1998). Cannabis is
12
normally a dioecious, wind-pollinated, annual herb, although plants may live for more
than a year in subtropical regions (Cherniak, 1982), and monoecious plants occur in
some populations (Migal, 1991). The indigenous range of Cannabis is believed to be in
central Asia, the northwest Himalayas, and possibly extending into China (de Candolle,
1885; Vavilov, 1926; Zhukovskij, 1962; Li, 1974). The genus may have two centers of
diversity, Hindustani and European-Siberian (Zeven and Zhukovsky, 1975). Cannabis
retains the ability to escape from cultivation and return to a weedy growth habit and is
considered to be only semi-domesticated (Vavilov, 1926; Bredemann et al., 1956).
Methods of Cannabis cultivation are described in the ancient literature of China, where it
has been utilized continuously for at least six thousand years (Li, 1974). The genus may
have been introduced into Europe ca. 1500 BCE by nomadic tribes from central Asia
(Schultes, 1970). Arab traders may have introduced Cannabis into Africa, perhaps one
to two thousand years ago (Du Toit, 1980). The genus is now distributed worldwide from
the equator to about 60˚N latitude and throughout much of the southern hemisphere.
Cannabis cultivated for fiber and/or achenes (i.e., “seeds”) is herein referred to as
“hemp.” Cannabis breeders distinguish eastern Asian hemp from the common hemp of
Europe (Bócsa and Karus, 1998; de Meijer, 1999). Russian botanists recognize four
“eco-geographical” groups of hemp: northern, middle-Russian, southern, and far-eastern
(Serebriakova and Sizov, 1940; Davidyan, 1972). Northern hemp landraces are smaller
in stature and earlier maturing than landraces from more southerly latitudes, with a
series of overlapping gradations in phenotypic traits between northern, middle-Russian,
and southern types. Far-east Asian hemp landraces are most similar to the southern
eco-geographical group (Dewey, 1914). Two basic types of drug plant are commonly
distinguished in accord with the taxonomic concepts of Schultes et al. (1974) and
Anderson (1980): narrow-leafleted drug strains and wide-leafleted drug strains
(Cherniak, 1982; anonymous, 1989; de Meijer, 1999).
13
The taxonomic treatment of Cannabis is problematic. Linnaeus (1753) considered
the genus to consist of a single undivided species, Cannabis sativa L. Lamarck (1785)
determined that Cannabis strains from India are distinct from the common hemp of
Europe and named the new species C. indica Lam. Distinguishing characteristics
include more branching, a thinner cortex, narrower leaflets, and the general ability of
C. indica to induce a state of inebriation. Opinions differ whether Lamarck adequately
differentiated C. indica from C. sativa, but they are both validly published species. Other
species of Cannabis have been proposed (reviewed in Schultes et al., 1974; and Small
and Cronquist, 1976), including C. chinensis Delile, and C. ruderalis Janisch. Vavilov
(1926) considered C. ruderalis to be synonymous with his own concept of C. sativa L.
var. spontanea Vav. He later recognized wild Cannabis populations in Afghanistan to be
distinct from C. sativa var. spontanea and named the new taxon C. indica Lam. var.
kafiristanica Vav. (Vavilov and Bukinich, 1929).
Small and Cronquist (1976) proposed a monotypic treatment of Cannabis, which is a
modification of the concepts of Lamarck and Vavilov. They reduced C. indica in rank to
C. sativa L. subsp. indica (Lam.) Small & Cronq. and differentiated it from C. sativa L.
subsp. sativa, primarily on the basis of “intoxicant ability” and purpose of cultivation.
Small and Cronquist bifurcated both subspecies into “wild” (sensu lato) and
domesticated varieties on the basis of achene size and other achene characteristics.
This concept was challenged by other botanists, who used morphological traits to delimit
three species: C. indica, C. sativa, and C. ruderalis (Anderson, 1974, 1980; Emboden,
1974; Schultes et al., 1974). Schultes et al. and Anderson narrowly circumscribed
C. indica to include relatively short, densely branched, wide-leaflet strains from
Afghanistan. The differences of opinion between taxonomists supporting monotypic
and polytypic concepts of Cannabis have not been resolved (Emboden, 1981).
14
Few studies of genetic variation in Cannabis have been reported. Lawi-Berger et al.
(1982) studied seed protein variation in five fiber strains and five drug strains of
Cannabis and found no basis for discriminating these predetermined groups. De Meijer
and Keizer (1996) conducted a more extensive investigation of protein variation in
bulked seed lots of 147 Cannabis accessions, and on the basis of five variable proteins
concluded that fiber cultivars, fiber landraces, drug strains, and wild or naturalized
populations could not be discriminated. A method that shows greater promise for
taxonomic investigation of Cannabis is random amplified polymorphic DNA (RAPD)
analysis. Using this technique, Cannabis strains from different geographic regions can
be distinguished (Faeti et al., 1996; Jagadish et al., 1996; Siniscalco Gigliano, 2001;
Mandolino and Ranalli, 2002), but the number and diversity of accessions that have
been analyzed in these investigations are too small to provide a firm basis for drawing
taxonomic inferences.
Allozyme analysis has proven useful in resolving difficult taxonomic issues in
domesticated plants (Doebley, 1989). Allozymes are enzyme variants that have arisen
through the process of DNA mutation. The genetic markers (allozymes) that are
commonly assayed are part of a plant’s primary metabolic pathways and presumed
neutral to the effects of human selection. Through allozyme analysis it is possible to
discern underlying patterns of variation that have been outwardly obscured by the
process of domestication. Because these genetic markers are cryptic, it is necessary to
associate allozyme frequencies with morphological differences in order to synthesize the
genetic data into a formal taxonomic treatment (Pickersgill, 1988). Other types of
biosystematic data may be included in the synthesis as well.
The purpose of this research is 1) to elucidate underlying genetic relationships
among Cannabis accessions of known geographic origin, and 2) to assess previous
taxonomic concepts in light of the genetic evidence. The research reported herein is
15
part of a broader systematic investigation of morphological, chemotaxonomic, and
genetic variation in Cannabis, which will be reported separately.
MATERIALS AND METHODS
The Cannabis germplasm collection - A diverse collection of 157 Cannabis
accessions of known geographic origin was obtained from breeders, researchers, gene
banks, and law enforcement agencies (Appendix A). Each accession consisted of an
unspecified number of viable achenes. Many of the landraces that were studied are no
longer cultivated and exist only in germplasm repositories. Sixty-nine accessions were
from hemp landraces conserved at the N. I. Vavilov Institute of Plant Industry (VIR) in
Russia (Lemeshev et al., 1994). Ten accessions were from Small’s taxonomic
investigation of Cannabis (Small and Beckstead, 1973; Small et al., 1976). Thirty-three
accessions were from de Meijer’s study of agronomic diversity in Cannabis (de Meijer
and van Soest, 1992; de Meijer, 1994; de Meijer, 1995; de Meijer and Keizer, 1996).
The accessions from Afghanistan were obtained from Cannabis breeders in Holland,
and at least three of these strains (Af-4, Af-5, Af-9) are inbred (anonymous, 1989). Six
Asian accessions were collected from extant populations, including a drug landrace from
Pakistan (Pk-1), three feral populations from India (In-2, In-3, In-5), and fiber landraces
from India (In-4) and China (Ch-4). Accession Ch-4 was collected in Shandong Province
from seed propagated on the island of Hunan (Clarke, 1995). Five accessions from
central Asia were collected from roadsides and gardens in the Altai region of Russia,
and identified by the provider as C. ruderalis. Several weedy accessions from Europe
were identified as C. ruderalis, “ssp. ruderalis,” or “var. spontanea.”
16
A priori grouping of accessions The accessions were assigned to drug or hemp
plant-use groups, or ruderal (wild or naturalized) populations as shown in Appendix A.
They were also assigned to putative taxa according to the concepts of Lamarck (1785),
Delile (1849), Schultes et al. (1974) and Anderson (1980), and Small and Cronquist
(1976), based on morphological differences, geographic origin, and presumed purpose
for cultivation. Not all of the accessions could be unambiguously assigned to a taxon for
each concept. To depict the various groups of interest, bivariate density ellipses were
drawn on the PC scatterplot. A probability value of 0.75 was chosen because at this
value the ellipses encompass the majority of accessions in a given group, but not the
outliers.
Allozyme analysis An initial survey was conducted to identify enzymes that
produce variable banding patterns in Cannabis (Wendel and Weeden, 1989). Only
enzymes showing variable banding patterns that could be visualized and interpreted
reliably were selected for further analysis. Of the 37 enzymes that were initially
assayed, 11 enzymes encoded at 17 putative loci were selected for a genetic survey of
the entire Cannabis germplasm collection. Previously published methods of starch-gel
electrophoresis and staining were employed (Shields et al., 1983; Soltis et al., 1983;
Morden et al., 1987; Wendel and Weeden, 1989; Kephart, 1990).
Gel/electrode buffer systems Three gel/electrode buffer systems were utilized
(Appendix B). A tris-citrate buffer system (modified from Wendel and Weeden, 1989)
was used to resolve aconitase (ACN), leucine aminopeptidase (LAP), malic enzyme
(ME), 6-phosphogluconate dehydrogenase (6PGD), phosphoglucoisomerase (PGI),
phosphoglucomutase (PGM), and shikimate dehydrogenase (SKDH). A lithium-borate
buffer system (modified from Soltis et al., 1983) was used to resolve hexokinase (HK)
and triosephosphate isomerase (TPI). A morpholine-citrate buffer system (modified from
Wendel and Weeden, 1989) was used to resolve LAP, malate dehydrogenase (MDH),
17
ME, PGI, PGM, and an unknown monomeric enzyme (UNK) that appeared on gels
stained for isocitrate dehydrogenase (IDH). IDH could not be interpreted reliably and
was not used in the analysis. A phosphate buffer (modified from Soltis et al., 1983) was
used for enzyme extraction.
Gel preparation Starch gels were prepared the day preceding electrophoresis,
using hydrolyzed potato starch in gel buffer. The proportion of starch to gel-buffer was
between about 10.0 and 13.0 % (w/v) depending on the starch lot, to produce gels of the
proper consistency. The gels were poured into 191 mm (l) x 165 mm (w) x 5 (or 10) mm
deep acrylic molds. After cooling, the gels were covered with plastic wrap and stored
overnight in a refrigerator.
Sample preparation and loading of gelsAbout 40 mg of unweighed sample
material was added to about 120 µL of cold phosphate extraction buffer on a cold
porcelain spot plate and ground to a slurry. The extract was absorbed onto thick filter-
paper wicks that were then placed into a vertical slit made toward one end of each gel.
Electrophoresis was conducted in a refrigerator at 4˚C.
Electrophoresis and staining For both the tris-citrate and morpholine-citrate
buffer systems, 5 mm thick gels were held at 30 mA, and 10 mm thick gels at 45 mA
throughout electrophoresis. For the lithium-borate buffer system, only 5 mm thick gels
were used. These were held at 50 mA for the first ten minutes (after which the wicks
were removed), and at 200 V subsequently. Current was applied for about six hours to
obtain good band separation. Immediately following electrophoresis the gels were
horizontally sliced and the top slice discarded. The three (for a 5 mm thick gel) or five
(for a 10 mm gel) remaining slices were each assayed for a different enzyme. After
covering the gels with staining solution they were placed in a dark drawer at room
temperature, or in an oven at 37˚C. Bands appeared after about 20 minutes to two
hours, depending on the enzyme. The banding patterns were interpreted and the gels
18
photographed on a light table using high contrast Kodak Technical Pan film (Eastman
Kodak Company, Rochester, NY). Staining recipes (Appendix B) for all enzymes except
HK were modified from Soltis et al. (1983). The HK recipe was modified from Morden
et al. (1987). Some recipes call for the addition of coenzymes that varied in efficacy
between different sources and lots. More (or less) than indicated of these enzymes
were sometimes needed to produce satisfactory results. Stock MTT and PMS solutions
were stored in a refrigerator and added to the assay solution just before staining.
Coenzymes were also added at this time.
Tissue sample collectionSample populations of each accession were grown in
two secure greenhouses. Voucher specimens are deposited in the Deam herbarium
(IND) at Indiana University. About ten plants of each accession were surveyed except
for accessions obtained late in the investigation. Thirty Cannabis plants were sampled
for each gel. To make the gels easier to interpret, two lanes were left blank or loaded
with a plant other than Cannabis. Tissue samples were collected the afternoon before
extraction and electrophoresis and stored overnight on moist filter paper in small Petri
dishes under refrigeration. Shoot tips generally produced the darkest bands, although
mature leaf tissue was better for visualizing PGM.
Multivariate analysisPutative genotypes were inferred from the allozyme banding
patterns, and allele frequencies were calculated for small populations of each accession
(Wendel and Weeden, 1989). Allele frequencies were analyzed using JMP version 5.0
(SAS Institute, 2002). Principal component analysis (PCA), commonly employed in
numerical taxonomic investigations, was used to visualize the underlying pattern of
genetic variation. The principal components were extracted from the correlation matrix
of allele frequencies. Each PC axis is defined by a linear combination of the allele
frequencies. PC axis 1 accounts for the largest amount of variance that can be
attributed to a single multivariate axis, and each succeeding axis accounts for a
19
progressively smaller proportion of the remaining variance. PC analysis simplifies the
original n-dimensional data set (n = the number of alleles) by enabling the data to be
plotted on a reduced number of orthogonal axes while minimizing the loss of information.
The degree of similarity among the accessions can be inferred from their proximity in PC
space (Wiley, 1981; Hillig and Iezzoni, 1988).
The average number of alleles per locus (A), number of alleles per polymorphic locus
(Ap), and percent polymorphic loci (P) were calculated for each accession, and the
expected heterozygosity (H
e
) averaged over all loci was calculated using the mean allele
frequencies of each sample population, for the 11 enzymes that were assayed (Nei,
1987; Doebley, 1989).
Several industrial hemp strains developed in European breeding programs were
genetically characterized but excluded from the statistical analysis because of their
possible hybrid origin (de Meijer and van Soest, 1992; de Meijer, 1995). For the purpose
of this investigation an accession was considered hybrid if the parental strains came
from more than one country. Nine Chinese accessions from the VIR collection were
excluded because of suspected hybridization during seed regeneration (Hillig, 2004).
Only accessions analyzed in this investigation are shown in Appendix A.
RESULTS
Gel interpretation The allozyme banding patterns were interpreted as shown in
Figure 2-1. Only diploid banding patterns were observed. When more than one set of
bands appeared on a gel the loci were numbered sequentially starting with the fastest
migrating (most anodal) locus. Alleles at a given locus were lettered sequentially
starting with the fastest migrating band. Monomeric enzymes (ACN, HK, LAP, PGM,
SKDH, UNK) showed a single band for homozygous individuals and two bands for
20
FIGURE 2-1. Starch gels stained for enzyme activity. The scale (cm) shows the distance
of migration from the origin. (a) ACN (b) HK; so-called “ghost” bands are artifacts and
can be ignored (c) IDH (not used in analysis) and UNK (d) PGM (e) LAP; cannabinoids
CBDA and THCA appear toward the bottom of the gel (f) MDH (g) 6PGD (h) ME (i)
SKDH (j) TPI (k) PGI (l) PGI; the two-banded pattern in lane 3 is attributed to the
expression of a “silent” allele (As).
21
FIGURE 2-1 (continued)
heterozygous individuals. Dimeric enzymes (6PGD, MDH, PGI, TPI) typically showed
one band for homozygotes and three bands for heterozygotes. Malic enzyme (ME) is
tetrameric (Weeden & Wendel, 1989) and heterozygous individuals produced a five-
banded pattern. The rates of migration of the allozymes through the starch gels relative
to the most common allozyme at a given locus are given in Appendix C. Curiously, a
pair of bands appeared toward the bottom of gels stained for LAP presumably due to
cannabidiolic acid (CBDA) and Δ
9
-tetrahydrocannabinolic acid (THCA) migrating into the
gels (Figure 2-1e). Cannabinoid data were not included in the statistical analysis.
A total of 65 alleles were detected for the 11 enzymes that were assayed. Thirteen of
these were excluded from the analysis because they appeared in only a single
22
accession. Although they are not useful in this study for taxonomic discrimination these
alleles may indicate regions of high genetic diversity. Ten of the 13 rare alleles were
detected in accessions from southern and eastern Asia (India, Japan, Pakistan, South
Korea), and only two were detected in accessions from Europe. The 52 alleles that were
detected in more than one accession were included in the statistical analysis.
Principal component analysis The Cannabis accessions were plotted on PC axis
1 (PC1) and PC axis 2 (PC2), which account for 12.3 and 7.3% of the total variance,
respectively (Figure 2-2). Two large clusters of accessions as well as several outliers
are evident on a density contour overlay of the PC scatterplot (Figure 2-3). A line
separating the two major groups is arbitrarily drawn at PC1 = -1. The geographic
distribution of the accessions was visualized by drawing bivariate density ellipses
(P = 0.75) on the PC plot for the 19 countries of origin represented by three or more
accessions (Figure 2-4). It can be seen in Figure 2-4 that the ellipses cluster into the
two major groups visualized in Figure 2-3. Accessions with values of PC1 < -1 are
mostly from Asian and African countries, including Afghanistan, Cambodia, China, India,
Japan, Nepal, Pakistan, South Korea, Thailand, and Uzbekistan, as well as Gambia,
Lesotho, Nigeria, Sierra Leone, South Africa, Swaziland, Uganda, and Zimbabwe.
Accessions from Colombia, Jamaica, and Mexico are also associated with this group.
The other major group, with values of PC1 > -1, is composed of accessions from
Europe, Asia Minor, and Asiatic regions of the former Soviet Union, including Armenia,
Belorus, Bulgaria, Germany, Hungary, Italy, Kazakhstan, Moldavia, Poland, Romania,
Russia, Spain, Syria, Turkey, Ukraine, and former Yugoslavia. Although the ellipses for
Russia and former Yugoslavia extend into the neighboring cluster, none of the
Yugoslavian accessions and only two of the Russian accessions (Rs-1, Rs-3) had
values of PC1 < -1. The ellipse for Russia is relatively large because of several outliers
including a group of five accessions (Rs-7, Rs-9, Rs-10, Rs-14, Rs-21), three of which
23
FIGURE 2-2. Scatterplot of 156 Ca nnabis accessions on PC axis 1 and PC axis 2. Accession codes are given in Appendix A.
Rs-5, a distant outlier, is not shown.
24
FIGURE 2-3. Density contour overlay of the PC scatterplot. The two large clusters of
accessions are separated by a line drawn at PC1 = -1. Several outlying accessions are
evident including Rs-5, not shown in Figure 2-2. Density contours are in 10%
increments with 0.7 kernel sizes for both axes.
are from the Altai region of central Asia. Three ruderal accessions from the same region
(Rs-1, Rs-4, Rs-5) are also outliers, but situated apart from the previous group. Two
ruderal Romanian accessions (Rm-1, Rm-2) are outliers, resulting in an elongated
ellipse that extends beyond the main cluster and envelops five ruderal Hungarian
accessions (Hn-5, Hn-6, Hn-7, Hn-8, Hn-9) as well.
For further analysis, accessions with values of PC1 < -1 were assigned to the indica
gene pool and those with values of PC1 > -1 were assigned to the sativa gene pool.
The gene pools are so-named because they correspond (more or less) to the
indica/sativa dichotomy perceived by Lamarck and others. A map showing the countries
25
FIGURE 2-4. Density ellipses (P = 0.75) drawn on the PC scatterplot for the countries of
origin of the various accessions. Ellipses were only generated for countries represented
by a minimum of three accessions.
of origin of accessions from Eurasia and Africa is shaded to indicate the approximate
geographic range of the indica and sativa gene pools on these continents (Figure 2-5).
A third ruderalis gene pool was hypothesized, to accommodate the six central Asian
ruderal accessions (Rs-1 thru Rs-5, Uz-1) situated on the PC plot between the indica
and sativa gene pools. The ruderalis accessions correspond to Janischevsky’s (1924)
description of C. ruderalis. The indigenous range of the putative ruderalis gene pool is
believed to be in central Asia. A more detailed analysis of spontaneous Cannabis
populations along the migratory routes of ancient nomadic people, ranging from central
26
FIGURE 2-5. Map showing the countries of origin of accessions assigned to the indica
and sativa gene pools. The arrows indicate human-vectored dispersal from the
presumed origin of Cannabis in central Asia.
Asia to the Carpathian Basin, may reveal further details regarding the ruderalis gene
pool.
The frequencies (ƒ) of 29 out of 52 alleles differed significantly (P 0.05) between
accessions assigned to the indica and sativa gene pools (Table 2-1). The most common
allele at each locus is the same for both gene pools, but their frequencies differed
significantly for ten of the 17 loci surveyed. The absolute values of the eigenvector
loadings (Table 2-1) indicate the relative contribution of each allele to a given PC axis.
Several alleles that account for much of the differentiation between the two major gene
pools on PC1 (ACN1-F, LAP1-B, 6PGD2-A, PGM- B, SKDH-D, UNK-C) are relatively
common (ƒ 0.10) in the sativa gene pool and uncommon (ƒ 0.05) in the indica gene
27
TABLE 2-1. Mean allele frequencies for accessions assigned to the indica, sativa and ruderalis
gene pools. For a given allele, means (in rows) not connected by the same letter are significantly
different using Student’s t-test (P 0.05). The most common allele at each locus is shown in
bold. N = number of accessions assigned to each group. Also shown are the eigenvector
loadings for the first two principal component axes (PC1 and PC2).
Allele
indica
N=62
sativa
N=89
ruderalis
N=6
Eigenvector
PC1
Eigenvector
PC2
ACN1-A
0.02 b
0.01 b
0.11 a
-0.039
0.280
ACN1-B
0.95 a
0.89 b
0.79 b
-0.082
-0.183
ACN1-D
0.02 a
0.00 a
0.02 a
-0.023
0.039
ACN1-E
0.01 a
0.00 a
0.00 a
-0.045
0.067
ACN1-F
0.00 b
0.10 a
0.09 a
0.161
0.025
ACN2-B
0.90 b
0.99 a
0.80 b
0.105
-0.342
ACN2-C
0.10 a
0.01 b
0.20 a
-0.104
0.341
HK-A
0.92 a
0.85 b
0.82 ab
-0.080
-0.189
HK-B
0.08 b
0.15 a
0.18 ab
0.080
0.187
LAP1-A
0.00 a
0.01 a
0.00 a
0.095
0.082
LAP1-B
0.03 b
0.33 a
0.00 b
0.231
-0.154
LAP1-C
0.68 b
0.64 b
0.93 a
-0.037
0.288
LAP1-D
0.30 a
0.03 b
0.07 b
-0.190
-0.189
LAP2-A
0.01 b
0.07 a
0.20 a
0.126
0.178
LAP2-B
0.99 a
0.92 b
0.81 b
-0.154
-0.175
LAP2-C
0.00 a
0.02 a
0.00 a
0.140
0.030
MDH1-A
0.01 a
0.00 a
0.00 a
-0.017
0.017
MDH1-B
0.99 a
0.94 b
0.93 ab
-0.218
-0.132
MDH1-C
0.00 b
0.06 a
0.07 ab
0.237
0.133
MDH2-B
1.00 a
0.99 a
1.00 a
-0.154
0.059
MDH2-C
0.00 a
0.01 a
0.00 a
0.156
-0.060
MDH3-A
0.00 a
0.00 a
0.00 a
0.030
0.067
MDH3-C
0.99 a
0.98 a
0.97 a
-0.045
-0.092
MDH3-E
0.00 b
0.02 a
0.03 a
0.077
0.041
ME-B
0.99 a
0.99 a
0.93 b
0.011
-0.160
ME-C
0.01 b
0.01 b
0.07 a
-0.004
0.168
6PGD1-A
0.00 a
0.00 a
0.00 a
-0.047
0.040
6PGD1-B
0.99 a
1.00 a
1.00 a
0.038
-0.143
6PGD2-A
0.02 b
0.17 a
0.15 a
0.252
0.045
6PGD2-B
0.98 a
0.82 b
0.85 b
-0.249
-0.044
6PGD2-C
0.00 a
0.00 a
0.00 a
0.022
-0.011
PGI2-A
0.08 b
0.21 a
0.00 b
0.143
-0.066
PGI2-As
0.01 a
0.00 a
0.00 a
-0.036
0.111
PGI2-B
0.86 a
0.79 b
0.98 a
-0.095
0.033
PGI2-C
0.05 a
0.00 b
0.02 ab
-0.083
0.025
PGM-B
0.01 c
0.34 a
0.20 b
0.294
-0.011
PGM-C
0.98 a
0.66 c
0.80 b
-0.291
0.009
PGM-D
0.01 a
0.00 b
0.00 ab
-0.035
0.040
SKDH-A
0.05 a
0.00 b
0.00 ab
-0.124
-0.123
SKDH-B
0.09 a
0.02 b
0.00 ab
-0.104
-0.058
SKDH-C
0.31 a
0.37 a
0.04 b
0.083
-0.132
SKDH-D
0.05 b
0.14 a
0.20 a
0.137
0.105
SKDH-E
0.42 b
0.43 ab
0.63 a
-0.036
0.068
SKDH-F
0.08 a
0.03 b
0.13 a
-0.098
0.239
TPI1-A
0.05 b
0.10 a
0.11 ab
0.098
0.097
TPI1-B
0.95 a
0.90 b
0.89 ab
-0.096
-0.097
TPI2-A
0.01 a
0.01 a
0.00 a
-0.023
0.034
TPI2-B
0.99 a
0.99 a
1.00 a
0.019
-0.013
TPI2-C
0.00 a
0.00 a
0.00 a
0.005
-0.049
UNK-A
0.00 b
0.00 b
0.03 a
0.029
0.219
UNK-B
0.99 a
0.60 b
0.97 a
-0.305
0.114
UNK-C
0.01 b
0.39 a
0.00 b
0.304
-0.126
28
pool. Four of these alleles (ACN1-F, 6PGD2-A, PGM-B, SKDH-D) are also common in
the ruderalis gene pool. Several other alleles that largely contribute to the differentiation
of accessions on PC2 (ACN1-A, LAP1-C, ME-C, UNK-A) are significantly more common
in the ruderalis gene pool than in the indica or sativa gene pools. Only two alleles
(ACN2-C, LAP1-D) were found that are common (ƒ 0.10) in accessions assigned to
the indica gene pool, and uncommon in accessions assigned to the sativa gene pool.
However, several less-common (0.05 ƒ < 0.10) alleles in the indica gene pool were
uncommon or rare (ƒ < 0.03) in the sativa gene pool (PGI2-C, SKDH-A, SKDH-B,
SKDH-F).
The ruderal accessions from Europe and central Asia tend to group apart. Although
Rs-5 is a distant outlier, plants of this accession appeared morphologically similar to
others from the same region. The outlying position of Rs-5 may be partially due to
sampling error since only four viable achenes were obtained. Allele LAP2-A is common
among the ruderal accessions from Europe and central Asia but relatively uncommon
among the other accessions in the collection, particularly those assigned to the indica
gene pool.
The germplasm collection included two very early maturing Russian hemp
accessions typical of the northern eco-geographical group (Rs-22, Rs-23). These are
situated on the PC plot with early maturing accessions from nearby regions (Rs-25,
Rs-26), and with three ruderal accessions (Hn-7, Hn-9, Rs-2). However, accessions
from more southerly latitudes in Europe also cluster nearby (Bg-4, Rm-3, Sp-3).
No formal distinction was made in this investigation between the middle-Russian and
southern eco-geographic groups of hemp, or between fiber and seed accessions. There
appears to be little basis for differentiating these groups on the PC scatterplot. The large
ellipse for Russia (Figure 2-4) envelops accessions assigned to both the sativa and
ruderalis gene pools. Allele MDH2-C was detected in four of the five Russian outliers
29
situated toward the right side of the PC scatterplot (Rs-7, Rs-9, Rs-14, Rs-21). This
allele was not found in any of the other accessions. The taxonomic significance of this
group, if any, is unknown.
The fiber/seed accessions assigned to the indica gene pool are genetically diverse.
All but six of the 57 alleles detected in the indica gene pool were present in this group,
including seven rare alleles that were each detected in only a single accession. The
outliers in the upper left corner of the PC scatterplot are mostly hemp landraces from
eastern Asia that had allele frequencies outside the normal range, which sets them apart
from the other indica accessions.
The narrow-leaflet drug accessions are relatively devoid of genetic variation
compared to the other conceptual groups recognized in this study. Even so, geographic
patterns of genetic variation are apparent within this group. The 12 African accessions
are from three regions: western Africa (Nigeria, Gambia, Sierra Leone), east-central
Africa (Uganda) and southern Africa (South Africa, Swaziland, Lesotho, Zimbabwe).
Sample populations of the two Ugandan accessions (Ug-1, Ug-2) consisted entirely of
monoecious plants devoid of detectable allozyme variation. The position of these two
accessions on the PC scatterplot represents a region of low genetic variation, with drug
accessions from southern Africa and Southeast Asia situated nearby. A rare allele
(SKDH-A) was found in all seven southern African accessions, but in only two other
accessions, from Nigeria and Colombia. For the African accessions, an allele (SKDH-C)
that was commonly found in most other accessions was not detected.
The wide-leaflet drug accessions from Afghanistan and Pakistan (Af-1 thru Af-10,
Pk-1) cluster with the other accessions assigned to the indica gene pool. Allele HK-B
was found in 9 of the 11 wide-leaflet drug accessions and in a few hemp accessions
from China and South Korea, but not in any of the narrow-leaflet drug accessions or feral
indica accessions. HK-B is common in the sativa gene pool, being found in 60 of the 89
30
accessions assigned to that group. However, several other alleles that are common in
the sativa gene pool (ACN1-F, LAP1-B, 6PGD2-A, PGM-B, TPI1-A, UNK-C) were rare or
undetected in the wide-leaflet drug accessions.
Taxonomic interpretationOne objective of this study is to assess previous
taxonomic concepts in light of the genetic evidence. Cannabis is commonly divided into
drug and hemp plant-use groups, and a third group of feral (wild or naturalized)
populations. The density ellipse for the drug accessions (Figure 2-6a) overlies the indica
gene pool, while the ellipse for the hemp accessions overlies both major gene pools, as
does the ellipse for the feral accessions.
Delile’s (1849) concept of C. chinensis is given consideration because hemp
accessions from southern and eastern Asia group separately from those assigned to the
sativa gene pool, and Delile was the first taxonomist to describe a separate taxon of
eastern Asian hemp. The density ellipse for accessions assigned to C. chinensis (Figure
2-6b) shows that they comprise a subset of the indica gene pool.
Lamarck’s (1785) taxonomic concept differentiates the narrow-leaflet C. indica drug
accessions from C. sativa, but it is ambiguous how he would have classified the wide-
leaflet drug accessions or the eastern Asian hemp accessions. Figure 2-6c shows good
separation of the two species proposed by Lamarck, but his concept of C. indica does
not circumscribe all of the accessions assigned to the indica gene pool.
Schultes et al. (1974) and Anderson (1980) narrowly circumscribed C. indica to
include wide-leaflet strains from Afghanistan. Narrow-leaflet drug strains together with
hemp strains from all locations are circumscribed under C. sativa. The density ellipse for
C. indica shows that the accessions assigned to this concept comprise a subset of the
indica gene pool (Figure 2-6d) while the ellipse for C. sativa includes accessions
assigned to both the indica and sativa gene pools. Schultes et al. and Anderson
31
FIGURE 2-6. The PC scatterplot with density ellipses (P = 0.75) showing how well various conceptual groups coincide with the
genetic data. The accessions were sorted according to the following concepts: (a) plant-use group (b) Delile (c) Lamarck (d)
Schultes et al. and Anderson (e) Small and Cronquist (f) author’s concept.
32
also recognized C. ruderalis and emphasized that it exists only in regions where
Cannabis is indigenous. The ellipse for the six central Asian accessions assigned to
C. ruderalis lies between and overlaps both the indica and sativa gene pools.
Small and Cronquist (1976) proposed two subspecies and four varieties of C. sativa.
Their circumscription of C. sativa L. subsp. sativa var. sativa includes hemp strains from
all regions and the resulting ellipse overlaps the indica and sativa gene pools (Figure
2-6e). Cannabis sativa L. subsp. sativa var. spontanea (Vav.) Small & Cronq. includes
ruderal accessions from both Europe and central Asia. The resulting ellipse
encompasses most of the sativa gene pool and a portion of the indica gene pool,
although only two accessions assigned to var. spontanea (Rs-1, Rs-3) had values of
PC1 < -1. The density ellipses for C. sativa L. subsp. indica Lam. var. indica (Lam.)
Wehmer and for C. sativa L. subsp. indica Lam. var. kafiristanica (Vav.) Small & Cronq.
encompass different subsets of the indica gene pool.
The author’s concept is illustrated by density ellipses for the indica, sativa, and
ruderalis gene pools (Figure 2-6f). The ellipses for accessions assigned to the indica
and sativa gene pools overlay the two major clusters of accessions while the ellipse for
the ruderalis accessions is intermediate and overlaps the other two. Since the existence
of a separate ruderalis gene pool is less certain, it is indicated with a dotted line.
Genetic diversity statistics - Genetic diversity statistics for gene pools and putative
taxa of Cannabis are given in Table 2-2. The taxa listed in Table 2-2 circumscribe
different subsets of the indica and sativa gene pools. Cannabis ruderalis is also
included here. The circumscriptions of C. sativa subsp. sativa var. sativa and C. sativa
subsp. sativa var. spontanea exclude accessions assigned to C. chinensis and
C. ruderalis, respectively, while C. indica sensu Lamarck excludes accessions assigned
to C. sativa subsp. indica var. kafiristanica. In general, the sativa accessions exhibited
greater genetic diversity than the indica accessions (including C. sativa subsp. indica
33
var. kafiristanica and C. chinensis), and the ruderalis accessions were intermediate.
Within the indica gene pool, the accessions assigned to C. chinensis exhibited the
greatest genetic diversity and the narrow-leaflet drug accessions (C. indica sensu
Lamarck) exhibited the least. Within the sativa gene pool, the cultivated (var. sativa) and
weedy (var. spontanea) accessions exhibited virtually identical levels of genetic diversity.
TABLE 2-2. Means for the number of alleles per locus (A), number of alleles per
polymorphic locus (Ap), percentage of polymorphic loci (P) and average expected
heterozygosity (He) for gene pools and putative taxa of Cannabis. Means (in columns)
not connected by the same letter are significantly different using Student’s t-test (P
0.05). The gene pools and putative taxa were tested separately. N = number of
accessions.
Gene Pool
N
A
Ap
P
He
sativa
89
1.60 a
2.20 b
48.3 a
0.17 a
indica
62
1.35 b
2.39 a
22.2 c
0.08 c
ruderalis
6
1.39 b
2.13 b
34.0 b
0.13 b
Putative Taxon
N
A
Ap
P
He
C. sativa subsp. sativa var. sativa
a
Small & Cronq.
81
1.60 a
2.20 bc
48.4 a
0.17 a
C. sativa subsp. sativa var. spontanea
b
Small and Cronq.
8
1.59ab
2.19 bc
47.0 a
0.17 a
C. sativa subsp. indica var. kafiristanica Small & Cronq.
5
1.44 bc
2.38 ab
22.4 cde
0.09 cd
C. indica Lam.
c
27
1.19 d
2.43 a
12.8 e
0.05 e
C. indica sensu Schultes et al. and Anderson
11
1.29 c
2.21 bc
22.1 d
0.07 d
C. chinensis Delile
19
1.59 a
2.44 a
35.6 b
0.12 bc
C. ruderalis Janisch.
6
1.39 c
2.13 c
34.0 bc
0.13 b
a
excluding accessions assigned to C. chinensis
b
excluding accessions assigned to C. ruderalis
c
excluding accessions assigned to C. sativa subsp. indica var. kafiristanica
DISCUSSION
The allozyme data show that the Cannabis accessions studied in this investigation
were derived from two major gene pools, ruling out the hypothesis of a single undivided
species. The genetic divergence of the cultivated accessions approximates the
indica/sativa split perceived by previous investigators. However, none of the earlier
taxonomic treatments of Cannabis adequately represent the underlying relationships
discovered in the present study.
34
The allozyme data in conjunction with the different geographic ranges of the indica
and sativa gene pools and previous investigations that demonstrate significant
morphological and chemotaxonomic differences between these two taxa (Small and
Beckstead, 1973; Small et al., 1976) support the formal recognition of C. sativa,
C. indica, and possibly C. ruderalis as separate species. This opinion represents a
synthesis of the species concepts of Lamarck, Delile, Janischevsky, Vavilov, Schultes
et al., and Anderson. It rejects the single-species concepts of Linnaeus, and Small and
Cronquist because the genetic data demonstrate a fundamental split within the Cannabis
gene pool. It is more “practical and natural” to assign the indica and sativa gene pools to
separate species and to leave the ranks of subspecies and variety available for further
classification of the putative taxa recognized herein.
The C. sativa gene pool includes hemp landraces from Europe, Asia Minor and
central Asia, as well as weedy populations from eastern Europe. The C. indica gene
pool is more diverse than Lamarck originally conceived. Besides the narrow-leaflet drug
strains, the C. indica gene pool includes wide-leaflet drug strains from Afghanistan and
Pakistan, hemp landraces from southern and eastern Asia, and feral populations from
India and Nepal. Cannabis ruderalis, assumed to be indigenous to central Asia, is
delimited to exclude naturalized C. sativa populations occurring in regions where
Cannabis is not native. The existence of a separate C. ruderalis gene pool is less
certain because only six accessions of this type were available for study.
The first two PC axes account for a relatively small proportion of the total variance
(19.6%) compared with a typical PC analysis of morphological data. Morphological data
sets often have a high degree of “concomitant character variation,” such as the size
correlation between different plant parts (Small, 1979). As a result, the first few PC axes
often account for a relatively large proportion of the variance. This type of “biological
correlation” was absent from the data set of allele frequencies. Although the less
35
common alleles are of taxonomic importance, the common alleles largely determined the
outcome of the PC analysis. When only the most frequent allele at each locus was
entered into the analysis, the first two PC axes accounted for 25.8% of the total variance
and the C. indica and C. sativa gene pools were nearly as well discriminated.
The role of human selection in the divergence of the C. indica and C. sativa gene
pools is uncertain. Small (1979) presumed the dichotomy to be largely a result of
selection for enhanced drug production in the case of the indica taxon, and selection for
enhanced fiber/seed production in the case of sativa. The genetic evidence challenges
this assumption because the fiber/seed accessions from India, China, Japan, South
Korea, Nepal, and Thailand all cluster with the C. indica gene pool. An alternate
hypothesis is that the C. indica and C. sativa hemp landraces were derived from different
primordial gene pools and independently domesticated, and that the drug strains were
derived from the same primordial gene pool as the C. indica hemp landraces. It is
assumed that, in general, when humans introduced Cannabis into a region where it did
not previously exist, the gene pool of the original introduction largely determined the
genetic make-up of the Cannabis populations inhabiting the region thereafter. It remains
to be determined whether the C. indica and C. sativa gene pools diverged before or after
the beginning of human intervention in the evolution of Cannabis.
The amount of genetic variation in Cannabis is similar to levels reported for other
crop plants (Doebley, 1989). Hamrick (1989) compiled data from different sources that
show relatively high levels of genetic variation within out-crossed and wind-pollinated
populations, and low levels of variation within weedy populations. Differentiation
between populations is relatively low for dioecious and out-crossed populations, and
high for annuals and plants (such as Cannabis) with gravity-dispersed seeds. Hamrick
reported the within-population means of 74 dicot taxa. The number of alleles per locus
(1.46), percentage of polymorphic loci (31.2%) and mean heterozygosity (0.113) are
36
within the ranges estimated for the putative taxa of Cannabis. The extensive overlap of
the density ellipses for the countries of origin of accessions assigned to the C. sativa
gene pool (Figure 2-4) suggests that this group is relatively homogeneous throughout its
range. In comparison, the ellipses for the C. indica gene pool do not all overlap,
suggesting that regional differences within this gene pool are more distinct.
Divergence in allele frequencies between populations (gene pools) can occur in two
principal ways (Witter, cited in Crawford, 1989). Initially, a founder population can
diverge partly or wholly by genetic drift. The second process, which presumably takes
much longer, involves the accumulation of new mutations in the two populations. Both
of these processes may help to explain the patterns of genetic variation present in
Cannabis, albeit on a larger scale. The alleles that differentiate C. indica from C. sativa
on PC1 are common in the C. sativa gene pool and uncommon in the C. indica gene
pool, which suggests that a founder event may have narrowed the genetic base of
C. indica. However, a considerable number of mutations appear to have subsequently
accumulated in both gene pools, indicating that the indica/sativa split may be quite
ancient.
The assumption that the alleles that were surveyed in this study are selectively
neutral does not imply that humans have not affected allele frequencies in Cannabis.
It means only that these genetic markers are “cryptic” and not subject to deliberate
manipulation. Humans have undoubtedly been instrumental in both the divergence and
mixing of the Cannabis gene pools. For example, the commercial hemp strain ‘Kompolti
Hybrid TC’ takes advantage of heterosis (hybrid vigor) in a cross between a European
hemp strain corresponding to C. sativa, and a Chinese “unisexual” hemp strain
corresponding to C. indica (Bócsa, 1999). Evidence of gene flow from eastern Asian
hemp to cultivated C. sativa is provided by certain alleles (e.g., LAP1-D, PGI2-C,
37
SKDH-B, SKDH-F) that occur in low frequency in the C. sativa gene pool and are
significantly more common among the hemp accessions assigned to C. indica. There is
also limited evidence of gene flow in the reverse direction; allele PGM-B, which is
common in accessions assigned to C. sativa, was detected at low frequency in a few of
the hemp accessions assigned to C. indica.
Some of the accessions in the collection encompass little genetic variation, which
may be a result of inbreeding, genetic drift, or sampling error (e.g., the achenes may
have been collected from a single plant). In general, the accessions cultivated for drug
use, particularly the narrow-leaflet drug accessions, show more signs of inbreeding than
those cultivated for fiber or seed. The absence of allele PGM-B in the gene pool of
narrow-leaflet drug accessions indicates a lack of gene flow from C. sativa. Although it
is possible that the entire gene pool of narrow-leaflet drug strains passed through a
“genetic bottleneck,” the low genetic diversity of this group may also be a result of the
way these plants are often cultivated. It is not unusual for growers to select achenes
from the few best plants in the current year’s crop to sow the following year, thereby
reducing the genetic diversity of the initial population. Since staminate plants are often
culled before flowering, the number of pollinators may also be extremely limited.
The gene pool of a cultivated taxon is expected to contain a subset of the alleles
present in the ancestral gene pool (Doebley, 1989). In the case of Cannabis, the
available evidence is insufficient to make an accurate determination of progenitor-
derivative relationships. Aboriginal populations may have migrated from central Asia
into Europe as “camp followers,” along with the cultivated landraces (Vavilov, 1926).
If so, then the weedy populations of Europe may represent the aboriginal gene pool into
which individuals that have escaped from cultivation have merged. Although fewer
alleles were detected in the feral accessions from central Asia and Europe than in the
cultivated C. sativa gene pool, this result is preliminary given the relatively small number
38
of feral accessions available for study. Similarly, the feral C. indica accessions from
India and Nepal do not encompass as much genetic variation as the cultivated
accessions of C. indica, but again this result is based on insufficient data to draw firm
conclusions. Even so, both results suggest that feral populations are secondary to the
domesticated ones. From the evidence at hand, it appears that the feral C. indica
accessions could represent the ancestral source of the narrow-leaflet drug accessions
but perhaps not of the wide-leaflet drug accessions, because allele HK-B was found in
nine of the eleven wide-leaflet drug accessions but not in any of the feral C. indica, or
narrow-leaflet drug accessions. Vavilov and Bukinich (1929) reported finding wild
Cannabis populations in eastern Afghanistan (C. indica Lam. f. afghanica Vav.), which
could represent the progenitor of the wide-leaflet drug strains. Unfortunately, wild
populations from Afghanistan were not represented in the present study.
Conclusion - This study substantiates the existence of a fundamental split within the
Cannabis gene pool. A synthesis of previous taxonomic concepts best describes the
underlying patterns of variation. The progenitor-derivative relationships within Cannabis
are not well understood and will require more extensive sampling and additional genetic
analyses to further resolve. A revised circumscription of the infraspecific taxonomic
groups is warranted, in conjunction with analyses of morphological and chemotaxonomic
variation within the germplasm collection under study.
39
REFERENCES
Anderson L.C. 1974. A study of systematic wood anatomy in Cannabis. Harvard
University Botanical Museum Leaflets 24: 2936.
Anderson L.C. 1980. Leaf variation among Cannabis species from a controlled garden.
Harvard University Botanical Museum Leaflets 28(1): 6169.
Anonymous. 1989. The Seed Bank Catalogue. Ooy, The Netherlands. [Authorship
attributed to N. Schoenmakers.]
Bócsa I. 1999. Genetic improvement: conventional approaches. In: Ranalli, P. (ed.),
Advances in Hemp Research, Haworth Press, Binghamton, New York, pp. 153184.
Bócsa I. and Karus M. 1998. The Cultivation of Hemp. Hemptech, Sebastopol,
California.
Bredemann G., Schwanitz Fr., and von Sengbusch R. 1956. Problems of modern hemp
breeding, with particular reference to the breeding of varieties with little or no
hashish. Bulletin on Narcotics 8: 3135.
de Candolle A. 1885. Hemp Cannabis sativa L. In: Origin of Cultivated Plants,
D. Appleton, New York, New York, pp. 148149.
Cherniak L. 1982. The Great Books of Cannabis, Vol. I, Book II. Cherniak/Damele
Publishing, Oakland, California.
Chopra I.C. and Chopra R.N. 1957. The use of Cannabis drugs in India. Bulletin on
Narcotics 9: 429.
Clarke R.C. 1995. Hemp (Cannabis sativa L.) cultivation in the Tai’an district of
Shandong Province, Peoples Republic of China. Journal of the International Hemp
Association 2(2): 57, 6065.
Crawford D.J. 1989. Enzyme electrophoresis and plant systematics. In: Soltis D.E. and
Soltis P.S. (eds), Isozymes in Plant Biology. Dioscorides Press, Portland, Oregon,
pp. 146164.
Davidyan G.G. 1972. [Hemp: biology and initial material for breeding.] Trudy po
Prikladnoi Botanike, Genetike i Seliktsii 48(3): 1160. [in Russian]
Delile A.R-. 1849. Index seminum horti botanici Monspeliensis. Annales des Sciences
Naturelles Botanique et Biologie Vegetale 12: 365366.
Dewey L.H. 1914. Hemp. In: USDA Yearbook 1913. United States Department of
Agriculture, Washington, D.C., pp. 283347.
40
Doebley J. 1989. Isozymic evidence and the evolution of crop plants. In: Soltis D.E.
and Soltis P.S. (eds), Isozymes in Plant Biology. Dioscorides Press, Portland,
Oregon, pp. 165191.
Du Toit B.M. 1980. Cannabis in Africa. A. A. Balkema, Rotterdam, The Netherlands.
Emboden W.A. 1974. Cannabisa polytypic genus. Economic Botany 28(3): 304
310.
Emboden W.A. 1981. The genus Cannabis and the correct use of taxonomic
categories. Journal of Psychoactive Drugs 13(1): 1521.
Faeti V., Mandolino G., and Ranalli P. 1996. Genetic diversity of Cannabis sativa
germplasm based on RAPD markers. Plant Breeding 115: 367370.
Fleming M.P. and Clarke R.C. 1998. Physical evidence for the antiquity of Cannabis
sativa L. Journal of the International Hemp Association 5(2): 8093.
Hamrick J.L. 1989. Isozymes and the analysis of genetic structure in plant populations.
In: Soltis D.E. and Soltis P.S. (eds), Isozymes in Plant Biology. Dioscorides Press,
Portland, Oregon, pp. 87105.
Hillig K.W. 2004. A multivariate analysis of allozyme variation in 93 Cannabis
accessions from the VIR germplasm collection. Journal of Industrial Hemp 9(2): 5
22.
Hillig K.W. and Iezzoni A.F. 1988. Multivariate analysis of a sour cherry germplasm
collection. Journal of the American Society for Horticultural Science 113(6): 928
934.
Jagadish V., Robertson J., and Gibbs A. 1996. RAPD analysis distinguishes Cannabis
sativa samples from different sources. Forensic Science International 79: 113121.
Janischevsky D.E. 1924. Forma konopli na sornykh mestakh v Yugovostochnoi Rossii.
In: Chiuevsky I.A. (ed.), Uchenye Zapiski Gosudarstvennogo Saratovskogo imeni
N. G. Chernyshevskogo Universiteta Fisiko-Matematicheskoye otdelenie
Pedagogicheskogo Fakul’teta 2(2): 317, Saratov University Press, Saratov, USSR.
[In Russian]
Kephart S.R. 1990. Starch gel electrophoresis of plant isozymes: a comparative
analysis of techniques. American Journal of Botany 77(5): 316368.
de Lamarck J.B. 1785. Encyclopédie Méthodique de Botanique, Vol. 1:, Pt. 2.
Panckoucke, Paris, France, pp. 694695. [In French]
41
Lawi-Berger C., Miège M.N., Kapétanidis I., and Miège J. 1982. Contribution a l’etude
chimiotaxonomique de Cannabis sativa L. Les Comptes Rendus de l'Académie des
Sciences Paris 295(3): 397402. [In French]
Lemeshev N., Rumyantseva L., and Clarke R.C. 1994. Maintenance of Cannabis
germplasm in the Vavilov Research Institute gene bank 1993. Journal of the
International Hemp Association 1(1): 1, 35.
Li Hui-Lin 1974. An archaeological and historical account of Cannabis in China.
Economic Botany 28(4): 437448.
Linnaeus C. 1753. Species Plantarum 2: 1027. Salvius, Stockholm. [Facsimile edition,
1957-1959, Ray Society, London, U.K.]
Mandolino G. and Ranalli P. 2002. The applications of molecular markers in genetics
and breeding of hemp. Journal of Industrial Hemp 7(1): 723.
de Meijer E.P.M. 1994. Diversity in Cannabis. Doctoral thesis, Wageningen
Agricultural University, Wageningen, The Netherlands.
de Meijer E.P.M. 1995. Fibre hemp cultivars: a survey of origin, ancestry, availability
and brief agronomic characteristics. Journal of the International Hemp Association
2(2): 6673.
de Meijer E.P.M. 1999. Cannabis germplasm resources. In: Ranalli P. (ed.), Advances
in Hemp Research. Haworth Press, Binghamton, New York, pp. 133151.
de Meijer E.P.M. and Keizer L.C.P. 1996. Patterns of diversity in Cannabis. Genetic
Resources and Crop Evolution 43: 4152.
de Meijer E.P.M. and van Soest L.J.M. 1992. The CPRO Cannabis germplasm
collection. Euphytica 62: 201211.
Migal N.D. 1991. Genetics of polymorphic sex evolution in hemp. Genetika 27: 1561-
1569. [Transl. from Russian in Soviet Genetics, March 1992: 1095-1102.]
Morden C.W., Doebley J., and Schertz K.F. 1987. A Manual of Techniques for Starch
Gel Electrophoresis of Sorghum Isozymes. Texas Agricultural Experiment Station
Miscellaneous Publication 1635, College Station, Texas.
Nei M. 1987. Molecular Evolutionary Genetics. Columbia University Press, New York,
New York.
Pickersgill B. 1988. The genus Capsicum: a multidisciplinary approach to the taxonomy
of cultivated and wild plants. Biologisches Zentralblatt 107(4): 381389.
SAS Institute. 2002. JMP Statistics and Graphics Guide. SAS Institute, Cary, North
Carolina.
42
Schultes R.E. 1970. Random thoughts and queries on the botany of Cannabis.
In: Joyce C.R.B. and Curry S.H. (eds), The Botany and Chemistry of Cannabis.
J. and A. Churchill, London, U.K., pp. 1138.
Schultes R.E. 1973. Man and marijuana. Natural History 82: 58–63, 80, 82.
Schultes R.E., Klein W.M., Plowman T., and Lockwood T.E. 1974. Cannabis: an
example of taxonomic neglect. Harvard University Botanical Museum Leaflets 23:
337367.
Serebriakova T.Ya. and Sizov I.A. 1940. Cannabinaceae Lindl. In: Vavilov N.I. (ed.),
Kulturnaia Flora SSSR Vol. 5. Moscow-Leningrad, USSR, pp. 153. [In Russian]
Shields C.R., Orton T.J., and Stuber C.W. 1983. An outline of general resource needs
and procedures for the electrophoretic separation of active enzymes from plant
tissue. In: Tanksley S.D. and Orton T.J. (eds), Isozymes in Plant Genetics and
Breeding, Part A. Elsevier Science Publishers, Amsterdam, pp. 443516.
Siniscalco Gigliano G. 2001. Cannabis sativa L. Botanical problems and molecular
approaches in forensic investigations. Forensic Science Review 13(1): 217.
Small E. 1979. The Species Problem in Cannabis, Vol. 1: Science. Corpus Information
Services, Toronto, Canada.
Small E. and Beckstead H.D. 1973. Common cannabinoid phenotypes in 350 stocks of
Cannabis. Lloydia 36(2): 144165.
Small E. and Cronquist A. 1976. A practical and natural taxonomy for Cannabis. Taxon
25(4): 405435.
Small E., Jui P.Y., and Lefkovitch L.P. 1976. A numerical taxonomic analysis of
Cannabis with special reference to species delimitation. Systematic Botany 1(1):
6784.
Soltis D.E., Haufler C.H., Darrow D.C., and Gastony G.J. 1983. Starch gel
electrophoresis of ferns: a compilation of grinding buffers, gel and electrode buffers,
and staining schedules. American Fern Journal 73(1): 927.
Vavilov N.I. 1926. The origin of the cultivation of ‘primary’ crops, in particular cultivated
hemp. In: Studies on the Origin of Cultivated Plants, Institute of Applied Botany and
Plant Breeding, Leningrad, USSR, pp. 221233.
Vavilov N.I. and Bukinich D.D. 1929. Konopli. Zemledel’cheskii Afghanistan. Trudy po
Prikladnoi Botanike, Genetike i Seliktsii Supplement 33: 380382, 474, 480, 584
585, 604. [Reissued in 1959 by Izdatel’stuo Akademii Nauk SSSR, Moskva-
Leningrad.]
43
Weeden N.F. and Wendel J.F. 1989. Genetics of plant isozymes. In: Soltis D.E. and
Soltis P.S. (eds), Isozymes in Plant Biology. Dioscorides Press, Portland, Oregon,
pp. 4672.
Wendel J.F. and Weeden N.F. 1989. Visualization and interpretation of plant isozymes.
In: Soltis D.E. and Soltis P.S. (eds), Isozymes in Plant Biology. Dioscorides Press,
Portland, Oregon, pp. 545.
Wiley E.O. 1981. Phylogenetics. John Wiley and Sons, New York, New York.
Zeven A.C. and Zhukovsky P.M. 1975. Cannabidaceae. In: Dictionary of Cultivated
Plants and their Centres of Diversity. Centre for Agricultural Publishing and
Documentation, Wageningen, The Netherlands, pp. 6263, 129130.
Zhukovskij P.M. 1962. Cultivated Plants and their Wild Relatives. Commonwealth
Agricultural Bureaux, Farnham Royal, Bucks, U.K., pp. 8384. [Transl. from
Russian by P. S. Hudson]
44
CHAPTER 3 – A MULTIVARIATE ANALYSIS OF PHENOTYPIC VARIATION IN CANNABIS
Based on: Hillig K.W. A multivariate analysis of phenotypic variation in Cannabis.
Submitted to: Systematic Botany.
ABSTRACT
This paper reports one aspect of a systematic investigation of a Cannabis
(Cannabaceae) germplasm collection. Phenotypic traits were evaluated for 135
greenhouse-grown plants of diverse geographic origin including hemp (cultivated for
fiber or seed), drug, and feral accessions. The data were interpreted with respect to a
taxonomic hypothesis based on earlier genetic and chemotaxonomic studies of the
same set of accessions. A scatterplot on the first two principal component (PC) axes
shows a central cluster of plants most of which were previously assigned to the hemp
biotype of C. sativa, surrounded by plants mostly assigned to the hemp, drug, and feral
biotypes of C. indica, to the feral biotype of C. sativa, and to C. ruderalis. The pattern of
variation on the PC plot was largely attributed to domestication of C. indica for multiple
uses and to selection of hemp landraces of C. sativa and C. indica for similar agronomic
traits. Canonical variates analysis discriminated the putative species and infraspecific
taxa (biotypes) with > 99 % accuracy. Because few differences were found between
plants initially assigned to the feral biotype of C. sativa and to C. ruderalis, a two-species
concept for Cannabis is proposed.
INTRODUCTION
Plants of the genus Cannabis (Cannabaceae) have been cultivated since prehistoric
times, providing fiber from their bast, food and oil from their achenes (“seeds”), and
45
medicine and euphoriant from their glandular trichomes (Schultes and Hofmann, 1980).
Humans consciously and unconsciously selected plants exhibiting beneficial traits
including branchless stems with long internodes, large achenes that remain attached to
the plant at maturity, and compact pistillate inflorescences with abundant resin
production. Recurrent selection over countless generations resulted in locally adapted
landraces, each suited to a particular environment, for a particular use (Dewey 1914).
Cannabis is a genus of wind-pollinated annual herbs, and the plants are normally
dioecious. Sexual reproduction by means of cross-pollination is relatively rare in
domesticated plants (Zohary 1984). Cannabis is only semi-domesticated, able to
escape cultivation and return to a feral existence (Vavilov 1926). Primitive populations
may still grow wild within its aboriginal range, presumed to be in central Asia, extending
into the Indian subcontinent and western China (de Candolle, 1885; Vavilov, 1926;
Vavilov and Bukinich, 1929; Li, 1974). Two centers of diversity are recognized:
Hindustani and European-Siberian (Zeven and Zhukovsky, 1975).
Cannabis cultivated for fiber or seed is here referred to as “hemp.” Two more-or-less
distinct types of hemp are recognized by Cannabis breeders: the common hemp of
Europe, and hemp landraces of eastern Asia (Bócsa and Karus, 1998; de Meijer, 1999).
Russian agronomists differentiate European hemp into northern, middle-Russian, and
southern eco-geographical groups (Davidyan, 1972). Two types of drug plant are
recognized in accord with taxonomic circumscriptions of Schultes et al. (1974) and
Anderson (1980). These two types are commonly referred to as “sativa” and “indica.”
Tall, laxly branched “sativa” strains have leaflets that are narrow relative to their length
and are primarily grown for the production of marijuana (i.e., pistillate inflorescences).
Short, densely branched “indica” strains have relatively wide leaflets and are traditionally
grown for the production of hashish (i.e., detached glandular trichomes)
46
(de Meijer, 1999). Drug strains generally produce high levels of Δ
9
-tetrahydrocannabinol
(THC), the primary psychoactive component of Cannabis resin (Small and Beckstead,
1973; de Meijer et al., 1992; McPartland and Russo, 2001; Hillig and Mahlberg, 2004).
Whether Cannabis is a monotypic or polytypic genus is a matter of divided opinion.
Linnaeus (1753) considered Cannabis to consist of a single highly variable species,
C. sativa L. Lamarck (1785) differentiated Cannabis native to the Indian subcontinent
from C. sativa and assigned the name C. indica Lam. Cannabis indica was
distinguished by its narrower leaflets, branchier growth habit, harder stem, thinner
cortex, and stronger odor relative to the same characters in C. sativa, as well as its
different geographic range and greater inebriant ability. Delile (1849) distinguished
eastern Asian hemp from C. sativa and C. indica and assigned the name C. chinensis
Delile, but his brief taxonomic treatment did not gain widespread acceptance. Other
species of Cannabis have been proposed (reviewed in Schultes et al., 1974; Small and
Cronquist, 1976); of these only C. ruderalis Janisch. is commonly accepted. Cannabis
ruderalis is differentiated by its small achenes that fall from the plant at maturity and
germinate unevenly. The achenes often have a constricted base, a swollen abscission
zone (eliosome), and an adherent mottled perianth (Janischevsky, 1924).
Vavilov conducted field studies of “wild” and cultivated Cannabis populations within
its indigenous range (Vavilov, 1926; Vavilov and Bukinich, 1929). He initially espoused
a single-species concept, but later recognized C. sativa and C. indica as separate
species. Vavilov (1926) considered C. ruderalis synonymous with C. sativa var.
spontanea Vav., which he differentiated from cultivated C. sativa. Vavilov and Bukinich
(1929) assigned wild populations from Afghanistan and the northwest Himalayas to
C. indica var. kafiristanica Vav. Feral plants from eastern Afghanistan having small,
light-colored achenes and a perianth that easily flakes off were given the name C. indica
47
f. afghanica Vav. (= C. sativa f. afghanica Vav.). Vavilov considered this taxon to be a
morphological link between wild and cultivated races of C. indica.
Nassonov (1940) studied the stem anatomy of 20 Cannabis strains and reported
significant differences between Japanese hemp (here assigned to C. indica), and Italian
and “Orlovian” (middle-Russian) hemp (here assigned to C. sativa). Lignified wood
vessels of the xylem were typical only of Japanese and Indian strains, the later differing
“very markedly” from all other strains. Wild strains from southern latitudes were
differentiated from wild northern strains by the same features that differentiated
cultivated southern strains from cultivated northern strains. This suggests that these
anatomical differences between northern and southern strains are not a result of
domestication. Anderson (1974) reported differences in wood anatomy between
C. sativa and C. indica and noted that these traits are likely to be highly conserved.
Lawi-Berger et al. (1984) found four times as many laticifers in plants of a South African
drug