Content uploaded by John McPartland
Author content
All content in this area was uploaded by John McPartland on Dec 02, 2018
Content may be subject to copyright.
Evolutionary origins of the endocannabinoid system
John M. McPartland
a,
⁎, Isabel Matias
b
, Vincenzo DiMarzo
b
, Michelle Glass
c
a
GW Pharmaceuticals, 53 Washington Street Ext., Middlebury, VT, 05753, USA
b
Endocannabinoid Research Group, Institute of Biomolecular Chemistry, Consiglio Nazionale delle Ricerche, Via Campi Flegrei 34, 80078 Pozzuoli (Napoli), Italy
c
Department of Pharmacology, University of Auckland, Private Bag 92019, New Zealand
Received 30 September 2005; received in revised form 4 November 2005; accepted 9 November 2005
Received by M. Di Giulio
Abstract
Endocannabinoid system evolution was estimated by searching for functional orthologs in the genomes of twelve phylogenetically diverse
organisms: Homo sapiens,Mus musculus,Takifugu rubripes,Ciona intestinalis,Caenorhabditis elegans,Drosophila melanogaster,
Saccharomyces cerevisiae,Arabidopsis thaliana,Plasmodium falciparum,Tetrahymena thermophila,Archaeoglobus fulgidus,and
Mycobacterium tuberculosis. Sequences similar to human endocannabinoid exon sequences were derived from filtered BLAST searches, and
subjected to phylogenetic testing with ClustalX and tree building programs. Monophyletic clades that agreed with broader phylogenetic evidence
(i.e., gene trees displaying topographical congruence with species trees) were considered orthologs. The capacity of orthologs to function as
endocannabinoid proteins was predicted with pattern profilers (Pfam, Prosite, TMHMM, and pSORT), and by examining queried sequences for
amino acid motifs known to serve critical roles in endocannabinoid protein function (obtained from a database of site-directed mutagenesis
studies). This novel transfer of functional information onto gene trees enabled us to better predict the functional origins of the endocannabinoid
system. Within this limited number of twelve organisms, the endocannabinoid genes exhibited heterogeneous evolutionary trajectories, with
functional orthologs limited to mammals (TRPV1 and GPR55), or vertebrates (CB2 and DAGLβ), or chordates (MAGL and COX2), or animals
(DAGLαand CB1-like receptors), or opisthokonta (animals and fungi, NAPE-PLD), or eukaryotes (FAAH). Our methods identified fewer
orthologs than did automated annotation systems, such as HomoloGene. Phylogenetic profiles, nonorthologous gene displacement, functional
convergence, and coevolution are discussed.
© 2005 Elsevier B.V. All rights reserved.
Keywords: Anandamide (AEA); 2-arachidonyl glycerol (2-AG); N-acyl-phosphatidylethanolamine (NAPE); NAPE-selective phospholipase D enzyme (NAPE-
PLD); Diacylglycerol lipase α(DAGLα); Diacylglycerol lipase β(DAGLβ); Fatty acid amide hydrolase (FAAH); Monoglyceride lipase (MAGL); Cyclooxygenase
2 (COX2); Cannabinoid receptors (CB1 and CB2); Vanilloid receptor (TRPV1); GPR55
Gene xx (2005) xxx –xxx
+ MODEL
GENE-35276; No of Pages 11
www.elsevier.com/locate/gene
Abbreviations: AEA, anandamide; BLAST, Basic Local Alignment Search Tool; CB1, cannabinoid receptor subtype 1; CB2, cannabinoid receptor subtype 2;
COX2, cyclooxygenase subtype 2; DAGLα, diacylglycerol lipase subtype α; DAGLβ, diacylglycerol lipase subtype β; FAAH, fatty acid amide hydrolase; FAS,
functional assessment score; GPCR, G-protein coupled receptor; MAGL, monoglyceride lipase; NAPE, N-acyl-phosphatidylethanolamine; NAPE-PLD, NAPE-
selective phospholipase D enzyme; PCR, polymerase chain reaction; THC, tetrahydrocannabinol; TM, transmembrane; TRPV1, vanilloid receptor; 2-AG, 2-
arachidonyl glycerol; genomes examined: Hs, human, Homo sapiens;Mm, mouse, Mus musculus;Tr, puffer fish, Takifugu rubripes;Ci, sea squirt, Ciona intestinalis;
Ce, nematode, Caenorhabditis elegans;Dm, fruit fly, Drosophila melanogaster;Sc, brewer's yeast, Saccharomyces cerevisiae;At, thale cress, Arabidopsis thaliana;
Pf, malaria apicomplexan, Plasmodium falciparum;Tt, ciliate, Tetrahymena thermophila;Af, archaen, Archaeoglobus fulgidus;Mt, bacterium, Mycobacterium
tuberculosis; A, alanine; C, cysteine; D, aspartate; E, glutamate; F, phenylalanine; G, glycine; H, histidine; I, isoleucine; K, lysine; L, leucine; M, methionine; N,
asparagine; P, proline; Q, glutamine; R, arginine; S, serine; T, threonine; V, valine; W, tryptophan; Y, tyrosine.
⁎Corresponding author. Tel./fax: +1 802 388 8304.
E-mail address: mcpruitt@verizon.net (J.M. McPartland).
0378-1119/$ - see front matter © 2005 Elsevier B.V. All rights reserved.
doi:10.1016/j.gene.2005.11.004
ARTICLE IN PRESS
1. Introduction
Ten genes that encode proteins involved in endocannabinoid
signalling have been identified to date. More await discovery.
Half a dozen endocannabinoid ligands are recognized, but
research has elucidated the proteins that metabolize only
anandamide (AEA) and 2-arachidonyl glycerol (2-AG). AEA
is biosynthesized from N-acyl-phosphatidylethanolamine
(NAPE) by a NAPE-selective phospholipase D enzyme
(NAPE-PLD) (Okamoto et al., 2004). 2-AG is biosynthesized
by two diacylglycerol lipases, DAGLαand DAGLβ(Bisogno
et al., 2003). Fatty acid amide hydrolase (FAAH) primarily
catabolizes AEA (Deutsch et al., 2002). 2-AG is catabolized by
monoglyceride lipase (MAGL, monoacylglycerol lipase, Dinh
et al., 2002) and by cyclooxygenase 2 (COX2, prostaglandin-
endoperoxide synthase, Kozak et al., 2003). AEA and 2-AG act
as agonists at cannabinoid receptors 1 and 2 (CB1 and CB2), a
pair of G-protein-coupled receptors (GCPRs) named after their
exogenous ligand, Δ
9
-tetrahydrocannabinol (THC) (Mechou-
lam et al., 1998). AEA also gates the vanilloid receptor
(TRPV1, transient receptor potential channel vanilloid receptor
1, Zygmunt et al., 1999) and GPR55, an orphan GPCR (Brown
et al., 2005; Baker et al., 2005).
Phylogenetic histories of the four receptors and six enzymes
have not been explored very deeply. We use CB1 as an example:
the GenBank website (National Center for Biotechnology
Information, www.ncbi.nlm.nih.gov) lists CB1 orthologs in 65
mammalian species, but only four non-mammalian vertebrates,
and one invertebrate. Elphick and colleagues used phylogenetic
tree analysis to identify orthologs of cannabinoid receptors in
Takifugu (Fugu)rubripes (Elphick, 2002) and the sea squirt
Ciona intestinalis (Elphick et al., 2003). They performed
BLAST searches and constructed gene trees with CB1 and CB2
sequences from other species, based on ClustalX alignments
and neighbor-joining methods. If gene tree topology was
congruent with species tree topology, the sequences were
deemed orthologous. This approach has widespread appeal,
although it does not represent a comprehensive phylogenetic
analysis, due to its automated nature and lack of functional
analysis (Brinkman and Leipe, 2001). McPartland (2004)
explored endocannabinoid gene phylogeny by screening the
genomes of twelve completely (N95%) sequenced organisms
that spanned the phylogenetic “Tree of Life”(Benton and Ayala,
2003). This method merged twelve pairwise sequence similarity
searches for each queried human endocannabinoid gene.
Although many automated annotation systems use this simple
comparative approach, the method may produce false positives
in the absence of functional analysis (Zmasek and Eddy, 2002).
Functional analysis requires in vitro experiments, but can be
estimated in silico by subjecting sequences to algorithms that
predict protein domains or specialized structures. Unfortunately
these automated profilers may propagate errors; a great majority
of the proteins have not been experimentally determined.
McPartland and Glass (2003) used experimentally determined
motifs for functional analysis. The positions of amino acid
motifs serving critical roles in protein function (e.g., catalytic
residues and ligand-binding sites, obtained from site-directed
mutagenesis studies) were mapped upon a ClustalX alignment,
and the presence or absence (conservation or mutation) of each
motif was visually examined.
The purpose of this study is to combine phylogenetic tree
analysis (e.g., Elphick et al., 2003) with phylogenomic
comparisons (e.g., McPartland, 2004). Both of these methods,
however, utilize similarity-based algorithms. Sequence similar-
ity may not overlap with orthology, and certainly does not
equate with functional conservation (Zmasek and Eddy, 2002).
Thus we mapped functional analysis onto phylogenomically
assembled gene trees, to better estimate the functional origins of
the endocannabinoid system.
2. Methods
2.1. Genome sampling
To access deep-level phylogenetic signal, we screened the
genomes of twelve phylogenetically diverse organisms (Fig. 1).
Unfortunately not all major clades were sampled in this study,
because representative organisms await genome sequencing,
such as reptiles, lophotrochozoans, cnidarians, and poriferans.
Some clades contained one available whole-genome sequence:
primates (human, Homo sapiens,Hs), tunicates (sea squirt,
Ciona intestinalis,Ci), apicomplexans (Plasmodium falci-
parum,Pf), and ciliates (Tetrahymena thermophila,Tt). Other
clades offered several choices, from which we chose the best-
Fig. 1. Species tree. A phylogenetic “Tree of Life,”based on broad phylogenetic
evidence (reviewed by Benton and Ayala, 2003), including the twelve
phylogenetically diverse species used in this study (in bold font) and species
discussed in the text. Mammals are marked by a bar labeled “M.”Animals are
divided into vertebrates (V) and invertebrates (I), and also divided into four
physiological groups: deuterostomes (D), lophotrochozoans (L), ecdysozoans
(E), cnidarians (C), and poriferans (P). The three domains (supraphyla) are
eukaryotes, archaens (A), and bacteria (B).
2J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS
characterized model organism per branch: for rodents, we chose
mouse Mus musculus (Mm); for fish, puffer fish Takifugu
rubripes (Tr); for insects, fruit fly Drosophila melanogaster
(Dm); for nematodes, Caenorhabditis elegans,(Ce); for fungi,
the brewers yeast Saccharomyces cerevisiae (Sc); for plants,
thale cress Arabidopsis thaliana,(At); for prokaryotes, the
archaen Archaeoglobus fulgidus (Af) and the bacterium
Mycobacterium tuberculosis (Mt). The chimp genome has
been partially sequenced, but the current draft lacks four
endocannabinoid genes (data not shown). Genome databases of
Hs,Mm,Tr,Ci,Dm,Ce,Sc,At,Pf,Af, and Mt were obtained
from GenBank (www.ncbi.nlm.nih.gov), with Tr and Ci cross-
referenced with Ensembl (www.ensembl.org/Multi/blastview)
and the Joint Genome Institute (JGI,http://genome.jgi-psf.org/
cgi-bin/runBlast). The Tt genome came from TIGR (http://
www.tigr.org/tdb/e2k1/ttg).
2.2. Similarity screening and tree building
The 12 genomes were searched with gapped BLAST (blastP
and tblastn, Altschul et al., 1997) using ten query sequences:
HsCB1 (GenBankNP_057167), HsCB2 (NP_001832),
HsTRPV1 (NP_542437), HsGPR55 (NP_005674), HsFAAH
(NP_001432), HsMAGL (NP_009214 not NP_001003794),
HsCOX2 (NP_000954), HsNAPE-PLD (NP_945341), HsDA-
GLα(NP_006124 notBAA31634), and HsDAGLβ(NP_
631918). Reciprocal best hits were filtered by means of a
threshold Evalue (b0.01) and sequence length (75% of query
over subject). Paired hits with nearly equal Evalues were
analyzed as paralogs and provided useful information for gene
trees (e.g., the paralogs CB1 and CB2). Best hits from each
genome were aligned with ClustalX (www-igbmc.u-strasbg.fr/
BioInfo/ClustalX/Top.html) using default parameters. Align-
ments were manual edited to minimize indel events before
generating gene trees with a neighbor-joining (NJ) algorithm.
We detected several polyphyletic sequences by placing two
outgroups per tree: the Hs sequence with closest similarity to
the query sequence found in the Hs genome, and a more
distantly related Hs outgroup to root the tree. Outgroups
characterized by in vitro functional studies were included
whenever possible. TreeView (http//:taxonomy.zoology.gla.
ac.uk/rod/treeview.html) recursively optimized the alignments
and generated graphic outputs, with branch lengths propor-
tional to distances between sequences. Confidence values for
NJ trees were generated by bootstrapping, based on 1000
resampling replicates. Bootstrap support for clade stability
was given at nodes; bootstrap values ≥500 supported
monophyly of the clade (Brinkman and Leipe, 2001).
2.3. Functional assessment
Whereas BLAST evaluated sequence similarity, a number of
algorithms evaluated the patterns and motifs within sequences,
and classified the sequences into function-based protein
families. We used two protein prediction programs. Pfam
(www.sanger.ac.uk/Software/Pfam) implemented a hidden
Markov model (HMM) upon gapped multiple sequence
alignments. Prosite (http://au.expasy.org/prosite) used a posi-
tion-specific scoring matrix (PSSM) upon ungapped multiple
sequence alignments. Pfam and Prosite are primary profilers
and combined automated and human curation, unlike other
databases(e.g.,CDD,COG,SMART).Wealsoqueried
sequences with two algorithms that predicted secondary
structures. TMHMM (www.cbs.dtu.dk/services/TMHMM)
was chosen over TMpred, SVMtm, and PSORT after it
outperformed the other transmembrane region prediction
programs in an accuracy trial run upon the ten human
endocannabinoid proteins (data not shown). PSORT (http://
psort.nibb.ac.jp) was chosen as a subcellular localization
predictor, after it outperformed ESLpred, LOC-target, PA-
Pence, SubLoc, and TargetP (data not known). When the
prediction for a potential ortholog matched the prediction for an
endocannabinoid protein, the potential ortholog scored 1. If the
algorithm predicted a different structure or function, the
potential ortholog scored 0. The positions of critical AA motifs
that served vital roles in endocannabinoid protein function (e.g.,
ligand-binding and catalysis) were mapped upon ClustalX
alignments, and the presence or absence of each AA motif was
visually examined, and scored with either 1 (indicating presence
of a motif) or 0 (indicating substitution or deletion of a motif).
Table 1
Specific amino acid residues utilized for functional mapping of ten endocannabinoid receptors or enzymes in this study
Protein Amino acid residue motifs References
CB1 + CB2 F3.25, K3.28, V3.32, W5.43, L5.50, C175 in EL-2, L6.33-A6.34 McPartland and Glass, 2003
a,b
CB2-specific motifs S3.31G, T3.35S, S4.53A, S4.57A, F5.46V, LDV not MDI in IC-3,
C313M; C2.59Y
McPartland and Glass, 2003
a,b
;Zhang et al., 2005
a
TRPV1 R114, R491, Y511, S512, T550, E761, C-terminus motif McPartland, 2004
a,b
GPR55 K2.60, FV3.28, βxxβ(6.43-6), K7.36 Unpublished data
FAAH P129, PPLP (310-3) part of SH3, I491 Matias et al., 2005
a,b
MAGL GxSxG(120-4)-D239-H269 C242 McPartland, 2004
a,b
;Saario et al., 2005
a
COX1 + 2 COX2-specific
motifs
R120, V349, Y355, Y385, G526, S530, L531 T383H, R513H,
V523I, L505F
Matias et al., 2005
a,b
;Schneider et al., 2004
a
;
Bambai et al., 2004
a
NAPE-PLD D147, HxHxDH(185-90), H253, D284, H331 Okamoto et al., 2004
c
DAGLαβ GxSxG(441-5), D495, H429 Bisogno et al., 2003
c
a
Motifs evaluated in point-mutation studies.
b
Review article that cites primary literature.
c
Motifs evaluated in computer-modeling studies.
3J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS
Conservative AA substitutions were allowed, based on the
BLOSUM62 substitution matrix, using a cut-off value of +1.
The list of critical AA motifs is presented in Table 1. Each
queried sequence was given a functional assessment score
(FAS), a sum of the four profilers (Pfam, Prosite, PSORT, and
TMHMM) and a variable number of functional mapping motifs
(between three and eight, see Table 1). FAS scores finalized the
transition from sequence similarity (a quantitative measure) to
phylogenetic homology (a statement of common ancestral
origins) to functional equivalence (a qualitative characterization
of protein utility).
3. Results
All twelve genomes expressed one or more sequences that
shared similarities with the queried human endocannabinoid
sequences. Databases for the Tr and Ci genomes did not cross-
reference with consistency, as noted below. Pfam and Prosite
tended to classify sequences too broadly, in large functional
families, and not in specific homologous series. For example, all
the sequences BLASTed from queries CB1, CB2, and GPR55
fell into the same profile, “GPCR.”This lack of specificity
inflated FAS scores. Sequences that claded together in
phylogenetic trees shared similar FAS scores.
3.1. CB1 and CB2
BLAST found no sequences that met threshold in the
genomes of Sc,At,Pf,Tt,Af,orMt. In the combined CB1–CB2
gene tree (Fig. 2), HsCB1 claded with MmCB1 and a pair of
TrCB1 paralogs, all with high FAS scores. HsCB2 claded with
MmCB2 and TrCB2, all with high CB2-specific FAS scores.
The Ci and Ce orthologs sistered basal to the CB1 and CB2
clades. Together, this monophyletic clade separated with good
bootstrap support from outgroups HsEDG1 and HsEDG2, the
sequences with greatest similarity to CB1 and CB2 in the Hs
genome (E= 2e-28 and E= 3e-29, respectively, with HsCB1).
The Dm sequence passed the BLAST threshold (E= 3.9e-39),
but lacked synapomorphy with cannabinoid receptors, it placed
between the EDG clade and the distal outgroup HsADRA1A
(an alpha adrenergic GPCR).
3.2. TRPV1
BLAST found no sequences that met threshold in the
genomes of Sc,At,Pf,Tt,Af,orMt. Paralogs from Tr did not
cross-reference consistently between databases: the best hit
(E= 7.9e-282) was identified as SINFRUP00000155081 by
Ensembl but SINFRUP00000066444 by JGI. Its nearly equal
paralog (E= 1.1e-262) was identified as SINFRUP00000162 427
by Ensembl but FRUP00000162428 by JGI (with indels), and
SINFRUP00000085541 by GenBank. In the phylogenetic tree
(Fig. 3), Tr162427 sistered with HsTRPV1 and MmTRPV1 (all
with high FAS scores), whereas Tr155081 (with a low FAS
score) claded with outgroup HsTRPV4, the sequence with
greatest similarity to TRPV1 in the Hs genome. The Ci,Ce,
and Dm sequence passed the BLAST threshold but lacked
synapomorphy with TRPV1, they placed between the TRPV4
clade and the distal outgroup HsTRPA1, with low FAS
scores.
3.3. GPR55
BLAST found no sequences that met threshold in the
genomes of Sc,At,Pf,Tt,Af,orMt. Functional mapping of
GPR55 posed a problem, no point mutation studies have been
done. GPR55 shares ligands with CB1 but little sequence
similarity (EN1 with HsCB1); according to Brown et al.
(2005), GPR55's affinity for AEA represents a case of
convergent evolution. Within the Hs genome, the GPR55
sequence most closely resembles that of GPR23 (E= 3e-35).
The ligand of GPR23 is lysophosphatidic acid (LPA), which
GPR23 shares with EDG2 (Noguchi et al., 2003). EDG2
shares little sequence similarity with GPR23 (EN1), but does
resemble CB1 (E= 3e-29). Arachidonoyl-(20:4)-LPA shares
structural characteristics with AEA (although LPA is charged),
so the ligand binding sites of GPR55 and GPR23 may
overlap. We aligned GPR55, GPR23, CB1, and EDG2, and
examined the point-mutation studies done on EDG receptors
(Fujiwara et al., 2005). Several residues aligned as potential
ligand-binding sites for GPR55 (Table 1). In the gene tree
(Fig. 4), HsGPR55 and MmGPR55 claded together, with
perfect FAS scores. Tr141960 claded with outgroup
HsGPR23. The Ci,Ce, and Dm sequences placed between
the GPR23 clade and the distal outgroup HsCB1, with lower
FAS scores.
Fig. 2. Gene tree of cannabinoid receptor orthologs and outgroups. All sequence
names are followed by two functional assessment scores (FAS). The first FAS
tallies the number of cannabinoid receptor motifs in the sequence, out of twelve
scored motifs. The second FAS tallies the number of CB1- or CB2-specific
motifs, presented as a ratio of CB1 / CB2 / neither. Sequences include HsCB1
(accession number NP_057167), MmCB1 (NP_031752), TrCB1A (FRUP
00000058680), TrCB1B (FRUP 00000081454), HsCB1 (NP_001832), MmCB2
(NP_034054), TrCB2 (FRUP_00000161224), CiCBR (ci0100149095),
CeC02H7.2 (NP_508147), DmCG9753-PA (NP_651772), and outgroup
sequences HsEDG1 (NP_001391), HsEDG2 (NP_476500) and HsADRA1A
(NP_000671). ClustalX-TreeView neighbor-joining tree, with bootstrap support
for clade stability given at nodes, based on 1000 resampling replicates.
4J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS
3.4. FAAH
BLAST identified sequences with similarity to HsFAAH in
all genomes. Tr sequences did not cross-reference between
databases: the best hit, Ensembl SINFRUP00000146100,
deleted 31 residues from its c-terminal that were present in
JGI FRUP00000146100 and GenBank FRUP00000055162, but
the JGI and GenBank sequences exhibited different, inconsis-
tent deletions. The FAAH tree (Fig. 5) bifurcated into two major
clades: HsFAAH claded with sequences from Mm,Tr,Ci, Ce,
Sc, and Tt, with variable FAS scores. HsAmidase, the sequence
with greatest similarity to FAAH in the Hs genome, claded with
sequences from Tr and Dm, with low FAS scores.
3.5. MAGL
We tested HsMAGL isoform A (NP_009214) and not
isoform B (NP_001003794), although the latter produced
identical lists of putative orthologs (data not shown). BLAST
identified sequences with similarity to HsMAGL in all
genomes. The MAGL tree (Fig. 6) bifurcated into two major
clades: HsMAGL claded with sequences from Mm,Tr,andCi,
with good FAS scores. A second clade bearing no congruence
with the species tree (Fig. 1) contained sequences from Tt,Pf,
At,Sc, and Dm, sister to the prokaryote sequences Af, and Mb,
with variable FAS scores.
3.6. COX2
BLAST found nosequences that met threshold in the genomes
of Sc,Pf,Tt,Af,orMt. Four sequences with nearly equal Escores
were BLASTed from the Ci genome. The best hit, Ensembl
ENSCINP00000012732, was identical to JGI ci0100149048
and nearly identical to the second hit (ENSCINP00000012733)
and the fourth hit (ENSCINP00000012734), except for in-
consistent indels in the latter two sequences. The third hit,
ENSCINP00000013352, was identical to JGI ci0100139817,
except for a 24 residue n-terminal insertion. In the COX2 tree
(Fig. 7), HsCOX2 claded with sequences from Mm and Tr,with
high FAS scores. This clade separated with good bootstrap
support from outgroup HsCOX1, the sequence with greatest
similarity to COX2 in the Hs genome. Basal to the COX2 and
COX1 clades, Ci paralogs sistered with good bootstrap and high
FAS scores. These were followed by a series of sequences that
Fig. 5. Gene tree of FAAH orthologs and outgroups. Sequence names are
followed by FAS scores, tallying the presence of FAAH enzyme motifs in
the following sequences: HsFAAH (NP_001432), MmFAAH (NP_034303),
Tr14600 (FRUP00000146100), Ci153926 (ci0100153926), CeB0218.1a
(NP_501368), ScAmd2p (NP_010528), TtTC159 (159.m00087), Tr168827
(SINFRUP00000168827), DmCG8839-PE (NP_725139), At5g64440
(NP_201249), MtMT1301 (NP_335746), AfGlutRNAase (NP_070778), PfGlutR-
NAase (NP_702811), and outgroup sequences HsAmidase (NP_777572) and
HsGlutRNAsyn (NP_060762).
Fig. 3. Gene tree of vanilloid receptor orthologs and outgroups. Sequence names
are followed by FAS scores, tallying the presence of vanilloid receptor motifs in
the following sequence: HsTRPV1 (NP_542437), MmTRPV1 (NP_001001445),
Tr162427 (SINFRUP00000085541), Tr155081 (SINFRUP00000155081),
Ci148845 (ci0100148845), CeOsm-9 (NP_501172), DmCG4536-PA
(NP_572353), and outgroup sequences HsTRPV4 (NP_067638), MmTRPV4
(NP_071300), and HsTRPA1 (NP_015628).
Fig. 4. Gene tree of GPR55 orthologs and outgroups. Sequence names are
followed by FAS scores, tallying the presence of GPR55 receptor motifs in the
following sequence: HsGPR55 (NP_005674), MmGPR55 (XP_136804),
Tr141960 (SINFRUP00000141960), CiQ869J2 (ENSCINP0000000761),
CeZK455.3 (NP_509896), DmCG2872-PB (NP_524700), and outgroup
sequences HsGPR23 (NP_005287) and HsCB1 (NP_057167).
5J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS
showed no congruence with the species tree (Fig. 1), and low
FAS scores.
3.7. NAPE-PLD
BLAST found no sequences that met threshold in the Ci,
Dm,At,orTt genomes. In the tree (Fig. 8)HsNAPE-PLD claded
with sequences from Mm and Tr, with very high FAS scores.
Below this group, the gene tree did not display topographical
congruence with the species tree; a clade of prokaryotes sistered
with metazoans. The FAS scores were problematic; Pfam and
Prosite had difficulty profiling NAPE-PLD, and functional
mapping was based upon motifs identified by computer
modeling rather than robust experimental point-mutation
studies (Table 1). The sequence with greatest similarity to
NAPE-PLD in the Hs genome, the interleukin 20 receptor (the
outgroup), bears little resemblance to it. This may have created
long branch length artifacts in the gene tree.
Fig. 7. Gene tree of COX2 orthologs and outgroups. All sequence names are
followed by two FAS scores. The first FAS tallies the number of COX motifs in
the sequence, out of eleven scored motifs. The second FAS tallies the number of
COX2-specific motifs, presented as a ratio of COX2 / COX1 / neither. Sequences
include HsCOX2 (NP_000954), MmCOX2 (NP_035328), Tr139832
(SINFRUP00000139832), Ci12732 (ENSCINP00000012732), Ci13352
(ENSCINP00000013352). CeC46A5.4 (NP_501272), DmCG10211-PA
(NP_609883), and outgroup sequences HsCOX1 (NP_000953), HsThyPeroxA
(NP_000538) and HsThyPeroxB (NP_783650).
Fig. 6. Gene tree of MAGL orthologs and outgroups. Sequence names are
followed by FAS scores, tallying the presence of MAGL enzyme motifs in the
following sequences: HsMAGL (NP_009214), MmMAGL (NP_035974),
Tr149915 (SINFRUP00000149915), Ci131319 (ci0100131319), CeF01D5.7b
(NP_001022067), DmCG1882-PD (NP_724611), ScYJU3 (NP_012829),
PfLyso (NP_702627), Tt29996 (29996), At1g73480 (NP_565066), MtLyso
(NP_214697), AfLyso (NP_070581), and outgroup sequence HsAbhydrolase
(NP_078803).
Fig. 8. Gene tree of NAPE-PLD orthologs and outgroups. Sequence names are
followed by FAS scores, tallying the presence of NAPE-PLD enzyme motifs in
the following sequences: HsNAPE-PLD (NP_945341), MmNAPE-PLD
(NP_848843), Tr142348 (SINFRUP00000142348), CeY37E11AR.4
(NP_500408), ScFmp30p (NP_015222), PfPF11_0452 (NP_701308),
AfAF1265 (NP_070093), MtMT0929 (NP_335362), and outgroup sequence
HsIL20RA (NP_055247).
Fig. 9. Gene tree of DAGL orthologs and outgroups. Sequence names ar e followed
by FAS scores, tallying the presence of DAGLαenzyme motifs in the following
sequences: HsDAGLα(NP_006124), HsDAGLβ(NP_631918), MmDAGLα
(NP_932782), MmDAGLβ(NP_659164), Tr167103 (SINFRUP00000167103),
Tr148896 (FRUP00000148896), Ci146113 (ci0100146113), CeF42G9.6a
(NP_741084) DmCG33174-PD (NP_788900), ScYjr107wp (NP_012641),
At1g05790 (NP_172070), Tt165.m00064 (8254448), and theoutgroup HsMAGL1
(NP_009214).
6J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS
3.8. DAGLαand DAGLβ
BLASTing with either HsDAGLαor HsDAGLβfound no
sequences with significant identity in the Pf,Af,orMt genomes,
and hit upon single sequences in the Ci,Ce,Dm,Sc,At, and Tt
genomes. BLASTing the Tr Ensembl database with HsDAGLα
identified SINFRUP00000167103, whereas HsDAGLβscored
no hits that met threshold. BLASTing the Tr JGI database with
HsDAGLαand HsDAGLβhit on the same sequence,
FRUP00000148896. The DAGLαclade (Fig. 9) had poor
bootstrap support below the sistered vertebrate sequences.
Depending upon the degree of manual editing, the Dm and Ci
sequences claded with DAGLαor DAGLβ, or sistered basal to
both clades. The final tree was optimized, with all gaps in the
alignment removed. The chordate sequences produced high
FAS scores, with lower scores in the other sequences.
4. Discussion
BLAST (threshold Eb0.01) sensitively identified sequences
that shared similarity to queried human sequences. We
confidently predict this method did not commit type 2 errors
(false negatives). Conversely, we faced the challenge of sorting
homologous sequences from homoplastic sequences (sequences
sharing similarity because of convergent evolution, not
common descent).
4.1. CB1 and CB2
The Hs and Mm orthologs identified herein (Fig. 2) have
been well-characterized in functional studies. The Tr sequences
have not been functionally tested and would benefit from the
attention. TrCB1A and Tr CB1B have been described as lineage-
specific expansions of CB1, not CB2 (Elphick, 2002). Yet
TrCB1A and TrCB1B expressed CB2-specific motifs (S4.53
and S4.57 in Table 1) whilst TrCB2 expressed three CB1-
specific motifs (G3.31, MDI in IC-3, and A[conserved S]3.35).
The TrCB2 sequence identified herein was 83 amino acid
residues longer than TrCB2 described by Elphick (2002).
The CiCBR sequence shared 29% identity with HsCB1 and
24% identity with HsCB2. These divergences are greater than
the divergence between HsCB1 and HsCB2 (47% identity),
suggesting the ancestor of CiCBR evolved prior to the CB1–
CB2 duplication event, as previously hypothesized by Elphick
et al. (2003). The phylogenetic tree upheld this hypothesis,
albeit with weak bootstrap support (Fig. 2). The ancestor of
CiCBR may have functioned like present-day CB1 rather than
CB2, judging from CiCBR's CB1-specific FAS score (3/0/5,
Fig. 2). CiCBR expressed substitutions at two motifs, F3.36 and
W5.43, required for mammalian cannabinoid receptors to bind
AEA, CP55,940, WIN55212-2, and SR141716A. Transfected
receptors with mutations at these sites lost affinity for
WIN55212-2, and SR141716A, but retained affinity for AEA
and CP55,940 (McAllister et al., 2003). Similarly, Ci tissues
demonstrated high-affinity binding with [
3
H]CP55,940
(McPartland et al., accepted for publication), but less specific
binding with [
3
H]SR141716A (Matias et al., 2005). CiCBR
expressed an aromatic residue located i-4 from W.6.48,
indicating it may lack the constitutive activity seen in vertebrate
receptors (Singh et al., 2002).
CeC02H7.2 was previously rejected as an ortholog because
of its low similarity to HsCB1 (Elphick and Egertova, 2001)
and because of it low FAS score (McPartland and Glass,
2001). Nevertheless, CeC02H7.2 claded with cannabinoid
receptors, with good bootstrap support. These results show
how a combination of BLAST and phylogenetic tree analysis can
improve the sensitivity and specificity of ortholog discovery.
CeC02H7.2 may be an ortholog, but it might not function as a
cannabinoid receptor, its FAS score indicated substitutions at
half the motifs required for mammalian CB1 to function.
Nevertheless, a radioligand study of nematode neural tissues
demonstrated high-affinity binding of [
3
H]CP55,940 (McPart-
land et al., accepted for publication). Within the Ce genome,
CeC02H7.2 closely resembled sequences that coded for
olfactory receptors. Perhaps the ancestral cannabinoid receptor
diverged from a nematode olfactory receptor. Moderate to high
densities of CB1 are retained in the human olfactory cortex
(Glass et al., 1997) as well as limbic structures, which share
primeval connections with the olfactory system.
The proximal outgroups, HsEDG1 and HsEDG2, shared the
closest similarity to CB1 and CB2 in the Hs genome (E= 2e-28
and E= 3e-29, respectively, with HsCB1). These sequences
have been selected as outgroups in previous cannabinoid studies
(Elphick, 2002; Elphick et al., 2003). The Dm sequence shared
greater similarity with HsCB1 (E= 3.9e-39) than HsEDG1 and
HsEDG2, but the CLUSTAL NJ algorithm placed the Dm
sequence distal to HsEDG1 and HsEDG2 in the phylogenetic
tree (Fig. 2). DmCG9753-PA was previously rejected as a
cannabinoid receptor (McPartland et al., 2001), despite its
annotation as a CB1 ortholog by GenBank. In concurrence,
radioligand binding studies of Dm found no high-affinity
binding of [
3
H]CP55,940 or [
3
H]SR141716A (McPartland et
al., 2001). Radioligand binding studies have shown high-
affinity binding in Hydra vulgaris (a cnidarian, De Petrocellis et
al., 1999) and in earthworm (Lumbricus terrestris, a lopho-
trochozoan, McPartland et al., accepted for publication).
Unfortunately, no cnidarian or lophotrochozoan genomes have
been sequenced yet. The FAS scores of outgroups EDG1 and
EDG2 were unexpectedly high, due in part to the lack of
specificity by Pfam and Prosite, and due to a cluster of
substituted-but-conserved residues (F3.25W, K3.28R, and
V3.32M).
Several web-based, automated annotation systems now post
lists of putative orthologs of human genes. These systems
conflicted with our results and lacked fidelity with each other,
due to their reliance upon incorrect or imprecise annotations
present in sequence databases. HomoloGene (www.ncbi.nlm.
nih.gov/entrez/query.fcgi?CMD=search and DB=homologene)
identified “putative homologs”of CB1 in the genomes of Hm
and Mm, and not in Ce,Dm,Sc,Pf,orAt (HomoloGene does not
examine the Tr,Ci,Tt,Mt, and Af genomes). UniGene (www.
ncbi.nlm.nih.gov/entrez/query.fcgi?db=unigene) an automated
system that partitions GenBank sequences into sets of gene-
oriented clusters, identified “protein similarities”between
7J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS
HsCB1 and sequences in Hs and Mm,aswellasCe
(NP_508760, a sequence removed from GenBank in January
2005) and Dm (NP_477007, a dopamine receptor). Unigene
identified similarities between HsCB2 and sequences in Hs and
Mm, as well as Ce (T24659, an opioid receptor) and Dm
(S68780, a dopamine receptor).
4.2. TRPV1
HsTRPV1 and MmTRPV1 have been well-characterized in
functional studies. The Tr paralogs beg functional analysis; the
gene tree (Fig. 3) suggested the Tr paralogs descended from a
duplication event that gave rise to the TRPV1 and TRPV4
lineages, reciprocal best hits in the Hs genome. TRPV1 and
TRPV4 are polymodal receptors, they both respond to heat,
TRPV1 responds to tissue acidity and TRPV4 responds to
osmolarity. TRPV1 is gated by capsaicin and AEA; TRPV4 is
gated not by AEA but by its FAAH-mediated metabolite,
arachidonic acid, and its cytochrome-450-mediated metabolite
5′,6′-epoxyeicosatrienoic acid (Watanabe et al., 2003).
TRPV1's affinity for AEA may have evolved recently, in
rodents and primates. The TRPV1 ortholog in chickens has little
functional affinity for AEA or capsaicin (Jordt and Julius,
2002).T550maybeakeyresidueforligandaffinity,
substitutions at this site were expressed in four capsaicin-
insensitive sequences: chicken TRPV1 (Jordt and Julius, 2002),
the Ce sequence identified here (CeOsm9, Tobin et al., 2002),
rabbit TRPV2 (Gavva et al., 2005), and HsTRPV2, HsTRPV4,
and frog TRPV1 (data not shown). The Tr paralogs also
substituted T550 for other residues.
Polytomic branching in the TRPV1 tree (Fig. 3) suggested
the Ci–Ce–Dm clade may have expressed greater dissimilarity
than the outgroup chosen to root the tree, HsTRPA1. The latter
sequence (also known as ANKTM1, a member of the TRP
channel family), has some affinity for THC (Jordt et al., 2004).
The highly divergent Ci,Ce, and Dm sequences may have
grouped together as a consequence of long branch attractions
and are not truly related. HomoloGene agreed with our results;
it identified putative homologs of TRPV1 in Hm and Mm, and
not in Ce,Dm,Sc,Pf,orAt (HomoloGene does not include the
Tr,Ci,Tt,Mt, and Af genomes). Unigene did not agree entirely
with our results, it identified protein similarities between
HsTRPV1 and sequences in Hs and Mm, but also in Ce
(NP_500372, another Osm-9 receptor).
4.3. GPR55
Functional mapping based upon our estimation of critical
motifs (Table 1) proved effective, although FAS scores were
inflated by Pfam, Prosite, and TMHMM classifying all the
sequences identically as GPRCs. The GPR55 gene tree (Fig. 4)
demonstrated the utility of placing two outgroups per tree to
detect polyphyletic sequences. Removing the HsGPR23
sequence collapsed the Hs,Mm,Tr, and Ci sequences into
one spurious clade. Instead, placement of HsGPR23 in the
alignment demonstrated that Tr141960 shared similarity with
GPR55 (thus its detection by BLAST) but was not orthologous
with GPR55. HomoloGene identified putative homologs of
GPR55 in Hm and Mm, and not in Ce,Dm,Sc,Pf,orAt
(HomoloGene does not include the Tr,Ci,Tt,Mt, and Af
genomes). Unigene identified protein similarities between
HsGPR55 and sequences in Hs and Mm, but also in Ce and Dm.
4.4. FAAH
The FAAH gene tree (Fig. 5) did not display topographical
congruence with the species tree (Fig. 1), necessitating some
interpretation, aided by FAS scores. We interpreted the Hs-to-Tt
clade as FAAH orthologs. This phylogenetic concept departed
from Patricelli and Cravatt (2000), who characterised FAAH as
a mammalian enzyme. The Hs-to-Tt clade separated with good
bootstrap support from the Hs–Tr–Dm clade of “other
amidases.”The literature supports this concept. FAAH-like
amidohydrolases have been extracted from Ci (Matias et al.,
2005), the leech Hirudo medicinalis (Matias et al., 2001),
Tetrahymena pyriformis, a species closely related to Tt,(Karava
et al., 2001), and even Hydra vulgaris (De Petrocellis et al.,
1999). Endocannabinoids have been extracted from Sc (Merkel
et al., 2005), consistent with the presence of an FAAH ortholog
in the Sc genome. On the other hand, Dm lacks detectable AEA
(McPartland et al., 2001), so the lack of an FAAH ortholog in
this species should not be surprising. Segregating FAAH from
“FAAH-like amidases”may be pedantic, because the At
sequence in Fig. 5 has been demonstrated to metabolize AEA
with kinetics equal to FAAH (Shrestha et al., 2003). Homo-
loGene did not agree with our results; it identified putative
homologs of FAAH in Hm,Mm,Ce and Dm, and not in Sc,Pf,
and At. Unigene on the other hand identified Sc and At
sequences sharing similarities with HsFAAH, along with Hs,
Mm, and Ce, sequences.
4.5. MAGL
The MAGL tree (Fig. 6) resolved into two major clades. We
interpreted the Hs,Mm,Tr, and Ci sequences as MAGL
orthologs, with good boostrap support and good FAS scores.
The clade that included Tt,Pf,At,Sc, and Dm had poor
bootstrap support and no congruence with the species tree
(Fig. 1). Interpreting the Tt-to-Dm clade as MAGL orthologs
would suggest that MAGL evolved in organisms with ancient
lineages. Indeed a sequence with similarity to HsMAGL was
previously reported in the Cowpox virus genome (McPartland,
2004). Functional studies are required to resolve this. Homo-
loGene identified putative homologs of MAGL in Hs,Mm, and
At, and not in Ce,Dm,Sc,orPf (HomoloGene does not include
the Tr,Ci,Tt,Mt, and Af genomes). Unigene identified protein
similarities between HsMAGL and the Hs,Mm,Sc, and At
sequences evaluated herein.
4.6. COX2
The top branches of the COX tree (Fig. 7) agreed with a
COX phylogram by Jarving et al. (2004), except Järving and
colleagues additionally BLASTed with HsCOX1, so the COX1
8J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS
clade included sequences from Mm and Tr. Järving and
colleagues also placed Ci paralogs in a clade basal to COX2
and COX1, suggesting the ancestor of the Ci sequences evolved
prior to the COX2–COX1 duplication event. The ancestral
COX gene may have functioned like present-day COX2 rather
than COX1, judging from FAS scores. Ci12732 conserved three
COX2-specific motifs (R513, V523, L505) and no COX1-
specific motifs; Ci13352 conserved two COX2-specific motifs
(R513, V523) and one COX1-specific motif (F505). The other
sequences had low FAS scores and placed in spurious clades
that contradicted broader phylogenetic evidence (e.g., mono-
phyly of plant and metazoan sequences). Whether these
sequence function like COX2 or COX1 can be ascertained in
studies similar to Knight et al. (1999), who inhibited Ci
prostaglandin synthesis with COX2-selective drugs (etodolac)
but not with COX1-selective drugs (resveratrol). HomoloGene
identified COX2 homologs in Hm and Mm, and not in Ce,Dm,
Sc,Pf,orAt (Tr,Ci,Tt,Mt, and Af genomes were not examined).
Unigene identified protein similarities between HsCOX2 and
sequences in Hs,Mm and Ce.
4.7. NAPE-PLD
TheNAPE-PLDgenetree(Fig. 8) did not display
topographical congruence with the species tree (Fig. 1). When
gene tree topology lacks congruence with species tree topology,
the sequences in the gene tree may not be orthologous
(Brinkman and Leipe, 2001). The incongruent placement of
the Mt–Af clade may be explained by the fact that NAPE-PLD
shares no similarity with other phospholipase D enzymes, rather
it resembles β-lactamase enzymes in prokaryotes (Okamoto et
al., 2004). Sister to the prokaryotes were Sc and Ce sequences,
which we interpreted as functional NAPE-PLD orthologs.
When the prokaryotes were removed from analysis, Sc and Ce
claded with the vertebrates, with adequate bootstrap support
(≥555 at both nodes), and Sc and Ce reversed their positions on
the gene tree, becoming topographical congruent with the
species tree (data not shown). Sc and Ce FAS scores supported
our interpretation, as does the literature. The lipid fraction of Ce
includes phosphatidylethanolamine and arachidonic acid
(Tanaka et al., 1996), which are the feedstock phospholipid
and fatty acid necessary for NAPE-PLD to produce AEA. The
presence of arachidonic acid in Sc is debated (Zank et al., 2000;
Merkel et al., 2005). The Sc ortholog identified by our methods
was recently examined by Merkel et al. (2005), who described it
as a NAPE hydrolyzing phospholipase D. Lack of NAPE-PLD
in Ci was previously reported (Matias et al., 2005) but still
surprising, because AEA has been extracted from Ci tissues
(Matias et al., 2005). Sun et al. (2004) proposed a second
mechanism for AEA biosynthesis, where NAPE is hydrolyzed
by a secretory phospholipase A
2
(PLA
2
group Ib) to NA-
lysoPE, which is then cleaved by a lysophospholipase D
enzyme to yield AEA. This possibility can be explored after the
specific mammalian enzymes have been cloned. Similarly, lack
of NAPE-PLD in Tt was surprising, Siafaka-Kapadai et al.
(2005) reported AEA and NAPE-PLD-like activity in Tetrahy-
mena pyriformis. No NAPE-PLD in the At genome was also
unexpected, because an ortholog of NAPE-PLD was reported in
tobacco plants (Chapman, 2000). Absence of a NAPE-PLD
ortholog in Dm agreed with ligand extraction studies that found
no measurable amounts of AEA in Dm tissues (McPartland et
al., 2001). HomoloGene identified putative homologs of
NAPE-PLD in Hs,Mm, and Ce, and not in Dm,Sc,orPf
(HomoloGene does not include the Tr,Ci,Tt,Mt, and Af
genomes). Unigene identified protein similarities between
HsMAGL and sequences in Mm and Ce.
4.8. DAGLαand DAGLβ
The combined DAGLαand DAGLβtree proved most
challenging. We interpreted the Hs-to-Ce sequences as DAGL
orthologs, with a duplication event that led to the eventual
divergence of DAGLαand DAGLβin vertebrates. Poor
bootstrap support, despite optimized editing of the alignment,
made the timing of the duplication event difficult to interpret.
The tree (Fig. 9) suggested a duplication event ancestral to Dm,
with subsequent loss of DAGLβparalogs in Dm and Ci. This is
not a very parsimonious scenario. Alternatively, Matias et al.
(2005) presented a DAGL tree constructed without editing, and
the Ci sequence sistered basal to the DAGLαand DAGLβ
clades. This tree better explains the appearance of paralogs in
vertebrates. Additional sequences are needed to better resolve
the timing of the DAGL duplication event. 2-AG, the product of
DAGLαand DAGLβ, has been extracted from the tissues of
vertebrates as well as several invertebrates: sea urchin
(Paracentrotus lividus), sea slug (Aplysia sp.), leeches (Hirudo
and Theromyzon tessulatum), mussel (Mytilus galloprovincia-
lis), clam (Tapes dicussatus), oyster (Crassosterea sp.), insects
(Apis mellifera and D. melanogaster), and even Hydra vulgaris
(reviewed by McPartland, 2004). HomoloGene identified
DAGLαhomologs in Hm,Mm,andDm, and DAGLβ
homologs in Hm and Mm, and not in Ce,Sc,Pf,orAt. Unigene
identified proteins similar to HsDAGLαin Hs,Mm, and Ce,as
well as Sc and At. Unigene identified proteins similar to
HsDAGLβin Hs,Ce,andAt.
4.9. Conclusions
The use of sequence profilers, phylogenetic tree analysis,
and functional mapping produced conservative lists of ortho-
logs. We found fewer orthologs than did automated annotation
systems. Our results suggested the endocannabinoid system was
heterogeneously distributed: functional TRPV1 and GPR55
receptors were limited to mammals; CB2 and DAGLβwere
limited to vertebrates; MAGL and COX2-like enzymes were
limited to chordates; CB1-like receptors and DAGLαwere
limited to bilaterian animals; NAPE-PLD was limited to the
opisthokonta (animals and fungi), and FAAH was limited to
eukaryotes.
Some genes shared phylogenetic profiles; that is, the genes
were present/absent in the same species. Phylogenetic profiles
indicate functional relationships (Pellegrini et al., 1999). For
example, CB1, CB2, TRPV1, GPR55, FAAH, and NAPE-PLD
were absent in the Dm genome; these genes code for proteins
9J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS
associated with AEA, and Dm tissues lacked detectable levels of
AEA (McPartland et al., 2001). Indeed the absence of
cannabinoid receptors in insects has been described as a sorting
event secondary to the loss of AEA (McPartland et al., accepted
for publication). This phylogenetic profile was not shared by
DAGLα, which was present in the Dm genome, functionally
linked to the presence of 2-AG in Dm tissues. We did not see
complementary patterns suggestive of nonorthologous gene
displacement, in which unrelated or distantly related proteins
are responsible for the same function in different organisms.
Although the lack of NAPE-PLD in the Ci genome, despite
AEA in its tissues (Matias et al., 2005), suggested an
unelucidated AEA biosynthesis pathway had evolved in the
chordates. This may also explain the appearance of three “new”
AEA-gated receptors in chordates (CB2, GPR55, TRPV1).
Mapping our limited data set of 12 extant organisms onto an
evolutionary time scale may be premature, but our study
suggested that endocannabinoid ligand-metabolizing enzymes
evolved prior to cannabinoid receptors. In the absence of
cannabinoid receptors, AEA and 2-AG may serve other roles.
This should not be surprising; AEA is known to modulate
Shaker-related K
+
channels, TASK-1 K
+
channels, T-type Ca
2+
channels, and stimulate ERK phosphorylation and AP-1
transcription activity (reviewed in McPartland, 2004). Insects
lack cannabinoid receptors but they biosynthesize 2-AG,
perhaps as a feeding deterrent against organisms that express
CB1, a case of convergent evolution. Receptors also exhibit
convergence, exampled by TRPV1's recently evolved affinity
for AEA. Cases of parallel evolution exist in plants —they do
not have CB receptors, nor do they produce endocannabinoids,
yet plants produce endocannabinoid-like “entourage com-
pounds,”and these compounds activate “CB-like”receptors
(Shrestha et al., 2003).
The endocannabinoid system is complex, involving conver-
gent, divergent, and parallel evolution. Over evolutionary time
we see duplications and mutations of endocannabinoid
receptors, resulting in sorting events (gene extinctions) or new
structures and new functions. This weave of receptors
intertwines with a weave of ligands, whose metabolic enzymes
similarly duplicate and mutate and evolve new ligand structures
and functions. Given this scenario, the symbol for endocanna-
binoid evolution is not a bifurcating tree, but an interweaving
arabesque of coevolving receptors and ligands (McPartland and
Guy, 2004). Receptors and their ligands coevolve (Park et al.,
2002), and this was not addressed by our analysis. For example,
CB2 first appeared in vertebrates, as did NAPE-PLD and
DAGLβ. Does CB2 show evidence of coevolution with NAPE-
PLD (aka, an AEA receptor) or with DAGLβ(aka, a 2-AG
receptor)? Questions concerning co-evolution will be addressed
in our next study.
Acknowledgements
This work was partially supported by an unrestricted grant
from GW Pharmaceuticals, Salisbury, Wiltshire SP4 0JQ, UK.
The authors thank anonymous journal reviewers for substantial
improvements to this manuscript.
References
Altschul, S.F., et al., 1997. Gapped BLAST and PSI-BLAST: a new generation
of protein database search programs. Nucleic Acids Res. 25, 3389–3402.
Baker, D., Pryce, G., Davies, W.L., Hiley, C.R, 2005. In silico patent searching
reveals a new cannabinoid receptor. Trends Pharmacol. Sci. (Nov 27;
[Electronic publication ahead of print]).
Bambai, B., Rogge, C.E., Stec, B., Kulmacz, R.J., 2004. Role of Asn-382 and
Thr-383 in activation and inactivation of human prostaglandin H synthase
cyclooxygenase catalysis. J. Biol. Chem. 279, 4084–4092.
Benton, M.J., Ayala, F.J., 2003. Dating the tree of life. Science 300, 1698–1700.
Bisogno, T., et al., 2003. Cloning of the first sn1-DAG lipases points to the
spatial and temporal regulation of endocannabinoid signaling in the brain.
J. Cell Biol. 163, 463–468.
Brinkman, F.S., Leipe, D.D., 2001. Phylogenetic analysis. Methods Biochem.
Anal. 43, 323–358.
Brown, A.J., Ueno, S., Suen, K., Dowell, S.J., Wise, A., 2005. Molecular
identification of GPR55 as a third G-protein coupled receptor responsive to
cannabinoid ligands. 2005 Symposium on the Cannabinoids. International
Cannabinoid Research Society, Burlington, Vermont, p. 16.
Chapman, K.D., 2000. Emerging physiological roles for N-acylphosphatidy-
lethanolamine metabolism in plants: signal transduction and membrane
protection. Chem. Phys. Lipids 108, 221–229.
De Petrocellis, L., Melck, D., Bisogno, T., Milone, A., Di Marzo, V., 1999.
Finding of the endocannabinoid signalling system in Hydra, a very
primitive organism: possible role in the feeding response. Neuroscience
92, 377–387.
Deutsch, D.G., Ueda, N., Yamamoto, S., 2002. The fatty acid amide hydrolase
(FAAH). Prostaglandins Leukot. Essent. Fat. Acids 66, 201–210.
Dinh, T.P., et al., 2002. Brain monoglyceride lipase participating in endocanna-
binoid inactivation. Proc. Natl. Acad. Sci. U. S. A. 99, 10819–10824.
Elphick, M.R., 2002. Evolution of cannabinoid receptors in vertebrates:
identification of a CB(2) gene in the puffer fish Fugu rubripes. Biol. Bull.
202, 104–107.
Elphick, M.R., Egertova, M., 2001. The neurobiology and evolution of
cannabinoid signalling. Philos. Trans. R. Soc. Lond., B Biol. Sci. 356,
381–408.
Elphick, M.R., Satou, Y., Satoh, N., 2003. The invertebrate ancestry of
endocannabinoid signalling: an orthologue of vertebrate cannabinoid
receptors in the urochordate Ciona intestinalis. Gene 302, 95–101.
Fujiwara, Y., et al., 2005. Identification of residues responsible for ligand
recognition and regioisomeric selectivity of LPA receptors expressed in
mammalian cells. J. Biol. Chem. 280, 35038–35050.
Gavva, N.R., et al., 2005. Proton activation does not alter antagonist interaction
with the capsaicin-binding pocket of TRPV1. Mol. Pharmacol. 68,
1524–1533.
Glass, M., Dragunow, M., Faull, R.L., 1997. Cannabinoid receptors in the
human brain: a detailed anatomical and quantitative autoradiographic
study in the fetal, neonatal and adult human brain. Neuroscience 77,
299–318.
Jarving, R., Jarving, I., Kurg, R., Brash, A.R., Samel, N., 2004. On the
evolutionary origin of cyclooxygenase (COX) isozymes: characterization
of marine invertebrate COX genes points to independent duplication
events in vertebrate and invertebrate lineages. J. Biol. Chem. 279 (14),
13624–13633.
Jordt, S.E., Julius, D., 2002. Molecular basis for species-specific sensitivity to
“hot”chili peppers. Cell 108, 421–430.
Jordt, S.E., et al., 2004. Mustard oils and cannabinoids excite sensory nerve
fibres through the TRP channel ANKTM1. Nature 427, 260–265.
Karava, V., Fasia, L., Siafaka-Kapadai, A., 2001. Anandamide amidohydrolase
activity, released in the medium by Tetrahymena pyriformis. Identification
and partial characterization. FEBS Lett. 508, 327–331.
Knight, J., Taylor, G.W., Wright, P., Clare, A.S., Rowley, A.F., 1999. Eicosanoid
biosynthesis in an advanced deuterostomate invertebrate, the sea squirt
(Ciona intestinalis). Biochim. Biophys. Acta 1436, 467–478.
Kozak, K.R., Prusakiewicz, J.J., Rowlinson, S.W., Prudhomme, D.R., Marnett,
L.J., 2003. Amino acid determinants in cyclooxygenase-2 oxygenation of
the endocannabinoid anandamide. Biochemistry 42, 9041–9049.
10 J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS
Matias, I., et al., 2001. Evidence for an endocannabinoid system in the central
nervous system of the leech Hirudo medicinalis. Brain Res. Mol. Brain Res.
87, 145–159.
Matias, I., McPartland, J.M., DiMarzo, V., 2005. Occurrence and possible
biological role of the endocannabinoid system in the sea squirt Ciona
intestinalis. J. Neurochem. 93, 1141–1156.
McAllister, S.D., et al., 2003. An aromatic microdomain at the cannabinoid CB
(1) receptor constitutes an agonist/inverse agonist binding region. J. Biol.
Chem. 46, 5139–5152.
McPartland, J.M., 2004. Phylogenomic and chemotaxonomic analysis of the
endocannabinoid system. Brain Res. Rev. 45, 18–29.
McPartland, J.M., Glass, M., 2001. The nematocidal effects of Cannabis may
not be mediated by cannabinoid receptors. N.Z. J. Crop Hortic. Sci. 29,
301–307.
McPartland, J.M., Glass, M., 2003. Functional mapping of cannabinoid receptor
homologs in mammals, other vertebrates, and invertebrates. Gene 312,
297–303.
McPartland, J.M., Guy, G., 2004. The evolution of Cannabis and coevolution
with the cannabinoid receptor—a hypothesis. In: Guy, G., Robson, R.,
Strong, K., Whittle, B. (Eds.), The Medicinal Use of Cannabis. Royal
Society of Pharmacists, London, pp. 71–102. 2004.
McPartland, J.M., Di Marzo, V., De Petrocellis, L., Mercer, A., Glass, M.,
2001. Cannabinoid receptors are absent in insects. J. Comp. Neurol. 436,
423–429.
McPartland, J.M., Agraval, J., Gleeson, D., Heasman, K., Glass, M., accepted
for publication. Evidence for cannabinoid receptors in invertebrates. Journal
of Evolutionary Biology.
Mechoulam, R., Fride, E., Di Marzo, V., 1998. Endocannabinoids. Eur. J.
Pharmacol. 359, 1–18.
Merkel, O., Schmid, P.C., Paltauf, F., Schmid, H.H., 2005. Presence and
potential signaling function of N-acylethanolamines and their phospholipid
precursors in the yeast Saccharomyces cerevisiae. Biochim. Biophys. Acta
1734, 215–219.
Noguchi, K., Ishii, S., Shimizu, T., 2003. Identification of p2y9/GPR23 as a
novel G protein-coupled receptor for lysophosphatidic acid, structurally
distant from the Edg family. J. Biol. Chem. 278, 25600–25606.
Okamoto, Y., Morishita, J., Tsuboi, K., Tonai, T., Ueda, N., 2004. Molecular
characterization of a phospholipase D generating anandamide and its
congeners. J. Biol. Chem. 279, 5298–5305.
Park, Y., Kim, Y.J., Adams, M.E., 2002. Identification of G protein-coupled
receptors for Drosophila PRXamide peptides, CCAP, corazonin, and AKH
supports a theory of ligand–receptor coevolution. Proc. Natl. Acad. Sci.
U. S. A. 99, 11423–11428.
Patricelli, M.P., Cravatt, B.F., 2000. Clarifying the catalytic roles of conserved
residues in the amidase signature family. J. Biol. Chem. 275, 19177–19184.
Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D., Yeates, T.O.,
1999. Assigning protein functions by comparative genome analysis: protein
phylogenetic profiles. Proc. Natl. Acad. Sci. U. S. A. 96, 4285–4288.
Saario, S.M., et al., 2005. Characterization of the sulfhydryl-sensitive site in the
enzyme responsible for hydrolysis of 2-arachidonoyl-glycerol in rat
cerebellar membranes. Chem. Biol. 12, 649–656.
Schneider, C., Boeglin, W.E., Brash, A.R., 2004. Identification of two
cyclooxygenase active site residues, Leucine 384 and Glycine 526, that
control carbon ring cyclization in prostaglandin biosynthesis. J. Biol. Chem.
279, 4404–4414.
Shrestha, R., Dixon, R.A., Chapman, K.D., 2003. Molecular identification of a
functional homologue of the mammalian fatty acid amide hydrolase in
Arabidopsis thaliana. J. Biol. Chem. 278, 34990–34997.
Siafaka-Kapadai, A., Anagnostopoulos, D., Zafiriou, M.P., Farmaki, E.,
Maccarrone, M., 2005. The endocannabinoid system in unicellular
eukaryotes. 2005 Symposium on the Cannabinoids. International Cannabi-
noid Research Society, Burlington, Vermont, p. 31.
Singh, R., Hurst, D.P., Barnett-Norris, J., Lynch, D.L., Reggio, P.H., Guarnieri,
F., 2002. Activation of the cannabinoid CB1 receptor may involve a W6.48/
F3.36 rotamer toggle switch. J. Pept. Res. 60, 357–370.
Sun, Y.X., et al., 2004. Biosynthesis of anandamide and N-palmitoylethanola-
mine by sequential actions of phospholipase A2 and lysophospholipase D.
Biochem. J. 380, 749–756.
Tanaka, T., Ikita, K., Ashida, T., Motoyama, Y., Yamaguchi, Y., Satouchi, K.,
1996. Effects of growth temperature on the fatty acid composition of the
free-living nematode Caenorhabditis elegans. Lipids 31, 1173–1178.
Tobin, D., et al., 2002. Combinatorial expression of TRPV channel proteins
defines their sensory functions and subcellular localization in C. elegans
neurons. Neuron 35, 307–318.
Watanabe, H., Vriens, J., Prenen, J., Droogmans, G., Voets, T., Nilius, B., 2003.
Anandamide and arachidonic acid use epoxyeicosatrienoic acids to activate
TRPV4 channels. Nature 42, 434–438.
Zank, T.K., Zähringer, U., Lerchl, J., Heinz, E., 2000. Cloning and functional
expression of the first plant fatty acid elongase specific for Δ
6
-
polyunsaturated fatty acids. Biochem. Soc. Trans. 28 (2000), 654–658.
Zhang, R., Hurst, D.P., Barnett-Norris, J., Reggio, P.H., Song, Z.H., 2005.
Cysteine 2.59(89) in the second transmembrane domain of human CB2
receptor is accessible within the ligand binding crevice: evidence for
possible CB2 deviation from a rhodopsin template. Mol. Pharmacol. 68,
69–83.
Zmasek, C.M., Eddy, S.R., 2002. RIO: analyzing proteomes by automated
phylogenomics using resampled inference of orthologs. BMC Bioinfor-
matics 16 (3), 14.
Zygmunt, P.M., et al., 1999. Vanilloid receptors on sensory nerves mediate the
vasodilator action of anandamide. Nature 400, 452–457.
11J.M. McPartland et al. / Gene xx (2005) xxx–xxx
ARTICLE IN PRESS