DNA AND CELL BIOLOGY
Volume 24, Number 12, 2005
© Mary Ann Liebert, Inc.
Identification and Characterization of Tandem Repeats in
Exon III of Dopamine Receptor D4 (DRD4) Genes from
Different Mammalian Species
SVEND ARILD LARSEN,1LINE MOGENSEN,1RUNE DIETZ,2HANS JØRGEN BAAGØE,3
MOGENS ANDERSEN,3THOMAS WERGE,1and HENRIK BERG RASMUSSEN1
In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in
33 public available nucleotide sequences from different mammalian species. We found that the tandem re-
peat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12-
bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, on-
ager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat.
Several of these sequences have been analyzed previously without a tandem repeat being found. In the do-
mestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two
closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from
mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consist-
ing of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were
composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink
and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed
a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary re-
lated species were clustered into the same groups. The degree of conservation of the tandem repeats varied
significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a
high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon
III tandem repeat, which was included in the study for comparative purposes. We identified proline-con-
taining motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain bind-
ing motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers
of potential functional sites varied pronouncedly between species. Our observations provide a platform for
future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that
differences in the structure of this tandem repeat contribute to specialization and generation of diversity in
another. In minisatellites, often designated variable number of
tandem repeats (VNTRs), each module consists of 10–100 nu-
cleotides. The number of repeat modules differs between loci,
ANDEM REPEATS are nucleic acid sequences composed of
modules of a nucleotide pattern serially arranged one after
and may reach several hundreds. Some minisatellites are un-
stable, with a high propensity for expansion, but the majority
of minisatellites mutate at much lower rates, leading to a re-
stricted allele size range (Bois and Jeffreys, 1999). There are
several reports of minisatellites in coding regions. Some cod-
ing minisatellites exhibit a high degree of interspecies variation
and are sources for rapid generation of new functional variants
1Research Institute of Biological Psychiatry, H:S Sct. Hans Hospital, Roskilde, Denmark.
2National Environmental Research Institute, Roskilde, Denmark.
3Zoological Museum, Vertebrate Department, Copenhagen University, Copenhagen, Denmark.
(Tompa, 2003). Coding minisatellites are frequently found in
unstructured protein domains, including low-complexity re-
gions and regions with a high proline content (Tompa, 2003).
Protein disorder, that is, the lack of a regular secondary struc-
ture combined with a high degree of flexibility in the polypep-
tide strand (Wright and Dyson, 1999), appears to play an im-
portant role in molecular recognition and signaling (Iakoucheva
et al., 2002).
Dopamine receptor D4 (DRD4) is a seven-transmembrane
helical structure, which couples to the G-protein. This recep-
tor, a target for several antipsychotics, is expressed at high lev-
els in regions of the brain implicated in the control of cogni-
tive functions (Tarazi and Baldssarini, 1999). A tandem repeat
has been identified in exon III of the gene encoding DRD4 in
several mammalian species. In prosimians, primates, and hu-
mans it is composed of 48-bp modules (Livak et al., 1995; Mat-
sumoto et al., 1995; Inoue-Murayama et al., 1998), whereas the
module size is 18 bp in equine species (Hasegawa et al., 2002)
and in Cetaceans (L. Mogensen, unpublished data).
A compound tandem repeat consisting of 39- and 12-bp mod-
ules in addition to a single 27-bp module has been detected in
the domestic dog and wolf (Niimi et al., 2001; Inoue-Murayama
et al., 2002; Ito et al., 2004). In a sequence from the raccoon
dog, another member of the Canidae family, stretches with ho-
mology to those of the repetitive motifs in the domestic dog
has been identified, but an exact description of the structure was
not provided (Inoue-Murayama et al., 2002). Attempts to iden-
tify a tandem repeat in DRD4 exon III sequences from other
carnivores, namely, the Asiatic bear, common raccoon, and do-
mestic cat, were unsuccessful (Inoue-Murayama et al., 2002).
Neither has such tandem repeat been detected in rodent se-
quences (O’Malley et al., 1992; Fishburn et al., 1995).
There is evidence that the length of the DRD4 exon III tan-
dem repeat affects receptor function. In humans, size variation
of this tandem repeat seems to modulate the expression and ef-
ficiency of maturation of the receptor (Schoots and Van Tol,
2003; Van Craenenbroeck et al., 2005). Moreover, a specific
size variant of the human DRD4 exon III tandem repeat has
been shown to require significantly higher levels of dopamine
to produce a response of the same magnitude as other size vari-
ants (Asghari et al., 1995). This particular size variant has been
associated with novelty seeking (Kluger et al., 2002) and with
susceptibility to psychiatric diseases (Holmes et al., 2002; Mil-
let et al., 2003). In dogs, a relation between the structure of the
DRD4 exon III tandem repeat and aggression-related behavior
has been reported (Ito et al., 2004).
In this study we have identified and characterized DRD4
exon III tandem repeats in sequences from various mammals.
Several of these sequences have been analyzed previously with-
out a tandem repeat being found.
MATERIALS AND METHODS
We retrieved 33 partial or entire nucleotide sequences of the
DRD4 exon III from GenBank®. These sequences were derived
from the domestic dog (Canis familiaris; AB030234,
AB030235, AB030236, AB030237, AB044885, AB044886, and
AB044887), the gray wolf (Canis lupus; AB069661), the rac-
coon dog (Nyctereutes procyonoides; AB069662), the Asiatic
black bear (Ursus thibetanus; AB069664), the polar bear (Ur-
sus maritimus; AY611807), the gray seal (Halichoerus grypus;
DQ071548), the common raccoon (Procyon lotor; AB069663),
the domestic mink (Mustela vision; AY611808), the domestic
ferret (Mustela putorius furo; AY394848), the European badger
(Meles meles; DQ029099), the European otter (Lutra lutra;
DQ029098), the domestic cat (Felis catus; AB069665), the do-
mestic cow (Bos taurus; AB069666), the white-beaked dolphin
(Lagenorhynchus albirostris; AY615861), the harbor porpoise
(Phocoena phocoena; AY615862), the domestic horse (Equus
caballus; AB080626 and AB080627), the wild horse (Equus
przewalskii; AB080628), the donkey (Equus asinus; AB080629
and AB080630), the onager (Equus hemionus; AB080631 and
AB080632), the plains zebra (Equus burchelli; AB080633), the
Grevy’s Zebra (Equus grevyi; AB080634), the mountain zebra
(Equus zebra; AB080635), the house mouse (Mus musculus;
U19880), and the rat (Rattus norvegicus; U03551). The amino
acid translations of these sequences and an amino acid sequence
of the 7R variant of the human DRD4 exon III tandem repeat
(Homo sapien; P21917) were also analyzed.
Detection and characterization of tandem
repeats in nucleic acid sequences
The Tandem Repeats Finder (TRF) program developed by
Benson (1999) was used to identify and characterize tandem re-
peats in the nucleotide sequences. In this program the weight
for match is ?2 and cannot be varied, while weights for mis-
match and indels are variable with three options, namely 3, 5,
or 7 (interpreted as negative numbers). Lower numbers permit
more mismatches or indels; higher numbers increase the strin-
gency of the search conditions. If successful in identifying a
tandem repeat, TRF characterizes it and reports the module size,
consensus pattern of the repetitive modules and its percent GC.
Also, percent of matches is calculated, that is, a measure of the
identity between adjacent modules in a tandem repeat, not be-
tween the consensus pattern and the single modules. Occa-
sionally, suggestions for different module sizes are provided.
In these cases selection of the most appropriate is based upon
the score values calculated by TRF. The program can be down-
loaded from the site: http://tandem.bu.edu/trf/trf.html.
GC-compositional strand bias was calculated as (C ?
G)/(C?G), where C and G denote the number of cytosine and
guanine residues, respectively.
Alignment analyses and construction of
a phylogenetic tree
ClustalX 1.83 (Thompson et al., 1997) was used for align-
ment analysis of entire repeat regions. Multiple alignments were
edited and refined manually.
The exact start of a tandem repeat is often difficult to deter-
mine, and may vary with several nucleotides even in closely
related repeats. This complicates a comparison of the consen-
sus patterns from different tandem repeats. Therefore, nu-
cleotide consensus patterns were subjected to cyclic alignment,
which allows every position in one of two sequences to be the
first. After cyclic alignment we constructed a phylogenetic tree
of the consensus patterns using the neighbor-joining method
LARSEN ET AL.
implemented in MEGA version 3.0 (Kumar et al., 2004), which
is accessible at http://www.megasoftware.net/. Tree construc-
tion was also done with ClustalX 1.83.
Detection and characterization of repeats in
The program Statistical Analysis of Protein Sequences
(SAPS) was applied for detection of repeats in protein se-
quences. This program is capable of identifying simple tandem
repeats as well as separated repeats. The SAPS program was
developed in the group of Samuel Karlin at Stanford University,
and is freely accessible for on-line analysis at http://www.
Protein disorder prediction
Prediction of structural disorder in the deduced amino acid
sequences of the tandem repeats was carried out using Glob-
plot™ version 2.1 (Linding et al., 2003). This program calcu-
lates a running sum of the propensity of the various amino acids
for disorder in a polypeptide strand with proline, glycine, as-
paragine, aspartic acid, and serine, conferring the highest dis-
order propensity values. Globplot™ is freely accessible at
Detection of potential functional sites in the
deduced amino acid sequences
The program NetPhos 2.0 (Blom et al., 1999), which is ac-
cessible at http://www.cbs.dtu.dk/services/NetPhos/, was used
to identify potential phosphorylation sites in the deduced amino
acid sequences of the tandem repeats. Detection of other types
of functional sites in the sequences was accomplished with
ELM (Puntervoll et al., 2003) and Scansite 2.0 (Obenauer et
al., 2003). These programs are accessible at http://elm.eu.org/
and http://scansite.mit.edu/, respectively.
We attempted to detect DRD4 exon III tandem repeats in 33
nucleotide sequences retrieved from GenBank®. Often multi-
ple suggestions for the module size of a tandem repeat were
obtained. The most important results from these analyses are
summarized in Table 1.
The Genbank®sequences from the domestic dog did not in-
clude the regions flanking the tandem repeats. To compensate
for this, the 5? flank from another dog sequence (Niimi et al.,
1999) was added to these tandem repeats. This flank is identi-
cal with that present in the wolf and raccoon dog. Multiple sug-
gestions for the module size in the canine tandem repeat were
given by TRF, including sizes of 51, 39, 12, and 27 bp, with
the two former producing the best score. Closer inspection of
the proposal for a 12-bp structure revealed the presence of three
additional nucleotides in some of the modules, suggesting the
tandem repeat to consist of 12-bp modules intermingled with
15-bp repeats. In accordance with this, a single 15-bp module
and three different 12-bp modules could be identified in the 51-
bp consensus pattern (Table 2). The first of the 51-bp modules
in the tandem repeats started one nucleotide upstream to the
previously identified repeat start (Niimi et al., 2001). Align-
ment analysis of entire repeat regions from canids revealed
blocks of 51 bp, which consisted of a 15-bp module and three
12-bp modules in the proximal end of the tandem repeats (Fig.
1). This pattern was less distinct in the distal part of the tan-
dem repeats, where the 15-bp modules were associated with
one or two 12-bp modules rather than three. Moreover, the se-
quences of these 12-bp modules differed slightly from those
further upstream. Some Canidae sequences were characterized
by the lack of one or more 51-bp blocks including the raccoon
dog tandem repeat, which differed from the others by being
Examination of the deduced amino acid sequences from the
domestic dog, wolf, and raccoon dog revealed several patterns
TANDEM REPEATS IN EXON III OF DOPAMINE RECEPTOR D4 GENES FROM MAMMALS
TABLE 1.DETECTION OF DRD4 EXON III TANDEM REPEATS IN DIFFERENT MAMMALIAN SPECIES
Module size under diffferent stringenciesa
Dog, wolf, and raccoon dog
Asiatic bear and polar bear
Dolphin and harbor porpoise
Horse, zebra, onager, and donkey
51, 39, 12, 27
27, 18, 9
36, 18, 6
51, 39, 39,b12, 27
54, 18, 18,b36
aStringencies of the alignment criteria were low (?2, ?3, ?5) and high (?2, ?7, ?7), where numbers refer to parameter
settings for match, mismatch, and indel, respectively. If multiple suggestions for the module size are given they are listed in the
order of decreasing score values. ND means that a tandem repeat was not detected.
bOverlapping tandem repeats with same module sizes but different consensus patterns were detected.
CHARACTERISTICS OF THE 51-BP MODULE FROM THE DRD4 EXON III TANDEM REPEAT IN CANIDS
Amino acid consensus
Nucleotide consensus patterna
aThe 51-bp consensus pattern consisted of one 15-bp segment and three 12-bp segments. Two different 15-bp consensus patterns differing at position 5 (A or G) were detected.
Nucleotides, which vary between the 12-bp segments, are underlined.
bServes as a measure of the degree of identity between adjacent modules.
cGC-compositional strand bias was calculated as (C ? G)/(C ? G), where C and G denote the number of cytosine and guanine residues, respectively.
dIdentified by examination of the deduced amino acid sequence. Note that the first triplet starts at the second nucleotide.
AB030236, AB030237, AB044885, AB044886, and AB044887), gray wolf (AB069661), and raccoon dog (AB069662). Single
nucleotide substitutions are underlined. Dashes indicate deletions. The sequences flanking the tandem repeats in the domestic dog
had not been included in the submissions to GenBank®. Therefore, the 5? flank from another dog tandem repeat (Niimi et al.,
1999) was added to these sequences. Note the presence of a 15-bp module followed by 12-bp modules. The start of the tandem
repeats was one nucleotide upstream to that described previously (Niimi et al., 2001; Ito et al., 2004).
Alignment of DRD4 tandem repeats from canids. The sequences derived from the domestic dog (AB030234, AB030235,
TANDEM REPEATS IN EXON III OF DOPAMINE RECEPTOR D4 GENES FROM MAMMALS799
of tandemly arranged repeat modules, including a 17-amino
acid motif. This motif was identical with that produced by trans-
lation of the 51-bp nucleotide consensus pattern (Table 2).
In the nucleotide sequences from the Asiatic bear and polar
bear a tandem repeat composed of 18-bp modules or 9-bp mod-
ules was found using permissive alignment conditions. Under
more stringent search conditions only an 18-bp module struc-
ture was identified. A 36-bp module tandem repeat composed
of two related 18-bp basic units was identified in the gray seal.
Using permissive search conditions we identified tandem re-
peats composed of 18-bp modules in the sequences from the
common raccoon and domestic cat. On increased stringency
search conditions TRF reported the detection of an 18-bp re-
peat structure in the feline sequence, and made no other sug-
gestions. In the mink and ferret, a 9-bp module tandem repeat
was identified. The badger harbored a tandem repeat, which
most appropriately was described as an array of 27-bp mod-
ules. An 18-bp tandem repeat was detected in the European ot-
ter. The 27-bp and 18-bp patterns from the badger and otter
were composed of 9-bp basic units closely related with the
9-bp consensus patterns detected in the mink and ferret. These
four species all belong to the Mustela family.
A 36-bp module tandem repeat composed of two related
18-bp basic units was identified in the domestic cow. The tan-
dem repeat of the harbor porpoise was classified as an 18-bp
module structure, while that of the white-beaked dolphin was
better described as being composed of 36-bp modules.
We identified tandem repeats composed of 18-bp modules
in the sequences from the horse species. Two 18-bp consensus
patterns differing at position number 17 were discovered (Table
3). In the proximal portions of the equine sequences small dele-
tions were identified suggesting the repeat structure to start with
two or three 15-bp modules followed by 18-bp modules. Using
high stringency conditions we identified two overlapping tan-
dem repeats, each composed of 18-bp modules in four of the
sequences. In three of these sequences the percent match be-
tween adjacent modules was markedly higher in the down-
stream tandem repeat than it was in the upstream one. More-
over, the GC content was lower in the downstream tandem
repeats due to an increased number of Ts.
Examination of sequences from the mouse and rat did not
unravel a tandem repeat, not even under permissive alignment
A module size of 18 bp was shared by most of the species
included in this study (Table 3). The number of modules iden-
tified in a tandem repeat was dependent upon the search strin-
gency. Under permissive conditions it varied from about two
in the badger to more than 20 in the horse species. The percent
of matches was low in the tandem repeats from all animal
species except the dog and domestic cat, reflecting a significant
degree of sequence variation between adjacent modules. The
GC-percent ranged from 75 to 88 in the tandem repeats with a
marked overrepresentation of C. This bias for C was in partic-
ular prominent in the tandem repeat from the whale spp. and
Cyclic alignment was used for comparison of the 18-bp con-
sensus patterns from various species, including the 18-bp basic
unit of the 36-bp tandem repeat from the cow and gray seal.
Also included in this analysis were 18-bp modules from the fer-
ret and mink each constructed using two (identical) 9-bp con-
sensus modules. On cyclic alignment of consensus patterns, we
found that the percent identity varied from 63 to 94 between
different species (data not shown), with high values being ob-
served for comparisons of whale with horse and cow. To fur-
ther explore the relationship between the patterns from the dif-
ferent species a phylogenetic tree was constructed (Fig. 2).
Inspection of this tree revealed a phylogenetic group consist-
ing of whales, the horse spp., and cow. Raccoons, cats, and
mustelids could be classified into another group, while the seal
and bear were located in between these two groups.
We were unable to detect repeat motifs in the deduced amino
acid sequences from the two bear species, ferret, badger, and
otter (Table 3). In the sequences from the other species listed
in Table 3, repeat motifs consisting of four to five conserved
amino acids were detected.
The content of disorder-promoting amino acids, namely pro-
line, glycine, asparagine, aspartic acid, and serine (Linding et
al., 2003) amounted roughly to 75% in the tandem repeats from
the various species (Table 4). The entire tandem repeats from
some species such as the domestic dog and humans were in a
potential disordered state, while stretches with high and low
propensities for disorder were intermingled in the tandem re-
peats from other species. Relatively large ordered stretches were
detected in the tandem repeats from the raccoon, bear, ferret,
badger, and otter. Short disordered stretches protruded from the
tandem repeats into the repeat flanks. In the human sequence
the disordered region consisted of 130 consecutive amino acids.
Only 18 of these were not located within the tandem repeat (7R
allele with 112 residues).
Different species varied markedly with respect to number of
potential functional sites in the tandem repeats (Table 4). No-
tably, potential phosphorylation sites were absent in the tandem
repeats from the bear, domestic cat, and humans. This was not
surprising, since these three tandem repeats all lacked serine,
threonine, and tyrosine, that is, the amino acids capable of ac-
cepting a phosphate group. The density of potential phospho-
rylation sites was high in the sequences from the dog.
Binding motifs for Src homology 3 (SH3) domain-contain-
ing proteins were present in tandem repeats from all species,
with a particularly high density in the human tandem repeat
(Table 4). Motifs for binding to WW (named for the conserved
tryptophan–tryptophan residues) domains, that is, motifs con-
taining serine or threonine, were absent in the tandem repeats
from the bear, domestic cat, and humans. PDZ (an acronym of
the first three PDZ-containing proteins identified) domain bind-
ing motifs were relatively abundant in the tandem repeats from
the gray seal, but lacked in those from the domestic dog and hu-
mans. One or two forkhead-associated (FHA) domain binding
motifs were found in the tandem repeats from some of the
species. The motif PDAI, which contains the recognition for
PDZ domain-containing proteins, and PPDA could be identified
in several of the amino acid repeat patterns, including canids.
In this study we have identified and characterized DRD4
exon III tandem repeats from a range of different mammals.
Several new findings were done. First, novel DRD4 exon III
tandem repeats were detected. Second, a more profound char-
LARSEN ET AL.
CHARACTERISTICS OF DRD4 EXON III TANDEM REPEATS IN DIFFERENT MAMMALS
Repetitive motif (consensus)a
Asiatic black bear
Dolphin and harbor porpoisee
Horse, zebra, onager, and donkey
aBased upon low stringency search conditions (?2, ?3, ?5). The positions of some of the repeat starts were shifted relative to others, implying that their consensus patterns are not di-
rectly comparable. Codons are delimited by dots. If relevant consensus patterns were split into their basic units (listed in different rows one above another).
BServes as a measure of the degree of identity between adjacent modules.
cGC-compositional strand bias was calculated as (C?G)/(C?G), where C and G denote the number of cytosine and guanine residues, respectively.
dPresence of repetitive motifs (separated or in tandem) consisting of at least four consecutive amino acids in the translated sequences.
eFor the sake of simplicity the tandem repeats from both cetacean species were classified as 18-bp repeat structures.
acterization of previously identified tandem repeats was pro-
vided. Third, we found links between the amino acid composi-
tion of the tandem repeats and the function of DRD4. All iden-
tified tandem repeats were GC-rich, a property typical of
classical minisatellites (Vergnaud and Denoeud, 2000), but dif-
fered profoundly otherwise.
While previous studies described the canine tandem repeat
as a composite structure consisting of modules of 39, 27, and
12 bp (Niimi et al., 2001; Ito et al., 2004), our description with
only two different module sizes, namely, 12 and 15 bp was sim-
pler. This mode of perceiving the tandem repeat in exon III of
the canine DRD4 gene may prove helpful to future studies of
A module size of 18 bp was found in the majority of the
LARSEN ET AL.
sensus patterns from different mammals. The
analysis was carried out using MEGA ver-
sion 3. The 18-bp mink pattern included in
this analysis consisted of two 9-bp consen-
sus modules from this species. Another 18-
bp pattern was constructed for the analysis
using two 9-bp patterns from ferret. Since the
9-bp patterns from these two species were
present in the 27-bp pattern from badger, the
latter was not included in the analysis. Note
the presence of a phylogenetic group con-
sisting of the cow, whale, and horse. Also, a
group consisting of the raccoon, cat, and
mustelids could be distinguished. The seal
and bear were located in between these two
groups. Slightly different results were ob-
tained with ClustalX, which suggested a
shorter distance between the whale and horse
than between the whale and cow.
Phylogenetic tree of 18-bp con-
TABLE 4.STRUCTURAL DISORDER AND POTENTIAL FUNCTIONAL SITES IN AMINO ACID SEQUENCES
OF DRD4 EXON III TANDEM REPEATS FROM VARIOUS MAMMALIAN SPECIES
with a high
aNucleotide repeat regions were detected by TRF and translated.
bRegions suspected of being structurally disordered were identified and expressed as fractions of the total length of the
cNumbers of potential functional sites in the tandem repeats are listed. In parentheses their densities were calculated as num-
ber of sites per 10 nucleotides.
dData are for sequence AB030234. Similar results were obtained with the other sequences fom the domestic dog and from the
eData are for sequence AB080627. Similar results were obtained with the other equine sequences.
fA single SH2 domain binding motif was identified.
Number and densities of potential functional sitesc
species included in this study, including the Asiatic bear, com-
mon raccoon, and domestic cat. Based upon analyses of amino
acid sequences a previous study failed to reveal a tandem re-
peat in these species (Inoue-Murayama et al., 2002). Most
likely, this reflects the accumulation of large numbers of non-
synonymous single nucleotide substitutions in the repeat mod-
ules, erasing the repeat structure in the polypeptide strands. The
tandem repeats from the gray seal and cow were composed of
36-bp modules, each consisting of two related 18-bp basic units.
Possibly, these 36-bp structures have evolved by simultaneous
duplication of two such basic modules. Similarly, the tandem
repeat in the dolphin may have evolved by duplication of an
18-bp basic module.
A DRD4 exon III tandem repeat has not been reported in the
domestic ferret, mink, European otter, and European badger
previously. We identified a tandem repeat composed of a 9-bp
basic unit in sequences from all four mustelid species. This 9-
bp basic unit could be the ancient module. Later in the evolu-
tion two or three neighbor blocks of this 9-bp unit might have
been subjected to duplication events in some mustelid species,
leading to the emergence of 18- and 27-bp repeat modules.
A tandem repeat composed of 18-bp modules ranging from
3 to 9 in number has previously been reported in the equine
species (Hasegawa et al., 2002). We identified a markedly
higher number of repeat modules in the same sequences. The
detection of two overlapping tandem repeats with different con-
sensus patterns in some of these sequences probably reflects a
polarity of mutation events resulting in the divergence of an an-
cestor tandem repeat into two distinct but partially overlapping
segments. The lower degree of similarity between adjacent
copies in the upstream segment of the tandem repeat rather than
the downstream segment suggests that expansion events have
taken place more recently in the latter of these two.
We confirmed the notion that rodents lack a tandem repeat
in exon III of their DRD4 gene (O’Malley et al., 1992; Fish-
burn et al., 1995). Apparently, this tandem repeat emerged af-
ter divergence of rodents from other mammalian lineages. Our
phylogenetic analysis showed that the 18-bp repeat modules
from closely related species displayed a higher degree of sim-
ilarity than they did with more distantly related species. How-
ever, it is important to emphasize that the 18-bp module size
was detected in several evolutionary distinct lineages such as
the ungulate and carnivore lineages.
To our knowledge, this study is the first to report structural
disorder in DRD4. Unstructured regions have been detected in
other receptors previously, including the glucocorticoid recep-
tor (Dunker et al., 2002). Molecular recognition involving struc-
tural disorder in proteins has several advantages. For example,
it increases the binding diversity allowing a protein to interact
with numerous partners without loss of binding specificity. We
found that the entire amino acid sequence of the DRD4 exon
III tandem repeat exhibited high propensities for disorder in
some species, while the propensity for disorder was lower in
the other species. The finding that an amino acid repeat motif
was absent in the majority of the tandem repeats with the low-
est propensity for disorder associates the degree of conserva-
tion of the repeat structure in the polypeptide strand with the
amount of structural disorder.
The translated tandem repeats contained several motifs of
potential functional importance. Signal transduction is usually
dependent upon interaction between a variety of proteins, in-
cluding adapter proteins, which participate in the formation of
multiprotein complexes. Various functional sites are involved
in the formation of such complexes, for example, SH3 domains
and WW domains for binding to proline-rich sequences (Mayer,
2001; Zarrinpar et al., 2003), PDZ domains (Hung and Sheng,
2002) and FHA domains (Durocher and Jackson, 2002). Previ-
ous observations suggested that the repeat sequence in the hu-
man DRD4 has a modulatory effect on the interaction with SH3
domain-containing proteins (Oldenhof et al., 1998). Also, the
various motifs in the tandem repeats from the other species
might be involved in the modulation of function of DRD4.
Whether the marked interspecies differences in the number of
potential functional sites in the tandem repeats has implications
for the function of DRD4 and translates into interspecies dif-
ferences in receptor function is a distinct possibility.
In summary, we have identified and characterized a tandem
repeat in DRD4 exon III from different mammals. Marked in-
terspecies differences existed in the composition of this tandem
repeat, suggesting that it has been subjected to strong evolu-
tionary forces and represents a source for development of new
variants. Our observations create a basis for future studies of
the evolution of the DRD4 exon III tandem repeat, and they
raise the question of whether there is an association between
the architecture of this the tandem repeat and the function of
DRD4 in different phylogenetic lineages.
We thank Drs. Mikkel Schierup, Bioinformatics Research
Center, University of Aarhus, Denmark, and Rune Linding,
Samuel Lunenfeld Research Institute, Mt. Sinai Hospital,
Toronto, for many helpful advices.
ASGHARI, V., SANYAL, S., BUCHWALDT, S., PATERSON, A.,
JOVANOVIC, V., and VAN TOL, H.H. (1995). Modulation of in-
tracellular cyclic AMP levels by different human dopamine D4 re-
ceptor variants. J. Neurochem. 65, 1157–1165.
BENSON, G. (1999). Tandem repeats finder: A program to analyze
DNA sequences. Nucleic Acids Res. 27, 573–580.
BLOM, N., GAMMELTOFT, S., and BRUNAK, S. (1999). Sequence-
and structure-based prediction of eukaryotic protein phosphorylation
sites. J. Mol. Biol. 294, 1351–1362.
BOIS, P., and JEFFREYS, A.J. (1999). Minisatellite instability and
germline mutation. Cell. Mol. Life Sci. 55, 1636–1648.
DUNKER, A.K., BROWN, C.J., LAWSON, J.D., IAKOUCHEVA,
L.M., and OBRADOVIC, Z. (2002). Intrinsic disorder and protein
function. Biochemistry 41, 6573–6582.
DUROCHER, D., and JACKSON, S.P. (2002). The FHA domain.
FEBS Lett. 513, 58–66.
FISHBURN, C.S., CARMON, S., and FUCHS, S. (1995). Molecular
cloning and characterisation of the gene encoding the murine D4
dopamine receptor. FEBS Lett. 361, 215–219.
HASEGAWA, T., SATO, F., and ISHIDA, N. (2002). Determination
and variability of nucleotide sequences for D4 dopamine receptor
genes (DRD4) in genus Equus. J. Equine. Sci. 13, 57–62.
HOLMES, J., PAYTON, A., BARRETT, J., HARRINGTON, R.,
MCGUFFIN, P., OWEN, M., OLLIER, W., WORTHINGTON, J.,
TANDEM REPEATS IN EXON III OF DOPAMINE RECEPTOR D4 GENES FROM MAMMALS
GILL, M., KIRLEY, A., et al. (2002). Association of DRD4 in chil- Download full-text
dren with ADHD and comorbid conduct problems. Am. J. Med.
Genet. 114, 150–153.
HUNG, A.Y., and SHENG, M. (2002). PDZ domains: Structural mod-
ules for protein complex assembly. J. Biol. Chem. 277, 5699–5702.
IAKOUCHEVA, L.M., BROWN,
OBRADOVIC, Z., and DUNKER, A,K. (2002). Intrinsic disorder in
cell-signaling and cancer-associated proteins. J. Mol. Biol. 323,
INOUE-MURAYAMA, M., TAKENAKA, O., and MURAYAMA, Y.
(1998). Origin and divergence of tandem repeats of primate D4
dopamine receptor genes. Primates 39, 217–224.
INOUE-MURAYAMA, M., MATSUURA, N., MURAYAMA, Y.,
TSUBOTA, T., IWASAKI, T., KITAGAWA, H., and ITO, S. (2002).
Sequence comparison of the dopamine receptor D4 exon III repeti-
tive region in several species of the order Carnivora. J. Vet. Med.
Sci. 64, 747–749.
ITO, H., NARA, H., INOUE-MURAYAMA, M., SHIMADA, M.K.,
KOSHIMURA, A., UEDA, Y., KITAGAWA, H., TAKEUCHI, Y.,
MORI, Y., MURAYAMA , Y., et al. (2004). Allele frequency dis-
tribution of the canine dopamine receptor D4 gene exon III and I in
23 breeds. J. Vet. Med. Sci. 66, 815–820.
KLUGER, A.N., SIEGFRIED, Z., and EBSTEIN, R.P. (2002). A meta-
analysis of the association between DRD4 polymorphism and nov-
elty seeking. Mol. Psychiatry 7, 712–717.
KUMAR, S., TAMURA, K., and NEI, M. (2004). MEGA3: Integrated
software for molecular evolutionary genetics analysis and sequence
alignment. Brief. Bioinform. 5, 150–163.
LINDING, R., RUSSELL, R.B., NEDUVA, V., and GIBSON, T.J.
(2003). GlobPlot: Exploring protein sequences for globularity and
disorder. Nucleic Acids Res. 31, 3701–3708.
LIVAK, K.J., ROGERS, J., and LICHTER, J.B. (1995). Variability of
dopamine D4 receptor (DRD4) gene sequence within and among non-
human primate species. Proc. Natl. Acad. Sci. USA 92, 427–431.
MATSUMOTO, M., HIDAKA, K., TADA, S., TASAKI, Y., and YA-
MAGUCHI, T. (1995). Polymorphic tandem repeats in dopamine D4
receptor are spread over primate species. Biochem. Biophys. Res.
Commmun. 207, 467–475.
MAYER, B.J. (2001). SH3 domains: Complexity in moderation. J. Cell
Sci. 114, 1253–1263.
MILLET, B., CHABANE, N., DELORME, R., LEBOYER, M.,
LEROY, S., POIRIER, M.F., BOURDEL, M.C., MOUREN-SIME-
ONI, M.C., ROUILLON, F., LOO, H., et al. (2003). Association be-
tween the dopamine receptor D4 (DRD4) gene and obsessive-com-
pulsive disorder. Am. J. Med. Genet. 116B, 55–59.
NIIMI, Y., INOUE-MURAYAMA, M., MURAYAMA, Y., ITO S, and
IWASAKI, T. (1999). Allelic variation of the D4 dopamine recep-
tor polymorphic region in two dog breeds, Golden retriever and
Shiba. J. Vet. Med. Sci. 61, 1281–1286.
NIIMI, Y., INOUE-MURAYAMA, M., KATO, K., MATSUURA, N.,
MURAYAMA, Y., ITO, S., MOMOI, Y., KONNO, K., and
IWASAKI, T. (2001). Breed differences in allele frequency of the
dopamine receptor D4 gene in dogs. J. Hered. 92, 433–436.
C.J., LAWSON, J.D.,
OBENAUER, J.C., CANTLEY, L.C., and YAFFE, M.B. (2003). Scan-
site 2.0: Proteome-wide prediction of cell signaling interactions us-
ing short sequence motifs. Nucleic Acids Res. 31, 3635–3641.
OLDENHOF, J., VICKERY, R., ANAFI, M., OAK, J., RAY, A.,
SCHOOTS, O., PAWSON, T., VON ZASTROW, M., and VAN
TOL, H.H. (1998). SH3 binding domains in the dopamine D4 re-
ceptor. Biochemistry 37, 15726–15736.
O’MALLEY, K.L., HARMON, S., TANG, L., and TODD, R.D. (1992).
The rat dopamine D4 receptor: Sequence, gene structure, and demon-
stration of expression in the cardiovascular system. New Biol. 4,
PUNTERVOLL, P., LINDING, R., GEMÜND, C., CHABANIS-
DAVIDSON, S., MATTINGSDAL, M., CAMERON, S., MARTIN,
D. M. A., AUSIELLO, G., BRANNETTI, B., COSTANTINI, A., et
al. (2003). ELM server: A new resource for investigating short func-
tional sites in modular eukaryotic proteins. Nucleic Acids Res. 31,
SCHOOTS, O., and VAN TOL, H.H. (2003). The human dopamine D4
receptor repeat sequences modulate expression. Pharmacogenom. J.
TARAZI, F.I., and BALDESSARINI, R.J. (1999). Brain dopamine D(4)
receptors: Basic and clinical status. Int. J. Neuropsychopharmacol.
THOMPSON, J.D., GIBSON, T.J., PLEWNIAK, F., JEANMOUGIN,
F., and HIGGINS, D.G. (1997). The ClustalX windows interface:
Flexible strategies for multiple sequence alignment aided by quality
analysis tools. Nucleic Acids Res. 24, 4876–4882.
TOMPA, P. (2003). Intrinsically unstructured proteins evolve by repeat
expansion. Bioessays 9, 847–855.
VAN CRAENENBROECK, K, CLARK, S.D., COX, M.J., OAK, J.N.,
LIU, F., and VAN TOL, H.H. (2005). Folding efficiency is rate-lim-
iting in dopamine D4 receptor biogenesis. J. Biol. Chem. 280,
VERGNAUD, G., and DENOEUD, F. (2000). Minisatellites: Mutabil-
ity and genome architecture. Genome Res. 10, 899–907.
WRIGHT, P., and DYSON, H. (1999). Intrinsically unstructured pro-
teins: Re-assessing the protein structure–function paradigm. J. Mol.
Biol. 293, 321–331.
ZARRINPAR, A., BHATTACHARYYA, R.P., and LIM, W.A. (2003).
The structure and function of proline recognition domains. Sci STKE.
Address reprint requests to:
Henrik Berg Rasmussen, Ph.D.
Research Institute of Biological Psychiatry
H:S Sct. Hans Hospital
DK-4000 Roskilde, Denmark
Received for publication February 3, 2005; received in revised
form March 23, 2005; accepted June 6, 2005.
LARSEN ET AL.