Signature of recent historical events in the European Y-chromosomal STR haplotype distribution.
ABSTRACT Previous studies of human Y-chromosomal single-nucleotide polymorphisms (Y-SNPs) established a link between the extant Y-SNP haplogroup distribution and the prehistoric demography of Europe. By contrast, our analysis of seven rapidly evolving Y-chromosomal short tandem repeat loci (Y-STRs) in over 12,700 samples from 91 different locations in Europe reveals a signature of more recent historic events, not previously detected by other genetic markers. Cluster analysis based upon molecular variance yields two clearly identifiable sub-clusters of Western and Eastern European Y-STR haplotypes, and a diverse transition zone in central Europe, where haplotype spectra change more rapidly with longitude than with latitude. This and other observed patterns of Y-STR similarity may plausibly be related to particular historical incidents, including, for example, the expansion of the Franconian and Ottoman Empires. We conclude that Y-STRs may be capable of resolving male genealogies to an unparalleled degree and could therefore provide a useful means to study local population structure and recent demographic history.
-
Citations (0)
- Cited In (10)
-
Article: Micro-geographic distribution of Y-chromosomal variation in the central-western European region Brabant.
Maarten H D Larmuseau, Nancy Vanderheyden, Manon Jacobs, Monique Coomans, Lucie Larno, Ronny Decorte[show abstract] [hide abstract]
ABSTRACT: One of the future issues in the forensic application of the haploid Y-chromosome (Y-chr) is surveying the distribution of the Y-chr variation on a micro-geographical scale. Studies on such a scale require observing Y-chr variation on a high resolution, high sampling efforts and reliable genealogical data of all DNA-donors. In the current study we optimised this framework by surveying the micro-geographical distribution of the Y-chr variation in the central-western European region named Brabant. The Duchy of Brabant was a historical region in the Low Countries containing three contemporary Belgian provinces and one Dutch province (Noord-Brabant). 477 males from five a priori defined regions within Brabant were selected based on their genealogical ancestry (known pedigree at least before 1800). The Y-haplotypes were determined based on 37 Y-STR loci and the finest possible level of substructuring was defined according to the latest published Y-chr phylogenetic tree. In total, eight Y-haplogroups and 32 different subhaplogroups were observed, whereby 70% of all participants belonged to only four subhaplogroups: R1b1b2a1 (R-U106), R1b1b2a2* (R-P312*), R1b1b2a2g (R-U152) and I1* (I-M253*). Significant micro-geographical differentiation within Brabant was detected between the Dutch (Noord-Brabant) vs. the Flemish regions based on the differences in (sub)haplogroup frequencies but not based on Y-STR variation within the main subhaplogroups. A clear gradient was found with higher frequencies of R1b1b2 (R-M269) chromosomes in the northern vs. southern regions, mainly related to a trend in the frequency of R1b1b2a1 (R-U106).Forensic science international. Genetics 10/2010; 5(2):95-9. · 2.42 Impact Factor -
SourceAvailable from: Lutz Roewer
Article: Y-STR Frequency Surveying Method: A critical reappraisal.
[show abstract] [hide abstract]
ABSTRACT: Reasonable formalized methods to estimate the frequencies of DNA profiles generated from lineage markers have been proposed in the past years and were discussed in the forensic community. Recently, collections of population data on the frequencies of variations in Y chromosomal STR profiles have reached a new quality with the establishment of the comprehensive neatly quality-controlled reference database YHRD. Grounded on such unrivalled empirical material from hundreds of populations studies the core assumption of the Haplotype Frequency Surveying Method originally described 10 years ago can be tested and improved. Here we provide new approaches to calculate the parameters used in the frequency surveying method: a maximum likelihood estimation of the regression parameters (r(1), r(2), s(1) and s(2)) and a revised Frequency Surveying framework with variable binning and a database preprocessing to take the population sub-structure into account. We found good estimates for 11 metapopulations using both approaches and demonstrate that the statistical basis of the method is well supported and independent of the population under study. The results of the estimation process are reliable and robust if the underlying datasets are large and representative and show small average and pairwise genetic distances.Forensic science international. Genetics 11/2010; 5(2):84-90. · 2.42 Impact Factor -
SourceAvailable from: Cesar Augusto Fortes-Lima
Article: Y-STR genetic diversity in autochthonous Andalusians from Huelva and Granada provinces (Spain).
Beatriz Ambrosio, Andrea Novelletto, Candela Hernandez, Jean Michel Dugoujon, César Fortes-Lima, Juan Nicolás Rodriguez, Rosario Calderon[show abstract] [hide abstract]
ABSTRACT: Seventeen Y-chromosomal short tandem repeats (STRs) were analyzed in 347 healthy, unrelated, autochthonous males from the Andalusian provinces of Huelva (N=167) and Granada (N=180). AmpFlSTR Y-filer PCR Amplification kit (Applied Biosystems) was used to type the Y-STR markers. A total of 156 and 166 different haplotypes for the 17 Y-STR set were detected in Huelva, and Granada, respectively. The same haplotype diversity was found for both samples (0.998±0.001), and the overall discrimination capacity was 0.904. The most common minimal haplotype (DYS19, DYS389 I, DYS389 II, DYS390, DYS391, DYS392, DYS393) in both subpopulations was 14-13-16-24-11-13-13, which is also the most frequent haplotype among Atlantic European populations. Comparison analysis using pairwise R(ST) values and Analysis of Molecular Variance (AMOVA) revealed a significant genetic distance between our Andalusian samples and other ones from the northern Iberian fringe (including Basque and Pyrenean populations). However, results from the multi-dimensional scaling analysis (MDS) yielded a well-defined group of Iberian populations separated from the other Mediterranean clusters observed.Forensic science international. Genetics 06/2011; 6(2):e66-71. · 2.42 Impact Factor
Page 1
ORIGINAL INVESTIGATION
Lutz Roewer Æ Peter J. P. Croucher Æ Sascha Willuweit
Tim T. Lu Æ Manfred Kayser Æ Ru ¨ diger Lessig
Peter de Knijff Æ Mark A. Jobling Æ Chris Tyler-Smith
Michael Krawczak
Signature of recent historical events in the European
Y-chromosomal STR haplotype distribution
Received: 13 May 2004/ Accepted: 13 September 2004/Published online: 20 January 2005
? Springer-Verlag 2005
Abstract Previous studies of human Y-chromosomal
single-nucleotide polymorphisms (Y-SNPs) established
a link between the extant Y-SNP haplogroup distri-
bution and the prehistoric demography of Europe.
By contrast, our analysis of seven rapidly evolving
Y-chromosomal short tandem repeat loci (Y-STRs)
in over 12,700 samples from 91 different locations in
Europe reveals a signature of more recent historic
events,notpreviouslydetected
markers. Cluster analysis based upon molecular vari-
ance yields two clearly identifiable sub-clusters of
Western and Eastern European Y-STR haplotypes,
and a diverse transition zone in central Europe, where
haplotype spectra change more rapidly with longitude
than with latitude. This and other observed patterns
of Y-STR similarity may plausibly be related to par-
ticular historical incidents, including, for example, the
expansion of the Franconian and Ottoman Empires.
We conclude that Y-STRs may be capable of resolving
male genealogies to an unparalleled degree and could
therefore provide a useful means to study local pop-
ulation structure and recent demographic history.
byother genetic
Introduction
The population dynamic processes that created the
subtle gradation pattern of European culture, and
eventually shaped the continent’s genetic structure, are
unparalleled in world history. It is widely accepted that
extant Europeans have their earliest roots in the scat-
teredPalaeolithichunter-gatherer
around 40,000–43,000 years ago (Boyd and Silk 1997).
Some 10,000 years before present (YBP), the rise of
agriculture in the Near East led to the migration into
Europe of a rapidly expanding farming population, al-
though the extent of concomitant genetic change has
been a matter of some dispute (Ammerman and Cavalli-
Sforza 1984; Chikhi et al. 2002). Archaeological evi-
dence suggests that the oldest rural villages in Europe
developed on the western coast of the Aegean Sea and in
communitiesof
L. Roewer and P.J.P. Croucher contributed equally to this paper.
L. Roewer Æ S. Willuweit
Institute of Legal Medicine, Humboldt-University,
Berlin, Germany
P. J. P. Croucher
First Department of Medicine,
Christian-Albrechts-University, Kiel,
Germany
P. J. P. Croucher Æ T. T. Lu Æ M. Krawczak (&)
Institute of Medical Informatics and Statistics,
Christian-Albrechts-University, Brunswiker Strasse 10,
Kiel, Germany
E-mail: krawczak@medinfo.uni-kiel.de
Tel.: +49-431-5973200
Fax: +49-431-5973193
M. Kayser
Max-Planck-Institute for Evolutionary Anthropology,
Leipzig, Germany
M. Kayser
Department of Forensic Molecular Biology,
Erasmus University, Rotterdam, The Netherlands
R. Lessig
Institute of Legal Medicine,
University of Leipzig, Germany
P. de Knijff
Forensic Laboratory for DNA Research,
Leiden University, The Netherlands
M. A. Jobling
Department of Genetics,
University of Leicester, UK
C. Tyler-Smith
The Wellcome Trust Sanger Institute,
Hinxton, UK
Hum Genet (2005) 116: 279–291
DOI 10.1007/s00439-004-1201-z
Page 2
Crete around 7,000 YBP. From there, agriculture spread
to most of the Balkan Peninsula and unfolded west-
wards via the Vardar-Danube-Rhine corridor and along
the northern coast of the Mediterranean.
For thousands of years the South-East of Europe
was the most developed part of the continent, whilst
north of the Alps non-imperial and rather autono-
mous communities predominated. With the decline of
the Roman Empire, around 2,000 YBP, the Teutons
started to expand westwards, and the amalgamation of
the Roman and Germanic civilisations marked the
origin of the extant ‘‘occidental’’ European popula-
tions (Banniard 1989). By contrast, the eastern part of
Europe always provided a gateway for migration and
invasion by nomadic populations, from the Kurgan
expansion around 7,000 YBP to the east-west move-
ments of the Scythians, Mongols, Huns, Avars, Alanes
and Magyars in historical times. Originating from the
upper Dnjepr, the Slavs had settled on the southern
shores of the Baltic Sea and replaced the Germanic
speakers in Hungary and the Balkans during the sixth
century AD (Gimbutas 1971). Their expansion was
only halted by the new European powerhouse, the
Franconian Empire. The political divide between these
two cultures was marked by the rivers Elbe, Danube
and Save, a split that has influenced Europe ever since
and is still obvious today.
The population genetic consequences of the Euro-
pean colonisation and re-colonisation during Palaeo-
lithic and Neolithic times have been addressed by a
number of studies that utilised slowly evolving single-
nucleotide polymorphisms (SNPs) on the Y chromo-
some (Rosser et al. 2000; Semino et al. 2000). Owing
to its large number of polymorphic sites and its par-
ticular sensitivity to genetic drift (Seielstad et al. 1998;
Kayser et al. 2001), the non-recombining part of the Y
chromosome is especially useful for the investigation
of population movements. Biallelic polymorphisms
have proven useful for the analysis of prehistoric
events, since the ancestral and derived sequence vari-
ants have had enough time to evolve independently.
However, in order to investigate the impact of politi-
cal, religious and cultural incidents in historical times,
such as the so-called ‘‘Making of Europe’’ between
950 and 1350 AC (Bartlett 1994), greater resolution is
required to distinguish the relevant patrilineal geneal-
ogies. The ability of hypervariable Y-chromosomal
short tandem repeat (Y-STR) haplotypes to discrimi-
nate between even closely related or co-localised male
populations has been demonstrated before for Ger-
mans and Dutch (Roewer et al. 1996), for the Baltic
populations (Lessig et al. 2001), for Central England
and North Wales (Weale et al. 2002) and for Poland
and Germany (Ploski et al. 2002). In the present
study, we genotyped over 12,700 Europeans at seven
Y-STR loci and assessed whether the geographical
haplotype distribution of these fast-evolving, male-
specific markers indeed reflects recent history. Samples
were derived from 91 different populations, spread
across the culturally most diverse regions of the con-
tinent and representing the most extensive survey of
human Y-chromosomal diversity undertaken in any
population or group of populations to date.
Material and methods
DNA samples
DNA samples were obtained from 12727 ‘‘white’’
European males through 91 local recruitment units
(Fig. 1). Nearly all individuals were ascertained via a
judicial or private paternity case, and care was taken
within the confines of general forensic practice that no
closely related males or males of non-local origin were
included in the study. The samples therefore represent
an unbiased cross-section of that part of the respective
male population that would have been liable to a
paternity dispute. All samples were logged in the Y-STR
haplotype reference database (YHRD), maintained at
the Institute of Legal Medicine, Humboldt-University of
Berlin, Germany, and made publicly available via the
Internet. The names of the samples and meta-samples
(see legend to Fig. 1) were chosen pragmatically without
any prior causal or explanatory (e.g. linguistic) rela-
tionship in mind.
Y-STR and Y-SNP genotyping
All samples were genotyped for six tetranucleotide Y-
STRs(DYS19, DYS389I,
DYS391,DYS393)and
(DYS392), following published protocols (Kayser et al.
1997). In accordance with recommendations by the
International Society of Forensic Genetics (Gill et al.
2001), Y-STR alleles were designated according to the
number of variable repeats included. Consistent allele
designation was assured by the use of allelic ladders in
all participating laboratories. Note that, in contrast to
the common practice in forensic databases, our allele
designation at DYS389I and DYS389II refers to the
repeat number at individual loci, and not the repeat
numbers revealed by the multiplex genotyping method
employed (Rolf et al. 1998). Prior to inclusion in the
study, all laboratories had to pass a quality test that
involved blind genotyping of five control samples.
Additional information about markers, laboratories and
sampling procedures is available at the YRHD web site
and from the Forensic Laboratory for DNA Research in
Leiden, The Netherlands.
Genotyping of Y-chromosomal SNPs in the Baltic
and Dutch samples allowed the assignment of each
individual chromosome to one of the most common
European Y-SNP haplogroups. Multiplex PCR and
SNaPshot minisequencing reactions were carried out as
described (Lessig et al. 2004). All products were analyzed
on ABI Prism 310 or 3100 Avant Genetic Analyzers.
DYS389II,
trinucleotide
DYS390,
Y-STRone
280
Page 3
Spatial autocorrelation analysis
Spatial autocorrelation analysis of haplotypes was per-
formed using Moran’s I index (Sokal and Oden 1978).
Parameter I measures the correlation between observa-
tions of the same type, made at locations of a given
geographical distance. Here, we used the repeat number
of each Y-STR locus as the primary observation. The
pair-wise great circle distances (GCD) between haplo-
type origins (i.e. recruitment units) were determined
using Cartesian coordinates obtained from the 2004
version of the CIA World Factbook and other sources in
the public domain. We chose not to employ II, an allele
frequency-based adaptation of Moran’s I to genetic
analyses suggested by Barbujani (1987), since STR re-
peat numbers evolving through single-step mutation
represent suitable quantities for analysis on their own.
Autocorrelation was evaluated over 11 distance classes,
including one class for haplotypes found in the same
location (i.e. GCD equal to zero) and ten additional,
equally frequent distance classes. Moran’s I values were
assessed for statistical significance using a randomisa-
tion test that permuted samples over locations in 100
replications.
Analysis of molecular variance
The genetic relationship between different populations
was assessed by means of FST, an analogue of Wright’s
FSTthat takes the evolutionary distance between indi-
vidual haplotypes into account (Excoffier et al. 1992;
Excoffier and Smouse 1994). Estimates of FSTwere ob-
tained using the Arlequin software (Schneider et al. 2000)
and tested for statistical significance by means of ran-
domisation (1,000 replicates per comparison). Several
local population comparisons yielded exceptionally low
FSTvalues, suggesting that the respective samples were
genetically indistinguishable. Population samples were,
therefore, recursively clustered into meta-samples, based
uponFST.Ineachclusteringstep,thatpairofpopulations
or clusters that yielded the minimum FST value was
grouped together. Clustering was performed in a two-
tiered fashion, in that groupings were first confined to
withinoneandthesamecountry,andgroupingcontinued
until no further insignificant clustering (i.e. P>0.05 for
the respective FST) was possible within any country. This
procedure led to the definition of regional meta-samples
(Fig. 1) which were more comparable to samples from
countries with only one recruitment unit. Meta-samples
Fig. 1 Geographical origin of 91 European male DNA samples.
The colour coding of meta-samples, formed by clustering samples
within a country according to minimum FST, is given in the inset.
Black circles mark singleton samples which were not included in
any meta-sample. Meta-samples include ‘‘Spain’’ (Andalucia,
Aragon, Asturias, Caceres, Cantabria, Catalonia, Galicia, Madrid,
Valencia), ‘‘Portugal’’ (North, Central and South Portugal,
Madeira), ‘‘South Holland’’ (Limburg, Zeeland), ‘‘North Holland’’
(Friesland, Groningen, Leiden), ‘‘Sweden’’ (Blekinge, Gotland,
Ostergoetland, Skaraborg, Stockholm, Uppsala, Vaermland),
‘‘Norway’’ (North, Central, East, West and South Norway, Oslo),
‘‘West Germany’’ (Cologne, Du ¨ sseldorf, Freiburg, Hamburg,
Mainz, Mu ¨ nster, Stuttgart), ‘‘North-East Germany’’ (Berlin,
Rostock), ‘‘South-East Germany’’ (Chemnitz, Greifswald, Leipzig,
Magdeburg, Munich), ‘‘Central Italy’’ (Lombardy, Marche, Tus-
cany, Umbria), ‘‘West Italy’’ (Latium, Liguria, Puglia), and
‘‘Poland’’ (Bydgoszcz, Gdansk, Krakow, Lublin, Warsaw, Wro-
claw)
281
Page 4
and the remaining individual samples formed the basis of
a second, exhaustive round of FST-based clustering.
Multidimensional scaling analysis
In order to graphically visualise the Y-STR genetic
landscape of Europe, principle coordinates were identi-
fied by subjecting the pair-wise FSTestimates between all
91 samples to a multidimensional scaling analysis
(MDS) using SPSS (SPSS, Chicago Ill., USA). Individ-
ual solutions for one through six dimensions were iter-
ated until the improvement in stress was less than
0.0001. The optimum dimensionality was then deter-
mined from a ‘‘scree’’ test (Table 1). A clear ‘‘elbow’’
was observed for the three-dimensional solution, with
higher-dimensional solutions not providing a substantial
decrease in Kruskal’s stress value, S. The R2values
(Table 1) indicate that 96.4% of the variance in the data
can be accounted for by the three-dimensional solution.
The resulting coordinates were then used to generate
three independent maps, one for each dimension, in
which MDS coordinates were interpolated across
Cartesian coordinates by inverse distance squared
weighting with the 12 nearest neighbour samples, using
the GRASS (Geographical Resources Analysis Support
System) software. The European land mass was masked
from oceans and seas using the GTOPO30 (30-arc sec-
onds topographical) dataset, publicly available at the US
Land Processes Distributed Active Archive Center. For
better resolution, some sample locations were moved to
the centre of the respective country of origin, especially
when only one location was included for that country
(e.g. Great Britain, Finland, Hungary etc.).
The first dimension of the MDS revealed a strong
east-west gradient for the Y-STR genetic structure. In
order to corroborate this finding more systematically,
Spearman rank correlation coefficients were calculated
for each of the 91 original samples between pair-wise
FSTand the longitudinal and latitudinal geographical
distance, respectively, to all other samples. Singletons
and meta-samples were also subjected to a pseudo-
admixture analysis relative to the ‘‘Western Europe’’
and ‘‘Eastern Europe’’ sub-clusters that become clearly
identifiable in the FST-based cluster analysis (boxed in
Fig. 2). Each haplotype that occurred in at least one of
the two fringe sub-clusters was labelled either ‘‘Western’’
or ‘‘Eastern’’, depending upon where it was relatively
more frequent. All haplotypes that were one-step
neighbours of an Eastern or a Western haplotype of at
least 1% frequency in the respective sub-cluster were
assigned the same origin. All other European samples
Table 1 Multidimensional scaling and ‘‘scree’’ test of pair-wise FST
values (S Kruskal’s stress value, R2coefficient of determination)
No. of
dimensions
SR2
1
2
3
4
5
6
0.211
0.136
0.106
0.094
0.084
0.076
0.887
0.944
0.964
0.968
0.972
0.975
Fig. 2 Clustering by minimum
Y-STR-based FSTof 45 male
European samples and meta-
samples. The significance of
individual groupings is
indicated by vertical bars below
the dendrogram (top line
randomisation P<0.05, bottom
lineP<0.001)
282
Page 5
and meta-samples were then characterised in terms of
the relative proportion of the fringe haplotypes.
Assessment of local haplotype diversity variation
Under an infinite sites model (ISM) of mutation, the
number of different haplotypes, Sn, expected in a pop-
ulation sample of size n is related to the mutation rate l
and the effective population size N via Ewen’s sampling
formula (Ewens 1972)
EðSnÞ ¼
X
n?1
i¼0
h
h þ i;
ð1Þ
where h=Nl for haploid genomes. When the mutational
process generating Y-STR haplotypes is approximated
by an ISM(Helgason
of l=2·10?2for of all seven loci combined (Kayser
et al. 2000) allows the estimation of effective population
sizes from n and Sn, solving Ewen’s sampling formula
for h. Furthermore, for any two populations with con-
stant population sizes N1and N2, respectively, and a
constant number M of migrants per generation between
them, assuming that each migrant introduces a new
haplotype into the other population leads to expected
values, E(Sn, k), that correspond to
?
For the combined population, E(Sn,c) is such that
et al.2000),adoption
hk¼ Nk l þM
Nk
?
;
k ¼ 1;2:
ð2Þ
hc¼ ðN1þ N2Þl
if the two populations are completely panmictic. If not,
then hcas obtained from Sn,cusing formula 1 an unbiased
estimate of the right-hand side of formula 3 only if n1/
n2=N1/N2, i.e. if the two sample sizes are proportional to
the population sizes. However, even under complete
population separation (i.e. M=0), simulation has shown
thattherespectivebiasisnotveryseriousforawiderange
of nivalues since the curve that relates hcto n1and n2is
usually flat around n1/n2=N1/N2(data not shown).
ð3Þ
Results
Populations, haplotypes and genetic diversity
The Y-chromosomal STRs DYS19, DYS389I, DYS389-
II, DYS390, DYS391, DYS392 and DYS393 were anal-
ysed in 12,727 males from 91 European populations
(Fig. 1).In the total sample, the markers showedbetween
eight (DYS391, DYS393) and 11 (DYS390, DYS392)
different alleles. However, of the seven million different
theoreticallypossiblehaplotypes,only2,489wereactually
observed. More than half of these (1,397/2,489=56.1%)
were unique to an individual male. Fifteen haplotypes,
comprising 3,088 males (24.3%), occurred in 40 or more
populations; another 160 haplotypes, representing 4,692
individuals (36.9%), were found in 10–39 populations.
Themostfrequenthaplotype(‘‘14-13-16-24-11-13-13’’,in
the above marker order) accounted for 661 males (5.2%)
and was observed in 80 populations. Its 14 one-step
neighbourscomprisedanother1,368individuals(10.7%),
with observed haplotype counts ranging from 24 to 303.
Whilst the highest haplotype diversity h (Nei 1987) was
observedinVienna,whereall66haplotypesweredifferent
(h=1.00), particularly low values of h (i.e. h<0.95) were
noted in Albania and Finland.
When population samples were iteratively collapsed
within countries, a total of 12 meta-samples emerged
before all FSTvalues became significant at the 5% level
(Fig. 1). This regrouping involved 58 original samples;
the remaining 33 data sets were henceforth treated as
singletons. The second, exhaustive round of clustering by
minimum FSTclearly reflected the geographical actuality
of Europe (Fig. 2). All Western populations, with the
exception of metropolitan Paris, formed a clearly iden-
tifiable sub-cluster (‘‘Western Europe’’). This sub-cluster
also included Emilia Romagna from Italy, which in turn
strongly resembled the French sample from the Alsace
(FST=?0.0010, P=0.507). Most notably, the Spanish
and Portuguese samples were not found to be signifi-
cantly different (FST=0.0014, P=0.078). At the other
extreme, all Slavs except Bulgaria formed one cluster
(‘‘Eastern Europe’’), where only Croatia (FST=0.0061)
and the two Baltic samples (FST=0.0157) differed from
the rest at the 0.1% significance level. Bulgaria belonged
to a different sub-cluster, which also incorporated
Romania (FST=0.0001, P=0.417), and which extended
to Greece and Hungary at the 1% significance level
(FST=0.0055, P=0.013 for all four samples combined).
Sicily and ‘‘West-Italy’’ together with the two Turkish
samples clearly complemented this ‘‘Balkan-Danube’’
grouping to form a coherent ‘‘Southern European’’ sub-
cluster. Whilst the two southern Dutch populations
(‘‘South Holland’’) were included in ‘‘Western Europe’’,
the three northern samples (‘‘North Holland’’) formed a
‘‘Friesian’’ sub-clustertogether
(FST=?0.0044, P=0.795). The Friesians were part of a
more comprehensive sub-cluster, including all franco-
phone populations, West Germany, Vienna and Sweden,
which may be loosely called ‘‘West-Central Europe’’. By
contrast, the other German and Austrian samples as well
as Northern-Italian Veneto link in with Norway and
eventually Albania and Estonia to group as a ‘‘East-
Central Europe’’. Finally, Finland can be regarded as a
sub-cluster on its own which associated with the non-
Slavic populations (FST=0.0460, P<0.001) before both
connected with the ‘‘Eastern European’’ sub-cluster
(FST=0.0744, P<0.001).
with Denmark
Genetic and geographic distances
Spatial autocorrelation analysis yielded results that were
consistent with an ‘‘isolation by distance’’ model of
283
Page 6
haplotype divergence (Fig. 3). Thus, Y-STR repeat
numbers were most similar for haplotypes sampled in
the same location (I=0.0929). At less than 1,000 km, the
autocorrelation index decreased sharply with distance,
reaching a plateau of negative I values thereafter and
decreasing only slightly beyond 2,000 km. All indices
were significantly different from zero (randomisation
P<0.01), including I=?0.0018 for the distance class
centred at 940 km.
Inspection of Fig. 1 suggests that the similarity of Y-
STR haplotypes decays much more rapidly along an
east-west than a north-south gradient, at least in central
Europe. This notion was formally corroborated by a
MDS analysis of all pair-wise FST values. The first
dimension, accounting for almost 89% of the variance
(Table 1), clearly shows a decomposition of the Euro-
pean Y-STR genetic structure into three major compo-
nents (Fig. 4a), closely corresponding to the ‘‘Western’’,
‘‘Central’’ and ‘‘Eastern’’ sub-clusters of Fig. 2. The first
dimension also highlights the genetic peculiarity of
metropolitan Paris and Vienna, Finland, and the two
Balkan-Slavic samples of Slovenia and Croatia in rela-
tion to their respective surroundings. The degree of east-
west stratification of the European Y-STR haplotype
spectrum was quantified by Spearman rank correlation
analysis between the latitudinal and longitudinal dis-
tances, respectively, and pair-wise FST(Fig. 5). For 81
samples, the correlation was stronger with longitude
than with latitude, and the few populations showing a
notably reversed effect were from the fringe of the
Fig. 4a–c Multidimensional scaling analysis of pair-wise Y-STR-
based FSTbetween 91 male European samples. Displayed are the
first three dimensions (a–c) which together account for 96.4% of
the variance. Sample locations are marked in black; colour coding
is on an arbitrary ‘‘rainbow’’ scale that allocates yellow and
magenta to the opposite extremes, via green and blue
c
Fig. 3 Spatial autocorrelation analysis of European Y-STR hapl-
otypes. Moran’s I index was calculated from the repeat number at
each Y-STR locus, using the Great Circle Distance between the
respective recruitment units as a measure of pair-wise geographical
distance between haplotypes
284
Page 7
continent (Fig. 5). Furthermore, whilst only five samples
showed a negative correlation with longitude, namely
Emilia Romagna (I), Vaesterbotten (S), Finland, Esto-
nia and Northern Norway, the same was true for 12
samples with latitude. The second dimension of the
MDS analysis revealed more subtle structural features,
such as, for example, the distinction between the Turkish
and non-Turkish samples in ‘‘Southern Europe’’ and the
divide between the two Dutch meta-samples. The third
dimension eventually depicted an underlying north-
south gradient that is usually seen in Y-SNP studies of
European populations (Rosser et al. 2000; Semino et al.
2000). However, since the second and third dimensions
of the Y-STR MDS accounted for less than 10% of the
variance (Table 1), the major geographic structuring
associated with the two types of markers must be sub-
stantially different.
Pseudo-admixture analysis
When compared with the peripheral ‘‘Western Europe’’
and ‘‘Eastern Europe’’ sub-clusters, the other 28 sam-
ples/meta-samples exhibited prominent geographical
gradients in terms of their Y-STR haplotype spectra
(Table 2). The proportion of Western European haplo-
types was naturally highest in ‘‘Western Europe’’ itself
(91%), with a cline-like decrease as one moves east and
north from Belgium (66%) and North Holland (61%) to
Romania (25.0%), Estonia (14%),‘‘Eastern Europe’’
Table 2 Pseudo-admixture
analysis of Y-STR haplotypes
in European populations (n
total number of haplotypes, NA
not applicable)
Sample/meta-samplen Frequency of haplotype group (%)
Western Eastern None
Western Europe
Finland
Estonia
Albania
Styria (A)
Norway
North-East Germany
Veneto (I)
South-East Germany
Tyrol (A)
Sweden
Vaesterbotten (S)
West Germany
Vienna (A)
North Holland
Denmark
Paris (F)
Central Italy
Berne (CH)
Belgium
Lausanne (CH)
West Italy
Sicily (I)
Hungary
Greece
Bulgaria
Romania
Bulgarian Turks
Turkey
Eastern Europe
Total
2,529
399
133
101
65
300
752
120
1,404
229
667
41
1,287
66
179
63
109
559
91
125
108
373
199
118
101
122
102
61
158
2,166
12,727
2300 (91)
39 (10)
19(14)
34 (34)
19 (29)
93 (31)
268 (35)
62 (52)
613 (44)
100 (44)
233 (35)
13 (32)
631 (49)
19 (29)
110 (61)
35 (56)
51 (46)
316 (57)
47 (52)
83 (66)
64 (60)
193 (52)
70 (35)
35 (30)
45 (44)
34 (28)
25 (24)
15 (25)
54 (34)
228 (11)
5,848 (46)
229 (9)
266 (67)
85 (64)
54 (53)
16 (25)
151 (50)
321 (43)
18 (15)
551 (39)
88 (38)
304 (46)
20 (48)
418 (32)
14 (21)
43 (24)
17 (27)
17 (16)
80 (14)
24 (26)
22 (18)
22 (20)
75 (20)
36 (18)
62 (52)
27 (27)
65 (53)
58 (57)
30 (49)
46 (29)
1938 (89)
5,097 (40)
NA
94 (23)
29(22)
13 (13)
30 (46)
56 (19)
163 (22)
40 (33)
240 (17)
41 (18)
130 (19)
8 (20)
238 (19)
33 (50)
26 (15)
11 (17)
41 (38)
163 (29)
20 (22)
20 (16)
22 (20)
105 (28)
93 (47)
21 (18)
29 (29)
23 (19)
19 (19)
16 (26)
58 (37)
NA
1,782 (14)
Fig. 5 Longitudinal and latitudinal extent of relatedness among
European Y-STR haplotypes. Each dot represents one of 91
recruitment units and depicts the Spearman correlation coefficient
of the pair-wise FSTvalues between that unit and all other samples
with the respective geographical distances. Horizontal axis corre-
lation with latitudinal distances (north–south), vertical axis
correlation with longitudinal distances (east–west)
285
Page 8
(11%) and Finland (10%). The proportion of Eastern
European haplotypes showed a similar trend in the re-
verse direction. Some samples were characterised by a
relatively high proportion of haplotypes (>30%) that
were not classifiable as either ‘‘Eastern’’ or ‘‘Western’’,
including Vienna (50%), Sicily (47%), Styria (46%),
Paris (38%), Turkey (37%) and Veneto (33%).
Local variation in the Dutch and Baltic Y-STR
haplotype spectra
Since the effective population size estimates for the
North and South Holland meta-samples were neither
additive nor identical (Table 3), the two sub-popula-
tions cannot have been completely separated or pan-
mictic. Instead, they must have experienced some level
of recent migration, and solving formulae 2 and 3 with
the respective h values included yielded N1=3,918,
N2=1,298 and M=22. These estimates were verified by
10,000 simulations of an ISM with migration, using the
very same N1, N2and M values. The mean Snobtained
in simulated samples of the original sizes (i.e. n1=179,
n2=96, andn3=n1+n2=275)
52.3±4.2 and 136.5±7.8, and were therefore very close
to the actually values (Table 3). As a consequence, it
may be concluded that the Dutch Y-STR data are best
explained by a threefold higher effective population size
in North than in South Holland and by migration rates
of m1=22/3918=0.56% (in and out of the North), and
m2=22/1298=1.69% (in and out of the South). On the
other hand, over the last, say, 100 generations, the
Dutch population can be assumed to have been large
enough for genetic drift to have had a minor effect
upon its genetic structure. Under this assumption, the
change in frequency of a given allele or haplotype can
be modelled in the two sub-populations as
were100.3±6.1,
jf1ðtÞ ? f2ðtÞj ¼ ð1 ? m1? m2Þt? jf1ð0Þ ? f2ð0Þjð4Þ
For the Dutch population, this implies that any dif-
ference in allele or haplotype frequency would have
decreased by 0.56+1.69=2.25% in one generation, and
by 100Æ[1?(1?0.0225)100]=89.7% in 100 generations.
Under
N1=4,172, N2=5,772, and M=53 for the two Baltic
samples/meta-samples from Estonia (‘‘1’’) and Lithua-
nia/Latvia (‘‘2’’), respectively (Table 3). This implies
that the migration rates between the two Baltic sub-re-
gions must have been of a similar order (m1=0.0127,
m2=0.0092) as those between the Dutch sub-popula-
tions.
the samemigration/ISM, weobtained
Discussion
The genealogy of European Y chromosomes is charac-
terised by substantial differentiation (Rosser et al. 2000;
Semino et al. 2000) and, as expounded upon in more
detail below, our study suggests that the observed pat-
tern of cline-like similarity between Y-chromosomal
STR haplotypes may partly reflect demographic events
that occurred in historical time. Recent male demogra-
phy would indeed be best assessed by the analysis of Y-
chromosomal polymorphisms since the fourfold smaller
effective population size of male-to-male transmission
renders Y-chromosomal markers much more sensitive to
genetic drift and population bottlenecks than their
autosomal counterparts. In principle, similar arguments
should also apply to mitochondrial (mtDNA) markers,
but little geographical structuring has been detected in
Europe by means of mtDNA analysis (Richards et al.
2002). This has led to the suggestion that mtDNA is
either exceptionally selection-sensitive or that female
gene flow has been particularly high in Europe (Barbu-
jani and Chikhi 2000).
Y-chromosomal short tandem repeats, in particular,
are capable of resolving population strata into individ-
ual genealogies that would otherwise be inseparable
(Roewer et al. 1996; Lessig et al. 2001; Ploski et al. 2002;
Weale et al. 2001, 2002). The power of the Y-STR ap-
proach is, however, critically dependent upon careful
sample collection, especially in regions with neighbour-
ing populations that have undergone mutual transitions.
Our analysis has made opportunistic use of the Y-STR
haplotype reference database (YHRD), a survey of male
genetic variation in Europe that was initiated in 1997 by
the Forensic Y Chromosome Research Group. The
major purpose of the database was to serve as reference
material for the presentation of male DNA profiles in
court (Roewer et al. 2000). Since then, the repository has
been growing continuously and expanded beyond its
original scope. As of August 2004, YHRD contained
over 24,000 Y-STR haplotypes from over 230 popula-
tions world-wide. The history of the database and its
originally intended use in forensic practice imply that
sampling has been neither systematic nor comprehen-
sive. However, it is questionable whether a merely aca-
demic study of neutral population genetic variation on
the present scale would have ever been possible to
implement. Furthermore, the use of validated genotyp-
ing procedures in quality-controlled forensic laborato-
ries has served to ensure the highest credibility for each
Table 3 Patterns of Y-STR haplotype diversity in the Dutch and
Baltic populations (n total number of haplotypes, Snnumber of
different haplotypes, h population parameter, N effective popula-
tion size)
SamplenSn
h
N
North Holland
South Holland
Combined
Estonia
Latvia/Lithuania
Combined
179
96
275
133
296
429
103
53
135
93
171
229
100.2
47.8
104.3
136.0
168.0
198.9
5,010
2,390
5,215
6,800
8,400
9,945
286
Page 9
and every haplotype that entered into YHRD (Roewer
et al. 2001).
In contrast to previous, SNP-based, studies of
European male genetics (Rosser et al. 2000; Semino et al.
2000), our analysis has utilised a much higher level of
inter-individual variability, and the identification of
previously unrecognised male strata in Europe confirms
the utility of this approach. However, the high vari-
ability of Y-STR profiles is a direct consequence of the
105to 106-fold higher likelihood of meiotic repeat length
change than single base-pair substitution (Kayser et al.
2000). High mutation rates may homogenise haplotypes
of different descent (de Knijff 2000), an inherent draw-
back of STR haplotypes that could only be compensated
for in our study by extensive sampling. The large size of
most regional samples and meta-samples ensured that all
locally common patrilines were represented by haplo-
type classes that contain profiles separated by one or
only a few single-step mutations.
Our study provides clear evidence for a major genetic
division of European males into Slavic-speaking eastern
and Romance language-speaking western populations,
separated by a central European block of Germanic-
and Italian-speaking populations. The Western and
Eastern European geographical clusters are reminiscent
of the well-established distribution of Y-SNP haplo-
groups P(xR1a) and R1a (Rosser et al. 2000). Indeed,
preliminary data by Kittler et al. (2003) indicate that the
ancestral Y-STR haplotypes were ‘‘14-13-16-24-11-13-
13’’ for P(xR1a) and ‘‘16(?)-13-17-25-10-11-13’’ for R1a.
Since the respective frequencies of these two haplotypes
and their one-step neighbours were 30.6 and 0.6% in
‘‘Western Europe’’, as opposed to 4.4 and 20.5% in
‘‘Eastern Europe’’, the Y-STR-based and Y-SNP-based
power of discrimination between the two sub-clusters
would have been equivalent. Furthermore, the signifi-
cantly negative autocorrelation in Y-STR repeat number
observed for haplotypes located more than 1,000 km
apart (Fig. 3) is suggestive of a continent-wide cline
between at least two different ancient lineages, possibly
corresponding to P(xR1a) and R1a. It is important to
stress, however, that the autocorrelation was based on
allele length, and not frequencies, so that we prefer to
interpret Fig. 3 in broader terms as simply representing
‘‘isolation-by-distance’’.
Our data revealed a rapid change in haplotype spec-
tra along east-west gradients, but a relative constancy
over more than 1,000 km in a north-south direction in
central parts of the continent. This may be indicative of
the fact that, when the expanding Slavic and Franconian
spheres of influence met in medieval central Europe, the
only way for male lineages to expand further may have
been in a northerly or southerly direction. Interestingly,
in our pseudo-admixture analysis, even the male lineages
within contemporary Germany turned out to be notably
different in the ‘‘North-East’’ (n=752; 43% Eastern vs
35% Western), the ‘‘South East’’ (n=752; 39% Eastern
vs 44% Western) and the ‘‘West’’ (n=1,287; 32%
Eastern vs 49% Western). When samples are stratified in
this way, the difference between the three Y-STR hap-
lotype spectra is highly significant (v2=3.945, 2 df,
P<0.001). The area covered by the former German
Democratic Republic significantly overlaps with the
homeland of Slavic (i.e. Wendish) people from the
Middle Ages, including the Sorbes, Pomeranes, Wa-
griens, Obodrites, and Ranes. This geographical coin-
cidence would explain the obvious preservation of
‘‘Slavic’’ haplotypes in eastern Germany far better than,
for example, the settlement of eastern European World
War II refugees, since the latter were mostly Germans
anyway.
In the Netherlands, a remarkable subdivision became
apparent in that the two southern samples (n=96) were
included in the ‘‘Western Europe’’ sub-cluster whereas
the three northern samples (n=179) formed part of a
‘‘Friesian’’ sub-cluster, together with Denmark. The
significant difference between the two Dutch regions was
exclusively due to haplotypes ‘‘14-13-16-24-10-13-13’’
and ‘‘14-13-16-24-11-13-13’’. Their combined frequency
was 16/179=0.089 in the North and 23/96=0.240 in the
South (v2=1.583, 1 df, P<0.001). If the two haplotypes,
which are also the most frequent ones in Spanish males,
are excluded from both meta-samples, pair-wise FST
drops from a highly significant 0.0183 (P=0.003) to an
insignificant 0.0069 (P=0.086). In the light of the 1–2%
background migration implied by the general overlap in
Y-STR haplotype spectra, however, such an isolated and
focused frequency difference must have had a fairly re-
cent origin for it not to have been eradicated. One
possible explanation for this coincidence would be a
war-related instance from the 15th and 16th centuries
when the south of the Netherlands came under Bur-
gundian and Spanish control much earlier than the
northern provinces. The latter formed the Union of
Utrecht in 1579 and gained independence from Spain
earlier than the southern Union of Arras (Merriman
1996).
Our considerations about the origin of Dutch Y-
STR differences have been based upon various simpli-
fying assumptions, the validity of which is not, how-
ever, critical for our main conclusion. First, we have
adopted an ISM which is known to overestimate the
number of STR alleles (or haplotypes) generated by
step-wise mutation (Shriver et al. 1993). In part, this
can be corrected for by adopting smaller h values in the
ISM; i.e. smaller population sizes and/or a smaller
mutation rate. However, even if we assume l=1·10?2
for the seven Y-STRs
N1=7,836,N2=2,596,M=22,
m2=0.0085. Such a level of background mutation is
still incompatible with the persistence over more than
100 generations of a 8.9 vs 24.0% allele frequency
difference for a Y-SNP, and even more so for a
Y-STR. Second, we have neglected migration from
outside the Netherlands. Since this can be allowed for
by increasing the mutation rate, the effective popula-
tion sizes would have to be even smaller and the rela-
tive background migration even higher than before.
combined, thisleads to
andm1=0.0028,
287
Page 10
Similarly, we have assumed that every migrant intro-
duces a new haplotype. Since this is clearly untrue,
even under an ISM, the actual level of migration must
have been even higher than estimated in order to
achieve the observed effects upon h; i.e. an apparent
increase in size over that expected if the two regions
had been completely separated. Third, we assumed
constant migration and a constant population size,
whereas in reality the two populations would have
undergone exponential expansion. This simplifying
assumption would have biased the interpretation of the
haplotype pattern in favour of our hypothesis only if
the two regions had experienced a lot of migration
occurring between them when populations were small,
and became completely separated only more recently. If
anything, European history suggests the opposite to be
true. Finally, it can be shown that the difference be-
tween the two Dutch sub-regions is not detectable with
Y-SNPs. SNP haplotype data were available for 171
(North) and 87 (South) of the males included in the Y-
STR study, respectively, and the pair-wise FSTvalue
obtained with the discriminating Y-SNPs resolving the
relevant European haplogroups was 0.0041 (P=0.209).
In western Europe, Y-STRs identified a large coher-
ent population pool, covering the Iberian Peninsula,
parts of Italy, France (Alsace) and the British Isles,
which coincides with the Frankonian Empire. Being the
heiress of multiethnic Rome, ‘‘Latin Europe’’ (i.e. the
part of Europe that was originally Roman Catholic ra-
ther than Greek orthodox or non-Christian) formed a
zone where strong shared cultural features were as
important as geographical contrasts (Bartlett 1994).
Italian- and Germanic-speaking populations were inte-
grated by adopting a common culture that was further
consolidated by a strong political system. The eastern
part of Europe, also identified by Y-STRs as being
homogeneous in terms of its male genetic make-up, has a
completely different history. This part of the continent
was significantly influenced by various waves of immi-
gration by nomadic Asian populations (Rosser 2000;
Wells et al. 2001). Nevertheless, in the vast and hardly
structured territory covered by our analysis, with no
borders to the east, the Slavic language was clearly a
strong uniting element for diverse cultures. The coloni-
sation attempts of western civilisations appear to have
left no significant genetic traces in eastern Europe and,
during the 20th century, genocide and resettlement have
both served to homogenise the (formerly more varied)
genetic landscape even further (Ploski et al. 2002).
In the Baltics, our analysis confirmed that Estonian
male lineages are comparatively close to both the
Central Europeans and Finnish, but differ strikingly
from the Lithuanian and Latvian Y-STR haplotype
spectra (Lessig et al. 2001). The difference between the
two samples/meta-samples is partly attributable to
haplotypes ‘‘14-14-16-23-11-14-14’’ and ‘‘14-14-16-24-
11-14-14’’ which occurred with frequencies of 14/
129=0.109 in Estonia and 4/296=0.014 in the other
two countries (Fisher’s exact two-sided P=3.1Æ10?5).
These two haplotypes are also the most frequent ones
among Finish males. With one exception (‘‘R1a1’’ on a
Latvian chromosome), they were found to be associ-
ated with Y-SNP haplogroup ‘‘N3’’ in the Baltic sam-
ples. Haplogroup ‘‘N3’’ is indeed more frequent in
Estonia (49/129=0.380) than in Latvia and Lithuania
(80/296=0.270; v2=5.103, 1 df, P=0.024) but not
whentheabove haplotypes
115=0.304 vs 77/292=0.264; v2=0.648, 1 df, P>0.4).
Interestingly, the most frequent Estonian haplotype is
‘‘14-12-16-22-10-11-13’’, with a frequency of 9/129 =
0.070 versus 1/296 = 0.003 in Latvia and Lithuania
(Fisher’s exact two-sided P=1.3Æ10?4). This haplotype
is the third most frequent one in Germany, and all but
one copy in the Baltic samples were found to reside on
SNP haplogroup ‘‘I’’. Nevertheless, the Y-STR haplo-
type frequency difference is not explicable in terms of a
generally higher frequency of haplogroup ‘‘I’’ in Esto-
nians. Excluding the ten Germanic Y-STR haplotypes
in question, the frequencies of haplogroup ‘‘I’’ are
virtually identical (15/120=0.125 in Estonia and 33/
295=0.112 in Latvia/Lithuania).
explanation for the substantial division exhibited by the
Baltic male lineages is the linguistic barrier that sepa-
rates the Lithuanians and Latvians, who speak two of
the three Baltic languages, and the Estonians, whose
language is part of the Finno-Ugric group. The simi-
larity of the Lithuanian and Latvian Y-chromosomal
gene pool, in turn, is potentially explicable by the fact
that a large proportion of present-day Latvia came
under Lithuanian-Polish control in 1561 and remained
so until the second division of Poland in 1772 (Gi-
eysztor et al.1979).
The sub-cluster broadly described as ‘‘Balkan-Dan-
ube’’ comprises Greek, Romanian, Bulgarian and
Hungarian males, i.e. members of populations who
speak languages from very different families. The ‘‘Bal-
kan-Danube’’ sub-cluster is also a relatively short dis-
tance away from the ‘‘Turks’’ (Fig. 2) and relatively
distant to the eastern (Slavic) samples. This pattern of
genetic similarity may seem counterintuitive at first
glance since the geographical region of origin of the
‘‘Balkan-Danube’’ meta-sample overlaps considerably
with the territory occupied by the southern Slavs after
their separation from the eastern and western Slavic
populations in the fifth and sixth centuries (Gimbutas
1971). In the 16th century, however, the same territory
was ruled by the Turks during the Ottoman expansion
and, 1,000 years earlier, Finno-Ugric-speaking (Magy-
ars) and Turkic-speaking people (e.g. the Proto-Bul-
garians) from western Asia had already invaded and
settled in the Danube basin and the Balkans, and had
consequently amalgamated with the southern Slavic
populations present. Another important factor that di-
vides the Slavic world has been religious orientation, and
even today one of the sharpest cultural divisions in the
Slavic world is that between people converted to
Christianity by the Franks and the Greeks, respectively
(Bartlett 1994). More than anywhere else, the genetic
are excluded (35/
The mostlikely
288
Page 11
record of the ‘‘Balkan-Danube’’ region can thus be read
as a palimpsest of repeatedly ‘‘overwritten’’ historical
processes (Jobling et al. 2004).
The appearance of singletons which are clearly
genetically different from the surrounding populations is
a result of sampling either from highly admixed and
heterogeneous Metropolitan populations (Paris, Vien-
na), or sampling populations from the fringes of Europe
that have a very distinct history (Finns, Estonians, Irish,
Albanians). For the latter category, the most important
force creating genetic individuality would have been
patrilocality, a socially instituted practice whereby a
newly-married couple live with or near the family of the
husband. The importance of social (family) structure in
influencing the population genetic diversity of patrilocal
societies has been described in tribal populations (Oota
et al. 2001), but can also be observed in Europe (Se-
ielstad et al. 1998).
Many of the population structures revealed in the
present study could potentially be related to recent his-
torical events. Irrespective of whether these links are
true, they would have been less likely to be discernible by
Y-SNP analyses, which usually mark much earlier waves
of population movement (see the Dutch and Baltic
examples above). Single-nucleotide substitutions are
known to occur at a rate of approximately 2–5·10?9per
generation (Cooper and Krawczak 1993) implying that
common selectively neutral SNPs are likely to have
arisen before the major expansion of the human popu-
lation. With their demic diffusion throughout the world,
Y-SNPs could therefore only create local genetic struc-
tures in populations that had become particularly iso-
lated. With Y-STRs, by contrast, the rapid mutation
process constantly generates new genetic variation that
allows genetic structures to change more easily. Fur-
thermore, even with their substantial variation in mar-
ker-specific mutation rates, Y-STRs usually manifest so
many meiotic mutations in evolutionarily related hapl-
otypes that the actual choice of markers is fairly
inconsequential to the inferred nature of the genealogi-
cal process. In this respect, Y-STRs represent a much
more robust population genetic tool than Y-SNPs,
which may have substantially different time-depths and
can thus create layers of incongruent maps (Jobling and
Tyler-Smith 2003).
In summary, we have shown that Y-STRs are capable
of resolving male genealogies in Europe to an unparal-
leled degree. Although it is inherently difficult to prove
by Y-STR analysis alone whether a particular genetic
division is of recent or prehistoric origin, or whether it
would have been detectable by Y-SNPs as well, it sug-
gests that Y-STRs should be considered as the markers
of choice for studies of local population structure and
recent demographic history. Therefore, sampling in
YHRD is currently concentrating upon an expansion of
the database to Eurasia as well as to the Americas in
order to facilitate investigation of migration processes
both to and from Europe that occurred during the past
millennium.
Electronic database information
Y-STR haplotype reference database (YHRD): http://
www.yhrd.org; Forensic Laboratory for DNA Re-
search,Leiden: http://www.humgen.nl/fldo/;
graphic Resources Analysis Support System: grass.itc.it;
US Land Processes Distributed Active Archive Center:
edcdaac.usgs.gov
Geo-
Acknowledgements The authors wish to thank Wulf Schiefenho ¨ vel,
Andechs, for helpful comments and criticisms, and Brian Fulfrost
of the UCSC GIS Technology Lab for valuable advice on geo-
spatial data interpolation and visualisation.
Publication of this paper is being undertaken on behalf
of the Forensic Y-Chromosome Research Group
M. Aler1, A. Alonso2, C. Alves3, M. Alu `4, A. Amorim3,
K. Anslinger5, E. Arroyo6, A. Asmundo7, C. Augustin8,
D. Ballard9, L. Barbarii10, G. Ba ¨ ßler11, A. Betz11, G.
Bla ¨ ß12, E. Bosch13, W. Branicki14, A. Brehm15, M.
Brion16, L. Buscemi17, L. Caenazzo18, A. Caglia ´19, E.
Carnevali20, E. Carra21, A. Carracedo16, K. Crainic22,
Z. de Battisti23, D. Dermengiu24, T. Dobosz25, R.
Dominici26,B.M.Dupuy27,
Furac28, S. Fu ¨ redi29, J. Garcı´a2, C. Gehrig30, M. Gene ´31,
B. Gornjak Pogorelc32, B. Glock33, L. Gusma ˜ o3, M.
Hedman34, K. Hedrich35, J. Henke36, L. Henke36, M.
Hidding M37, C. Hohoff38, G. Holmlund39, B. Hoste40,
H.-J Ka ¨ rgel35, C. Keyser-Traqui41, M. Klintschar M42,
S. Kravchenko43, I. Kremensky44, T. Kupiec14, M.V.
Lareu16, B.Legrand22,
Limborska45, L.A. Livshits43, A.M. Lo ´ pez-Parra6, M.
Lorente46, B. Ludes41, P. Martı´n2, B. Martinez-Jarreta
B47, M.S. Mesa6, D.M. Monies48, M. Nagy49, P.
Nievas47, S. Noerby50, M. Nowak51, K.S. Parreira16, W.
Parson52, V. Pascali19, R. Pawlowski14, 53, A. Piccinini54,
R. Ploski51, M.Poetsch55,
Reichenpfader57, U. Ricci58, C. Robino59, B. Rolf5, A.
Sajantila34, A. Salas16, P. Sa ´ nchez-Diz16, U. Schmidt60,
C.Schmitt37,P.M.Schneider61,
Syndercombe Court9, R. Szibor63, A. Tagliabracci17, J.
Teifel-Greding64, K. Thiele65, M.L.G Uzielli58, N. von
Wurmb-Schwark66, R. Wegener67, M. Wozniak68, B.
Zaharova44, M.T. Zarrabeitia69, I. Zupanic Pajnic32
1Laboratory of Forensic Genetics, Department of
Legal Medicine,University
2National Institute of Toxicology, Biology Services,
Ministry of Justice, Madrid, Spain;
Pathology and Molecular Immunology, University of
Porto, Portugal;
Forensic Sciences, University of Modena and Reggio
Emilia, Italy;
Maximilians-University, Munich, Germany;
tory of Forensic Biology, Universidad Complutense
de Madrid, Spain;
versity of Messina, Italy;8Institute of Legal Medicine,
A.T.Fernandes15, I.
E.Liebeherr12, S.A.
C.Previdere ´56,B.
I.Skitsa62, D.
ofValencia,Spain;
3Institute of
4Department of Morphological and
5Institute of Legal Medicine, Ludwig-
6Labora-
7Institute of Legal Medicine, Uni-
289
Page 12
University of Hamburg, Germany;
Haematology, Queen Mary School of Medicine and
Dentistry, London, UK;
Medicine, Bucharest, Romania;11State Criminal Police
Office Baden-Wu ¨ rttemberg, Stuttgart, Germany;12State
Criminal Police Office, Berlin, Germany;
Biologia Evolutiva, Facultat de Cie ` ncies de la Salut i de
la Vida, Universitat Pompeu Fabra, Barcelona, Spain;
14Institute of Forensic Research, Krako ´ w, Poland;
15Centre of Biological and Geological Sciences, Uni-
versity of Madeira, Funchal, Portugal;
Legal Medicine, University of Santiago de Compostela,
Spain;
Ancona, Italy;18Institute of Legal Medicine, University
of Padova, Italy;19Institute of Legal Medicine, Catholic
University, Rome, Italy;20Institute of Legal Medicine,
University of Perugia, Italy;21Department of Cellular
Biology and Development, University of Palermo, Italy;
22Department of Legal Medicine, University Rene ´
Descartes, Paris, France;23Institute of Legal Medicine,
University of Verona, Italy;
Medicine, University of Medicine and Pharmacy,
Bucharest, Romania;25Institute of Forensic Medicine,
Medical University, Wroclaw, Poland;26Section of Le-
gal Medicine, Department of Experimental Biomedicine,
University of Pisa, Italy;27Institute of Forensic Medi-
cine, University of Oslo, Norway;
Forensic Medicine and Criminology, University of Za-
greb, Croatia;29Hungarian National Police, Budapest,
Hungary;
Geneva, Switzerland;
University of Barcelona, Spain;32Institute of Forensic
Medicine, University of Ljubljana, Slovenia;33Clinical
Department for Blood Group Serology, University of
Vienna, Austria;
University of Helsinki, Finland;35State Criminal Police
Office Sachsen-Anhalt, Magdeburg, Germany;36Institut
fu ¨ r Blutgruppenforschung, Cologne, Germany;37Insti-
tute of Legal Medicine, University of Cologne, Ger-
many;
Mu ¨ nster, Germany;39The National Board of Forensic
Medicine, Department of Forensic Genetics, Linko ¨ ping,
Sweden;
Criminology, Brussels, Belgium;
Medicine, University of Strasbourg, France;42Institute
of Legal Medicine, Martin-Luther-University, Halle-
Wittenberg, Germany;
nomics, Institute of Molecular Biology and Genetics,
Kiev, Ukraine;
University Hospital of Obstetrics and Gynaecology,
Sofia, Bulgaria;45Institute of Molecular Genetics, RAS,
Moscow, Russia;
University of Granada, Spain;47Department of Forensic
Medicine, University of Zaragoza, Spain;48Institute of
Forensic Medicine, Medical University, Lublin, Poland;
49Institute of Legal Medicine, Humboldt-University,
Berlin, Germany;
University of Copenhagen, Denmark;51Department of
Forensic Medicine, Medical Academy Warsaw, Poland;
9Department of
10National Institute of Legal
13Unitat de
16Institute of
17Institute of Legal Medicine, University of
24Department of Legal
28Department of
30Institute of Legal Medicine, University of
31Forensic Genetics Laboratory,
34Department of Forensic Medicine,
38Institute of Legal Medicine, University of
40National Institute of Criminalistics and
41Institute of Legal
43Department of Human Ge-
44Laboratory of Molecular Pathology,
46Department of Legal Medicine,
50Institute of Forensic Medicine,
52Institute of Legal Medicine, University of Innsbruck,
Austria;
University of Gdansk, Poland;
Medicine, University of Milan, Italy;55Institute of Legal
Medicine, Ernst-Moritz-Arndt-University, Greifswald,
Germany;56Department of Legal Medicine and Public
Health, University of Pavia, Italy;57Institute of Legal
Medicine, University of Graz, Austria;
Medical Genetics and Molecular Medicine, University
of Florence, Italy;
ences, University of Torino, Italy;60Institute of Legal
Medicine, Albert-Ludwigs-University, Freiburg, Ger-
many;61Institute of Legal Medicine, Johannes-Guten-
berg-University, Mainz,Germany;
Medicine Department, DNA Analysis Laboratory,
Athens, Greece;63Institute of Legal Medicine, Otto-von-
Guericke-University, Magdeburg, Germany;
Criminal Police Office Bavaria, Munich, Germany;
65Institute of Legal Medicine, University of Leipzig,
Germany;66Institute of Legal Medicine, Christian-Al-
brechts-University, Kiel, Germany;67Institute of Legal
Medicine, University of Rostock, Germany;68Institute
of Forensic Medicine, University School of Medical
Sciences, Bydgoszcz, Poland;69Unit of Legal Medicine,
University of Cantabria, Santander, Spain
53Institute of Forensic Medicine, Medical
54Institute of Legal
58Centre of
59Laboratory of Criminalistic Sci-
62AthensLegal
64State
References
Ammerman AJ, Cavalli-Sforza LL (1984) The neolithic transition
and the genetics of populations in Europe. Princeton University
Press, Princeton
Banniard M (1989) Gene ` se culturelle de l’Europe - Ve au VIII
sie ` cle. Le Seuil, Paris
Barbujani G (1987) Autocorrelation of gene frequencies under
isolation by distance. Genetics 117:777–782
Barbujani G, Chikhi L (2000) Genetic population structure of
Europeans inferred from nuclear and mitochondrial DNA
polymorphisms. In: Renfrew C, Boyle K (eds) Archaeogenetics:
DNA and the population prehistory of Europe. McDonald
Institute for Archaeological Research, Cambridge, pp 119–129
Bartlett R (1994) The making of Europe: conquest, colonization,
and cultural change, 950-1350. Princeton University Press,
Princeton
Boyd R, Silk JB (1997) How humans evolved. WW Norton, New
York
Chikhi L, Nichols RA, Barbujani G, Beaumont MA (2002) Y ge-
netic data support the Neolithic demic diffusion model. Proc
Natl Acad Sci U S A 99:11008–11013
Cooper DN, Krawczak M (1993) Human gene mutation. BIOS
Scientific Publishers, Oxford
Ewens WJ (1972) The sampling theory of selectively neutral alleles.
Theor Popul Biol 3:87–112
Excoffier L, Smouse PE (1994) Using allele frequencies and geo-
graphic subdivision to reconstruct gene trees within a species:
molecular variance parsimony. Genetics 136:343–359
Excoffier L, Smouse PE, Quattro JM (1992) Analysis of molecular
variance inferred from metric distances among mtDNA hapl-
otypes: application to human mitochondrial DNA restriction
data. Genetics 131:479–491
Gieysztor A, Kieniewicz S, Rostworowski E, Tazbir J, Wereszycki
H (1979) History of Poland. Polish Scientific Publishers, War-
saw
Gill P, Brenner C, Brinkmann B, Budowle B, Carracedo A, Jobling
MA, de Knijff P, Kayser M, Krawczak M, Mayr W, Morling
290
Page 13
N, Olaisen B, Pascali VL, Prinz M, Roewer L, Schneider PM,
Sajantila A, Tyler-Smith C (2001) DNA commission of the
international society of forensic genetics: recommendations on
forensic analysis using Y chromosome STRs. Int J Legal Med
114:305–309
Gimbutas M (1971) The slavs. Thames & Hudson, London
Helgason A, Sigurdadottir S, Nicholson J, Sykes B, Hill EW,
Bradley DG, Bosnes V, Gulcher JR, Ward R, Stefansson K
(2000) Estimating Scandinavian and gaelic ancestry in the male
settlers of Iceland. Am J Hum Genet 67:697–717
Jobling MA, Tyler-Smith C (2003) The human Y chromosome: an
evolutionary marker comes of age. Nat Rev Genet 4:598–612
Jobling MA, Hurles M, Tyler-Smith C (2004) Human evolutionary
genetics. Garland Science, New York
Kayser M, Caglia ´ A, Corach D, Fretwell N, Gehrig C, Graziosi G,
Heidorn F, Herrmann S, Herzog B, Hidding M, Honda K,
Jobling M, Krawczak M, Leim K, Meuser S, Meyer E, Oest-
erreich W, Pandya A, Parson W, Piccinini A, Perez-Lezaun A,
Prinz M, Schmitt C, Schneider PM, Szibor R, Teifel-Greding J,
Weichhold G, de Knijff P, Roewer L (1997) Evaluation of Y
chromosomal STRs: a multicenter study. Int J Leg Med
110:125–133
Kayser M, Roewer L, Hedman M, Henke L, Henke J,
Brauer S, Kru ¨ ger C, Krawczak M, Nagy M, Dobosz T,
Szibor R, de Knijff P, Stoneking M, Sajantila A (2000)
Characteristics and frequency of germline mutations at mi-
crosatellite loci from the human Y chromosome, as revealed
by direct observation in father/son pairs. Am J Hum Genet
66:1580–1588
Kayser M, Krawczak M, Excoffier L, Dieltjes P, Corach D, Pascali
V, Gehrig C, Bernini LF, Jespersen J, Bakker E, Roewer L, de
Knijff P (2001) Extensive analysis of chromosome Y microsat-
ellite haplotypes from globally dispersed human populations.
Am J Hum Genet 68:990–1018
Kittler R, Erler A, Brauer S, Stoneking M, Kayser M (2003)
Apparent intrachromosomal exchange on the human Y chro-
mosome explained by population history. Eur J Hum Genet
11:304–314
de Knijff P (2000) Messages through bottlenecks: on the combined
use of slow and fast evolving polymorphic markers on the hu-
man Y chromosome. Am J Hum Genet 67:1055–1061
Lessig R, Edelmann J, Krawczak M (2001) Population genetics of
Y-chromosomal microsatellites in Baltic males. Forensic Sci Int
118:153–157
Lessig R, Edelmann J, Zoledziewska M, Dobosz T, Fahr K,
Kostrzewa M (2004) SNP-genotyping on human Y-chromo-
some for forensic purposes—comparison of two different
methods. In: Progress in forensic genetics, vol 10, Elsevier,
Amsterdam, pp 334–336
Merriman J (1996) A history of modern Europe. WW Norton, New
York
Nei M (1987) Molecular evolutionary genetics. Columbia Univer-
sity Press, New York
Oota H, Settheetham-Ishida W, Tiwawech D, Ishida T, Stoneking
M (2001) Human mtDNA and Y-chromosome variation is
correlated with matrilocal versus patrilocal residence. Nat
Genet 29:20–21
Ploski R, Wozniak M, Pawlowski R, Monies DM, Branicki W,
Kupiec T, Kloostermann A, Dobosz T, Bosch E, Nowak E,
Lessig R, Jobling MA, Roewer L, Kayser M (2002) Homoge-
neity and distinctiveness of Polish paternal lineages revealed by
Y chromosome microsatellite haplotype analysis. Hum Genet
110:592–600
Richards M, Macaulay V, Torroni A, Bandelt HJ (2002) In search
of geographical patterns in European mitochondrial DNA. Am
J Hum Genet 71:1168–1174
Roewer L, Kayser M, Dieltjes P, Nagy M, Bakker E, Krawczak M,
de Knijff P (1996) Analysis of molecular variance (AMOVA) of
Y-chromosome specific microsatellites in two closely related
human populations. Hum Mol Genet 5:1029–1033 (Erratum in
Hum Mol Genet 6:828 1997)
Roewer L, Kayser M, de Knijff P, Anslinger K, Caglia ´ A, Corach
D, Fu ¨ redi S, Henke L, Hidding M, Ka ¨ rgel H-J, Lessig R, Nagy
M, Pascali VL, Parson W, Rolf B, Schmitt C, Szibor R, Teifel-
Greding J, Krawczak M (2000) A new method for the evalua-
tion of matches in non-recombining genomes: application to Y-
chromosomal short tandem repeat (STR) haplotypes in Euro-
pean males. Forensic Sci Int 114:31–43
Roewer L, Krawczak M, Willuweit S, Nagy M, Alves C, Amorim
A, Anslinger K, Augustin C, Betz A, Bosch E, Caglia ´ A, Car-
racedo A, Corach D, Dekairelle AF, Dobosz T, Dupuy BM,
Fu ¨ redi S, Gehrig C, Gusmao L, Henke J, Henke L, Hidding M,
Hohoff C, Hoste B, Jobling MA, Ka ¨ rgel HJ, de Knijff P, Lessig
R, Liebeherr E, Lorente M, Martı´nez-Jarreta B, Nievas P,
Nowak M, Parson W, Pascali VL, Penacino G, Ploski R, Rolf
B, Sala A, Schmidt U, Schmitt C, Schneider PM, Szibor R,
Teifel-Greding J, Kayser M (2001) Online reference database of
European Y-chromosomal short tandem repeat (STR) haplo-
types. Forensic Sci Int 118:106–113
Rolf B, Meyer E, Brinkmann B, de Knijff P (1998) Polymorphism
at the tetranucleotide repeat locus DYS389 in 10 populations
reveals strong geographic clustering. Eur J Hum Genet 6:583–
588
Rosser ZH, Zerjal T, Hurles ME, Adojaan M, Alavantic D, Am-
orim A, Amos W, Armenteros M, Arroyo E, Barbujani G,
Beckman L, Bertranpetit J, Bosch E, Bradley DG, Brede G,
Cooper GC, Corte-Real H, de Knijff P, Decorte R, Dubrova
YE, Evgravov O, Gilissen A, Glisic S, Go ¨ lge M, Hill EW,
Jeziorovska A, Kalaydijeva L, Kayser M, Kravcenko SA,
Lavinha J, Livshits LA, Maria S, McElreavey K, Meitinger TA,
Melegh B, Mitchell RJ, Nicholson J, Norby S, Noveletto A,
Pandya A, Parik J, Patsalis PC, Pereira L, Peterlin B, Pielberg
G, Prata MJ, Previdere ´ C, Rajczy K, Roewer L, Rootsi S,
Rubinsztein DC, Saillard J, Santos FR, Shlumukova M, Ste-
fanescu G, Sykes BC, Tolun A, Villems R, Tyler-Smith C, Jo-
bling MA (2000) Y-chromosomal diversity within Europe is
clinal and influenced primarily by geography rather than lan-
guage. Am J Hum Genet 67:1526–1543
Schneider S, Roessli D, Excoffier L (2000) Arlequin: A software for
population genetics data analysis, version 2.000. Genetics and
Biometry Lab, Dept. of Anthropology, University of Geneva
Seielstad MT, Minch E, Cavalli-Sforza LL (1998) Genetic evidence
for a higher female migration rate in humans. Nat Genet
20:278–180
Semino O, Passarino G, Oefner PJ, Lin AA, Arbuzova S, Beckman
LE, de Benedictis G, Francalacci P, Kouvatsi A, Limborska S,
Marcikiae M, Mika A, Mika B, Primorac D, Santachiara-
Benerecetti AS, Cavalli-Sforza LL, Underhill PA (2000) The
genetic legacy of Paleolithic Homo sapiens sapiens in extant
Europeans: a Y chromosome perspective. Science 290:1155–
1159
Shriver MD, Jin L, Chakraborty R, Boerwinkle E (1993) VNTR
allele frequency distributions under the stepwise mutation
model: a computer simulation approach. Genetics134:983–993
Sokal RR, Oden NL (1978) Spatial autocorrelation in biology. 1.
Methodology. Biol J Linn Soc 10:199–228
Weale ME, Yepiskoposyan L, Jager RF, Hovhannisyan N,
Khudoyan A, Burbage-Hall O, Bradman N, Thomas MG
(2001) Armenian Y chromosome haplotypes reveal strong re-
gional structure within a single ethno-national group. Hum
Genet 109:659-674
Weale ME, Weiss DA, Jager RF, Bradman N, Thomas MG (2002)
Y chromosome evidence for Anglo-Saxon mass migration. Mol
Biol Evol 19:1008–1021
Wells RS, Yuldasheva N, Ruzibakiev R, Underhill PA, Evseeva I,
Blue-Smith J, Jin L, Su B, Pitchappan R, Shanmugalakshmi S,
Balakrishnan K, Read M, Pearson NM, Zerjal T, Webster MT,
Zholoshvili I, Jamarjashvili E, Gambarov S, Nikbin B, Dostiev
A, Aknazarov O, Zalloua P, Tsoy I, Kitaev M, Mirrakhimov
M, Chariev A, Bodmer WF (2001) The Eurasian heartland: a
continental perspective on Y-chromosome diversity. Proc Natl
Acad Sci U S A 98:10244–10249
291
View other sources
Hide other sources
-
Available from Rüdiger Lessig · 22 Oct 2012
-
Available from gu.se