ArticlePDF Available

The Andalusian population from Huelva reveals a high diversification of Y-DNA paternal lineages from haplogroup E: Identifying human male movements within the Mediterranean space

Taylor & Francis
Annals of Human Biology
Authors:

Abstract and Figures

Gene flow among human populations is generally interpreted in terms of complex patterns, with the observed gene frequencies being the consequence of the entire genetic and demographic histories of the population. This study performs a high-resolution analysis of the Y-chromosome haplogroup E in Western Andalusians (Huelva province). The genetic information presented here provides new insights into migration processes that took place throughout the Mediterranean space and tries to evaluate its impact on the current genetic composition of the most southwestern population of Spain. 167 unrelated males were previously typed for the presence/absence of the Y-chromosome Alu polymorphism (YAP). The group of YAP (+) Andalusians was genotyped for 16 Y-SNPs and also characterized for 16 Y-STR loci. Results: The distribution of E-M81 haplogroup, a Berber marker, was found at a frequency of 3% in our sample. The distribution of M81 frequencies in Iberia seems to be not concordant with the regions where Islamic rule was most intense and long-lasting. The study also showed that most of M78 derived allele (6.6%) led to the V13* subhaplogroup. We also found the most basal and rare paragroup M78* and others with V12 and V65 mutations. The lineage defined by M34 mutation, which is quite frequent in Jews, was detected as well. The haplogroup E among Western Andalusians revealed a complex admixture of genetic markers from the Mediterranean space, with interesting signatures of populations from the Middle East and the Balkan Peninsula and a surprisingly low influence by Berber populations compared to other areas of the Iberian Peninsula.
Content may be subject to copyright.
Annals of Human Biology, JanuaryFebruary 2010; 37(1): 86107
ORIGINAL ARTICLE
The Andalusian population from Huelva reveals a high
diversication of Y-DNA paternal lineages from
haplogroup E: Identifying human male movements within
the Mediterranean space
B. AMBROSIO
1
, J. M. DUGOUJON
2
, C. HERNÁNDEZ
1
,
D. DE LA FUENTE
1
, A. GONZÁLEZ-MARTÍN
1
, C. A. FORTES-LIMA
1
,
A. NOVELLETTO
3
, J. N. RODRÍGUEZ
4
& R. CALDERÓN
1
1
Departamento de Zoología y Antropología Física, Facultad de Biología, Universidad Complutense,
Madrid 28040, Spain,
2
Laboratoire dAnthropologie, FRE 2960, Centre National de la Recherche
Scientique (CNRS), Université Paul Sabatier, Toulouse 31073, France,
3
Dipartimento di Biologia,
Università Tor Vergata, Rome 00133, Italy, and
4
Servicio de Hematología, Hospital Juan Ramón
Jiménez, Huelva, 21005, Spain
(Received 22 April 2009; accepted 28 July 2009)
Abstract
Background: Gene ow among human populations is generally interpreted in terms of complex
patterns, with the observed gene frequencies being the consequence of the entire genetic and
demographic histories of the population.
Aims: This study performs a high-resolution analysis of the Y-chromosome haplogroup E in Western
Andalusians (Huelva province). The genetic information presented here provides new insights into
migration processes that took place throughout the Mediterranean space and tries to evaluate its impact
on the current genetic composition of the most southwestern population of Spain.
Subjects and methods: 167 unrelated males were previously typed for the presence/absence of the
Y-chromosome Alu polymorphism (YAP). The group of YAP (+) Andalusians was genotyped for 16
Y-SNPs and also characterized for 16 Y-STR loci.
Results: The distribution of E-M81 haplogroup, a Berber marker, was found at a frequency of 3% in our
sample. The distribution of M81 frequencies in Iberia seems to be not concordant with the regions
where Islamic rule was most intense and long-lasting. The study also showed that most of M78 derived
allele (6.6%) led to the V13* subhaplogroup. We also found the most basal and rare paragroup M78*
and others with V12 and V65 mutations. The lineage dened by M34 mutation, which is quite frequent
in Jews, was detected as well.
Conclusions: The haplogroup E among Western Andalusians revealed a complex admixture of genetic
markers from the Mediterranean space, with interesting signatures of populations from the Middle East
Correspondence: Prof. Rosario Calderón, Departamento de Zoología y Antropología Física, Facultad de Biología, Universidad
Complutense, Ciudad Universitaria, 28040 Madrid, Spain. E-mail: rcalfer@bio.ucm.es
ISSN 0301-4460 print/ISSN 1464-5033 online 2009 Informa UK Ltd.
DOI: 10.3109/03014460903229155
and the Balkan Peninsula and a surprisingly low inuence by Berber populations compared to other
areas of the Iberian Peninsula.
Keywords: Y-SNPs,genealogical history,Mediterranean gene pool,Iberia,human migrations,source
populations
Introduction
Andalusia, a large and relatively densely populated region of southern Spain, has a long
history shaped by migrations from different parts of the world at different times, including a
relatively recent long Islamic settlement. Despite its implications for the peopling of the
Iberian Peninsula and its relevance to the evolution of modern Homo sapiens, the genetic
composition of the Andalusian people has never been studied in depth.
Within Andalusia, the westernmost province of Huelva is of particular interest, due to its
geographic position. It is located on the western fringe of Europe, bordering Portugal and
the Atlantic Ocean, and is also near the Strait of Gibraltar, which has served as both a
genetic barrier and corridor at different times. Its population is of moderate size and has
experienced a slow demographic growth. Furthermore, gene ow from contemporary
immigrants has been minimal, and the main features of the autochthonous population
were preserved.
The Y chromosome contains the largest non-recombining region (NRY) of the human
genome. It is a haploid locus harbouring a great deal of information that has found wide
applications in a range of elds such as human evolutionary, forensic, and medical genetics
(Underhill et al. 2000; Shastry 2002; Jobling and Tyler-Smith 2003; Novelletto 2007; Camp-
bell and Tishkoff 2008). Current knowledge of the Y-tree topology of the NRY region could
be qualied as both rened and complex, and much has come from the ongoing identi-
cation of new single-nucleotide polymorphisms (SNPs) and lineages in the human
population (de Knijff 2000; Hammer et al. 2000; Underhill et al. 2001, Jobling and
Tyler-Smith 2003; Cruciani et al. 2004, 2006, 2007; Behar et al. 2006; Sims et al.
2007; Karafet et al. 2008). The importance of the molecular and evolutionary characteristics
of SNPs becomes even more evident if we consider that the NRY region, just like the
mitochondrial DNA (mtDNA) genome, has a small effective population size (approximately
one fourth that of autosomes) (Hartl and Clark 1989; Hammer et al. 1997) which enhances
the signal of inter-population divergence.
Many recent studies on Y chromosome and mtDNA variation have largely focused on in-
depth analyses of European, North African, and Western Asian populations, all of which have
knownhistorical relationships within theMediterranean area(Hammer etal. 1997;Rosser etal.
2000; Scozzari et al. 2001; Cruciani et al. 2002, 2004, 2007; Semino et al. 2004; Cinnio
glu et al.
2004; Roewer et al. 2005; Torroni et al. 2006 among others). Research into these topics is
providing interesting insights into demographic changes, migratory patterns, and admixture
episodes that have occurred during recent human evolution.
The polymorphic presence of an Alu element in the Y chromosome denes a deep-rooting
clade containing haplogroups D and E of the phylogenetic tree of the Y-chromosome
haplogroups (see the human Y-chromosomal haplogroup tree at http://ycc.biosci.arizona.
edu/ (Y-Chromosomal Consortium)). In a recently published paper, Karafet et al. (2008)
reported that haplogroup E is characterized by a high number of mutations and is indeed one
of the most mutationally diverse of the 20 major Y-chromosome clades. These particularities
make it especially apt for investigating recent human migrations. Furthermore, many
Haplogroup E in Western Andalusia 87
populations around the world have already been studied in search of this haplogroup and can
therefore be used for comparison purposes. While haplogroup D seems to be conned to
Asia, haplogroup E (mainly E1b1b1, formerly E3b, lineages), with its strong phylogeo-
graphic structure, is more varied and appears to be highly frequent in Africa and moderately
so in southern Europe and in other regions of the Mediterranean, including the Levant
(Hammer et al. 1997, 2000; Underhill and Roseman 2001; Weale et al. 2003; Cruciani et al.
2004, 2007; Sims et al. 2007). E1b1a (formerly E3a) lineages, in contrast, are associated
with sub-Saharan Africa, and the Iberian Peninsula is one of the regions in the Mediter-
ranean area in which the two major monophyletic E subclades, E1b1b1 and E1b1a, exist,
albeit with varying frequencies between populations (Cruciani et al. 2004; Semino et al.
2004; Beleza et al. 2005; Neto et al. 2007).
The aim of this study was to perform a high-resolution analysis of haplogroup E in the
Andalusian population of Huelva and to compare the lineages observed with those in other
Iberian, European, North African, and more distant Mediterranean populations. The
genetic information presented here will provide a reliable background against which to
discuss new insights into migration processes that took place throughout the Mediterranean
space mainly during the period ranging from the protohistoric era, when the Tartessian
civilization ourished (from before 11th to 5th centuries BC), to the time of the rise and fall
of Islamic rule and to thus evaluate the impact of these processes on the current genetic
composition of the most southwestern population of Spain.
Materials and methods
Population samples and geographical sampling
Blood samples were collected by venepuncture into EDTA tubes by a group of doctors and
nurses from Hospital Juan Ramón Jiménez in Huelva city, accompanied by researchers (RC
and BA) from Universidad Complutense de Madrid. Participants were asked about the
origins of their parents and grandparents and their genetic relationships with other donors
contributing to this study. The sampling strategy was designed to be as representative as
possible and to include individuals from throughout the province (not including the city of
Huelva) and municipalities whose population size had remained more or less constant over
the last 2 centuries (http://www.ine.es). The municipalities chosen for sampling were
Aracena, El Repilado, El Cerro del Andévalo, La Puebla de Guzmán, Valverde del Camino,
Villablanca, and Niebla (Figure 1). Between 2004 and 2007, seven eld trips were made to
collect blood samples from 302 unrelated, healthy autochthonous males and females from
45 demographic units located throughout the province. Written informed consent was
obtained from all individuals prior to their participation. Additional information on the
geography, history, demography, and archaeology of the Andalusian province of Huelva and
its population can be found in Calderón et al. (2006) and references therein.
Laboratory analysis: Polymorphisms and haplotyping
Genomic DNA was extracted using a standard proteinase-K digestion followed by phenol
chloroform extraction and ethanol with some modications. A total of 167 unrelated males
described above were initially analysed for the presence/absence of the Y-chromosome Alu
polymorphism (YAP) following the recommendations of Hammer et al. (1997). Following a
descendent hierarchical order, we performed a high-resolution search of the Y-chromosome
binary E1b1b1 haplogroups, rst characterizing the following Y-SNPs: M96, M35, M78,
88 B. Ambrosio et al.
M81, M123, M281 and V6. Individuals with the M78 derived state were also genotyped for
the binary markers V12, V13, V22, V27, V32 and V65 described in Cruciani et al. (2006,
2007). Those individuals being positive for the M81 mutation were genotyped for internal
lineages: M107 and M165. Samples identied as M123 were also genotyped for M34.
Binary markers that were genotyped but not detected in our sample of YAP (+) Andalusian
males were V32, V27, V22, M107, M165, M281 and V6.
To genotype the Y binary markers, we followed standard protocols already described in
the literature using the following methods: Polymerase-chain reaction (PCR)-restriction
fragment length polymorphism (RFLP) analysis, SNP multiplexing, and direct sequencing.
Markers such as M96 and M35 were rst amplied by duplex PCR and then by single base
extension using the SNaPshot multiplex kit (Applied Biosystems, Foster City, CA, USA)
described by Brion et al. (2005). M78, V12, V13, V22, V27 and V32 were amplied using
previously published primers (Underhill et al. 2001, Cruciani et al. 2006) and their
corresponding allelic states were diagnosed by means of the restriction enzymes AciI
(also for V13), BsgI, MmeI, PvuII and MnlI, respectively following the protocol given
in Cruciani et al. (2006). Other markers such as V65, M81, M107, M165, M123, M34,
M281 and V6 were genotyped with published primers (Underhill et al. 2001, Cruciani et al.
2004) by sequencing both strands (BigDye Terminator kit v.3.1) using an ABI Prism 3730
DNA analyzer (Applied Biosystems).
Y chromosomes identied by the presence of biallelic polymorphisms (mutations) or SNPs
are called haplogroups or subhaplogroups if they are dened by a terminal mutation within a
given haplogroup. We used the nomenclature system proposed by Karafet et al. (2008) and
adopted by the Y Chromosomal Consortium to name haplogroups/subhaplogroup s dened by
N
S
ARACENA
EL REPILADO
EL CERRO DEL ANDÉVALO
Guadiana river
PUEBLA DE GUZMÀN
VILLABLANCA
Atlantic ocean
Portugal
VALVERDE DEL CAMINO
NIEBLA
HUELVA
01020 3040 50km
SEVILLA
France
Spain
Andalusia
Mediterranean sea
Portugal
Tinto river
Guadalquivir river
Odiel river
Figure 1. The geographic distribution of the municipalities sampled within the Huelva province (Andalusia, Spain).
Haplogroup E in Western Andalusia 89
the presence of a binary polymorphism. We also consulted information provided by the
International Society of Genetic Genealogy (http://www.isogg.org/tree/).
Y-microsatellite (Y-STR) markers
All the samples from our group of YAP(+) Andalusian males were also characterized for 16
short tandem repeat (STR) loci using the AmpFlSTRYler PCR amplication kit (Applied
Biosystems). The loci were DYS456[(AGAT)
n
], DYS389I/II[(TCTG)
n
(TCTA)
n
], DYS390
[(TCTA)
n
(TCTG)
n
], DYS458[(GAAA)
n
], DYS19[(TAGA)
n
], DYS385a/b[(GAAA)
n
],
DYS393[(AGAT)
n
], DYS391[(TCTA)
n
], DYS439[(AGAT)
n
], DYS635(Y GATA C4)
[(TATC)
n
], DYS392[(TAT)
n
], Y GATA H4[(TAGA)
n
], DYS438[(TTTTC)
n
], DYS437
[(TCTA)
n
], DYS448[(AGAGAT)
n
]. Alleles were detected using 5¢-labelled uorescent
primers, an ABI3100 capillary sequencer (Applied Biosystems), internal size standards,
and GeneMapper fragment analysis. In accordance with the recommendations of the
International Society of Forensic Genetics (ISFG) (Gill et al. 2001), Y-STR alleles were
named according to the number of variable repeats included. Alleles at DYS389II were
considered after subtracting the variation at DYS389I (Cooper et al. 1996).
Data analysis
Haplogroup and haplotype diversity, as described by Nei (1987, p. 187), as well as sampling
variance were estimated using ARLEQUIN software (versio n3.01) (Excofer et al. 2005). The
combination of alleles at multiple SNPs denes a NRY haplogroup of alleles whereas the
combinationatmultipleY-STRsona singleY chromosomedenesa Y-STRhaplotype(deKnijff
2000).E-M78 andE-M81 and theirassociatedmicrosatellitehaplotypeswith frequenciesof 5in
a set of Mediterranean population samples taken from the literature four from Northern Africa
(Egypt (n=2),Algeria, and Tunisia); three from Southern Europe (Italy, Portugal and the study
sample) and one from Western Eurasia (Turkey) were used to infer mutational relationships
between haplotype sequences based on ve Y-STR loci (DYS19, DYS390, DYS391, DYS392,
DYS393). The phylogenetic pattern was visualized using the Reduced Median (RM) network
algorithm (NETWORK 4.5 program; http://www.uxus-engineering.com/) (Bandelt et al.
1999). Microsatellites were weighted proportionally to the inverse of the repeat variance for
each haplogroup to reduce network reticulations.
The geographical variation for E subhaplogroups was analysed by hierarchical cluster
analysis (HCA) using the statistical program SPAD (Système Portable Pour LAnalyse de
Donnés; Lebart et al. 1984). HCA was performed on the basis of Euclidean distances and
Wards linkage algorithm, and analysis of variance was used to evaluate distances between
clusters. We included data sets from 73 population samples in the literature: 52 from Europe
(17 from the Iberian mainland and two from Iberian islands), 13 from North Africa
(including eight from Morocco), four from the Middle East, and four from Western
Asia. Genetic information was based on population frequencies of the following E lineages:
E-M35*, E-M78, E-M81, E-M123+E-M34 and E*(xE1b1b1), which contained lineages
with interesting geographic variation patterns. Populations were denoted by the rst two
letters used in country code top-level domains for Internet addresses (http://www.iana.org/
cctld/). The sample size for each population sample selected was 20. In the present state of
the literature, sample sizes of several interesting populations are lower than 50 individuals,
and thus the standard deviations can be high in relation to the sample frequencies. However,
those samples have been included in Table I in accordance with the great majority of the
published studies on the subject.
90 B. Ambrosio et al.
Table I. Population frequencies of Y-chromosome haplogroup E and its subclades.
Hg E Frequency of E subhaplogroup (%)
Populations ACRN Region nNo. % E-M35* E-M78 E-M81 E-M123+E-M34 E* (xE-M35) References
1. Spanish Basques 1 ESB1 IB 48 1 2.10 2.10 Semino et al. 2004
2. Spanish Basques 2 ESB2 IB 55 2 3.60 3.60 Cruciani et al. 2004
3. Spanish Basques 3 ESB3 IB 45 1 2.20 2.20 Underhill et al. 2000
4. Spanish Basques (Guipúzcoa) ESBG IB 74 1 1.35 1.35 Alonso et al. 2005
5. Pasiegos ESPA IB 56 24 42.90 1.80 41.10 Cruciani et al. 2004
6. Cantabrians ESCB IB 70 9 12.90 8.60 4.30 Flores et al. 2004
7. Asturians ESAT IB 90 12 13.30 10.00 2.20 1.10 Cruciani et al. 2004
8. Catalans ESCN IB 33 2 6.00 3.00 3.00 Semino et al. 2004
9. Andalusians (Huelva) ESAH IB 167 20 11.98 1.20 6.59 2.99 1.20 Present study
10. Andalusians (Cordoba) ESAC IB 27 3 11.10 3.70 7.40 Flores et al. 2004
11. Andalusians (Seville) ESAS IB 155 11 7.00 0.60 4.50 1.30 0.60 Flores et al. 2004
12. Andalusians 1 ESA1 IB 76 7 9.20 3.90 5.30 Semino et al. 2004
13. Andalusians 2 ESA2 IB 62 4 6.40 1.60 3.20 1.60 Cruciani et al. 2004
14. Tenerife ESTN IB 178 33 18.60 3.40 10.70 3.90 0.60 Flores et al. 2003
15. Mainland Portugal PTML IB 657 82 12.50 0.90 4.10 5.60 1.20 0.70 Neto et al. 2007
16. Northern Portuguese 1 PTN1 IB 50 7 14.00 2.00 4.00 4.00 4.00 Cruciani et al. 2004
17. Northern Portuguese 2 PTN2 IB 109 16 14.60 6.40 5.50 2.70 0.90 Flores et al. 2004
18. Southern Portuguese PTSU IB 49 9 18.30 4.10 12.20 2.00 Cruciani et al. 2004
19. Azores Islands PTAZ IB 319 42 12.90 1.30 6.10 3.70 0.90 0.90 Neto et al. 2007
20. French FR WEU 85 7 8.20 4.70 3.50 Cruciani et al. 2004
21. French Bearnais FRBN WEU 27 1 3.70 3.70 Semino et al. 2004
22. French Basques FRB1 WEU 45 0 0.00 Semino et al. 2004
23. Corsicans FRCO WEU 140 9 6.40 0.70 4.30 1.40 Cruciani et al. 2004
24. Sardinians 1 ITSD1 WEU 367 37 10.00 1.10 3.50 0.30 3.50 1.60 Cruciani et al. 2004
25. Sardinians 2 ITSD2 WEU 139 7 5.00 0.70 2.90 1.40 Semino et al. 2004
26. Danish DK WEU 35 1 2.90 2.90 Cruciani et al. 2004
27. Dutch NL WEU 34 0 0.00 Semino et al. 2004
28. Northern Italians ITN CEU 67 8 12.00 9.00 1.50 1.50 Cruciani et al. 2004
29. North-Central Italians ITNC CEU 56 6 10.70 10.70 Semino et al. 2004
Haplogroup E in Western Andalusia 91
Table I (Continued)
Hg E Frequency of E subhaplogroup (%)
Populations ACRN Region nNo. % E-M35* E-M78 E-M81 E-M123+E-M34 E* (xE-M35) References
30. Central Italians ITC CEU 89 12 13.40 11.20 2.20 Cruciani et al. 2004
31. Southern Italians ITS CEU 87 12 13.80 11.50 2.30 Cruciani et al. 2004
32. Southern Italians (Calabria 1) ITCL1 CEU 80 18 22.70 1.30 16.30 1.30 2.50 1.30 Semino et al. 2004
33. Southern Italians (Calabria 2) ITCL2 CEU 68 16 23.50 1.50 5.90 13.20 2.90 Semino et al. 2004
34. Southern Italians (Apulia) ITAP CEU 86 12 13.90 11.60 2.30 Semino et al. 2004
35. Sicilians 1 ITSY1 CEU 136 29 21.30 14.00 0.70 6.60 Cruciani et al. 2004
36. Sicilians 2 ITSY2 CEU 55 15 27.30 5.50 12.70 5.50 3.60 Semino et al. 2004
37. Mainland Croatia HRML EEU 108 6 5.60 5.60 Peri
cic´ et al. 2005
38. Herzegovinians BA EEU 141 12 8.50 8.50 Peri
cic´ et al. 2005
39. Serbians RS EEU 113 24 21.25 20.35 0.90 Peri
cic´ et al. 2005
40. Macedonians MK EEU 79 19 24.06 24.06 Peri
cic´ et al. 2005
41. Polish 1 PL1 EEU 99 4 4.00 4.00 Semino et al. 2004
42. Polish 2 PL2 EEU 38 1 2.60 2.60 Cruciani et al. 2004
43. Hungarians HU EEU 53 5 9.40 7.50 1.90 Semino et al. 2004
44. Estonians EE EEU 74 4 5.50 1.40 4.10 Cruciani et al. 2004
45. Russians RU EEU 42 0 0.00 Cruciani et al. 2004
46. Ukrainian UA EEU 93 8 8.60 7.50 1.10 Semino et al. 2004
47. Georgian GE EEU 41 0 0.00 Semino et al. 2004
48. Balkarian (Southern Caucasus) RUBK EEU 39 1 2.60 2.60 Semino et al. 2004
49. Bulgarians BG EEU 116 25 21.60 20.70 0.90 Cruciani et al. 2004
50. Albanians AL1 EEU 44 11 25.00 25.00 Semino et al. 2004
51. Northern Greeks (Macedonia) GRMA EEU 59 12 20.30 18.60 1.70 Semino et al. 2004
52. Greeks GR EEU 84 20 23.80 21.40 2.40 Semino et al. 2004
53. Turkish (Edirne) TRED EAS 52 8 15.38 11.54 3.85 Cinnio
glu et al. 2004
54. Turkish (Kars) TRKA EAS 82 12 14.63 8.54 6.10 Cinnio
glu et al. 2004
55. Turkish (Konya) TRKO EAS 90 8 8.89 1.11 1.11 6.67 Cinnio
glu et al. 2004
56. Turkish (Istanbul) TRIS EAS 81 13 16.05 7.41 4.94 3.70 Cinnio
glu et al. 2004
57. Iraqi IQ MDE 218 20 9.20 5.50 2.80 0.90 Semino et al. 2004
58. Lebanese LB MDE 42 8 19.10 11.90 2.40 4.80 Semino et al. 2004
92 B. Ambrosio et al.
Table I (Continued)
Hg E Frequency of E subhaplogroup (%)
Populations ACRN Region nNo. % E-M35* E-M78 E-M81 E-M123+E-M34 E* (xE-M35) References
59. Ashkenazim Jewish ILA MDE 77 14 18.20 1.30 5.20 11.70 Semino et al. 2004
60. Sephardim Jewish ILS MDE 40 12 30.00 2.50 12.50 5.00 10.00 Semino et al. 2004
61. Moroccan (Arabs) 1 MAA1 NAF 54 39 72.30 38.90 31.50 1.90 Cruciani et al. 2004
62. Moroccan (Arabs) 2 MAA2 NAF 49 37 75.50 42.90 32.60 Semino et al. 2004
63. Moroccan (Arabs) 3 MAA3 NAF 44 32 72.80 2.30 11.40 52.30 6.80 Semino et al. 2004
64. Moyen Atlas (Berbers) MABA NAF 69 60 86.90 10.10 71.00 5.80 Cruciani et al. 2004
65. Marrakesh (Berbers) MABM NAF 29 26 89.50 3.40 6.90 72.40 3.40 3.40 Cruciani et al. 2004
66. Morocco (Berbers) MAB NAF 64 55 85.90 10.90 68.70 6.30 Semino et al. 2004
67. Morocco (North-
Central Berbers)
MABN NAF 63 55 87.30 7.90 1.60 65.10 12.70 Semino et al. 2004
68. Morocco (Southern
Berbers)
MABS NAF 40 35 87.50 7.50 12.50 65.00 2.50 Semino et al. 2004
69. Saharawish EH NAF 29 24 82.70 75.90 6.80 Semino et al. 2004
70. Algerians DZ NAF 32 21 65.60 3.10 6.30 53.10 3.10 Semino et al. 2004
71. Tunisians TN NAF 58 32 55.10 3.40 15.50 27.60 5.20 3.40 Semino et al. 2004
72. Northern Egyptians EGN NAF 21 8 38.20 28.60 4.80 4.80 Cruciani et al. 2004
73. Southern Egyptians EGS NAF 34 6 17.60 17.60 Cruciani et al. 2004
Haplogroup E in Western Andalusia 93
Results
The clade E emerged as the second most common haplogroup in autochthonous Anda-
lusians from Huelva, with a high frequency (12%, 20/167 individuals) in comparison to
other Spanish populations. The corresponding value in neighbouring southern Portugal is
18% (Cruciani et al. 2004). Most of the Y chromosomes analysed in the Andalusian
population from Huelva belongs to the R1b lineage (R-P25), with an incidence of 59.9%.
The E-M35 (E1b1b1) haplogroup variations observed in Huelva are shown in Figure 2,
and E subhaplogroup frequencies for the 73 populations selected for comparison in Table I.
The other important basal monophyletic subclade within haplogroup E, E1b1a, which is
particularly frequent in sub-Saharan Africa (Underhill et al. 2000; Cruciani et al. 2002), was
not observed in our population, although carriers of this subclade have been detected in
mainland Portugal and the Azores Islands with higher frequencies (0.20.9% of total Y
chromosomes) (Beleza et al. 2005; Gonçalves et al. 2005; Neto et al. 2007) than those
detected in other European populations.
A comparative analysis of the E haplogroups detected in the Andalusian population with
frequencies reported for the population data set in Table I revealed a number of interesting
ndings. The E-M81 clade (ve occurrences) accounted for 3% of the total haplogroup E
frequency detected in Huelva, all belonging to the paragroup E-M81* (E1b1b1b* lineage);
Lineages
YCC 20021
D* D*
E*
E1b1b1*
E1b1b1a*
E1b1b1a1*
E1b1b1a1b*
E1b1b1a2*
E1b1b1a2a*
E1b1b1a3*
E1b1b1a4*
E1b1b1b*
E1b1b1b1
E1b1b1b2
E1b1b1c*
E1b1b1c1*
E1b1b1d
E3b3a*
E3b3*
E3b2b
E3b2a
E3b2*
E1b1b1e
20 (11.98)
0
0
0
0
0
0
0
0
1 (0.60)
2 (1.20)
2 (1.20)3
0
0
Hg frequency
5.(2.99)
7 (4.19)
1 (0.60)
2 (1.20)
E*
E3b*
E3b1*
V12
V32
V27
V13
V22
V65
M107
M165
M34
M123
M81
M78
M35
M96
YAP
M281
V6
1YCC 2002 (Genome Res. 2002 12:339-348)
2Karafet et al. 2008 (Genome Res. 2008 18:830-838)
3Relative frequencies
Karafet et al. 20082
Figure 2. Phylogeny of the clade E-M35 (E1b1b1) with the 16 binary markers used in the genetic characterization of
the autochthonous population of the Huelva province.
94 B. Ambrosio et al.
this frequency is similar to the mean value observed in Basques (Alonso et al. 2005) and
lower than that reported for the majority of Spanish (Adams et al. 2008; Capelli et al. 2009),
French (3.5%) and Portuguese populations (12% in mainland Portugal (Cruciani et al.
2004) and 4% in the Azores Islands (Neto et al. 2007)). Phylogeographic analysis of the
E-M81 lineages in Mediterranean populations has shown these lineages to be remarkably
frequent in Berbers (80% in Mozabites; 6573% in Berbers from Morocco) (Cruciani et al.
2004; Semino et al. 2004) although their frequency declines sharply towards the north east
(Egypt 5%). E-M81 lineages are practically absent in Eastern Europe (Peri
cic´ et al. 2005)
and uncommon in Italy, with the exception of Sicily (5.5%) (Semino et al. 2004) where there
was an Islamic occupation that lasted for over two centuries (8781091 AD). Presuming that
the presence of E-M81 in southwestern Europe is a signature of a North-African gene pool
shared also by Berbers, these migrants seem to have left a much smaller genetic imprint in
the male gene pool of Huelva than in other parts of Spain.
Our study also showed that most of the Y chromosomes carrying the E-M78-derived allele
were further classied into the E-V13 subhaplogroup (E1b1b1a2). The frequency of E-V13 in
our study population was relatively high (seven occurrences, 4.2% of total) in comparison with
other Iberian populations. In North Africa, this subhaplogroup shows a mean frequency of
4.5%, as reported by Cruciani et al. (2007) (see Table I), and is indeed most common in
Albanians (32.30%), Macedonians, Greeks, and Bulgarians (1518%) (Peri
cic´ et al. 2005). In
our analysis of the subclade E-M78, we also found 2 (1.2%) Y chromosomes in the most basal
and rare paragroup, E-M78*, one (0.60%) in E-V12, and, interestingly, a single occurrence of
E-V65 (0.60%). The average frequency of E-M78* has been estimated at 0.08% (Cruciani
et al. 2007) (see Table I) and the highest values have been registered in Egyptians from Gurna
(5.9%), followed by Moroccan Arabs (3.6%) and Sardinians (0.27%). No occurrences of this
subhaplogroup were found in any of the other 81 populations included in the analysis
by Cruciani et al. (2007). The subhaplogroup E-V12 has also been found in high frequencies
in Egyptians (ranging from 44% in the south to 6% in the north) and in Berbers (3.5%) and
Moroccan Jews (2%). The surprisingly high frequency observed in French Basques (6%, one
occurrence) seems to be due to the small size of the sample. The frequency of the sub-
haplogroup E-V65 in the sample from Huelva coincides with the mean frequency reported
by Cruciani et al. (2007) in Table I. E-V65 is relatively frequent in Arabs from north Morocco
(29%) and Libya (20%) but less common in other groups from Morocco, and in Libyans,
Egyptians, Sardinians, and Sicilians.
Finally, E-M34 (E1b1b1c1*), a lineage internal to the E-M123 haplogroup, was found in
1.2% of the Huelvan sample (two occurrences). The E-M34 lineage has been found at
relatively high frequencies in Jews (1012%) and in a sample from Calabria (13%). The
corresponding frequencies reported for Turkey and Tunisia are between 5% and 6%.
It is worth noting that two individuals (1.2% of total) in our study sample were found to
carry the derived state at M35 but not at all other known SNPs further downstream (http://
www.familytreedna.com/public/E3b/). The deep paragroup E-M35* (E1b1b1*, formerly
E3b*) is rare in Europe (~0.4%) but present in high frequencies in East Africa (8%-17%)
(Cruciani et al. 2004). Other frequencies reported for this paragroup are 8% for Moroccan
Berbers, 5.5% for Sicilians, 3% for Algerians and Tunisians, 2.5% for Sephardic Jews, and
1.3% for Ashkenazi Jews (see references in Table I of present study).
Frequency and structure of Y-STR haplotypes associated with each E binary markers
detected in the study Andalusian population sample is shown in Table II. The analysis
initially revealed a high haplogroup diversity for the E-M35 clade (h=0.8211 ±0.06) being
the subclade E-M78 that yielding a rather high level of SNP h(0.5906 ±0.15). The subclade
E-M78 also revealed a high level of internal Y-STR diversity (H=0.9818 ±0.05), with a
Haplogroup E in Western Andalusia 95
Table II. Distribution of Y-chromosome haplotypes among the subhaplogroups E found in Huelva (Spain).
Haplotypes Haplogroups DYS19 DYS389I DYS389II DYS390 DYS391 DYS392 DYS393 DYS385a,b DYS438 DYS439 Frequency
H1 E1b1b1*- M35 14 13 30 22 9 11 14 12,13 10 11 1
H2 E1b1b1*- M35 14 13 30 22 9 11 14 13,13 10 11 1
H3 E1b1b1a*- M78 14 14 32 24 10 11 13 18,21 10 13 1
H4 E1b1b1a*- M78 15 14 31 25 11 11 13 16,20 10 12 1
H5 E1b1b1a1*- V12 13 13 31 24 11 11 13 16,17 10 12 1
H6 E1b1b1a2*- V13 13 13 30 24 10 11 12 17,19 10 12 1
H7 E1b1b1a2*- V13 13 13 30 24 10 11 13 16,17 10 14 1
H8 E1b1b1a2*- V13 13 13 30 24 10 11 13 16,18 10 12 1
H9 E1b1b1a2*- V13 13 13 30 24 10 11 13 16,18 10 13 2
H10 E1b1b1a2*- V13 13 13 30 25 10 11 13 16,18 10 14 1
H11 E1b1b1a2*- V13 13 13 31 23 10 11 13 17,18 10 12 1
H12 E1b1b1a4- V65 13 12 29 24 11 11 13 16,17 10 10 1
H13 E1b1b1b*- M81 13 14 30 24 9 11 13 13,14 10 10 4
H14 E1b1b1b*- M81 14 14 30 24 9 11 13 13,14 10 10 1
H15 E1b1b1c1*- M34 13 13 30 24 11 11 13 15,16 10 13 1
H16 E1b1b1c1*- M34 13 13 31 24 10 11 13 16,16 10 12 1
96 B. Ambrosio et al.
mean variance of 0.3933 ±0.24. Conversely, the haplotype variability within the sister
subclade E-M81 was much lower (0.4000 ±0.24): Four of the ve males carrying the M81
marker had identical Y-haplotypes: DYS389I(14)-DYS389II(30)-DYS390(24)-DYS391
(9)-DYS392(11)-DYS393(13)-DYS385a/b(13,14)-DYS438(10)-DYS439(10). This nd-
ing was accompanied by a very low mean variance in allele size (0.0444 ±0.06). Within
E-M81* the fth haplotype differed only in one mutational step at the DYS19 locus (13- to
14repeat alleles). Analysis of variance at single Y-microsatellite loci showed that the
DYS390 locus yielded the highest value (0.70); the corresponding gure for DYS19 was
0.33, with the 13repeat allele being the most frequent (15 of the 20 YAP+ chromosomes
detected in the study sample). An in-depth analysis of the relationship between
haplogroup E frequency and associated haplogroup/haplotype diversities within and between
large geographic areas in the Mediterranean would provide interesting insights into pop-
ulation demographic histories.
The RM network of E-M78 microsatellite haplotypes in a group of Mediterranean samples
available in the literature is shown in Figure 3. The population data set included a total of 36
distinct haplotypes. The structure of the haplotype network is highly diversied, which is to be
expected given the haplogroup substructure that it includes. The most common node (68 of
160 Y chromosomes) is represented by the modal haplotype 13-24-10-11-13 (for the loci
DYS19-DYS390-DYS391-DYS392-DYS393, respectively), shared by different populations
in mainland Italy, Turkey, and Portugal and to a lesser extent by southern Egyptians, Algerian
Arabs, and Andalusians from Huelva (four chromosomes) (for more details see the legend
to Figure 3). This node bears the highest number of connections with other haplotypes (n=6),
strongly suggesting that it is the root, which is in turn the modal haplotype of the network
(Crandall and Templeton 1993). The longest lineage is composed of seven nodes, separated by
single mutational differences. Huelvan Y-haplotypes fall into three nodes (one specic copy in
each), which contain, in varying combinations, most of the populations noted above and other
neighbouring Mediterranean groups. Within this lineage, the third node (represented by the
Y-STR haplotype 14-24-10-11-13) is represented by males from northern and southern Egypt,
Italy, and Andalusia. We also detected one singleton (haplotypes represented by a single
individual) in the study sample. Haplotypes showing high frequencies are expected to be good
indicators of when a particular mutation originated, whereas those with low frequencies (rare
haplotypes), usually represented by a single individual, point to a recent evolutionary origin.
Indeed they occur preferentially at the tips of networks (Golding 1987; Excofer and Langaney
1989). An RM E-M81 network for the same samples was also constructed but is not shown
here. The most common node for E-M81 (61 of 126 chromosomes) is represented by the
modal haplotype 13-24-9-11-13; this coincided with the four identical Y chromosomes
detected in Huelva and is shared in variable frequencies by most (n=8) of the population
samples analysed.
Figure 4 portrays the results of HCA based on frequencies of the E-M35 haplogroup and
its main subclades within the data set of populations shown in Table I. Factors I and II
account for 97.81% of the variance, with a predominance of factor I (83.83%). The
multivariate analysis formed six distinct clusters, with E-M78 and E-M81 haplogroups
contributing signicantly to the pattern observed. With the aim of giving more support to the
number of clusters (six) suggested by the dendrogram, it is interesting to note that when
computing the Inertia decomposition on the rst two factors, the quotient (Inertia inter-
clusters/Inertia Total) =0.9565. Thus, this amount is highly coherent with the number of
major ramications showed by the tree, and it demonstrates that with six clusters a high
percentage (96%) of the data variation is explained. Clusters C1,C2, and C3 on the right side
of the plot are formed by European, Middle Eastern, and western Eurasian populations,
Haplogroup E in Western Andalusia 97
whereas clusters C4 and C6 on the upper left and lower sides of the bidimensional space, and
C5 in an intermediate position, include populations almost exclusively from northern Africa.
The genetic map clearly evidences the combined effect of interactions between different
evolutionary processes during the coalescence time of these populations for the major E
haplogroup.
Andalusians from Huelva (ESAH) are included in C2, which is distinguished by a high mean
cluster value for the lineage E-M123+E-M34 with respect to the overall mean (test-
value =3.35, p<0.001) and a low value for E-M81 (test-value =-2.15, p<0.02). The 18
populations that shape this cluster are Asturians from northern Spain, most (9 of 11) of the
Italian population samples, three Turkish population samples, Bosnians, Hungarians, Ukrai-
nians, Sephardic Jews, and Lebanese. Our Andalusian sample, thus, is grouped with popula-
tions from the middle and eastern areas of the Mediterranean basin although the topology of
the HCA plot also reveals a genetic afnity of Huelva with several populations located within
cluster C1 such as Portuguese from the Azores Islands and northern mainland Portugal,
Huelva
Italy
Portugal
Turkey
Algeria
North Egypt
South Egypt
Tunisia
Figure 3. Reduced median (RM) microsatellite haplotype network for E-M78 haplogroup based on ve Y-STRs
loci (DYS19, DYS390, DYS391, DYS392, DYS393). Circles are proportional to the number of individuals sharing
that haplotype. Mediterranean populations used to construct the network have been taken from the literature:
Tunisians, Egyptians from Northern and Southern, Arabs and Berbers from Algeria (Arredi et al. 2004); Turkish
98 B. Ambrosio et al.
Andalusians (in general), Andalusians from Cordoba, and French. The genetic topology of
Huelva closely agrees with the results of an earlier analysis of GM immunoglobulin allotypes
on the same population sample (Calderón et al. 2006).
Cluster C1 is the most numerous cluster, and contains mostly European populations,
including 15 of the 17 population samples analysed in Spain. Also included in this cluster are
-22.5
-4 0 4
EGN
8Factor 1 - 83.83%
Factor 1 - 83.83%
M78
MAA1
MAA2
C4
AL1
MK
GR
BG
RS
GRMA
EGS
ITSY1
LB
ITC
ESAT
ESAH
PTAZ
ESAS ESB2
ESB3
ESCN
ESB1 ESBG
EGN
AL1
MK
GR
BG
RS
GRM
EG
ITCL1
ITSY1
TRED
LB
ILS
ITSY2
ITN
ESA
ITC
TRK
A
IT BA
TRIS
UA
HRML
FR
PTN2
ESAH
C2
C3
M123/M34
PTML
PTAZ
PTN
ITSD1 IL
ITCL
ITSD
ESBG
FRB
ESB1
ESB2
ESB3
ESAS
ESCB
E*(xE-M35)
M35 ESAC ESA1
C1
ESTN
PTSU
TRK
ESA
GE
ITA IT
TRKO
ESA3
RUNL
FRB1
PL2
RUBK GE
ITSD2
ITSD1
PTN1
PL1
EE
FRBN
FR
ESA1
PTML
ESAC
ESCB
ESTN
PTSU
7.5
0
-7.5
-15.0
Factor 2-13.98%
-60 -45 -30 -15 0
PTN2
ITCL1
ITSY2 ILS
C3
C2
C1
ITN
TRED
ITAP
ITS
ITNC
TRKA
BA
HUUA
ILA
TRIS
ITCL2
HRMLIQ
FRCO
DK
MAA3
DZ
ESPA
TN
C5
MABS
MAB
MABA
MABM
M81
C6
EH MABN
Factor 2-13.98%
10
0
-10
-20
-30
a.
b.
Figure 4. (a) Hierarchical cluster analysis (HCA) of 73 European, Western Asian, Middle Eastern and Northern
African populations based on Y-chromosomal diversity of E-M35 subhaplogroups (M35, M78, M81, M123+M34
and E*(xE-M35)). (b) A zoom plot of clusters C1, C2 and C3. Population codes are as in Table I.
Haplogroup E in Western Andalusia 99
Turks from Konia, Ashkenazi Jews, and Iraqis. The populations in C1 are distinguished by
low frequencies of all the E-M35 lineages considered (test value range, -1.79 to -5.36; p
value range, 0.0370.000). Interestingly, haplogroup E is absent from other populations in
this cluster, such as the Georgians, the Russians, the Dutch, and the French Basques, and is
very infrequent in Spanish Basques, Danes, Poles, and people from the Balkans. These
ndings suggest the existence of a prehistoric migration corridor following the arching plains
of Europe stretching from Caucasus to the Basque area, as has been postulated by Calderón
(2000) and Calderón et al. (1998).
Most of the samples in cluster C3, which is characterized by high frequencies of the M78
haplogroup (test value =4.76, p<0.001), are from the Balkan region (Greeks, Albanians,
Serbs, Macedonians, and Bulgarians) but there were also two Egyptian samples and a
sample from Calabria (Italy), which historically formed part of Magna Graecia.
Cluster C4, which has the highest E-M78 frequencies (test value =5.37, p<0.001) can be
found in the lower part of the plot. This cluster is made up of two Moroccan Arab
populations who, in contrast to neighbouring Maghreb populations, would have originated
from somewhere in Arabia or the Fertile Crescent and had relatively little contact with
Berber populations.
Cluster C5, on the contrary, is characterized by high frequencies of E-M81 and the
paragroup E-M35* (test value >2.5, p<0.01) and is made up of three Maghrebi populations
(Algerians, Tunisians, and Arabs from Morocco), and surprisingly, a sample from the
Pasiega region in Cantabria, northern Spain. The abnormal presence of this northern
Spanish population in this cluster, which has no known historical or demographic justi-
cation, strongly suggests the need to repeat analyses in this population. While there is a
strong Arab component in the north African populations in C5, it is not as strong as that
observed in C4.
Finally, cluster C6 has the highest frequencies of E-M81, E-M35* and E*(xE-M35) (test
value range, 3.937.19; p=0.000). This cluster is formed by six populations, including ve
Moroccan Berbers populations and a Saharawi population. Cluster 6, therefore, is the group
that best denes the current Berber population.
Discussion
The geographic position of Huelva, located at the southwestern fringe of Europe, and the
fact that haplogroup E the focus of the present study appears to have originated in East
Africa and is not frequent in this part of Europe prompt us to consider that the native
population of Huelva received these genes from elsewhere. In this section, we will explore
different hypotheses that might shed some light on who these populations were and where
they came from.
The times to the most recent common ancestors (TMRCA) of the main haplogroups in
the E clade observed in Huelva, E-M78 and E-M81, were estimated at 18.64.3 thousand
years ago (kya) (Cruciani et al. 2004, 2007). The most frequent haplogroups in Huelva,
E-V13 and E-M81, emerged after the Younger Dryas (~12 kya), with E-M81 emerging close
to the beginning of the Neolithic age (5.6 kya). The estimated time of coalescence for E-V65
is more recent (~4.5 kya), and would have coincided with the beginning of the Early Bronze
Age in Iberia. The TMRCA for the paragroups E-M35* and E-M78* would be some time
before the Last Glacial Maximum (~18 kya). It should be noted that alleles (mutations) must
be present in high frequencies in order to be transferred effectively (i.e. in notable
frequencies) by migrants to other populations. This is why effective gene ow will only
occur quite some time after the TMRCA. Consequently, migration linked to the movement
100 B. Ambrosio et al.
of males with E-M35 lineages should be situated within a relatively recent evolutionary
period. In this context, it should be noted that TMRCA estimates are dependent on many
assumptions about mutational processes and population structures (Hein et al. 2005).
The wide but non-homogeneous spatial distribution pattern of E-M81 chromosomes in
Iberia does not seem to be concordant with the regions in which Islamic occupation was
most intense and prolonged (Lopez-Davalillo 2000; Martinez-Ruiz et al. 2003). This would
strengthen the hypothesis that migratory movements took place between Maghreb and Iberia
prior to the Islamic occupation, and those other important movements within the peninsula
occurred later (Calderón 2006). The Islamic occupation of western Andalusia lasted from
711 to 1262 and 5 years later Portugal and Castile (Spain) agreed on the southern border
dividing their kingdoms. The Berbers, who arrived in several waves, were the most
numerous of all the migrant populations that arrived in Iberia during Islamic rule. It has
been estimated that around one-third of the 300 000 Berbers that arrived during these years
did so in the eighth century (Mackay 1977), a time when the total population of the
peninsula has been estimated at between 6 and 7 million (Dupaquier 1997). The Berbers
tended to settle in the mountainous areas of the peninsula and, interestingly, most of them
were men of reproductive age, many of whom came with their wives, also Berbers in many
cases. Some of the descendants of these early occupiers were to return to Maghreb between
1264 and 1609. There are several relevant historical contexts preceding the Islamic
occupation that are likely to have had a considerable impact on population dynamics in
the Mediterranean area and in the Iberian Peninsula in particular. These were (i) the
existence of a Berber gene ow associated with the Carthaginian period, which included a
portion of north African natives; (ii) the establishment of the Roman Empire on both sides of
the Strait of Gibraltar, which considerably improved communication and safety throughout
the Empire; and (iii) the considerable and lasting difference in population sizes between
Europe and North Africa (always higher in Europe), which would explain the low North
African contribution to the gene pool in Europe, although the Berbers did leave a genetic
imprint in numerous locations in Europe due to the remarkably high frequency of E-M81 in
this population.
The levels of E-V13* detected in our study population could perhaps be explained by
possible contact with populations that would have travelled by sea from areas under Greek
control during the protohistoric period, when the Kingdom of Tartessos gained strategic
importance thanks to its extraordinarily rich deposits of copper, silver, and tin. Abundant
remains of the Tartessian culture have been found in the area of Huelva, in particular,
remains corresponding to the period after the eight century BC marked by intense contacts
with civilizations from the eastern Mediterranean (Almagro-Basch et al. 1974; Fernandez-
Jurado et al. 1997). Herodotus, in 1.163, reports that the Phocaean were the rst of the
Greeks who performed long voyages, and it was they who made the Greeks acquainted with
the Adriatic and with Tyrrhenia, with Iberia, and the city of Tartessos. The vessel which they
used in their voyages was the long penteconter (50oared ship). On their arrival at Tartessus,
the King Arganthonius took a liking to them. He begged the Phocaeans to quit Ionia and
settle in whatever part of his country they liked(Placido 1999). The existence of trade
relations across the Mediterranean Sea therefore seems to be a more plausible explanation
for this gene ow into Huelva than the theory of population movements following the long,
winding river waters connecting the southern Balkans and north-central Europe as Peri
cic´
et al. (2005) and others have suggested. Nonetheless, a small proportion of the E-V13
Y-chromosomes found in Huelva might have arrived much later, through the Visigoths, who
travelled to Hispania after 411 AD from the northern region of the Black Sea and the
Balkans.
Haplogroup E in Western Andalusia 101
The presence of Y chromosomes E-M78* and E-V12* might be due to an Egyptian gene
ow, which is supported by historical evidence. Around 742 AD, a large army led by the
Syrian Balch arrived in the Iberian Peninsula to suppress a Berber rebellion against the
Arabs. The army was made up of several contingents (chunds) from different Islamized
regions of the Levant and Egypt. Following the defeat of the Berbers, the troops were
separated according to their regions of origin and sent to different parts of the south of the
peninsula. Part of the Egyptian contingent settled in the district of Beja (today a Portuguese
town bordering Huelva; Ajbar Machmuâ, translated into Spanish by Lafuente Alcántara.
Madrid 1867; Ibn-Al-Jatib cited in Dozy, Recherches, I3, 78 Leyden 1881).
The V65 marker detected should also be considered a signature of the Arabs. Because
Arabs also settled in Maghreb after the Islamic conquest of this region, the E-V65
chromosomes in Huelva might have come directly from the Middle East, without necessarily
having passed through an intermediate North African reproductive stage. It has been
estimated that the rst inux of Arabs into the Iberian Peninsula numbered 30 00040
000, a relatively small number and substantially inferior to that of the Berbers. This rst
group of Arabs consisted of two rival tribes: The Qaysi, from the north of the Arabian
Peninsula, who eventually settled in the province of Huelva, and the Yameni from the south.
Arabs, unlike Berbers, settled mostly in cities, chose their wives from among the Visigothic
nobility, avoiding marriage with Berbers. Many were rich and powerful and, interestingly,
they practised polygamy, which would have had a multiplier effect on chromosome Y
lineages. By way of example, the 10 emirs and caliphs that governed Cordoba between 756
and 1013 had at least 143 sons (who did not die prematurely) who had male offspring
ranging from 40 to none (Vallvé 1977).
Following the Reconquista of western Andalusia, a considerable proportion of the Muslim
population, who had lived there for over 5 centuries, left these lands for Granada or
Morocco. Later, during the 14th and 15th centuries, about 400 000 Muslims (»4.5% of the
total population) abandoned the peninsula, and in 1609, with the passing of the decree to
expel Moriscos (Spanish Muslims), many were sent to North Africa (Morocco and Algeria),
a region considerably less populated than the Iberian Peninsula (Lopez-Davalillo 2000).
A portion of the present Moroccan and Algerian Arab and Berber male gene pool would thus
have remained a long period of time in the Iberian Peninsula. As far as the E-M35 is
concerned, populations carrying this mutation from East Africa would presumably have
migrated to southwestern Andalusia after an intermediary settlement period alongside
Maghreb Berbers during the Islamic occupation. The frequencies of this mutation observed
in Jewish populations could be the result of previous contact with populations from the
Middle East and West Africa or later links with Berbers in North Africa.
Finally, the Jewish settlements in Andalusia could explain the frequency of the E-M34
subclade in Huelva. Tartessos is mentioned numerous times in ancient writing sources
(see Myro 1999 for a list of citations), indicating the existence of close contact between both
extremes of the Mediterranean dating back many centuries. In Andalusia, there were Jewish
communities at least as far back as the time of the Roman Empire (García Iglesias 1978).
Several requisites must be satised for a source population to contribute noticeably to
frequencies of a particular genetic marker in a recipient population. Firstly, the frequencies
of the marker in the source population and the size of the migrating population in relation to
that of the recipient population must be high enough in order not to excessively dilute the
gene ow. When testing such a hypothesis, thus, it is necessary to analyse the demographic
sizes of both populations and the relative and absolute frequencies of the marker in question.
Because human genetic diversity mainly seems to consist of frequency clines (Serre and
Pääbo 2004), substantial initial differences in frequencies are generally found between
102 B. Ambrosio et al.
sources and recipients when populations are separated by large distances or abrupt barriers.
Gene frequencies only reach high enough levels to produce an effective admixture (i.e. a
frequency that lasts and is easy to detect) in the recipient population a long time after the
mutation event occurred in the source population. Occasional, motivated group migrations
therefore are more plausible and genetically effective than small persistent, gene ows.
A recipient population can receive a particular gene directly from the source population or
through intermediate populations which inherited it from the source group. In such a case,
however, effective gene ow is seriously compromised by the time required for the
movement to be completed and the mutation frequencies that would occur in the inter-
mediate populations. This reduces the number of theoretically possible migrations to just a
few plausible ones. Migrations that occur in several stages have several characteristics.
Firstly, a considerable amount of time is required at each stage for admixture to occur and to
have an effect on the spread of gene in the next stage; and secondly, there is a progressive
reduction of gene frequencies at each intermediate stage (population) governed by the
decreasing power law, f
i
=k
i
f
o
, where f
o
is the marker frequency in the original source
population, f
i
the marker frequency at each stage i, and kthe roughly constant fraction of
migrants at each stage. For example, with k=0.25 and i=3, f
3
=0.156 f
o
. Consequently, just
a few stages are necessary to reduce gene frequency in the nal recipient population to a level
of close to zero, regardless of the frequency of the marker in the original source population. It
would therefore seem logical to reject hypotheses involving migrations that occur in several
stages because the nal effect in the recipient population would be negligible.
Populations that have not experienced direct gene ow might have similar proportions of a
particular genetic marker if a source population, which might even be now extinct, had sent
the same proportion of migrants in two opposite directions but in what was considered a
single migratory movement. In such a case, an allele would have virtually travelled twice as
far as the distance covered by the migrants. The Mediterranean space is characterized by
relatively short distances and an absence of major geographic barriers (maximum sea spans
are 3800 km from east to west and 900 km from north to south and the land distance
following the north African coastline is 5740 km) (Hofrichter 2004). This means that it
could be crossed without great difculties by foot, horse, or rudimentary ships. The main
restriction to human movement would have been the presence of hostile humans. Until
recent times, sea travel would have been the most rapid and safest mode of transport.
Because we generally know less about ancient, prehistoric migratory processes than we do
about more recent processes, it is more difcult to reject hypotheses regarding earlier
processes and discover caveats in later ones, particular when there are well documented
migrations and information on what motivated the movement (trade, wars, mating
searches). Historic movements, however, must not be ignored, and relevant parameters
relating to known recent migrations must be estimated and added to hypotheses regarding
these movements.
Many of the genetic traits that characterize human populations everywhere are the result of
migratory processes that have shaped the general peopling of the world. Basques, for
example, are different because they have not been effective gene sources and have a
weak signature from East Asia (Calderón et al. 1998; Hellenthal et al. 2008) Berbers are
different because they lived in moderately small numbers in a long, narrow area bordered by
the sea and the desert and have a very high frequency of certain genetic markers (e.g.
haplogroup E) which has been transmitted in low frequencies to many European popula-
tions; and Andalusians from Huelva are different in that they have a considerable proportion
of gene markers from the opposite Eastern edge of the Mediterranean.
Haplogroup E in Western Andalusia 103
Conclusions
Our analysis of the Y-chromosome haplogroup E in the native population of Huelva revealed
a complex admixture of genetic markers from the Mediterranean space, with interesting
signatures of populations from the Middle East and the Balkan Peninsula and a surprisingly
low inuence by Berber populations compared to other areas of the Iberian Peninsula. These
particular traits can, plausibly, be explained by protohistoric and other documented
historical movements against the backdrop of the Tartessian civilization, the rise and fall
of the Roman Empire, and the different migrations associated with the expansion and
decline of Islam during the Middle Ages. We believe that an explanation based on
prehistoric movements is less plausible. As a result of the magnitude of these migratory
movements, Huelva occupies a central position on the Mediterranean genetic map, despite
its location at the western edge of the Mediterranean Basin.
Acknowledgements
We express our sincere thanks to the people of Huelva who generously donated blood
samples to contribute to this study, and also to Dr A. Fernández-Jurado from the
Haematology Department and Dr E. Prado and Dr D. Fernández from the Blood
Transfusion Center at Hospital Juan Ramón Jiménez in Huelva for their invaluable help
in organizing the eldwork to collect samples, and to Dr P. Cuesta from Complutense
Computer Center for his help with statistical analysis. This research was supported by grants
from the Spanish Ministry of Education and Science (Investigation Projects
BOS2002-01677 and CGL2006-04749/BOS) awarded to RC and from the Italian Ministero
dellIstruzione, dellUniversità e della Ricerca (MIUR-PRIN 2007) awarded to AN.
References
Adams SM, Bosch E, Balaresque PL, Ballereau SJ, Lee AC, Arroyo E, López-Parra AM, Aler M, Grifo MSG, Brion
M, Carracedo A, Lavinha J, Martínez-Jarreta B, Quintana-Murci L, Picornell A, Ramon M, Skorecki K, Behar
DM, Calafell F, Jobling MA. 2008. The genetic legacy of religious diversity and intolerance: Paternal lineages of
Christians, Jews, and Muslims in the Iberian Peninsula. Am J Hum Genet: 83:725736.
Ajbar Machmuâ. 1867. Crónica anónima del siglo XI. Colección de obras arábigas de historia y geografía, vol. I.
Madrid: Real Academia de la Historia.
Almagro-Basch M, del Amo M, Beltrán A. 1974. Huelva: Prehistoria y Antigüedad. Madrid: Editorial Nacional.
Alonso S, Flores C, Cabrera V, Alonso A, Martín P, Albarrán C, Izagirre N, de la Rúa C, García O. 2005. The place
of the Basques in the European Y-chromosome diversity landscape. Eur J Hum Genet 13:12931302.
Arredi B, Polonin ES, Paracchini S, Zerjal T, Fathallah DM, Makrelouf M, Pascali VL, Novelletto A, Tyler-Smith
C. 2004. A predominantly neolithic origin for Y-chromosomal DNA variation in North Africa. Am J Hum Genet.
75:338345.
Bandelt, HJ, Forster P, Röhl A. 1999. Median-joining networks for inferring intraspecic phylogenies. Mol Biol
Evol. 16:3748.
Behar DM, Metspalu E, Kivisild T, Achilli A, Hadid Y, Tzur S, Pereira L, Amorim A, Quintana-Murci L, Majamaa
K, Herrnstadt C, Howell N, Balanovsky O, Kutuev I, Pshenichnov A, Gurwitz D, Bonne-Tamir B, Torroni A,
Villems R, Skoreck K. 2006. The matrilineal ancestry of Ashkenazi Jewry: Portrait of a recent founder event. Am J
Hum Genet 78:487497.
Beleza S, Gusmão L, Amorim A, Carracedo A, Salas A. 2005. The genetic legacy of western Bantu migrations.
Hum Genet 117:366375.
Brion M, Sobrino B, Blanco-Verea A, Lareu MV, Carracedo A. 2005. Hierarchical analysis of 30 Y-chromosome
SNPs in European populations. Int J Legal Med 119:1015.
Calderón R. 2000. Population and peopling in the Mediterranean world. Int J Anthropol 15:271278.
Calderón R. 2006. Gene ow in the Iberian Peninsula. Lecture at 15th Congress of the European Anthropological
Association (EAA). Man and Environment: Trends and Challenges in Anthropology, Budapest (Hungary).
104 B. Ambrosio et al.
.
Calderón R, Vidales C, Peña JA, Perez-Miranda A, Dugoujon JM. 1998. Immunoglobulin allotypes (GM and KM)
in Basques from Spain: Approach to the origin of the Basque population. Hum Biol. 70:667698.
Calderón R, Ambrosio B, Guitard E, Gonzalez-Martin A, Aresti U, Dugoujon JM. 2006. The genetic position of
Andalusians from Huelva in relation to other European and North-African populations: A study based on GM
and KM allotypes. Hum Biol 78:663679.
Campbell MC, Tishkoff SA. 2008. African genetic diversity: Implications for human demographic history, modern
human origins and complex disease mapping. Ann Rev Genom Hum Genet 9:403333.
Capelli C, Onofri V, Brisighelli F, Boschi I, Scarnicci F, Masullo M, Ferri G, Tofanelli S, Tagliabracci A, Gusmao
L, Amorim A, Gatto F, Kirin M, Merlitti D, Brion M, Verea AB, Romano V, Cali F, Pascali V. 2009. Moors and
Saracens in Europe: Estimating the medieval North African male legacy in southern Europe. Eur J Hum Genet
17:842-852.
Capelli C, Redhead N, Romano V, Calì F, Lefranc G, Delague V, Megarbane A, Felice AE, Pascali VL, Neophytou
PI, Poulli Z, Novelletto A, Malaspina P, Terrenato L, Berebbi A, Fellous M, Thomas MG, Goldstein DB. 2006.
Population structure in the Mediterranean basin: A Y chromosome perspective. Ann Hum Genet 70:207225.
Cinnio
glu C, King R, Kivisild T, Kalfo
glu E, Atasoy S, Cavalleri GL, Lillie AS, Roseman CC, Lin AA, Prince K,
Oefner PJ, Shen P, Semino O, Cavalli-Sforza LL, Underhill PA. 2004. Excavating Y-chromosome haplotype
strata in Anatolia. Hum Genet 114:127148.
Cooper G, Amos W, Hoffman D, Rubinsztein DC. 1996. Network analysis of human Y microsatellite haplotypes.
Hum Mol Genet 11:17591766.
Crandall KA, Templeton AR. 1993. Empirical test of some predictions from coalescent theory with applications to
intraspecic phylogeny reconstruction. Genetics 134:959969.
Cruciani F, Santolamazza P, Shen P, Macaulay V, Moral P, Olckers A, Modiano D, Holmes S, Destro-Bisol G,
Coia V, Wallace DC, Oefner PJ, Torroni A, Cavalli-Sforza LL, Scozzari R, Underhill PA. 2002. A back migration
from Asia to Sub-Saharan Africa is supported by high-resolution analysis of human Y-chromosome haplotypes.
Am J Hum Genet 70:11971214.
Cruciani F, La Fratta R, Santolamazza P, Sellitto D, Pascone R, Moral P, Watson E, Guida V, Colomb EB,
Zaharova B, Lavinha J, Vona G, Aman R, Calì F, Akar N, Richards M, Torroni A, Novelletto A, Scozzari R.
2004. Phylogeographic analysis of haplogroup E3b (E-M215) Y chromosomes reveals multiple migratory events
within and out of Africa. Am J Hum Genet 74:10141022.
Cruciani F, La Fratta R, Torroni A, Underhill PA, Scozzari R. 2006. Molecular dissection of the Y chromosome
haplogroup E-M78 (E3b1a): A posteriori evaluation of a microsatellite-network-based approach through six new
biallelic markers. Hum Mutat 27:831832.
Cruciani F, La Fratta R, Trombetta B, Santolamazza P, Sellitto D, Colomb EB, Dugoujon JM, Crivellaro F,
Benincasa T, Pascone R, Moral P, Watson E, Melegh B, Barbujani G, Fuselli S, Vona G, Zagradisnik B, Assum
G, Brdicka R, Kozlov AI, Efremov GD, Coppa A, Novelletto A, Scozzari R. 2007. Tracing past human male
movements in northern/Eeastern Africa and Western Eurasia: New clues from Y-chromosomal haplogroups
E-M78 and J-M12. Mol Biol Evol 24:13001311.
de Knijff P. 2000. Messages through bottlenecks: On the combined use of slow and fast evolving polymorphic
markers on the human Y chromosome. Am J Hum Genet 67:10551061.
Dupâquier J. 1997. Des origines aux prémices de la revolution démographique. In: Bardet JP and Dupâquier J, eds.
Histoire des Populations de lEurope (pp. 2638). Paris: Librairie Arthème Fayard.
Excofer L, Langaney A. 1989. Origin and differentiation of human mitochondrial DNA. Am J Hum Genet 44:73
85.
Excofer L, Laval G, Schneider S. 2005. Arlequin (version 3.0): An integrated software package for population
genetics data analysis. Evol Bioinform Online 1:4750.
Fernandez-Jurado J, García-Sanz C, Bufete P. 1997. De Tartessos a Onuba. Huelva: Diputación de Huelva.
García Iglesias L. 1978. Los judíos en la España Antigua. Madrid: Ediciones Cristiandad.
Gill P, Brenner C, Brinkmann B, Budowle B, Carracedo A, Jobling MA, de Knijff P, Kayser M, Krawczak M, Mayr
WR, Morling N, Olaisen B, Pascali V, Prinz M, Roewer L, Schneider PM, Sajantila A, Tyler-Smith C. 2001.
DNA Commission of the International Society of Forensic Genetics: Recommendations on forensic analysis
using Y-chromosome STRs. Forensic Sci Int 124:510.
Golding GB. 1987. The detection of deleterious selection using ancestors inferred from a phylogenetic history.
Genet Res 49:7182.
Gonçalves R, Freitas A, Branco M, Rosa A, Fernandes AT, Zhivotovsky LA, Underhill PA, Kivisild T, Brehm A.
2005. Y-chromosome lineages from Portugal, Madeira and Açores record elements of Sephardim and Berber
ancestry. Ann Hum Genet 69:443454.
Haplogroup E in Western Andalusia 105
HammerMF, Spurdle AB, KarafetT, BonnerMR, WoodET,Novelletto A, Malaspina P,Mitchell RJ,HoraiS, Jenkins
T, Zegura SL. 1997. The geographic distribution of human Y chromosome variation. Genetics 145:787805.
Hammer M, Redd AJ, Wood ET, Bonner MR, Jarjanazi H, Karafet T, Santachiara-Benerecetti S, Oppenheim A,
Jobling MA, Jenkins T, Ostrer H, Bonne-Tamir B. 2000. Jewish and Middle Eastern non-Jewish populations
share a common pool of Y-chromosome biallelic haplotypes. Proc Natl Acad Sci 97:67696774.
Hartl DL, Clark AG. 1989. Principles of populations genetics. Sunderland, MA: Sinauer Associates.
Hein J, Schierup J, Wiuf C. 2005. Gene genealogies, variation and evolution. A primer in coalescent theory. Oxford:
Oxford University Press.
Hellenthal G, Auton A, Daniel F. 2008. Inferring human colonization history using a copying model. PLOS
Genetics 4: doi:10.1371/journal.pgen.1000078.
Hofrichter R. 2004. El mar Mediterráneo. Fauna, ora, ecología, vol. I. Barcelona: Ediciones Omega.
Jobling MA, Tyler-Smith C. 2003. The human Y chromosome: An evolutionary marker comes of age. Nat Rev
Genet 4:598612.
Karafet TM, Mendez FL, Meilerman MB, Underhill PA, Zegura SL, Hammer MF. 2008. New binary
polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree. Genome Res
18:830838.
Lebart L, Morineau A, Warwick KM. 1984. Multivariate descriptive statistical analysis: Correspondence analysis
and related techniques for large matrices. New York: Wiley and Sons.
Lopez-Davalillo Larrea J 2000. Atlas histórico de España y Portugal. Desde el Paleolítico hasta el siglo XX. Madrid:
Editorial Síntesis.
Mackay A. 1977. La España de la Edad Media: desde la frontera hasta el imperio 10001500. Madrid: Ediciones
Cátedra.
Martinez-Ruiz E, Maqueda C, Montero S, Ladero MA, Ladero MF, Olivera C, Cantera S. 2003. Atlas Histórico de
España I. Madrid: Ediciones Istmo.
Myro MM. 1999. Los enigmas de Tarteso: apéndices documentales. In: Alvar J and Blázquez JM, eds. Los enigmas
de Tarteso (pp. 201214). Madrid: Ediciones Cátedra.
Nei M. 1987. Molecular Evolutionary Genetics. New York: Columbia University Press.
Neto D, Montie, R, Bettencourt C, Santos C, Prata MJ, Lima M. 2007. The African contribution to the present-day
population of the Azores Islands (Portugal): Analysis of the Y chromosome haplogroup E. Am J Hum Biol
19:854860.
Novelletto A. 2007. Y chromosome variation in Europe: Continental and local processes in the formation of the
extant gene pool. Ann Hum Biol 34:139172.
Peri
cic´ M, Lauc LB, Klaric IM, Rootsi S, Janicijevic B, Rudan i, Terzic R, Colak I, Kevesic A, Popovic D,SijacKi A,
Behluli I, Dorevic D, Efremovska L, Bajec DD, Stefanovic BD, Villens R, Rudan P. 2005. High-resolution
phylogenetic analysis of southeastern Europe traces major episodes of paternal gene ow among Slavic
populations. Mol Biol Evol 22:19641975.
Placido D. 1999. La imagen griega de Tarteso. In: Alvar J and Blázquez JM, eds. Los enigmas de Tarteso (pp. 8189).
Madrid: Ediciones Cátedra.
Roewer L, Croucher PJ, Willuweit S, Lu TT, Kayser M, Lessig R, de Knijff P, Jobling MA, Tyler-Smith C,
Krawczak M. 2005. Signature of recent historical events in the European Y-chromosomal STR haplotype
distribution. Hum Genet 116:279291.
Rosser Z, Zerjal T, Hurles ME, Adojaan M, Alavantic D, Amorim A et al 2000. Y-chromosomal diversity in Europe
is clinal and inuenced primarily by geography, rather than by language. Am J Hum Genet. 67:15261543.
Scozzari R, Cruciani F, Pangrazio A, Santolamazza P, Vona G, Moral P, Latini V, Varesi L, Memmi MM, Romano
V, de Leo G, Gennarelli M, Jaruzelska J, Villems R, Parik J, Macaulay V, Torroni A. 2001. Human
Y-chromosome variation in the Western Mediterranean area: Implications for the peopling of the region.
Hum Immunol 62:871884.
Semino O, Magri C, Benuzzi G, Lin AA, Al-Zahery N, Battaglia V, Maccioni L, Triantaphyllidis C, Shen P, Oefner
PJ, Zhivotovsky LA, King R, Torroni A, Cavalli-Sforza LL, Underhill PA, Santachiara-Benerecetti AS. 2004.
Origin, diffusion, and differentiation of Y-chromosome haplogroups E and J: Inferences on the neolithization of
Europe and later migratory events in the Mediterranean area. Am J Hum Genet 4:10231034.
Serre D, Pääbo S. 2004. Evidence for gradients of human genetic diversity within and among continents. Genome
Res 14:16791685.
Shastry BS. 2002. SNP alleles in human disease and evolution. J Hum Genet 47:561566.
Sims LM, Garvey D, Ballantyne J. 2007. Sub-populations within the major European and African derived
haplogroups R1b3 and E3a are differentiated by previously phylogenetically undened Y-SNPs. Hum Mutat
28:97.
106 B. Ambrosio et al.
Torroni A, Achilli A, Macaulay A, Richards M, Bandlet HJ. 2006. Harvesting the fruit of the of the human mtDNA
tree. Trends Genet 22:339345.
Underhill P, Shen P, Lin AA., Jin L, Passarino G, Yang WH, Kauffman E, Bonné-Tamir B, Bertranpetit J,
Francalacci P, Ibrahim M, Jenkins T, Kidd JR, Mehdi SQ, Seielstad MT, Wells RS, Piazza A, Davis RW,
Feldman MW, Cavalli-Sforza LL Oefner PJ. 2000. Y chromosome sequence variation and the history of human
populations. Nat Genet 26:358361.
Underhill P, Passarino G, Lin AA, Shen P, Mirazón Lahr M, Foley RA, Oefner PJ, Cavalli-Sforza LL. 2001. The
phylogeography of Y chromosome binary haplotypes and the origins of modern human populations. Ann Hum
Genet 65:4362.
Underhill PA, Roseman CC. 2001. The case for an African rather than an Asian origin of human Y-chromosome
YAP insertion. In: Jin L, Seielstad M and Xiao C, eds. Genetic, Linguistic and Archaeological Perspectives on
Human Diversity in Southeast Asia: Recent Advances in Human Biology. vol. 8 (pp. 4356). River Edge, New
Jersey: World Scientic.
Vallvé J. 1977. Sobre demografía y sociedad en Al-Andalus (siglos VIIIXI). Al-Andalus 42:323340.
Weale M, Shah T, Jones AL, Greenhalgh J, Wilson JF, Nymadawa P, Zeitlin D, Connell BA, Bradman N, Thomas
MG. 2003. Rare deep-rooting Y chromosome lineages in humans: Lessons for phylogeography. Genetics
165:229234.
Haplogroup E in Western Andalusia 107
... Population sampling and population samples Sampling strategy was planned by members of this team and details of this step have been reported previously [27][28][29][30][31]. Western and eastern Andalusians from Huelva and Granada provinces, respectively, were selected and reasons for this decision are described in the abovementioned references. ...
... The population sample studied here is a subset of a larger one used in our ongoing project on Andalusian population structure, in the frame of the Mediterranean space. Genetic data derived from autosomal and uniparental markers-based on these samples-have already been published [27][28][29][30][31][32][33]. ...
... Granada sample (n = 249) was reduced for technical reasons to 223 individuals when typing the array of microsatellites. Alleles of microsatellites were designated based on the number of variable repeats included [29]. In addition, all Andalusian male samples (n = 416) were genotyped for the Y-Alu polymorphism (YAP element DYS287), 12f2 polymorphism, microsatellites DYS413 and DYS445 and also for 49 Y-SNPs (Single Nucleotide Polymorphisms). ...
Article
Full-text available
A sample of 416 males from western and eastern Andalusia has been jointly analyzed for surnames and Y-chromosome haplogroups and haplotypes. The observed number of differ- ent surnames was 222 (353 when the second surname of the Spanish system of naming is considered). The great majority of recorded surnames have a Castilian-Leonese origin, while Catalan or Basque surnames have not been found. A few Arab-related surnames ap- pear but none discernible of Sephardic-Jewish descent. Low correlation among surnames with different population frequencies and Y-chromosomemarkers, at different levels of ge- netic resolution, has been observed in Andalusia. This finding could be explained mainly by the very low rate of monophyletic surnames because of the historical process of surname ascription and the resulting high frequencies of the most common Spanish surnames. The introduction of surnames in Spain during the Middle Ages coincided with Reconquest of the territories under Islamic rule, and Muslims and Jews progressively adopted the present male line surname system. Sampled surnames and Y-chromosome lineages fit well a power-law distribution and observed isonymy is very close to that of the general population. Besides, our data and results show that the reliability of the isonymy method should be questioned because of the high rate of polyphyletic surnames, even in small geographic re- gions and autochthonous populations. Random isonymy would be consistently dependent of the most common surname frequencies in the population.
... The presence of this haplogroup could be due to certain subhaplogroups, such as E1b1b-M81 or E1b1b-M34. The subhaplogroup E1b1b-M81, characteristic of North African regions, may be associated with the genetic flow between Maghreb and Iberia favored by Carthage expansion [8,[42][43][44][45]. The centuries of Muslim domination of the Iberian Peninsula and South Italy in the Middle Ages must also be considered. ...
... On the other hand, the subhaplogroup E1b1b-M34, which is widespread among Jews, is also found in Spain. This is consistent with the Jewish presence, deep-rooted in the Iberian Peninsula since the Roman period [8,42,44]. ...
Article
Full-text available
The remarkable geographical situation of the Mediterranean region, located between Europe, Africa, and Asia, with numerous migratory routes, has made this area a crucible of cultures. Studying the Y-chromosome variability is a very performant tool to explore the genetic ancestry and evaluate scenarios that may explain the current Mediterranean gene pool. Here, six Mediterranean populations, including three Balearic Islands (Ibiza, Majorca, and Minorca) and three Southern Italian regions (Catanzaro, Cosenza, and Reggio di Calabria) were typed using 23 Y-STR loci and up to 39 Y-SNPs and compared to geographically targeted key reference populations to explore their genetic relationship and provide an overview of Y-chromosome variation across the Mediterranean basin. Pairwise RST genetic distances calculated with STRs markers and Y-haplogroups mirror the West to East geographic distribution of European and Asian Mediterranean populations, highlighting the North-South division of Italy, with a higher Eastern Mediterranean component in Southern Italian populations. In contrast, the African populations from the Southern coast of the Mediterranean clustered separately. Overall, these results support the notion that migrations from Magna Graecia or the Byzantine Empire, which followed similar Neolithic and post-Neolithic routes into Southern Italy, may have contributed to maintaining and/or reinforcing the Eastern Mediterranean genetic component in Southern Italian populations.
... Data from the Western European populations were obtained from the Y chromosome haplotype reference database (http://www.yhrd. org/), including Spanish [10] and French (accession numbers YA003268, YA003131, and YA003011) populations, as well as a sample from the Spanish Huelva province [11]. We also analyzed Basque populations from the Spanish provinces of Alava, Vizcaya, and Guipuzcoa (accession numbers YA003672, YA003673, and YA003674) [12]; the Western USA (accession numbers YA003675, YA003676, and YA003677) [13]; mestizo group (this study); and the Amerindian "Mayas" group [14]. ...
Article
Full-text available
Spinocerebellar ataxia type 7 (SCA7) is a neurode-generative disorder characterized by progressive cerebellar ataxia associated with macular degeneration that leads, in the majority of patients, to loss of autonomy and blindness. The cause of the disease has been identified as (CAG) n repeat expansion in the coding sequence of the ATXN7 gene on chromosome 3p21.1. SCA7 is one of the least common genetically verified autosomal dominant cerebellar ataxias found worldwide; however, we previously identified the Mexican population showing high prevalence of SCA7, suggesting the occurrence of a common founder effect. In this study, haplotype analysis using four SCA7 gene-linked markers revealed that all 72 SCA7 carriers studied share a common haplotype, A-254-82-98, for the intragenic marker 3145G/A and centromeric markers D3S1287, D3S1228, and D3S3635, respectively. This multiloci combination is uncommon in healthy relatives and Mexican general population, suggesting that a single ancestral mutation is responsible for all SCA7 cases in this population. Furthermore, genotyping using 17 short tandem repeat markers from the non-recombining region of the Y chromosome and further phylogenetic relationship analysis revealed that Mexican patients possess the Western European ancestry, which might trace the SCA7 ancestral mutation to that world region.
... The samples were Saudi Arabia (two cases), Palestine (two cases) and Andalusia (one case). For six out of the nine alleles reported above, the Iberian and Moroccan samples fell into different frequency belts, supporting discrepant allele assemblages for North Africa and extreme South-Western Europe (Ambrosio et al., 2010), despite evidence of historical gene flow (Achilli et al., 2005;Hern andez et al., 2017). ...
Article
Background: Tetranucleotide Short Tandem Repeats (STRs) for human identification and common use in forensic cases have recently been used to address the population genetics of the North-Eastern Mediterranean area. However, to gain confidence in the inferences made using STRs, this kind of analysis should be challenged with changes in three main aspects of the data, i.e. the sizes of the samples, their distance across space and the genetic background from which they are drawn. Aim: To test the resilience of the gradients previously detected in the North-Eastern Mediterranean to the enlargement of the surveyed area and population set, using revised data. Subjects and methods: STR genotype profiles were obtained from a publicly available database (PopAffilietor databank) and a dataset was assembled including >7000 subjects from the Arabian Peninsula to Scandinavia, genotyped at eight loci. Spatial principal component analysis (sPCA) was applied and the frequency maps of the nine alleles which contributed most strongly to sPC1 were examined in detail. Results: By far the greatest part of diversity was summarised by a single spatial principal component (sPC1), oriented along a SouthEast-to-NorthWest axis. The alleles with the top 5% squared loadings were TH01(9.3), D19S433(14), TH01(6), D19S433(15.2), FGA(20), FGA(24), D3S1358(14), FGA(21) and D2S1338(19). These results confirm a clinal pattern over the whole range for at least four loci (TH01, D19S433, FGA, D3S1358). Conclusions: Four of the eight STR loci (or even alleles) considered here can reproducibly capture continental arrangements of diversity. This would, in principle, allow for the exploitation of forensic data to clarify important aspects in the formation of local gene pools.
... Thus, the westernmost extreme of the Mediterranean likely did not represent a true physical barrier to gene flow between both continents. The patterns of variation in the Y-chromosome between western and eastern Andalusians, based on 416 males, have also been investigated for a set of Y-Short Tandem Repeats (Y-STRs) and Y-SNPs [53][54][55], Calderón et al., unpublished data] in combination to mtDNA analyses ( [18,19] and present study). In general, for both uniparental makers, Andalusians exhibit a typical western European genetic background, with peak frequencies of mtDNA Hg H and Y-chromosome Hg R1b1b2-M269 (45% and 60%, respectively). ...
Article
Full-text available
Background The structure of haplogroup H reveals significant differences between the western and eastern edges of the Mediterranean, as well as between the northern and southern regions. Human populations along the westernmost Mediterranean coasts, which were settled by individuals from two continents separated by a relatively narrow body of water, show the highest frequencies of mitochondrial haplogroup H. These characteristics permit the analysis of ancient migrations between both shores, which may have occurred via primitive sea crafts and early seafaring. We collected a sample of 750 autochthonous people from the southern Iberian Peninsula (Andalusians from Huelva and Granada provinces). We performed a high-resolution analysis of haplogroup H by control region sequencing and coding SNP screening of the 337 individuals harboring this maternal marker. Our results were compared with those of a wide panel of populations, including individuals from Iberia, the Maghreb, and other regions around the Mediterranean, collected from the literature. Results Both Andalusian subpopulations showed a typical western European profile for the internal composition of clade H, but eastern Andalusians from Granada also revealed interesting traces from the eastern Mediterranean. The basal nodes of the most frequent H sub-haplogroups, H1 and H3, harbored many individuals of Iberian and Maghrebian origins. Derived haplotypes were found in both regions; haplotypes were shared far more frequently between Andalusia and Morocco than between Andalusia and the rest of the Maghreb. These and previous results indicate intense, ancient and sustained contact among populations on both sides of the Mediterranean. Conclusions Our genetic data on mtDNA diversity, combined with corresponding archaeological similarities, provide support for arguments favoring prehistoric bonds with a genetic legacy traceable in extant populations. Furthermore, the results presented here indicate that the Strait of Gibraltar and the adjacent Alboran Sea, which have often been assumed to be an insurmountable geographic barrier in prehistory, served as a frequently traveled route between continents. Electronic supplementary material The online version of this article (doi:10.1186/s12863-017-0514-6) contains supplementary material, which is available to authorized users.
... African maternal genes have left imprints both in southern Iberia and the Atlantic façade of the Peninsula [8,12]. Comparatively, markers are less abundant for paternal genes although we have found visible traces of the North African E-M81 Y-chromosome lineage (referred to as the "Berber marker") in the Andalusian gene pool [13,14]. A recently published study aimed at analyzing the geographic distribution of autosomal immunoglobulin genes at the GM locus across the Mediterranean highlighted the relatively high frequency (4% on average) of the sub-Saharan GM 1,17 5 Ã haplotype in the Andalusian and neighboring Iberian Atlantic regions compared to other Mediterranean European populations [15]. ...
Article
Full-text available
Determining the timing, identity and direction of migrations in the Mediterranean Basin, the role of "migratory routes" in and among regions of Africa, Europe and Asia, and the effects of sex-specific behaviors of population movements have important implications for our understanding of the present human genetic diversity. A crucial component of the Mediterranean world is its westernmost region. Clear features of transcontinental ancient contacts between North African and Iberian populations surrounding the maritime region of Gibraltar Strait have been identified from archeological data. The attempt to discern origin and dates of migration between close geographically related regions has been a challenge in the field of uniparental-based population genetics. Mitochondrial DNA (mtDNA) studies have been focused on surveying the H1, H3 and V lineages when trying to ascertain north-south migrations, and U6 and L in the opposite direction, assuming that those lineages are good proxies for the ancestry of each side of the Mediterranean. To this end, in the present work we have screened entire mtDNA sequences belonging to U6, M1 and L haplogroups in Andalusians-from Huelva and Granada provinces-and Moroccan Berbers. We present here pioneer data and interpretations on the role of NW Africa and the Iberian Peninsula regarding the time of origin, number of founders and expansion directions of these specific markers. The estimated entrance of the North African U6 lineages into Iberia at 10 ky correlates well with other L African clades, indicating that U6 and some L lineages moved together from Africa to Iberia in the Early Holocene. Still, founder analysis highlights that the high sharing of lineages between North Africa and Iberia results from a complex process continued through time, impairing simplistic interpretations. In particular, our work supports the existence of an ancient, frequently denied, bridge connecting the Maghreb and Andalusia.
... Interestingly, similar findings on the predominance of embryonal tumours were reported from the nearby countries of Jordan, Morocco, Saudi Arabia and Syria, possibly implicating environmental and ethnic backgrounds as potential reasons for this distinct distribution [37][38][39][40]. Therefore, one could speculate that these discrepancies could be attributed to the fact that Mediterranean populations share common genetic traits, given the historical population admixtures in the region [41]. Belarus is ethnically discrete from all other countries presenting an excess of embryonal tumours, although a recent analysis indicates early gene flow from the Middle-East region [42]. ...
Article
Abstract Aim Following completion of the first 5-year nationwide childhood (0–14 years) registration in Greece, central nervous system (CNS) tumour incidence rates are compared with those of 12 registries operating in 10 Southern–Eastern European countries. Methods All CNS tumours, as defined by the International Classification of Childhood Cancer (ICCC-3) and registered in any period between 1983 and 2014 were collected from the collaborating cancer registries. Data were evaluated using standard International Agency for Research on Cancer (IARC) criteria. Crude and age-adjusted incidence rates (AIR) by age/gender/diagnostic subgroup were calculated, whereas time trends were assessed through Poisson and Joinpoint regression models. Results 6062 CNS tumours were retrieved with non-malignant CNS tumours recorded in eight registries; therefore, the analyses were performed on 5191 malignant tumours. Proportion of death certificate only cases was low and morphologic verification overall high; yet five registries presented >10% unspecified neoplasms. The male/female ratio was 1.3 and incidence decreased gradually with age, apart from Turkey and Ukraine. Overall AIR for malignant tumours was 23/106 children, with the highest rates noted in Croatia and Serbia. A statistically significant AIR increase was noted in Bulgaria, whereas significant decreases were noted in Belarus, Croatia, Cyprus and Serbia. Although astrocytomas were overall the most common subgroup (30%) followed by embryonal tumours (26%), the latter was the predominant subgroup in six registries. Conclusion Childhood cancer registration is expanding in Southern–Eastern Europe. The heterogeneity in registration practices and incidence patterns of CNS tumours necessitates further investigation aiming to provide clues in aetiology and direct investments into surveillance and early tumour detection.
... Recent results obtained by our group on the composition and mtDNA variation in Andalusia [31] reveal high internal complexity and a distinctive influence of U6 and L haplogroups in both eastern and western side of region. Investigating Y-chromosome variability in the same sample sets, we also observe interpopulation genetic differentiation, though in a lesser extent ( [32,33], Calderó n et al. manuscript in preparation). Both selected territories -Granada and Huelva -are differentiated by their different, complex histories. ...
Article
L’Andalousie est, depuis l’Antiquité, la région d’Espagne la plus peuplée ; elle a une riche histoire de contacts dans la Méditerranée. Les premières études ont souligné la relative haute fréquence de l’haplotype sub-saharien GM*1,17 5* dans l’Andalousie occidentale (province de Huelva, n = 252) et les régions atlantiques voisines. Dans ce travail, nous apportons de nouvelles données sur les marqueurs GM/KM en Andalousie orientale (n = 195), province de Grenade, où l’haplotype africain GM*1,17 5* a une fréquence relativement élevée (0,044). Les haplotypes GM les plus fréquents en Andalousie sont aussi les plus communs en Europe. Ces données dans leur ensemble constituent un nouvel apport à la diversité génétique de la partie sud de la péninsule Ibérique. De plus, nous comparons la structure de nos populations ibériques avec celle de 41 populations méditerranéennes. La variation des haplotypes GM en Méditerranée reflète d’intenses et complexes interactions entre les populations nord-africaines et sud-européennes, soulignant, en outre, que l’influence africaine dans la péninsule Ibérique n’est pas homogène.
Article
Full-text available
The Y chromosome has been widely explored for the study of human migrations. Due to its paternal inheritance, the Y chromosome polymorphisms are helpful tools for understanding the geographical distribution of populations all over the world and for inferring their origin, which is really useful in forensics. The remarkable historical context of Europe, with numerous migrations and invasions, has turned this continent into a melting pot. For this reason, it is interesting to study the Y chromosome variability and how it has contributed to improving our knowledge of the distribution and development of European male genetic pool as it is today. The analysis of Y lineages in Europe shows the predominance of four haplogroups, R1b-M269, I1-M253, I2-M438 and R1a-M420. However, other haplogroups have been identified which, although less frequent, provide significant evidence about the paternal origin of the populations. In addition, the study of the Y chromosome in Europe is a valuable tool for revealing the genetic trace of the different European colonizations, mainly in several American countries, where the European ancestry is mostly detected by the presence of the R1b-M269 haplogroup. Therefore, the objective of this review is to compile the studies of the Y chromosome haplogroups in current European populations, in order to provide an outline of these haplogroups which facilitate their use in forensic studies.
Thesis
El polimorfismo del gen APOE en Andalucía. Nuevas aportaciones al conocimiento de la composición e historia genética de las poblaciones mediterráneas.
Article
Full-text available
Arlequin ver 3.0 is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past population expansions, and thorough analyses of population subdivision under the AMOVA framework. Arlequin 3 introduces a completely new graphical interface written in C++, a more robust semantic analysis of input files, and two new methods: a Bayesian estimation of gametic phase from multi-locus genotypes, and an estimation of the parameters of an instantaneous spatial expansion from DNA sequence polymorphism. Arlequin can handle several data types like DNA sequences, microsatellite data, or standard multilocus genotypes. A Windows version of the software is freely available on http://cmpg.unibe.ch/software/arlequin3.
Article
During the past few years the DNA commission of the International Society of Forensic Genetics has published a series of documents providing guidelines and recommendations concerning the application of DNA polymorphisms to the problems of human identification. This latest report addresses a relatively new area, namely Y-chromosome polymorphisms, with particular emphasis on short tandem repeats (STRs). This report addresses nomenclature, use of allelic ladders, population genetics and reporting methods.
Article
Article
The Mediterranean region, formed with parts of Africa, Asia and Europe, is unique in our planet. It has received the passage and settlement of various hominid migrations from the Pleistocene and the three main races have met there since remote ages. Its archeology and history are better known than those in other regions of the world. The area, with a fairly uniform climate, was unified for several centuries, under Roman Empire which restricted the access to outside populations, favouring relationships among inside ones. Genetic evidence suggets that most of the evolutionary history of humans is quite recent and interwined. A paradigm can be found in Mediterranean populations. The understanding on what happened in this region will be decisive to know the evolution of the entire human species. An overall genetic study on the whole Mediterranean region, with its European part as its Asian and African ones has not been carried out yet and such work should be drawn up soon.
Article
Y-chromosome variation was analyzed in a sample of 1127 males from the Western Mediterranean area by surveying 16 biallelic and 4 multiallelic sites. Some populations from Northeastern Europe and the Middle East were also studied for comparison. All Y-chromosome haplotypes were included in a parsimonious genealogic tree consisting of 17 haplogroups, several of which displayed distinct geographic specificities. One of the haplogroups, HG9.2, has some features that are compatible with a spread into Europe from the Near East during the Neolithic period. However, the current distribution of this haplogroup would suggest that the Neolithic gene pool had a major impact in the eastern and central part of the Mediterranean basin, but very limited consequences in Iberia and Northwestern Europe. Two other haplogroups, HG25.2 and HG2.2, were found to have much more restricted geographic distributions. The first most likely originated in the Berbers within the last few thousand years, and allows the detection of gene flow to Iberia and Southern Europe. The latter haplogroup is common only in Sardinia, which confirms the genetic peculiarity and isolation of the Sardinians. Overall, this study demonstrates that the dissection of Y-chromosome variation into haplogroups with a more restricted geographic distribution can reveal important differences even between populations that live at short distances, and provides new clues to their past interactions.