ArticlePDF Available

The genome-wide structure of the Jewish people

Authors:
  • Institute of Genomics, University of Tartu

Abstract and Figures

Contemporary Jews comprise an aggregate of ethno-religious communities whose worldwide members identify with each other through various shared religious, historical and cultural traditions. Historical evidence suggests common origins in the Middle East, followed by migrations leading to the establishment of communities of Jews in Europe, Africa and Asia, in what is termed the Jewish Diaspora. This complex demographic history imposes special challenges in attempting to address the genetic structure of the Jewish people. Although many genetic studies have shed light on Jewish origins and on diseases prevalent among Jewish communities, including studies focusing on uniparentally and biparentally inherited markers, genome-wide patterns of variation across the vast geographic span of Jewish Diaspora communities and their respective neighbours have yet to be addressed. Here we use high-density bead arrays to genotype individuals from 14 Jewish Diaspora communities and compare these patterns of genome-wide diversity with those from 69 Old World non-Jewish populations, of which 25 have not previously been reported. These samples were carefully chosen to provide comprehensive comparisons between Jewish and non-Jewish populations in the Diaspora, as well as with non-Jewish populations from the Middle East and north Africa. Principal component and structure-like analyses identify previously unrecognized genetic substructure within the Middle East. Most Jewish samples form a remarkably tight subcluster that overlies Druze and Cypriot samples but not samples from other Levantine populations or paired Diaspora host populations. In contrast, Ethiopian Jews (Beta Israel) and Indian Jews (Bene Israel and Cochini) cluster with neighbouring autochthonous populations in Ethiopia and western India, respectively, despite a clear paternal link between the Bene Israel and the Levant. These results cast light on the variegated genetic architecture of the Middle East, and trace the origins of most Jewish Diaspora communities to the Levant.
Content may be subject to copyright.
LETTERS
The genome-wide structure of the Jewish people
Doron M. Behar
1,2
*, Bayazit Yunusbayev
2,3
*, Mait Metspalu
2
*, Ene Metspalu
2
, Saharon Rosset
4
,Ju
¨ri Parik
2
,
Siiri Rootsi
2
, Gyaneshwer Chaubey
2
, Ildus Kutuev
2,3
, Guennady Yudkovsky
1,5
, Elza K. Khusnutdinova
3
,
Oleg Balanovsky
6
, Ornella Semino
7
, Luisa Pereira
8,9
, David Comas
10
, David Gurwitz
11
, Batsheva Bonne-Tamir
11
,
Tudor Parfitt
12
, Michael F. Hammer
13
, Karl Skorecki
1,5
& Richard Villems
2
Contemporary Jews comprise an aggregate of ethno-religious
communities whose worldwide members identify with each other
through various shared religious, historical and cultural tradi-
tions
1,2
. Historical evidence suggests common origins in the Middle
East, followed by migrations leading to the establishment of com-
munities of Jews in Europe, Africa and Asia, in what is termed the
Jewish Diaspora
3–5
. This complex demographic history imposes
special challenges in attempting to address the genetic structure
of the Jewish people
6
. Although many genetic studies have shed
light on Jewish origins and on diseases prevalent among Jewish
communities, including studies focusing on uniparentally and
biparentally inherited markers
7–16
, genome-wide patterns of
variation across the vast geographic span of Jewish Diaspora com-
munities and their respective neighbours have yet to be addressed.
Here we use high-density bead arrays to genotype individuals from
14 Jewish Diaspora communities and compare these patterns of
genome-wide diversity with those from 69 Old World non-Jewish
populations, of which 25 have not previously been reported.
These samples were carefully chosen to provide comprehensive
comparisons between Jewish and non-Jewish populations in the
Diaspora, as well as with non-Jewish populations from the Middle
East and north Africa. Principal component and structure-like
analyses identify previously unrecognized genetic substructure
within the Middle East. Most Jewish samples form a remarkably
tight subcluster that overlies Druze and Cypriot samples but not
samples from other Levantine populations or paired Diaspora host
populations. In contrast, Ethiopian Jews (Beta Israel) and Indian
Jews (Bene Israel and Cochini) cluster with neighbouring auto-
chthonous populations in Ethiopia and western India, respec-
tively, despite a clear paternal link between the Bene Israel and
the Levant. These results cast light on the variegated genetic archi-
tecture of the Middle East, and trace the origins of most Jewish
Diaspora communities to the Levant.
Recently, the capacity to obtain whole-genome genotypes with the
use of array technology has provided a robust tool forelucidating fine-
scale population structure and aspects of demographic history
17–23
.
This approach, initially used to account for population stratification
in genome-wide association studies, identified genome-wide patterns
of variation that distinguished between Ashkenazi Jews and non-
Jews of European descent
7,11,12,14–16
. Similarly, a large-scale survey of
autosomal microsatellites found that samples from four Jewish
communities clustered close to each other and intermediate between
non-Jewish Middle Eastern and European populations
10
.
Illumina 610K and 660K bead arrays were used to genotype 121
samples from 14 Jewish communities. The results were compared
with 1,166 individuals from 69 non-Jewish populations (Supplemen-
tary Note 1 and Supplementary Table 1), with particular attention to
neighbouring or ‘host’ populations in corresponding geographic
regions. These results were also integrated with analyses of genotype
data from about 8,000 Y chromosomes and 14,000 mitochondrial
DNA (mtDNA) samples (Supplementary Note 6 and Supplemen-
tary Tables 4 and 5). Several questions were then addressed: What
are the locations of the various Jewish communities in a global genetic
variation context? What are the features of the Middle Eastern (Sup-
plementary Fig. 1) population genetic substructure? What are the
genetic distances between contemporary Jewish communities, their
Diaspora neighbours and Middle Easternpopulations? Can the genetic
origin of Jews be pinpointed within the Middle East?
The EIGENSOFT package
24
was used to identify the principal
components (PCs) of autosomal variation in our Old World sample
set (Fig. 1 and Supplementary Fig. 2a). This analysis places the
studied samples along two well-established geographic axes of global
genetic variation
18,19,22
: PC1 (sub-Saharan Africa versus the rest of the
Old World) and PC2 (east versus west Eurasia). Focusing on the
Middle Eastern populations in the PC1–PC2 plot (Fig. 1b) reveals
more geographically refined groupings. Populations of the Caucasus,
flanked by Cypriots, form an almost uninterrupted rim that separates
the bulk of Europeans from Middle Eastern populations. Bedouins,
Jordanians, Palestinians and Saudi Arabians are located in close
proximity to each other, which is consistent with a common origin
in the Arabian Peninsula
25
, whereas the Egyptian, Moroccan,
Mozabite Berber, and Yemenite samples are located closer to sub-
Saharan populations (Fig. 1a and Supplementary Fig. 2a).
Most Jewish samples, other than those from Ethiopia and India,
overlie non-Jewish samplesfrom the Levant (Fig. 1b). The tight cluster
comprising the Ashkenazi, Caucasus (Azerbaijani and Georgian),
Middle Eastern (Iranian and Iraqi), north African (Moroccan) and
Sephardi (Bulgarian and Turkish) Jewish communities, as well as
Samaritans, strongly overlaps Israeli Druze and is centrally located
on the principal component analysis (PCA) plot when compared with
Middle Eastern, European Mediterranean, Anatolian and Caucasus
non-Jewish populations (Fig. 1). This Jewish cluster consists of
*These authors contributed equally to this work.
1
Molecular Medicine Laboratory, Rambam Health Care Campus, Haifa 31096, Israel.
2
Estonian Biocentre and Department of Evolutionary Biology, University of Tartu, Tartu 51010,
Estonia.
3
Institute of Biochemistry and Genetics, Ufa Research Center, Russian Academy of Sciences, Ufa 450054, Russia.
4
Department of Statistics and Operations Research, School
of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
5
Rappaport Faculty of Medicine and Research Institute, Technion
Israel Institute of Technology, Haifa 31096,
Israel.
6
Research Centre for Medical Genetics, Russian Academy of Medical Sciences, Moscow 115478, Russia.
7
Dipartimento di Genetica e Microbiologia, Universita
`di Pavia, Pavia
27100, Italy.
8
Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto 4200-465, Portugal.
9
Faculdade de Medicina, Universidade do Porto, Porto
4200-319, Portugal.
10
Institute of Evolutionary Biology (CSIC-UPF), CEXS-UPF-PRBB and CIBER de Epidemiologı
´
a y Salud Pu
´blica, Barcelona 08003, Spain.
11
Department of Human
Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel.
12
Department of the Languages and Cultures of the Near and Middle
East, Faculty of Languages and Cultures, School of Oriental and African Studies (SOAS) , University of London, London WC1H 0XG, UK.
13
ARL Division of Biotechnology, University of
Arizona, Tucson, Arizona 85721, USA.
doi:10.1038/nature09103
1
Macmillan Publishers Limited. All rights reserved
©2010
samples from most Jewish communities studied here, which together
cover more than 90% of the current world Jewish population
5
; this is
consistent with an ancestral Levantine contribution to much of con-
temporary Jewry. A compact cluster of Yemenite Jews, which is also
located within an assemblage of Levantine samples, overlaps primarily
with Bedouins but also with Saudi individuals (Fig. 1b). In contrast,
Ethiopian and Indian Jews are located close to those from neighbour-
ing host populations (Fig. 1c,d). Ethiopian Jews clustered with
Semitic-speaking rather than Cushitic-speaking Ethiopians. See Sup-
plementary Note 2 for a discussion of the assignment of samples repre-
senting the Belmonte and Uzbek (Bukharan) Jewish communities.
To glean further details of Levantine genetic structure, we repeated
PCA on a restricted set of samples from west Eurasia (Fig. 2, Sup-
plementary Fig. 3 and Supplementary Note 2) and by inspect-
ing lower-ranked PCs in the Old World context (Supplementary
Fig. 2b, c; PC1 versus PC3 and PC4). These analyses reveal three
−0.02 00.02
0.04 0.06
−0.08 −0.06 −0.04 −0.02 0.00 0.02
Eigenvector 2
Eigenvector 1
00
0.0
2
ector 2
Ru
Fr
Fr
FB
FB Orc
Orc
Fr
Fr
FB
FB
Orc
Orc
Fr Fr
Fr
FB
FB
Orc
Orc
Fr
Fr
Fr
Fr
Fr
Fr
Fr
Fr
Fr
FB
Tus
Orc
Fr Fr
Fr
Fr
FB
Tus
Orc
Fr
Fr
FB Fr
Fr
FB
FB
FB
Tus
Orc
Tus
Orc
FB
FB
Tus
FB
FB
Tus
Orc
Ru
FB
Tus
Orc
Ru
Ru
R
Fr
Fr
FB
FB
FB
Orc
Orc
Ru
Ru
Fr
Fr
FB Orc
FB
Chu
Bel
Bel
Bel
BelBel
Bel
Bel
Bel
Bel
Hng
Hng
Hng
Hng
Hng
Hng
Hng
Hng
Hng
Hng
Hng
Hng
Hng
Hng
Hng
HngHng
Hng
Hng
Hng
Lit
Lit
Lit
Lit
Lit
Lit
Lit
Lit Lit
Spa
Spa
Spa
Spa
Spa
Spa
Spa
Spa
Spa
Spa
Spa
Spa
Rmn Rmn
Rmn
Rmn
Rmn
Rmn
Rmn
Rmn
Rmn
RmnRmn
Rmn Rmn
Rmn
Europeans
Levantine
non-Jewish populations
non-Jewish populations
non-Jewish populations
Bedouins
Palestinians
Saudis
Jordanians
a
Sp
Sp
Spa
Spa
pa
S
Spa
p
Sp
Sp
pa
Sp
Sp
S
S
pa
Sp
SpSp
S
S
Druze
Cypriots
Samaritans
T
T
Hng
Hng
ng
ng
Rmn
ng
ng
Rm
HH
Hn
Hn
ngngngng
n
n
n
n
Armenians
Georgians
Turks
Iranians
Lezgins
Adygei
u
R
R
L
ez
g
in
s
Ady
g
ei
Ady
Ady
Ady
Ady
Ady
Ady
Ady
Ady Ady
Ady Ady
Ady
Ady
Ady
Ady
Ady
Pal
Pal Pal
Pal
Pal
Pal Pal
Pal
PalPal Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
Pal
alP
Pal Pal
GoGo
Go
Go
Go Go
Go
Go
Go
Go
Go
Go
Go
Go
Go
Go
Go
Go
Go
Go
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Arm
Tur
Tur
Tur
Tur
Tur
Tur
Tur
Tur
Tur
Tur
TurTu r
Tur
Tur
Tur
Tur
Tur
TurTur
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Irn
Yem
Yem
Lzg
Lzg
Lzg
Lzg
LzgLzg
Lzg Lzg
Lzg Lzg
Lzg
Lzg
Lzg
Lzg
Lzg
Lzg
Lzg
Lzg
Drz Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz
Drz Drz
Drz
Drz
Drz
Drz
Drz
LebLeb
Leb
Leb
Leb
Leb
Cyr
Cyr
Cyr
CyrCyr
Cyr
Cyr
Cyr
Cyr
Cyr Cyr
Cyr
Jor
Jor
Jor
Jor
Jor
Jor
Jor
Jor
Jor
Jor
Jor Jor
Jor
Jor
Jor
Jor
Jor
Syr
Syr
Syr
Syr
Syr
Syr
Syr
Syr
Syr
Syr
Syr
Syr
Syr
Syr
Syr
Sdi
Sdi
Sdi
Sdi
Sdi
Sdi
Sdi
Sdi Sdi
Sdi
Sdi
Sdi
Sdi
Sdi
Sdi
Sdi
Sdi
Sdi
Sdi
Sm
Sm
Sm
InJ
InJInJ
InJ
IqJ IqJ
IqJIqJ
IqJIqJ
IqJ
IqJ IqJ
IqJ
IqJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ
SJ SJ SbJ AJ
AJ
AJ
AJ
AJ
AJAJ
AJ
AJ
AJ
AJ
AJ AJ
AJ
AJ AJ
AJ AJ
AJ
AJ
AJ UJ
UJ
YJ
YJ
YJ
YJ
YJ
YJ
YJ
YJ
YJ
YJ
YJ
YJ
YJ
YJ
YJ
MJ MJ
MJ
MJ
MJ
MJ
MJ
MJ MJ
MJ
MJ
MJ
MJ
MJ
MJ
MJ
AzJ
AzJ
AzJ
AzJ
AzJ
AzJ
AzJ
AzJ
GJ
GJ
GJ
GJ
InJ, Iranian Jews
IqJ, Iraqi Jews
SJ, Sephardi Jews
AJ, Ashkenazi Jews
UJ, Uzbekistani Jews
MJ, Moroccan Jews
AzJ, Azerbaijani Jews
GJ, Georgian Jews
YJ, Yemenite Jews
Leb, Lebanese
Syr, Syrians
Yem, Yemenites
0
E
EJ
EJ
EJ
EJ
EJ
EJ
EJ
EJ
EJ
EJ
EJ
EJ
EJ
Jor
Mor
Mor
EtO
EtO
EtO
EtO
EtO
EtO
EtT
EtT
EtO
EtA
EtA
EtA
EtA
EtTEtT
EtT
EtA
EtA
EJ, Ethiopian Jews
Jor, Jordanians
Mor, Moroccans
EtT, Tigray Ethiopians
EtO, Oromo Ethiopians
EtA, Amhara Ethiopians
0
0
2
Blo
Sin
Sin
Ptn
Bur
Bur
Sin
Ptn
Ptn
Bur
Bur
Sin
Sin
Ptn
Bur Bur
Blo
SinSin
Ptn Bur
Bur
Sin
Ptn
Bur
Bur
Sin
Sin
Ptn
Bur
Bur
Sin Sin
Ptn Ptn
Bur
Sin
Sin
Ptn
Bur Bur
Sin
Sin
Bur
Bur
Ptn
Bur
Bur
Sin
Sin
Ptn
Ptn
Bur
Bur
Bur
Bur
Bur
Bur
Ind Ind Ind
Ind
Ind
Ind
Ind
IcJ
IcJ
IcJ
IcJIbJ
IbJ IbJ
IbJ
Ptn, Pathan
Blo, Balochi
Bur, Burusho
Sin, Sindhi
Ind, India (southern)
IcJ, Cochini Jews
IbJ, Mumbai Jews
Sub-Saharan Africa
North and east Africa
Middle East
Europe
Jewish communities
South Asia
East Asia
a
b
c
d
SbJ, Belmonte Jew
Figure 1
|
PCA of high-density array data. a, Scatter plot of Old World
individuals, showing the first two principal components. Each ring
corresponds to one individual and the colour indicates the region of origin
(for the full figure see Supplementary Fig. 2). bd, A series of magnifications
showing samples from Europe and the Middle East (b), Ethiopia (c) and
south Asia (d). Each letter code (Supplementary Table 1) corresponds to one
individual, and the colour indicates the geographic region of origin. In b,a
polygon surrounding all of the individual samples belonging to a group
designation highlights several population groups.
LETTERS NATURE
2
Macmillan Publishers Limited. All rights reserved
©2010
distinct Near Eastern Jewish subclusters: the first group is located
between Middle Eastern and European populations and consists of
Ashkenazi, Moroccan and Sephardi Jews. The second group, com-
prising the Middle Eastern and Caucasus Jewish communities, is
positioned within the large conglomerate of non-Jewish populations
of the region. The third group contains only a tight cluster of
Yemenite Jews.
After elucidation of these groupings by PCA, we turned to
structure-like analysis
26
with the algorithm ADMIXTURE
27
to assign
individuals proportionally to hypothetical ancestral populations
(Supplementary Note 3). Initially, all Jewish samples were analysed
jointly with 25 novel reference populations (Supplementary Note 1)
in combination with the Human Genome Diversity Panel
18
samples
representing Africa, the Middle East, Europe, and central, south and
east Asia (Fig. 3 and Supplementary Fig. 4). This analysis significantly
refines and reinforces the previously proposed partitioning of Old
World population samples into continental groupings
18,19
(Sup-
plementary Fig. 4 and Supplementary Note 4). We note that mem-
bership of a sample in a component that is predominant in, but not
restricted to, a specific geographic region is not sufficient to infer its
genetic origins. Membership in several genetic components can imply
either a shared genetic ancestry or a recent admixture of sampled
individuals
18,28
. An illustrative example at K58 (Fig. 3 and Sup-
plementary Note 3) is the pattern of membership of Ashkenazi,
Caucasus (Azerbaijani and Georgian), Middle Eastern (Iranian and
Iraqi), north African (Moroccan), Sephardi (Bulgarian and Turkish)
and Yemenite Jewish communities in the light-green and light-
blue genetic components, which is similar to that observed for
Middle Eastern non-Jewish populations, suggesting a shared regional
origin of these Jewish communities. This inference is consistent with
historical records describing the dispersion of the people of ancient
Israel throughout the Old World
1–4
. Our conclusion favouring
common ancestry over recent admixture is further supported by the
fact that our sample contains individuals that are known not to be
admixed in the most recent one or two generations. It is also evident
that among the Ashkenazi, Moroccan and Sephardi Jewish com-
munities the dark-blue component dominating European populations
is more substantial than the corresponding proportion of this com-
ponent amongthe Middle Eastern Jewish communities (Fig. 3).For the
Indian and Ethiopian Jewish communities the dark-green and light-
brown genetic components are consistent with corresponding mem-
bership of theirrespective host populations (Fig. 3).ADMIXTURE was
also run on the west Eurasian subset of the Old World sample, which
highlights differentiation between the Middle East and Europe (Sup-
plementary Fig. 4b). Here, comparison between the ADMIXTURE-
derived component patterns for Sephardi and Ashkenazi Jews shows
that the former have only slightly greater similarity to the pattern
observed for Middle Eastern populations than do the latter.
Genetic relationships between our population samples were then
explored with the measure of allele sharing distances (ASDs)
29
.Table1
provides genetic distances between each Jewish community and its
corresponding host population, all Jewish communities, west
Eurasian Jewish communities, their respective Jewish group inferred
from the PCA, and non-Jewish Levantine populations. The Ashkenazi,
Sephardi, Moroccan, Iranian, Iraqi, Azerbaijani and Uzbekistani
Jewish communities have the lowest ASD values when compared with
their PCA-based inferred Jewish sub-cluster (Fig. 3 and Supplemen-
tary Figs 2c and 3). In all except the Sephardi Jewish community, this
ASD difference is statistically significant (P,0.01, bootstrap t-test).
ASD values between Ashkenazi, Sephardi and Caucasus Jewish popu-
lations and their respective hosts are lower than those between each
Jewish population and non-Jewish populations from the Levant. This
might be the result of a bias inherent in our calculations as a result of
the genetically more diverse non-Jewish populations of the Levant.
The Ethiopian and Indian Jewish communities show the lowest ASD
values when compared with their host population (Supplementary
Tables 2 and 3 and Supplementary Note 5).
Although uniparental markers
8,9
(Supplementary Note 6) are limited
in their capacity to uncover genetic substructure within the Middle
East, they do provide important insights into sex-specific processes that
are not unambiguously evident from the autosomal data alone. For
example, Y-chromosome data point to a unique paternal genetic link
between the Bene Israel community and the Levant, whereas the
absence of sub-Saharan African maternal lineages in Yemenite and
Moroccan Jews (in contrast to their hosts) suggests limited maternal
gene flow.
–0.10 –0.05 0 0.05
–0.05
0
0.05
Ei
g
envector 2, ei
g
envalue = 2.8
Eigenvector 1, eigenvalue = 6.1
Adygei
Lezgins
Armenians
Georgians
Chuvashi
Fr. Basque
Sardinians
Spaniards
French
Russians
Romanians
Hungarians
Lithuanians
Orcadians
Iranians
Saudis
Bedouins
Syrians
Jordanians
Druze
Turk s
Cypriots
Ashkenazi Jews
Iraqi Jews
Yemenite Jews
Sephardi Jews
Moroccan Jews o
o
oo
oo
oo
Azerbaijani Jews (o)
gg
g
g
Georgian Jews (g)
L
L
LL
L
LL
L, Lebanese
S
S
S
S, Samaritans
T
TT
T
TT
T
T, Tuscans
J
J
JJ
Iranian Jews (J)
B
B
BB
BB
B
B
B
B, Belorussians
SbJ
U
U
U, Uzb. Jews
SbJ, Sephardi Belmonte
Palestinians
Figure 2
|
PCA of west Eurasian high-density array data. Plot of kernel
densities (Supplementary Note 2) for each population sample (n.10) was
estimated on the basis of PC1 and PC2 coordinates in Supplementary Fig. 3.
Individuals from these samples were plotted by using PC1 and PC2
coordinates and were overlaid with the plot of kernel density.
Africa Middle East Europe Central, south and east Asia
Biaka P ygmies
Mbuti Pygmies
San
Bantu
Yoruba
Mandenkas
*Ethiopian J ews
*Ethiopians
Mozabi tes
*Moroccans
*Moroccan Je ws
*Egyptians
*Saudis
*Yemenese
*Yemenite Je ws
Bedouins
Palestini ans
*Syrians
*Jordanians
Druze
*Lebanese
*Samari tans
*Turks
*Iraqi Jews
*Iranian Jews
*Iranians
*Armenians
*Georgians
*Georgian Je ws
*Azerbaijani Je ws
Adyge i
*Lezgins
*Sephardi J ews
*Ashkenazi Jews
T
B
Orcadi ans
French Basque
French
*Spaniards
Tuscans
Sardinians
*Cypriots
*Romanians
*Hungarians
*Lithuanians
*Belorussians
Russians
*Chuvashs
*Uzbeks
Uygurs
Hazara
Burusho
Pathan
Brahui
Balochi
Sindhi
Makrani
*South Indians
Yakuts
Cambodians
Dai
Lahu
Miaozu
She
Han
Tuji a
Naxi
Yizu
Tu
Xibo
Oroqe n
Mongo ls
Daur
Hezhen
Japanese
*Uzbekistani Jews
*Mumbai Jews
*Cochini Jews
Figure 3
|
Population structure inferred by ADMIXTURE analysis. Each
individual is represented by a vertical (100%) stacked column of genetic
components proportions shown in colour for K58. The Jewish
communities are labelled in colour and bold. T and B further specify
Sephardi Jews from Turkey and Bulgaria, respectively. Populations
introduced for the first time in this study and analysed together with the
Human Genome Diversity Panel
18
data are marked with an asterisk.
NATURE LETTERS
3
Macmillan Publishers Limited. All rights reserved
©2010
Our PCA, ADMIXTURE and ASD analyses, which are based on
genome-wide data from a large sample of Jewish communities, their
non-Jewish host populations, and novel samples from the Middle
East, are concordant in revealing a close relationship between most
contemporary Jews and non-Jewish populations from theLevant. The
most parsimonious explanation for these observations is a common
genetic origin, which is consistent with an historical formulation of
the Jewish people as descending from ancient Hebrew and Israelite
residents of the Levant. This inference underscores the significant
genetic continuity that exists among most Jewish communities and
contemporary non-Jewish Levantine populations, despite their long-
term residence in diverse regions remote from the Levant and isola-
tion from one another. This study further uncovers genetic structure
that partitions most Jewish samples into Ashkenazi–north African–
Sephardi, Caucasus–Middle Eastern, and Yemenite subclusters
(Fig. 2). There are several mutually compatible explanations for the
observed pattern: a splintering of Jewish populations in the early
Diaspora period, an underappreciated level of contact betweenmem-
bers of each of these subclusters, and low levels of admixture with
Diaspora host populations. Equally interesting are the inferences that
can be gleaned from more distant Diaspora communities, such as the
Ethiopian and Indian Jewish communities. Strong similarities to their
neighbouring host populations may have resulted from one or more
of the following: large-scale introgression, asymmetrical sex-biased
gene flow, or religious and cultural diffusion during the process of
becoming one of the many and varied Jewish communities.
METHODS SUMMARY
Blood or buccal samples were collected with informed consent from unrelated
volunteers who self-identified as members of one of the Jewish communities or
non-Jewish populations studied here (Supplementary Note 1). The term ‘Old
World’ refers to populations of the Eastern Hemisphere, specifically Europe,
Asia and Africa. Whenever the term Jewish is not part of the population
designation, this refers to a non-Jewish population. DNA samples chosen for
the biparental analysis were genotyped on Illumina 610K or 660K bead arrays
and showed a genotyping success rate of more than 97%. Data management and
quality control were aided by PLINK 1.05 (ref. 30). For comparison, the relevant
populations from the Illumina 650K-based data set of the Human Genome
Diversity Panel, excluding relatives
18
, were included in our analysis. After iden-
tification of the intersection of genotypes from the various Bead-Arrays, quality
control (QC) and linkage disequilibrium (LD) pruning, a total of 226,839 auto-
somal single nucleotide polymorphisms (SNPs) remained for further analysis.
PCA of autosomal variation using the smartpca of the EIGENSOFT package
24
was performed (Supplementary Note 2). Samples were modelled as comprising a
mixture of major genetic components using the structure-like ADMIXTURE
program
27
, and the inferred genetic membership of each individual from this
analysis was studied (Supplementary Notes 3 and 4). ASD
29
between groups
was assessed, and a bootstrap procedure to determine the significance of differ-
ences in ASD between pairs of populations was adapted (Supplementary Note 5).
Our uniparental data was merged with previously reported data sets for
Y-chromosome and mtDNA analysis (Supplementary Note 6). A matrix of
Y-chromosome and mtDNA haplogroup frequencies was constructed, and
PCA was performed in the R environment (using the function princomp).
Full Methods and any associated references are available in the online version of
the paper at www.nature.com/nature.
Received 9 December 2009; accepted 21 April 2010.
Published online 9 June 2010.
1. Ben-Sasson, H. H. A History of the Jewish People (Harvard Univ. Press, 1976).
2. De Lange, N. Atlas of the Jewish World (Phaidon Press, 1984).
3. Mahler, R. A History of Modern Jewry (Schocken, 1971).
4. Stillman, N. A. Jews of Arab Lands: A History and Source Book (Jewish Publication
Society of America, 1979).
5. Della Pergola, S. in Papers in Jewish Demography 1997 (eds Della Pergola, S. & Even,
J.) 11
33 (The Hebrew University of Jerusalem, 1997).
6. Cavalli-Sforza, L. L., Menozzi, A. & Piazza, A. in The History and Geography of
Human Genes 4 (Princeton Univ. Press, 1994).
7. Bauchet, M. et al. Measuring European population stratification with microarray
genotype data. Am. J. Hum. Genet. 80, 948
956 (2007).
8. Behar, D. M. et al. Counting the founders: the matrilineal genetic ancestry of the
Jewish Diaspora. PLoS ONE 3, e2062 (2008).
9. Hammer, M. F. et al. Jewish and Middle Eastern non-Jewish populations share a
common pool of Y-chromosome biallelic haplotypes. Proc. Natl Acad. Sci. USA 97,
6769
6774 (2000).
10. Kopelman, N. M. et al. Genomic microsatellites identify shared Jewish ancestry
intermediate between Middle Eastern and European populations. BMC Genet. 10,
80 (2009).
11. Need, A. C., Kasperaviciute, D., Cirulli, E. T. & Goldstein, D. B. A genome-wide
genetic signature of Jewish ancestry perfectly separates individuals with and
without full Jewish ancestry in a large random sample of European Americans.
Genome Biol. 10, R7 (2009).
12. Olshen, A. B. et al. Analysis of genetic variation in Ashkenazi Jews by high density
SNP genotyping. BMC Genet. 9, 14 (2008).
13. Ostrer, H. A genetic profile of contemporary Jewish populations. Nature Rev.
Genet. 2, 891
898 (2001).
14. Price, A. L. et al. Discerning the ancestry of European Americans in genetic
association studies. PLoS Genet. 4, e236 (2008).
15. Seldin, M. F. et al. European population substructure: clustering of northern and
southern populations. PLoS Genet. 2, e143 (2006).
16. Tian, C. et al. Analysis and application of European genetic substructure using
300 K SNP information. PLoS Genet. 4, e4 (2008).
17. Abdulla, M. A. et al. Mapping human genetic diversity in Asia. Science 326,
1541
1545 (2009).
18. Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns
of variation. Science 319, 1100
1104 (2008).
19. Jakobsson, M. et al. Genotype, haplotype and copy-number variation in worldwide
human populations. Nature 451, 998
1003 (2008).
20. Novembre, J. et al. Genes mirror geography within Europe. Nature 456, 98
101
(2008).
21. Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian
population history. Nature 461, 489
494 (2009).
22. Biswas, S., Scheinfeldt, L. B. & Akey, J. M. Genome-wide insights into the patterns
and determinants of fine-scale population structure in humans. Am. J. Hum. Genet.
84, 641
650 (2009).
23. Tishkoff, S. A. et al. The genetic structure and history of Africans and African
Americans. Science 324, 1035
1044 (2009).
Table 1
|
Genetic distances (ASD) between Jewish, Levantine and Diaspora host populations
Jewish community Host population Hosts Levant*All Jews West Eurasian Jews{Jewish cluster{
Ashkenazi Europe10.236 0.239I0.240 0.236 0.235
Sephardi Spain 0.236 0.238 0.239 0.236 0.235
Moroccan Morocco 0.246 0.239 0.240 0.237 0.236
Georgian Georgia 0.234 0.238 0.239 0.236 0.236
Azerbaijani Lezgin 0.238 0.240 0.241 0.238 0.237
Iranian Iran 0.239 0.239 0.240 0.237 0.236
Iraqi Syria, Iran 0.238 0.238 0.239 0.236 0.236
Uzbekistani Uzbekistan 0.243 0.238 0.239 0.236 0.235
Bene Israel India (Mumbai) 0.240 0.245 0.245 0.243 0.241
Cochini India (Kerala) 0.238 0.247 0.247 0.245 0.241
Ethiopian Ethiopia"0.245 0.253 0.255 0.254
Yemenite Yemen 0.243 0.238 0.240 0.237
*Levant populations included Bedouin, Cypriots, Druze, Jordanians, Lebanese, Palestinians, Samaritans and Syrians.
{All Jewish populations excluding Ethiopian and Indian Jews.
{Jewish communities in the same cluster as obtained from the PCA analysis (Supplementary Fig. 3) are indicated by bold, italic or underlined type under the heading Jewish community.
1Russians, Romanians, Hungarians, Belorussians, French and Lithuanians.
ISignificance throughout the table: italic entries are significantly bigger than ASD from hosts (that is, further away), bold entries are significantly smaller than ASD from hosts; see Supplementary
Table 3 for details.
"Amhara, Oromo and Tigray.
LETTERS NATURE
4
Macmillan Publishers Limited. All rights reserved
©2010
24. Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS
Genet. 2, e190 (2006).
25. Hourani, A. A History of the Arab Peoples (Faber & Faber, 1991).
26. Weiss, K. M. & Long, J. C. Non-Darwinian estimation: my ancestors, my genes’
ancestors. Genome Res. 19, 703
710 (2009).
27. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of
ancestry in unrelated individuals. Genome Res. 19, 1655
1664 (2009).
28. Rasmussen, M. et al. Ancient human genome sequence of an extinct Palaeo-
Eskimo. Nature 463, 757
762 (2010).
29. Gao, X. & Martin, E. R. Using allele sharing distance for detecting human
population stratification. Hum. Hered. 68, 182
191 (2009).
30. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-
based linkage analyses. Am. J. Hum. Genet. 81, 559
575 (2007).
Supplementary Information is linked to the online version of the paper at
www.nature.com/nature.
Acknowledgements We thank the individuals who provided DNA samples for this
study, including the National Laboratoryfor the Genetics of Israeli Populations; Mari
Nelis, Georgi Hudjashov and Viljo Soo for conducting the autosomal genotyping;
Lauri Anton for computational help. R.V. and D.M.B. thank the European
Commission, Directorate-General for Researchfor FP7 Ecogene grant 205419. R.V.
thanks the European Union, Regional Development Fund through a Centre of
Excellence in Genomics grant and the Swedish Collegium for Advanced Studies for
support during the initial stage of this study. E.M. and Si.R. thank the Estonian
Science Foundation for grants 7858 and 7445, respectively. K.S. thanks the Arthur
and Rosalinde Gilbert Foundation fund of the American Technion Society. Sa.R.
thanks the European Union for Marie Curie International Reintegration grant
CT-2007-208019,and the Israeli Science Foundation for grant 1227/09. IPATIMUP
is an Associate Laboratory of the Portuguese Ministry of Science, Technology and
Higher Education and is partlysupported by Fundac¸a
˜o para a Cie
ˆncia ea Tecnologia,
the Portuguese Foundation for Science and Technology.
Author Contributions D.M.B. and R.V. conceived and designed the study. B.B.T.,
D.C., D.G., D.M.B., E.K.K., G.C., I.K., L.P., M.F.H., O.B., O.S., T.P. and R.V. provided
DNA samples to this study. E.M., J.P. and G.Y. screened and prepared the samples
for the autosomal genotyping. D.M.B., E.M., G.C., M.F.H. and Si.R. generated and
summarized the databasefor the uniparental analysis. B.Y., M.M. and Sa.R. designed
and applied the modelling methodologyand statistical analysis. T.P. provided expert
input regarding the relevant historical aspects. B.Y., D.M.B., K.S., M.F.H., M.M., R.V.
and Sa.R. wrote the paper. B.Y., D.M.B. and M.M. contributed equally to the paper.
All authors discussed the results and commented on the manuscript.
Author Information The array data described in this paper are deposited in the
Gene Expression Omnibus under accession number GSE21478. Reprints and
permissions information is available at www.nature.com/reprints. The authors
declare no competing financial interests. Readers are welcome to comment on the
online version of this article at www.nature.com/nature. Correspondence and
requests for materials should be addressed to D.M.B. (behardm@usernet.com),
K.S. (skorecki@tx.technion.ac.il) or R.V. (rvillems@ebc.ee).
NATURE LETTERS
5
Macmillan Publishers Limited. All rights reserved
©2010
METHODS
Sample collection. All samples reported here were derived from a buccal swab or
blood cells collected with informed consent in accordance with protocols
approved by the National Human Subjects Review Committee in Israel and
Institutional Review boards of the participating research centres. Participants
were recruited during scheduled archaeogenetics lectures addressing the general
public, genealogical societies, heritage centres and the scientific community.
Each volunteer reported ancestry by providing information on the origin of all
four grandparents. Samples were also obtained from the National Laboratory for
the Genetics of Israeli Populations (http://www.tau.ac.il/medicine/NLGIP/).
Comparative data sets for the uniparental and biparental analysis were
assembled from the literature as summarized in Supplementary Note 1 and
Supplementary Tables 1 and 4 and 5.
Genotyping autosomal markers. Illumina 610K or 660K bead arrays were used
for genotyping with standard protocols, and Bead Studio software was used to
assign genotypes. PLINK 1.05 (ref. 30) was used to perform data management
and QC operations. Samples and SNPs with success rates of less than 97% were
excluded. A total of 475 novel samples were analysed, 121 of which were from 14
Jewish communities representing most of the known geographic range of Jews
during the past 100 years. The other 354 samples were chosen from 27 non-
Jewish populations to enable paired analysis with the Jewish sample set. For
comparison, relevant populations were further included (Supplementary
Table 1) from the Illumina 650K-based data set of the Human Genome
Diversity Panel after excluding relatives as in ref. 18. Because background LD
can distort both PCA
24
and structure-like analysis
27
results, one member of any
pair of SNPs in strong LD (r
2
.0.4) in windows of 200 SNPs (sliding the window
by 25 SNPs at a time) was removed using indep-pairwise in PLINK. After iden-
tifying the intersection of genotypes from the two types of bead array (Illumina
610K and 660K), QC and LD pruning, a total of 226,839 autosomal SNPs were
chosen for all autosomal analyses.
Principal component analysis. PC analysis was performed with the smartpca
program of the EIGENSOFT package
24
. To express the relative importance of the
top two eigenvectors in the resulting PC plot, two axes were scaled by a factor
equal to the square root of the corresponding eigenvalue (Supplementary Note
2). Our analysis was repeated for the entire set of populations and for the subset
of west Eurasian populations (Supplementary Table 1). The R environment was
used to perform PCA (using the function princomp) and plot the results for all
analyses of uniparental data.
Structure-like analysis. The recently introduced structure-like approach was
applied as assembled in the program ADMIXTURE
27
(Supplementary Notes 3
and 4). ADMIXTURE was run on our global and west Eurasian data sets 100
times in parallel at K52toK510 (using random seeds). Convergence between
independent runs at the same Kwas monitored by comparing the resulting log-
likelihood scores (LLs). The minimal variation in LLs (less than 1 LL unit) within
a fraction (10%) of runs with the highest LLs was assumed to be a reasonable
proxy for inferring convergence
28
. In the global data set, convergence was
observed in the case of all explored Kvalues (K52toK510). Results from
runs at all values of Kare shown rather than restricting the reader to one chosenK
(Supplementary Note 3). To focus on population structure in the relevant
regions of the Middle East and Europe we performed analyses on a data set
restricted to west Eurasian samples. In this analysis, convergence was reached
at K52toK55; K57 and K58. Only K54 was highlighted in Supplemen-
tary Fig. 5 because components appearing at higher values of Kwere predomi-
nantly restricted to a single population and were therefore less informative for
our purposes. Judging from the distribution of LLs of the converged Kvalues, the
maximum-likelihood solutions with LLs very close to the highest LLs were also
the most frequent solutions (except for K56 of the global data set). One run
from the top LLs fraction of each converged K(from global and west Eurasian
data set) was plotted with Excel (Supplementary Fig. 4a, b).
Allele sharing distances. ASD was used for measuring genetic distances between
populations. ASD is less sensitive to small sample size than the Fixation Index
(F
ST
) and other measures
29
, and more appropriate for our goal of measuring
genetic distances between groups regardless of their internal diversity. Standard
errors of ASD values were calculated with a bootstrap approach, accounting for
variance resulting from both sample selection and site selection. ASDs between
individual Jewish populations and population groups representing a geographic
region or ethnic group were calculated. In each case, the population under
consideration was removed from all groupings with which it was compared.
To test significance of differences in pairs of ASD values in each row in
Table 1, a bootstrap approach was used (Supplementary Note 5 and Supplemen-
tary Tables 2 and 3).
Genotyping uniparental markers. Our data from the Y chromosome and
mtDNA were combined with previously published data sets from populations
of interest (Supplementary Note 6). Markers were chosen to match the phylo-
genetic level of resolution achieved in previously reported data sets. A total of
8,210 samples were assembled for Y-chromosome analysis (Supplementary
Table 4). Genotypes for these sites were determined by using multiple tech-
niques, such as allele-specific PCR, TaqMan, Kaspar and direct sequencing. A
total of 13,919 samples were assembled for mtDNA analysis (Supplementary
Table 5).
doi:10.1038/nature09103
Macmillan Publishers Limited. All rights reserved
©2010
... We assembled a dataset comprehensive of 2,662 present-day individuals from the Mediterranean basin, Europe and Africa genotyped with different Illumina SNP arrays [39,48,61,[64][65][66][67][68][69][70][71][72] (Table S4). Only SNPs and individuals with a missingness rate lower than 1% were retained (--geno and --mind flags equal to 0.01). ...
... Table S4. [65,72] and present-day people inhabiting Georgia [65,72] were used. Table S2. ...
... Table S4. [65,72] and present-day people inhabiting Georgia [65,72] were used. Table S2. ...
Article
Full-text available
Southern Italy was characterised by a complex prehistory that started with different Palaeolithic cultures, later followed by the Neolithization and the demic dispersal from the Pontic-Caspian Steppe during the Bronze Age. Archaeological and historical evidences point to a link between Southern Italians and the Balkans still present in modern times. To shed light on these dynamics, we analysed around 700 South Mediterranean genomes combined with informative ancient DNAs. Our findings revealed high affinities of South-Eastern Italians with modern Eastern Peloponnesians, and a closer affinity of ancient Greek genomes with those from specific regions of South Italy than modern Greek genomes. The higher similarity could be associated with a Bronze Age component ultimately originating from the Caucasus with high Iranian and Anatolian Neolithic ancestries. Furthermore, extremely differentiated allele frequencies among Northern and Southern Italy revealed putatively adapted SNPs in genes involved in alcohol metabolism, nevi features and immunological traits.
... To conduct the PC and ADMIXTURE analyses, we combined the newly generated genotypes with the data from previous studies (Li et al. 2008;Behar et al. 2010Behar et al. , 2013Rasmussen et al. 2010;Metspalu et al. 2011;Yunusbayev et al. 2012;Fedorova et al. 2013;Raghavan et al. 2013). Individuals with missing genotypes greater than 1.5% were excluded from the combined dataset. ...
... To confirm the ethnic identity of the patients, we compared their genetic components with different populations of the world using a PC analysis (Patterson et al. 2006) and performed a global ancestry inference with ADMIXTURE (Alexander et al. 2009). For the PC and ADMIXTURE analyses, we combined the genotypes of the patients with the data from previous studies (Li et al. 2008;Behar et al. 2010Behar et al. , 2013Rasmussen et al. 2010;Metspalu et al. 2011;Yunusbayev et al. 2012;Fedorova et al. 2013;Raghavan et al. 2014). Figure 1 shows that all individuals homozygous for c.-23 + 1G > A were clustered according to their ethnic identity and geographic origin (Fig. 1b). ...
Article
Full-text available
Mutations in the GJB2 gene are known to be a major cause of autosomal recessive deafness 1A (OMIM 220290). The most common pathogenic variants of the GJB2 gene have a high ethno-geographic specificity in their distribution, being attributed to a founder effect related to the Neolithic migration routes of Homo sapiens. The c.-23 + 1G > A splice site variant is frequently found among deaf patients of both Caucasian and Asian origins. It is currently unknown whether the spread of this mutation across Eurasia is a result of the founder effect or if it could have multiple local centers of origin. To determine the origin of c.-23 + 1G > A, we reconstructed haplotypes by genotyping SNPs on an Illumina OmniExpress 730 K platform of 23 deaf individuals homozygous for this variant from different populations of Eurasia. The analyses revealed the presence of common regions of homozygosity in different individual genomes in the sample. These data support the hypothesis of the common founder effect in the distribution of the c.-23 + 1G > A variant of the GJB2 gene. Based on the published data on the c.-23 + 1G > A prevalence among 16,177 deaf people and the calculation of the TMRCA of the modified f2-haplotypes carrying this variant, we reconstructed the potential migration routes of the carriers of this mutation around the world. This analysis indicates that the c.-23 + 1G > A variant in the GJB2 gene may have originated approximately 6000 years ago in the territory of the Caucasus or the Middle East then spread throughout Europe, South and Central Asia and other regions of the world.
... The genetic heritage of Jewish populations has been deeply scrutinized at the population level as well as for the medical implications, using uniparental and autosomal markers (Hammer et al., 2000;Ostrer, 2001;Bauchet et al., 2007;Adams et al., 2008;Behar et al., 2008;Olshen et al., 2008;Kopelman et al., 2009;Elhaik, 2013) and more recently through genome-wide approaches (Seldin et al., 2006;Atzmon et al., 2010;Behar et al., 2010;Campbell et al., 2012;Velez et al., 2012;Ostrer and Skorecki, 2013). ...
... New data from recombining genetic markers in the line of Behar et al. (2010), as well as from classical genealogical studies, will surely contribute decisively to explain how this was achieved. At any rate, the DNA evidence gathered so far adds a new facet to the already recognized astonishing cultural resistance of these communities: not only they have kept a sense of belonging throughout centuries of persecution but they also succeeded in maintaining a genetic heritage of their own. ...
... Population stratification can be an issue in the analysis of population-based genetic data, including WES, particularly for association studies (21)(22)(23)(24). Population structures have been widely determined by GWSA (25,26) in European (27), African (28,29), Asian (30), Jewish (31), Mexican (32), and other populations (33). These analyses are mostly based on principal component analysis (PCA) (34), which can also be used to confirm or reveal the ethnicity of an individual patient (or his or her parents). ...
... In addition to our data, we also used genome-wide autosomal marker data of populations from the Human Genome Diversity Project whole genomes to increase overlap (genotyped on Illumina 650Y platform) and also considered two datasets (genotyped on Illumina 610K platforms) obtained from the open genotype database of the Estonian Biocentre detailed in Supplementary Table S3 (Cavalli-Sforza, 2005;Rosenberg et al., 2005;Behar et al., 2010;Yunusbayev et al., 2012). We performed on these data the same quality control as described above in the case of TLH and TLS. ...
Article
Full-text available
Genome-wide genotype data from 48 carefully selected population samples of Transylvania-living Szeklers and non-Szekler Hungarians were analyzed by comparative analysis. Our analyses involved contemporary Hungarians living in Hungary, other Europeans, and Eurasian samples counting 530 individuals altogether. The source of the Szekler samples was the commune of Korond, Transylvania. The analyzed non-Szekler Hungarian samples were collected from villages with a history dating back to the era of the Árpád Dynasty. Population structure by principal component analysis and ancestry analysis also revealed a great within-group similarity of the analyzed Szeklers and non-Szekler Transylvanian Hungarians. These groups also showed similar genetic patterns with each other. Haplotype analyses using identity-by-descent segment discovering tools showed that average pairwise identity-by-descent sharing is similar in the investigated populations, but the Korond Szekler samples had higher average sharing with the Hungarians from Hungary than non-Szekler Transylvanian Hungarians. Average sharing results showed that both groups are isolated compared to other Europeans, and pointed out that the non-Szekler Transylvanian Hungarian inhabitants of the investigated Árpád Age villages are more isolated than investigated Szeklers from Korond. This was confirmed by our autozygosity analysis as well. Identity-by-descent segment analyses and 4-population tests also confirmed that these Hungarian-speaking Transylvanian ethnic groups are strongly related to Hungarians living in Hungary.
... We next investigated the genetic relationship of the Emiratis to worldwide populations by combining our samples with the 1000G data and with published data from Sub-Saharan Africa, the Middle East, Europe, the Caucasus, and South Asia (Behar et al. 2010;Pagani et al. 2015;Sudmant et al. 2015;The 1000Genomes Project Consortium et al. 2015Yunusbayev et al. 2015;Pagani et al. 2016) (supplementary tables S1-S5, Supplementary Material online). This allowed us to understand the current genetic landscape in the Emirates and how it is related to neighboring populations. ...
Article
Full-text available
The indigenous population of the United Arab Emirates (UAE) has a unique demographic and cultural history. Its tradition of endogamy and consanguinity is expected to produce genetic homogeneity and partitioning of gene pools while population movements and intercontinental trade are likely to have contributed to genetic diversity. Emiratis and neighbouring populations of the Middle East have been underrepresented in the population genetics literature with few studies covering the broader genetic history of the Arabian Peninsula. Here, we genotyped 1,198 individuals from the seven Emirates using 1.7 million markers and by employing haplotype-based algorithms and admixture analyses we reveal the fine-scale genetic structure of the Emirati population. Shared ancestry and gene flow with neighbouring populations display their unique geographic position while increased intra- vs inter-Emirati kinship and sharing of uniparental haplogroups, reflect the endogamous and consanguineous cultural traditions of the Emirates and their tribes.
... Finally, our study did not include some ethnicities, such as East Asians. Yet, the genetic ancestry of the population was heterogeneous, and was shown to be related by genotyping to contemporary Middle East, North African, European and other Western populations [48,49]. The strengths of this study include a nationwide screening setting that applied to both sexes, measured BP, weight and height within a narrow age range during a period of over four decades. ...
Article
Full-text available
Background Elevated blood pressure among adolescents has been shown to be associated with future adverse cardiovascular outcomes and early onset diabetes. Most data regarding systolic and diastolic blood pressure trends are based on surveys of selected populations within 10–20-year periods. The goal of this study was to characterize the secular trend of blood pressure given the rising prevalence of adolescent obesity. Methods This nationwide population-based study included 2,785,515 Israeli adolescents (41.6% females, mean age 17.4 years) who were medically evaluated and whose weight, height and blood pressure were measured, prior to mandatory military service between 1977 and 2020. The study period was divided into 5-year intervals. Linear regression models were used to describe the P for trend along the time intervals. Analysis of covariance was used to calculate means of blood pressure adjusted for body mass index. Results During the study period, the mean body mass index increased by 2.1 and 1.6 kg/m ² in males and females, respectively (P for trend < 0.001 in both sexes). The mean diastolic blood pressure decreased by 3.6 mmHg in males and by 2.9 mmHg in females (P < 0.001 in both sexes). The mean systolic blood pressure increased by 1.6 mmHg in males and decreased by 1.9 mmHg in females. These trends were also consistent when blood pressure values were adjusted to body mass index. Conclusion Despite the increase in body mass index over the last four decades, diastolic blood pressure decreased in both sexes while systolic blood pressure increased slightly in males and decreased in females.
... In order to explore the ancestry of the analyzed individuals, an explorative analysis was carried out using a PCA and Admixture approach (Alexander et al., 2009). We merged the genotype data with publicly available genome-wide datasets (Behar et al., 2010;Behar et al., 2013;Kushniarevich et al., 2015;Ongaro et al., 2019;Tambets et al., 2018;Tamm et al., 2019;Yunusbayev et al., 2012;Yunusbayev et al., 2015) using PLINK 1.9, a widely used program for research in population genetics (Chang et al., 2015). After merging, SNPs and individuals characterized by less than 3% and 5% of missing data were retained for a total of 87,181 SNPs, and 516 individuals. ...
Article
Full-text available
Recessive dystrophic epidermolysis bullosa (RDEB) is a rare genodermatosis caused by mutations in the gene coding for type VII collagen (COL7A1). More than 800 different pathogenic mutations in COL7A1 have been described to date; however, the ancestral origins of many of these mutations have not been precisely identified. In this study, 32 RDEB patient samples from the Southwestern United States, Mexico, Chile, and Colombia carrying common mutations in the COL7A1 gene were investigated to determine the origins of these mutations and the extent to which shared ancestry contributes to disease prevalence. The results demonstrate both shared European and American origins of RDEB mutations in distinct populations in the Americas and suggest the influence of Sephardic ancestry in at least some RDEB mutations of European origins. Knowledge of ancestry and relatedness among RDEB patient populations will be crucial for the development of future clinical trials and the advancement of novel therapeutics.
Article
Mutations in the BRCA1 and BRCA2 genes increase the risk for various cancers including breast, ovarian, prostate, pancreas and melanoma. Identifying BRCA1/2 mutation carriers enables risk assessment, surveillance, early detection and risk reduction. In certain Israeli sub-populations recurring and founder mutations have been identified and for these, testing for founder mutations is simple, efficient and cost-effective. Founder mutations in the Jewish Ethiopian population have not been described. We report here the identification of a recurring BRCA2 mutation in the Ethiopian Jewish population; c.5159C>A; p.Ser1720Ter, which has only been described once before in this population. In addition, in another family of the same origin we found the BRCA2 c.7579delG; p.Val2527Ter mutation that has been previously described in two different Jewish Ethiopian families. In Israel genetic testing is performed in a sequential stepwise manner, first testing a panel of predominant mutations and if negative further testing by gene sequencing is offered. Recently it has been decided to expand the founder mutation panel to include mutations which have been found in two or more separate families. This new panel will include the BRCA2 c.7579delG; p.Val2527Ter mutation, and we recommend that the BRCA2 c.5159C>A; p.Ser1720Ter mutation should also be added to the new predominant mutation panel.
Chapter
At the beginning of 2020, the world’s Jewish population was estimated at 14,787,200—an increase of 92,400 (0.63%) over the 2019 revised estimate of 14,694,800. The world’s total population increased by 0.92% in 2019. The rate of increase of world Jewry hence amounted to two thirds of that of the total population. The Jewish population was highly concentrated in two countries: Israel (46% of the world total) and the US (39%). Nine percent lived in Europe, 5% in other North America and Latin America, and 1% in other continents. Steady demographic increase in Israel was matched by stagnation or decline elsewhere, explained by low birth rates, frequent intermarriage, identificational drift, aging, and emigration. Most Jews are increasingly found in a handful of developed and democratic countries, with tens of communities currently below sufficient critical mass needed to sustain viable community institutions. This chapter carefully reviews different approaches to Jewish population definitions and the highly variable availability and reliability of data sources. The critically important Jewish-Arab population balance in Israel and Palestine is analyzed. Estimates are provided for 102 countries with at least 100 Jews each, along with vignettes on the 14 largest Jewish populations each with 40,000 Jews or more—Israel, the US, France, Canada, the United Kingdom, Argentina, Russia, Germany, Australia, Brazil, South Africa, Hungary, Ukraine, and Mexico.
Article
Understanding the genetic structure of human populations is of fundamental interest to medical, forensic and anthropological sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation and suggest the potential to use large samples to uncover variation among closely spaced populations. Here we characterize genetic variation in a sample of 3,000 European individuals genotyped at over half a million variable DNA sites in the human genome. Despite low average levels of genetic differentiation among Europeans, we find a close correspondence between genetic and geographic distances; indeed, a geographical map of Europe arises naturally as an efficient two-dimensional summary of genetic variation in Europeans. The results emphasize that when mapping the genetic basis of a disease phenotype, spurious associations can arise if genetic structure is not properly accounted for. In addition, the results are relevant to the prospects of genetic ancestry testing; an individual’s DNA can be used to infer their geographic origin with surprising accuracy—often to within a few hundred kilometres.
Article
Asia harbors substantial cultural and linguistic diversity, but the geographic structure of genetic variation across the continent remains enigmatic. Here we report a large-scale survey of autosomal variation from a broad geographic sample of Asian human populations. Our results show that genetic ancestry is strongly correlated with linguistic affiliations as well as geography. Most populations show relatedness within ethnic/linguistic groups, despite prevalent gene flow among populations. More than 90% of East Asian (EA) haplotypes could be found in either Southeast Asian (SEA) or Central-South Asian (CSA) populations and show clinal structure with haplotype diversity decreasing from south to north. Furthermore, 50% of EA haplotypes were found in SEA only and 5% were found in CSA only, indicating that SEA was a major geographic source of EA populations.