Content uploaded by Ian Heath
Author content
All content in this area was uploaded by Ian Heath on Apr 09, 2022
Content may be subject to copyright.
How many ancestors does a Briton have?
by Ian J Heath Independent Researcher, Canterbury, UK
version 0.5c, April 9, 2022
Abstract
Professor Wachter’s 1978 seminal paper “Ancestors at the Norman Conquest”
estimated the ancestors of an indigenous Briton. As far as I can ascertain, there has
been limited further progress on the core question of numbers of ancestors since then.
This paper introduces new methods and the accompanying spreadsheet for estimating
ancestors. The results are considerably less than Wachter’s 1978 estimates. It also
estimates all non-extinct persons in each generation, i.e. the total ancestors of everyone
today. It then builds on these estimates to estimate numbers of cousins of any degree.
Introducon
This study will help you understand how many ancestors you had long ago, even back
as far as the first millennium? How could we estimate that? Well, you have two parents
and each of these had two parents themselves, and so on. So as we go back through the
generations, your ancestors double every generation. That means that after 10
generations you have 210, which is 1024, ancestors, and after 30 generations you have
230, approximately one billion, ancestors. But that’s greater than the world population
back then! How can that be? The explanation is “Pedigree Collapse”. Not all of your
doubled-up ancestors are distinct. Many of them occur several times over in your
pedigree.
We first summarise our results on the numbers of ancestors and then explain “Pedigree
Collapse” and our method of estimating ancestors by iterating back through the
generations. It similarly estimates all non-extinct persons in each generation, i.e. the
total ancestors of everyone today. It then uses these estimates to further estimate the
number of cousins of any degree.
Finally, we consider the “Localism” effect , that an ancestor’s parents are more likely
than other parents to be local to the ancestor. This is simulated in Netlogo using
localism by parish. However this made surprisingly little difference.
Summary of the Ancestor Results
The following plot results from running the model back 35 generations to 917AD.
-
500,000
1,000,000
1,500,000
2,000,000
2,500,000
Numbers of British Ancestors of Brit by year
Parents
NonExtin
cts
Note: The plot is cut off at 2.5 million in order to focus on the blue Ancestors curve.
Initially the Ancestors (of “Brit” a representative Briton) is a miniscule % of the
population, less than 1% as far back as 1517AD. It then takes off rapidly to peak in
1217AD before falling down again and finally flattening out at 596,428 in the 1st
millennium. This peaking at 1217AD reflects the population peak predating the Black
Death population trough in 1427AD. The Black Death population trough shows up
clearly for Parents (i.e. all spouses in fertile marriages) and NonExtincts (i.e. all
ancestors of everyone today) but not for Ancestors. The reason for no discernible
population effect on Ancestors is that during this trough Ancestors had only built up to
16% of the parent population limit by that point which was still insufficient to have a
discernible effect.
The flattening of Parents in the 1st millennium is due to our assumption of a constant
630,000 parents per generation (equating to a population size of 1,667,000). As a
result of this constant limit, both Ancestors and NonExtincts saturate and flatten out at
78% of Parents. The theoretical reason for this saturation at 78% is explained in the
section on Ancestors estimation. This 78% saturation level is sensitive to the
immigration level but not to the stabilised population level. So it is the most general
answer that can be given to the overall question of “How many ancestors?” and would
apply to any bounded geography.
half-sibling parents
common grandparent
grandparents
Brit
half-1st cousin parents
grandparents
great grandparents
Br
BritBrit
Genealogical Coalescence (i.e. Pedigree Collapse)
This occurs when a person, for example Brit, has one or more coalescent ancestors.
The simplest example of this is when Brit’s parents are incestuous half-siblings. Both
parents have a common parent (who bore them by different partners). So Brit has 1
coalescent grandparent and consequently has just 3 grandparents. There are 2
separate paths, shown in dark red, from Brit to his coalescent ancestor.
If Brit’s parents are incestuous full-siblings instead then Brit would have just 2
grandparents (both coalescent).
If Brit’s parents are half-1st cousins, he has just 7 great-grandparents, including 1
coalescent great-grandparent.
1st cousin parents
grandparents
great grandparents
Brit
If Brit’s parents are full-cousins, he has 6 great-grandparents, including 2 coalescent
great-grandparents.
Esmang numbers of Ancestors in each generaon
So how can genealogical coalescence be taken into account in calculating the number of
distinct Brit’s ancestors in each generation? The key to this is the realisation that the
ancestors are limited to the population pool in each generation. So we can calculate the
expected number of ancestors in each generation by the following probability formula:
number of distinct picks from a pool =
PoolSize∗
(
1−
(
1−1
PoolSize
)
picks
)
(1)
where
picks
is the number of random picks from a fixed pool of size PoolSize
This follows from the probability that an individual member of the pool is not picked is:
(
1−1
PoolSize
)
picks
We apply this formula iteratively, generation by generation, using the PoolSize for each
generation. PoolSize is the number of Parents in each generation in Great Britain
(excluding Scotland for now).
As in (Wachter, 1978), we assume a generational interval of 30 years, based on an
average maternity age of 29 and paternity age of 33 (and allowing for a slight decrease
in earlier generations).
Genn is generation n, counting backwards in time from Gen0, the current generation
Parentsn is the expected number of parents in Genn (actually the number of spouses in
fertile marriages, based on an assumed average infertility rate of 10%). For Gen0, this
was all persons who married in 1961-1990 and were expected to bear children (on
average) 6 years later, i.e. 1967-1996.
Ancestorsn is the number of Brit’s British ancestors in Genn
(note: immigrants are excluded)
As Ancestorsn+1 are the parents of Ancestorsn, we can estimate them iteratively. In each
generation, we make
picks
picks from the pool of all Parents, where picks is the number
of non-immigrant parents of Ancestors in the later generation. So, employing our
“distinct picks from a pool” formula (1) we get:
Ancestorsn=Parentsn∗
(
1−
(
1−1
Parentsn
)
picksn
)
(2)
where
picks
n
=2∗NonImmigrantAncestors
n−1
and
NonImmigrantAncestors
n−1
=Ancestors
n−1
∗
(
1–Immigrant %
100
)
and
Ancestors
0
=1
(i.e. Brit himself)
Adjusting for remarriages and infertile marriages, we get:
Parents=2∗Marriages
(
1−%Remarriages
100
)(
1−%MarriagesInfertile
100
)
Note 1: We have assumed that all parents were married, as was mostly true in earlier
more-religious times. Whilst this has become less so in recent times, the numbers are
much less significant anyway, as there is almost no Ancestor coalescence before their
numbers start to build up in the 18th century.
Note 2: In determining the %Remarriages input data, half-remarriages are estimated and
counted only half as much as full-remarriages.
The NonExtincts in each generation is the total number of British ancestors of all
Britons today, i.e. the ancestors of everyone in
Gen
0
, the current generation.
NonExtincts is useful as it enables us to derive Cousins of all degrees from Ancestors.
As NonExtincts are the Parents of NonExtincts, we can estimate them iteratively also.
In each generation, we make
picks
picks from the pool of all Parents, where picks is the
number of non-immigrant not-necessarily-distinct parents of NonExtincts in the later
generation. So, employing our “distinct picks from a pool” formula (1) again we get:
NonExtinctsn=Parentsn∗
(
1−
(
1−1
Parentsn
)
picks
)
where
picks=2∗NonExtincts
n−1
∗
(
1−Immigrant %
100
)
and for Gen0 :
NonExtincts0=Parents0
!
For Parents in each Gen, we use:
Marriages by Gen
%Remarriages by Gen, and
an overall %MarriagesInfertile average for all generations. User modifiable
default = 10%
For Ancestors and NonExtincts in each Gen, we use:
Parents, and
an overall Immigrant% average for all generations. Default = 3%
"
#!
GenYears:
International Society of Genetic Genealogy: Generation length
Its conclusions show GenYears has remained remarkably constant, as far back as the medieval ages.
Marital ages:
ONS - Marriages in England and Wales
Marriages for E&W:
1871-1961: ONS - Annual UK figures for births, deaths, marriages etc
1541-1871: “CAMPOP PopEsts.EPHFR.xlsx” data extracted from Wrigley et al (1997),
kindly supplied by The Cambridge Group for the History of Population and Social Structure.
Remarriage%:
1841-1951: ONS - Age and previous marital status at marriage
1541-1851: (Wrigley & Schofield, 1989) p 259
%MarriagesInfertile:
( Wrigley et al, 1997) p 384, table 7.11 “Entry sterility: batchelor/spinster completed marriages”
Immigrant% 1851-1911:
VisionofBritain - Persons Born in the several parts of the UK and elsewhere
$%!
These were summarized at the start of this paper.
The only other estimates of Ancestors to compare with, to my knowledge, are in
(Wachter, 1978) and ours are considerably less than those. This can be partly
explained by the following:
1. PoolSize is Parents and not the whole population.
2. Immigration is taken into account, which cumulatively reduces the
picks
in each
generation by 3%.
The detailed Ancestor and NonExtincts figures from the spreadsheet follow:
&
Gen# Gen
year Marriages %Remarried Parents Ancestors NonExncts Ancestors as
Parents%
0 1967 11,045,925 8.72 18,149,823 1 18,149,823 0.00%
1 1937 10,765,634 8.72 17,689,270 2 15,416,756 0.00%
2 1907 8,741,190 7.38 14,573,260 3.88 12,701,458 0.00%
3 1877 6,402,076 8.68 10,523,583 7.53 9,511,377 0.00%
4 1847 4,766,844 9.11 7,798,578 15 7,066,699 0.00%
5 1817 3,112,089 9.70 5,058,402 28 4,721,907 0.00%
6 1787 2,211,846 10.29 3,571,697 55 3,296,896 0.00%
7 1757 1,796,616 10.88 2,882,137 107 2,568,854 0.00%
8 1727 1,557,547 11.47 2,482,111 207 2,148,803 0.01%
9 1697 1,292,386 12.06 2,045,850 401 1,779,199 0.02%
10 1667 1,203,037 12.64 1,891,658 778 1,586,588 0.04%
11 1637 1,339,099 13.23 2,091,407 1,509 1,611,369 0.07%
12 1607 1,235,352 13.82 1,916,279 2,925 1,541,317 0.15%
13 1577 1,158,028 14.41 1,784,058 5,666 1,450,238 0.32%
14 1547 1,101,507 15 1,685,306 10,957 1,367,864 0.65%
15 1517 842,659 15 1,289,268 21,083 1,124,661 1.64%
16 1487 719,918 15 1,101,475 40,150 949,522 3.65%
17 1457 657,588 15 1,006,110 74,952 844,859 7.45%
18 1427 649,670 15 993,995 135,272 802,894 13.61%
19 1397 706,478 15 1,080,911 233,000 825,075 21.56%
20 1367 792,849 15 1,213,059 377,358 888,850 31.11%
21 1337 1,044,416 15 1,597,956 587,306 1,054,814 36.75%
22 1307 1,372,791 15 2,100,370 879,394 1,307,552 41.87%
23 1277 1,493,844 15 2,285,581 1,202,088 1,532,235 52.59%
24 1247 1,427,854 15 2,184,617 1,433,389 1,624,285 65.61%
25 1217 1,296,865 15 1,984,203 1,495,617 1,578,804 75.38%
26 1187 1,103,467 15 1,688,305 1,385,559 1,413,158 82.07%
27 1157 917,004 15 1,403,016 1,196,471 1,204,205 85.28%
28 1127 784,708 15 1,200,603 1,026,916 1,029,073 85.53%
29 1097 652,411 15 998,189 862,535 863,102 86.41%
30 1067 575,238 15 880,114 748,641 748,806 85.06%
31 1037 500,000 15 765,000 650,410 650,458 85.02%
32 1007 500,000 15 765,000 617,995 618,012 80.78%
33 977 500,000 15 765,000 605,400 605,407 79.14%
34 947 500,000 15 765,000 600,220 600,223 78.46%
35 917 500,000 15 765,000 598,041 598,042 78.18%
36 887 500,000 15 765,000 597,116 597,116 78.05%
37 857 500,000 15 765,000 596,721 596,722 78.00%
38 827 500,000 15 765,000 596,553 596,553 77.98%
39 797 500,000 15 765,000 596,481 596,481 77.97%
40 767 500,000 15 765,000 596,450 596,450 77.97%
41 737 500,000 15 765,000 596,437 596,437 77.97%
'(#!)*
As explained in the results summary , the flattening of Parents in the 1st millennium led
to the subsequent flattening of Ancestors at 78% of Parents. This section will show the
+
theoretical reason for this and why it is always 78% for a 3% immigration level,
regardless of the population level.
Assume L is the flattened saturation level for
Ancestors
Parents
Then L is a stationary point of equation (2), and so satisfies:
L=1−
(
1−1
Parents
)
Parents∗L∗p
where p = picks per parent =
2∗
(
1−Immigrant %
100
)
Now, by the well known limit:
(
1−1
Parents
)
Parents
→ e
−1
as
Parents
→∞
So when Parents is large, L satisfies the equation:
L=1−e
−p∗L
Solving this by fixed-point iteration for a range of immigration levels, gives:
Immigration% Rounded Saturation Level For Parents greater than
0% 80% 33
1% 79% 74
2% 79% 30
3% 78% 54
4% 78% 25
5% 77% 39
The final column shows the minimum number of Parents to give the correct rounded
percentage saturation level. Clearly, in any realistic population, Parents would greatly
exceed this.
To 4 decimal places, the solution for 3% immigration is 77.9643%, which matches well
with 77.9644% at Gen45 (617AD) in our spreadsheet.
,
How many Cousins does Brit have?
Now we have estimates of Ancestors we can estimate cousins, not just 1st cousins but
cousins of any degree. To keep it simple, we only estimate non-removed cousins which
makes a significant difference as it reduces the candidate population for cousins by
about 60%. We also restrict cousins to potential parents in the same generation as
Brit, i.e. to those married 1961-1990.
(!
We start by estimating nth kin, the total descendants of all of Brit’s ancestors in Genn.
For example:
Brit’s 0th kin descend from either of Brit’s parents, including Brit himself and his
full/half siblings.
Brit’s 1st kin descend from any of Brit’s grandparents, including Brit, his siblings
and his full/half 1st cousins.
Brit’s 2nd kin descend from any of Brit’s great-grandparents, including Brit, his
siblings, 1st cousins and his full/half 2nd cousins.
Using nth kin , we can estimate expected nth cousins from nth kin, as:
nth cousins = nth kin – (n-1)th kin
This follows since the nth cousins are all nth kin, and the (n-1)th kin are the nth kin that are
more closely related than the nth cousins.
-.
nth kin =
Parents0∗(1−
(
1−Ancestorsn
nonExtinctsn
)
ancestor couples∈Genn
)
where
ancestor couples∈Genn
=
Ancestors
n
2
, approximately
since
probability that a parent in Genn is an ancestor of Brit =
Ancestors
n
nonExtincts
n
/
%! 0123'!4
Note: To keep things simple, we will only show the results for no migration .
012345678
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
number of British Cousins of Brit by cousin degree
nth cousins
Comparison with
23andme
Gen#
Gen
year Ancestors NonExncts
coalescent
Kin Cousin degree
nth
cousins
Comparison with
23andme
0 1967 1 18,149,823 1 self 1
1 1937 2 15,416,756 2.35 siblings 0 1.35 1.5
2 1907 4 12,701,458 10.76 1st cousins 1 8.40 7.5
3 1877 8 9,511,377 54 2nd cousins 2 43 38
4 1847 15 7,066,699 274 3rd cousins 3 220 190
5 1817 28 4,721,907 1,542 4th cousins 4 1,268 940
6 1787 55 3,296,896 8,312 5th cousins 5 6,770 4,700
7 1757 107 2,568,854 40,113 6th cousins 6 31,801 23,000
8 1727 207 2,148,803 179,775 7th cousins 7 139,662 120,000
9 1697 401 1,779,199 802,825 8th cousins 8 623,050 590,000
As you can see, these cousin estimates accord reasonably well with those of 23andme.
Taking into account the Localism E1ect
Throughout this study we have assumed complete panmixia, i.e. completely random
mating. Consequently our ancestor estimates might be too high as we would expect
localism to reduce the mating circles and hence the ancestral pool.
We explored the effect of such localism by developing a stochastic simulation in
Netlogo. This performs a “Pooled”, by parish, simulation. This model can be accessed
and run interactively in your browser by clicking on this link
http://modelingcommons.org/browse/one_model/5383 . On opening this link you will
see the Info tab with general information about the model. Click the “Run in Netlogo
Web” tab and follow the instructions.
Surprisingly this Localism simulation shows only minor suppression of the ancestor
numbers. The greatest difference is in Gen21 (1337AD) when the Pooled Ancestors are
83% of the unPooled.
-(05 4
In our NetLogo World, patches represent parishes or the uninhabited green space
(coloured green) between them. Parishes that hold any of Brit’s Ancestors are
coloured white and those without ancestors are coloured black. Initially the central
patch is the only white and it holds just one ancestor (Brit). As the population falls in
earlier generations, parishes are replaced with uninhabited green space. Eventually, by
around Gen23, Brit’s Ancestors are scattered across all the remaining parishes and there
are no black patches left.
The tricky part was to model the gravitational selection of the same/nearby parish for
each parent of an ancestor. To do this, we assume that 50% of parents come from the
same parish (as suggested by demographic data), and that the other 50% are scattered
using proximity-weighted randomisation. The parish of each parent is selected by
randomly picking a parish within a proximity-weighted random radius of the child’s
parish.
You can refer to the “Info” tab of the model for detail on how this works.
#!() 6
#Parishes:
1831-present: VisionofBritain nCube: Total Population
1560-1820: TheClergyDatabase Advanced Search
%ParentHere in 19th-20th centuries:
Migration and Mobility in Britain from the Eighteenth to the Twentieth Centuries ,
Pooley and Turnbull (1996)
%ParentInSameCounty in 1881 (used to weight proximity):
Southall, H.R. and Ell, P. Great Britain Historical Database: Census Data: Migration
Statistics, 1851-1951
References
CAMPOP The Cambridge Group for the History of Population and Social Structure for
marriage data kindly supplied on request.
International Society of Genetic Genealogy: Cousin statistics
International Society of Genetic Genealogy: Generation length
International Society of Genetic Genealogy: Pedigree Collapse
Brenna M. Henn, Lawrence Hon , J. Michael Macpherson, Nick Eriksson, Serge Saxonov,
Itsik Pe'er, Joanna L. Mountain. Cryptic Distant Relatives Are Common in Both
Isolated and Cosmopolitan Genetic Samples
https://doi.org/10.1371/journal.pone.0034267
Derrida, B., S. C. Manrubia, and D. H. Zanette. 1999. Statistical properties of genealogical
trees. Phys. Rev. Lett. 82:1987-1990.
Derrida, B., S. C. Manrubia, and D. H. Zanette. 2000a. On the genealogy of a population of
biparental individuals. J. Theor. Biol. 203:303-315.
Derrida, B., S. C. Manrubia, and D. H. Zanette. 2000b. Distributions of repetitions of
ancestors in genealogical trees. Physica A. 281:1-16.
Falconer, D. S., 1989 Introduction to Quantitative Genetics, Ed. 3
Kaplanis et al. Quantitative analysis of population-scale family trees with millions of
relatives, March 2018, Science, DOI 10.1126/science.aam9309
Kelleher et al. Spread of pedigree versus genetic ancestry in spatially distributed
populations, November 2015, Theoretical Population Biology 108, DOI
10.1016/j.tpb.2015.10.008
Murphy, Mike (2004) Tracing Very Long-Term Kinship Networks Using SOCSIM.
Demographic Research, Volume 10, Article 7, pages 171-196
Pattison, John E. Estimating inbreeding in large, semi-isolated populations: Effects of
varying generation lengths and of migration, July 2007, American Journal of Human
Biology 19(4):495-510
Pattison, John E. Estimating Inbreeding in Human Populations over Historic Times,
December 2008. Conference: 22nd Annual Conference of the Australasian Society for
Human Biology, At: Adelaide, South Australia
Pattison, John E. An Attempt to Integrate Previous Localized Estimates of Human
Inbreeding for the Whole of Britain, December 2016, Human Biology 88(4):264-274
Pooley, Colin G. and Turnbull, Jean (1996) Migration and mobility in Britain from the
eighteenth to the twentieth centuries. Local Population Studies, 57. pp. 50-71. ISSN
0143-2974
Smith, M. T. (2001) Estimates of cousin marriage and mean inbreeding in the United
Kingdom from 'birth briefs', Journal of biosocial science., 33 (1) pp. 55-66.
Speed, D; Balding, DJ; (2015) Relatedness in the post-genomic era: is it still
useful? Nature Reviews Genetics , 16
Wachter, K. W. (1978) Ancestors at the Norman Conquest. In: Statistical Studies of
Historical Social Structure, pp. 153–161. Edited by K. W. Wachter, E. A. Hammel & P.
Laslett. Academic Press, New York.
K. W. Wachter, D. Blackwell and E. A. Hammel (1997) Testing the Validity of Kinship
Microsimulation. Mathematical Computer Modelling Vol. 26, No. 6, pp. 89-104
Wrigley & Schofield (1989) The Population History of England 1541-1871
Wrigley et al (1997) English Population History from Family Reconstitution 1580-1837