[show abstract][hide abstract] ABSTRACT: Computing the genetic relationship between two humans is important to studies in genetics, genomics, genealogy, and forensics. Relationship algorithms may be sensitive to noise, such as that arising from sequencing errors or imperfect reference genomes. We developed an algorithm for estimation of genetic relationship by averaged blocks (GRAB) that is designed for whole-genome sequencing (WGS) data. GRAB segments the genome into blocks, calculates the fraction of blocks sharing identity, and then uses a classification tree to infer 1st- to 5th- degree relationships and unrelated individuals. We evaluated GRAB on simulated and real sequenced families, and compared it with other software. GRAB achieves similar performance, and does not require knowledge of population background or phasing. GRAB can be used in workflows for identifying unreported relationships, validating reported relationships in family-based studies, and detection of sample-tracking errors or duplicate inclusion. The software is available at familygenomics.systemsbiology.net/grab.
PLoS ONE 01/2014; 9(2):e85437. · 3.73 Impact Factor
[show abstract][hide abstract] ABSTRACT: The determination of the relationship between a pair of individuals is a fundamental application of genetics. Previously, we and others have demonstrated that identity-by-descent (IBD) information generated from high-density single-nucleotide polymorphism (SNP) data can greatly improve the power and accuracy of genetic relationship detection. Whole-genome sequencing (WGS) marks the final step in increasing genetic marker density by assaying all single-nucleotide variants (SNVs), and thus has the potential to further improve relationship detection by enabling more accurate detection of IBD segments and more precise resolution of IBD segment boundaries. However, WGS introduces new complexities that must be addressed in order to achieve these improvements in relationship detection. To evaluate these complexities, we estimated genetic relationships from WGS data for 1490 known pairwise relationships among 258 individuals in 30 families along with 46 population samples as controls. We identified several genomic regions with excess pairwise IBD in both the pedigree and control datasets using three established IBD methods: GERMLINE, fastIBD, and ISCA. These spurious IBD segments produced a 10-fold increase in the rate of detected false-positive relationships among controls compared to high-density microarray datasets. To address this issue, we developed a new method to identify and mask genomic regions with excess IBD. This method, implemented in ERSA 2.0, fully resolved the inflated cryptic relationship detection rates while improving relationship estimation accuracy. ERSA 2.0 detected all 1(st) through 6(th) degree relationships, and 55% of 9(th) through 11(th) degree relationships in the 30 families. We estimate that WGS data provides a 5% to 15% increase in relationship detection power relative to high-density microarray data for distant relationships. Our results identify regions of the genome that are highly problematic for IBD mapping and introduce new software to accurately detect 1(st) through 9(th) degree relationships from whole-genome sequence data.
[show abstract][hide abstract] ABSTRACT: A mutation in presenilin 1 (E280A) causes early-onset Alzheimer's disease. Understanding the origin of this mutation will inform medical genetics.
We sequenced the genomes of 102 individuals from Antioquia, Colombia. We applied identity-by-descent analysis to identify regions of common ancestry. We estimated the age of the E280A mutation and the local ancestry of the haplotype harboring this mutation.
All affected individuals share a minimal haplotype of 1.8 Mb containing E280A. We estimate a time to most recent common ancestor of E280A of 10 (95% credible interval, 7.2-12.6) generations. We date the de novo mutation event to 15 (95% credible interval, 11-25) generations ago. We infer a western European geographic origin of the shared haplotype.
The age and geographic origin of E280A are consistent with a single founder dating from the time of the Spanish Conquistadors who began colonizing Colombia during the early 16th century.
Alzheimer's & dementia: the journal of the Alzheimer's Association 11/2013; · 5.90 Impact Factor