GRR: graphical representation of relationship errors.
ABSTRACT A graphical tool for verifying assumed relationships between individuals in genetic studies is described. GRR can detect many common errors using genotypes from many markers. AVAILABILITY: GRR is available at http://bioinformatics.well.ox.ac.uk/GRR.
- SourceAvailable from: PubMed Central[Show abstract] [Hide abstract]
ABSTRACT: Pedigree errors and cryptic relatedness often appear in families or population samples collected for genetic studies. If not identified, these issues can lead to either increased false negatives or false positives in both linkage and association analyses. To identify pedigree errors and cryptic relatedness among individuals from the 20 San Antonio Family Studies (SAFS) families and cryptic relatedness among the 157 putatively unrelated individuals, we apply PREST-plus to the genome-wide single-nucleotide polymorphism (SNP) data and analyze estimated identity-by-descent (IBD) distributions for all pairs of genotyped individuals. Based on the given pedigrees alone, PREST-plus identifies the following putative pairs: 1091 full-sib, 162 half-sib, 360 grandparent-grandchild, 2269 avuncular, 2717 first cousin, 402 half-avuncular, 559 half-first cousin, 2 half-sib+first cousin, 957 parent-offspring and 440,546 unrelated. Using the genotype data, PREST-plus detects 7 mis-specified relative pairs, with their IBD estimates clearly deviating from the null expectations, and it identifies 4 cryptic related pairs involving 7 individuals from 6 families.BMC proceedings 01/2014; 8(Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo):S23.
- [Show abstract] [Hide abstract]
ABSTRACT: The aim of this study was to identify the genetic basis of a chorioretinal dystrophy with high myopia of unknown origin in a child of a consanguineous marriage. The proband and ten family members of Iranian ancestry participated in this study. Linkage analysis was carried out with DNA samples of the proband and her parents by using the Human SNP Array 6.0. Whole exome sequencing (WES) was performed with the patients’ DNA. Specific sequence alterations within the homozygous regions identified by whole exome sequencing were verified by Sanger sequencing. Upon genetic analysis, a novel homozygous frameshift mutation was found in exon 42 of the COL18A1 gene in the patient. Both parents were heterozygous for this sequence variation. Mutations in COL18A1 are known to cause Knobloch syndrome (KS). Retrospective analysis of clinical records of the patient revealed surgical removal of a meningocele present at birth. The clinical features shown by our patient were typical of KS with the exception of chorioretinal degeneration which is a rare manifestation. This is the first case of KS reported in a family of Iranian ancestry. We identified a novel disease-causing (deletion) mutation in the COL18A1 gene leading to a frameshift and premature stop codon in the last exon. The mutation was not present in SNP databases and was also not found in 192 control individuals. Its localization within the endostatin domain implicates a functional relevance of endostatin in KS. A combined approach of linkage analysis and WES led to a rapid identification of the disease-causing mutation even though the clinical description was not completely clear at the beginning.PLoS ONE 11/2014; · 3.53 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Non-hereditary colorectal cancer (CRC) is a complex disorder resulting from the combination of genetic and non-genetic factors. Genome-wide association studies (GWAS) are useful for identifying such genetic susceptibility factors. However, the single loci so far associated with CRC only represent a fraction of the genetic risk for CRC development in the general population. Therefore, many other genetic risk variants alone and in combination must still remain to be discovered. The aim of this work was to search for genetic risk factors for CRC, by performing single-locus and two-locus GWAS in the Spanish population.PLoS ONE 06/2014; 9(6):e101178. · 3.53 Impact Factor
BIOINFORMATICS APPLICATIONS NOTE
Vol. 17 no. 8 2001
GRR: graphical representation of relationship
Gonc ¸alo R. Abecasis, Stacey S. Cherny, W. O. C. Cookson and
Lon R. Cardon
Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive,
Oxford OX3 7RZ, UK
Received on February 1, 2001; revised on March 27, 2001; accepted on March 28, 2001
Summary: A graphical tool for verifying assumed relation-
ships between individuals in genetic studies is described.
GRR can detect many common errors using genotypes
from many markers.
Availability: GRR is available at http://bioinformatics.well.
Contact: email@example.com; firstname.lastname@example.org
Many large scale linkage and association studies have
been conducted and their popularity is increasing. Simple,
efficient, quality control procedures are essential to
the successful completion of these studies. A common
problem in genetic studies is the misspecification of
relationships between DNA samples (Ott, 1991). Mis-
specification of relationships can lead to inaccurate or
biased results and it is therefore important to verify all
The effects of relationship misspecification are varied.
In studies using family data, problems such as non-
paternity and the mislabeling of monozygotic (MZ) twins
as non-twin full sibs, as well as sample mix-ups can lead
to mistaken inferences about allele sharing. For example,
MZ twins will always share more alleles than other
sibling pairs while, on average, half-siblings will share
fewer alleles than full-siblings. In larger pedigrees, the
potential for relationship misspecification is greater and
the detection of these problems is even harder.
In studies using samples of unrelated individuals, such
as association and pharmacogenetic applications, the pres-
ence of related individuals can lead to a misleading infer-
ence about statistical significance. For example, although
to a certain drug share a certain genotype, the finding is
less striking if some of the individuals are related.
The correct genetic relationship between any two
individuals defines an expected pattern of allele sharing
between them. The details of this pattern can be com-
plex, and will depend on the exact type of relationship,
marker characteristics, population history and inbreeding.
Statistics for verifying relationships through patterns of
allele sharing have been proposed, with various degrees
of sophistication, computing time requirements and
assumptions (Boehnke and Cox, 1997; Goring and Ott,
1997; Broman and Weber, 1998; Epstein et al., 2000;
McPeek and Sun, 2000). Here we describe a simple,
general approach for verifying that individuals with the
same specified relationship have similar patterns of allele
sharing. Unlike other approaches, our method does not
require specification of allele frequencies or any other
population parameter. It is expected to be robust to a small
level of random errors in the data and applicable to large
inbred samples. In addition to relationship misspecifica-
tion, our method can detect some other problems such as
sample duplications and switches.
The method is defined as follows: first, classify each
pair of individuals according to their assumed relation-
ship (such as sib-pairs, parent–offspring pairs, unrelated
individuals, etc.). Second, calculate the mean (µij) and
variance (σij) of identical-by-state allele sharing over a
number of polymorphic loci for each pair of individuals,
i and j. If the sample is homogeneous, we expect each
group to display a characteristic pattern of allele sharing.
For example, sib-pairs will be expected to share more al-
leles on average than unrelated individuals, while parent–
offspring pairs (which share at least one chromosome) are
expected to show less variability in allele sharing than sib-
pairs (which may share zero, one or two chromosomes).
A convenient way to identify individuals with patterns of
allele sharing inconsistent with their specified relationship
is to colour code and plot these mean—variance statistics
The figure presents typical results for a genome scan in
a non-inbred sample. Several distinct clusters are present:
unrelated individuals have the lowest average sharing and
high variance (coloured in blue); half-siblings have higher
sharing on average (coloured in green) and full-siblings
have even higher sharing (coloured in red); parent–
offspring pairs have a similar degree of allele sharing to
sib-pairs but with lower variance (coloured in yellow). All
c ? Oxford University Press 2001
Fig. 1. Sample screen shot. Features described in text.
other relative pairs are grouped together and not displayed
by default. Note that some sibling and full-sibling pairs
have been misclassified and appear in other clusters. A
right corner) and corresponds to a pair of identical twins.
To ensure that outlier points are easily identifiable, GRR
implements an outlier rating scheme and places likely
outliers on top of less interesting points. This scheme
is implemented by calculating the mean and variance
of each allele-sharing statistic within each relationship
group. Then each individual’s scores are standardized
to obtain Zµijand Zσijand assigned the outlier scores
are layered on top of lower rated points. Alternative
schemes for layering data points, such as the Mahalanobis
(1936) distance, can be selected by the user.
GRR recognizes standard genetic formats for genotype
and family structure data, including linkage and QTDT
format files (Ott, 1991; Abecasis et al., 2000). Interactive
features allow the user to select individual families and
inspect statistics for any pair of individuals by clicking the
appropriate plot area.
This approach is simple to implement and can be
incorporated into many genetic analysis databases and
quality control protocols. The method performs efficiently
in genome scan linkage panels, although as few as 50
unlinked markers may be sufficient to verify first-degree
relationships in family samples or to verify that no close
relatives or gross stratification are present in samples of
This research was supported by the Wellcome Trust and
by grant EY-12562 from the National Institutes of Health,
Abecasis,G.R., Cardon,L.R. and Cookson,W.O.C. (2000) A general
test of association for quantitative traits in nuclear families. Am.
J. Hum. Genet., 66, 279–292.
Boehnke,M. and Cox,N.J. (1997) Accurate inference of relation-
ships in sib-pair linkage studies. Am. J. Hum. Genet., 61, 423–
Broman,K.W. and Weber,J.L. (1998) Estimation of pairwise rela-
tionships in the presence of genotyping errors. Am. J. Hum.
Genet., 63, 1563–1564.
Epstein,M.P., Duren,W.L. and Boehnke,M. (2000) Improved infer-
ence of relationship for pairs of individuals. Am. J. Hum. Genet.,
Goring,H.H. and Ott,J. (1997) Relationship estimation in affected
sib pair analysis of late-onset diseases. Eur. J. Hum. Genet., 5,
Mahalanobis,P.C. (1936) On the generalized distance in statistics.
Proc. Natl Inst. Sci. India, 2, 49.
McPeek,M.S. and Sun,L. (2000) Statistical tests for detection of
misspecified relationships by use of genome-screen data. Am. J.
Hum. Genet., 66, 1076–1094.
Ott,J. (1991) Analysis of Human Genetic Linkage. Johns Hopkins
University Press, Baltimore.