Frazer, K., Ballinger, D., Cox, D., Hinds, D., Stuve, L., Gibbs, R. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851-861

University of Cambridge, Cambridge, England, United Kingdom
Nature (Impact Factor: 42.35). 11/2007; 449(7164):851-61. DOI: 10.1038/nature06258
Source: PubMed

ABSTRACT We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10-30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.

1 Follower
  • Source
    • "Based on these findings, we decided to expand current knowledge of potentially damaging SNVs within the KLKs (as predicted by the functional SNV class and its impact on KLK structure) and investigate their frequencies in certain populations and their potential to result in clinical phenotypes. Single nucleotide polymorphisms (SNPs) are variations at a single nucleotide position and are considered to be the most frequent type of variation within the human genome, comprising about 0.1% of the human genome (Collins et al., 1998; International HapMap et al., 2007). Rare SNPs are present at a frequency above 1% in the general population, whereas common SNPs are present at a frequency above 5–10% (Kruglyak and Nickerson, 2001; Ladiges et al., 2004). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Kallikreins (KLKs) are a group of 15 serine proteases encoded by the KLK locus on chromosome 19. Certain single nucleotide variants (SNVs) within the KLK locus have been linked to human disease. Next-generation sequencing of large human cohorts enables reexamination of genomic variation at the KLK locus. We aimed to identify all KLK-related SNVs and examine their impact on gene regulation and function. To this end, we mined KLK SNVs across Ensembl and Exome Variant Server, with exome-sequencing data from 6503 individuals. PolyPhen-2-based prediction of damaging SNVs and population frequencies of these SNVs were examined. Damaging SNVs were plotted on protein sequence and structure. We identified 4866 SNVs, the largest number of KLK-related SNVs reported. Fourteen percent of noncoding SNVs overlapped with transcription factor binding sites. We identified 602 missense coding SNVs, among which 148 were predicted to be damaging. Nine missense SNVs were common (>1% frequency) and displayed significantly different frequencies between European-American and African-American populations. SNVs predicted to be damaging appeared to alter tertiary structure of KLK1 and KLK6. Similarly, these missense SNVs may affect KLK function, resulting in disease phenotypes. Our study represents a mine of information for those studying KLK-related SNVs and their associations with diseases.
    Biological Chemistry 09/2014; 395(9):1037-1050. DOI:10.1515/hsz-2014-0136 · 2.69 Impact Factor
  • Source
    • "Especially since the sequencing efficiency of modern NGS technologies outpaced the improvement of storage capacities, which directly leads to a growing economical issue, as storing and sharing the generated information is now bounded by the available storage and network resources (Kahn, 2011). The same technology led to the generation of comprehensive catalogs for genetic variations of the human (Durbin et al., 2010; Frazer et al., 2007) and other organisms as well (e.g. Auton et al., 2012; Keane et al., 2011). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Next generation sequencing (NGS) has revolutionized biomedical research in the last decade and led to a continues stream of developments in bioinformatics addressing the need for fast and space efficient solutions for analyzing NGS data. Often researchers need to analyze a set of genomic sequences which stem from closely related species or are indeed individuals of the same species. Hence the analyzed sequences are very similar. For analyses where local changes in the examined sequence induce only local changes in the results it is obviously desirable to examine identical or similar regions not repeatedly.
    Bioinformatics 07/2014; 30(24). DOI:10.1093/bioinformatics/btu438 · 4.62 Impact Factor
  • Source
    • "SNPnexus allows single queries using dbSNP identifiers or chromosomal regions for annotating known variants. SNPnexus summarizes (Table 2) any related information from genetic association studies of complex diseases and disorders available from the genetic association database (GAD) [18] and overlaps with genomic structural variability [19] [20] [21] [22] [23] [24] [25] [26] [27], dbSNP [28], and HapMap [29] genotype and allele frequency. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Over the past decade, a steady increase in the incidence of HPRT-related hyperuricemia (HRH) has been observed in Saudi Arabia. We examined all the nine exons of HPRT gene for mutations in ten biochemically confirmed hyperuricemia patients, including one female and three normal controls. In all, we identified 13 novel mutations in Saudi Arabian HPRT-related hyperuricemia patients manifesting different levels of uric acid. The Lys103Met alteration was highly recurrent and was observed in 50% of the cases, while Ala160Thr and Lys158Asn substitutions were found in two patients. Moreover, in 70% of the patients ≥2 mutations were detected concurrently in the HPRT gene. Interestingly, one of the patients that harbored Lys103Met substitution along with two frameshift mutations at codons 85 and 160 resulting in shortened protein demonstrated unusually high serum uric acid level of 738 íµí¼‡mol/L. Two of the seven point mutations that resulted in amino acid change (Lys103Met and Val160Gly) were predicted to be damaging by SIFT and Polyphen and were further analyzed for their protein stability and function by molecular dynamics simulation. The identified novel mutations in the HPRT gene may prove useful in the prenatal diagnosis and genetic counseling.
    BioMed Research International 07/2014; 2014. DOI:10.1155/2014/290325 · 2.71 Impact Factor
Show more