Genome-wide association studies: theoretical and practical concerns.

Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 2XY, UK.
Nature Reviews Genetics (Impact Factor: 39.79). 03/2005; 6(2):109-18. DOI: 10.1038/nrg1522
Source: PubMed

ABSTRACT To fully understand the allelic variation that underlies common diseases, complete genome sequencing for many individuals with and without disease is required. This is still not technically feasible. However, recently it has become possible to carry out partial surveys of the genome by genotyping large numbers of common SNPs in genome-wide association studies. Here, we outline the main factors - including models of the allelic architecture of common diseases, sample size, map density and sample-collection biases - that need to be taken into account in order to optimize the cost efficiency of identifying genuine disease-susceptibility loci.


Available from: William Wang, Nov 30, 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Substantial health disparities exist between African Americans and Caucasians in the United States. Copy number variations (CNVs) are one form of human genetic variations that have been linked with complex diseases and often occur at different frequencies among African Americans and Caucasian populations. Here, we aimed to investigate whether CNVs with differential frequencies can contribute to health disparities from the perspective of gene networks. We inferred network clusters from human gene/protein networks based on two different data sources. We then evaluated each network cluster for the occurrences of known pathogenic genes and genes located in CNVs with different population frequencies, and used false discovery rates to rank network clusters. This approach let us identify five clusters enriched with known pathogenic genes and with genes located in CNVs with different frequencies between African Americans and Caucasians. These clustering patterns predict two candidate causal genes located in four population-specific CNVs that play potential roles in health disparities.
    03/2015; 3:e677. DOI:10.7717/peerj.677
  • [Show abstract] [Hide abstract]
    ABSTRACT: Several biomaterials have been widely used in the treatment of cancer. However, how these biomaterials alter gene expression is poorly understood. The problem of identifying genes that are differentially expressed across varying biological conditions or in response to different biomaterials based on microarray data is a typical multiple testing problem. In this paper, we focus on FDR control for large-scale multiple testing problems, by our proposed statistics and resampling method, a powerful FDR controlling procedure for large-scale multiple testing problems is provided. Simulations show that, our Fiducial estimator is accurate and stable than other five traditional methods, with satisfactory FDR control. In particular, we propose a generally applicable estimate of the proposed procedure for identifying differentially expressed genes in microarray experiments. This microarray method consistently shows favorable performance over the existing methods. For example, in testing for differential expression between two breast cancer tumor types, the proposed procedure provides increases from 37% to 127% in the number of genes called significant at a false discovery rate of 3%.
    08/2011; 311-313:1661-1666. DOI:10.4028/
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Mexican Holstein (HO) industry has imported Canadian and US (CAN + USA) HO germplasm for use in two different production systems, the conventional (Conv) and the low income (Lowi) system. The objective of this work was to study the genetic composition and differentiation of the Mexican HO cattle, considering the production system in which they perform and their relationship with the Canadian and US HO populations. The analysis included information from 149, 303, and 173 unrelated or with unknown pedigree HO animals from the Conv, Lowi, and CAN + USA populations, respectively. Canadian and US Jersey (JE) and Brown Swiss (BS) genotypes (162 and 86, respectively) were used to determine if Mexican HOs were hybridized with either of these breeds. After quality control filtering, a total of 6,617 out of 6,836 single nucleotide polymorphism markers were used. To describe the genetic diversity across the populations, principal component (PC), admixture composition, and linkage disequilibrium (LD; r(2) ) analyses were performed. Through the PC analysis, HO × JE and HO × BS crossbreeding was detected in the Lowi system. The Conv system appeared to be in between Lowi and CAN + USA populations. Admixture analysis differentiated between the genetic composition of the Conv and Lowi systems, and five ancestry groups associated to sire's country of origin were identified. The minimum distance between markers to estimate a useful LD was found to be 54.5 kb for the Mexican HO populations. At this average distance, the persistence of phase across autosomes of Conv and Lowi systems was 0.94, for Conv and CAN + USA was 0.92 and for the Lowi and CAN + USA was 0.91. Results supported the flow of germplasm among populations being Conv a source for Lowi, and dependent on migration from CAN + USA. Mexican HO cattle in Conv and Lowi populations share common ancestry with CAN + USA but have different genetic signatures.
    Frontiers in Genetics 01/2015; 6:7. DOI:10.3389/fgene.2015.00007