Increased accuracy of artificial selection by using the realized relationship matrix.

Biosciences Research Division, Department of Primary Industries Victoria, 1 Park Drive, Bundoora 3083, Australia.
Genetics Research (Impact Factor: 2). 03/2009; 91(1):47-60. DOI: 10.1017/S0016672308009981
Source: PubMed

ABSTRACT Dense marker genotypes allow the construction of the realized relationship matrix between individuals, with elements the realized proportion of the genome that is identical by descent (IBD) between pairs of individuals. In this paper, we demonstrate that by replacing the average relationship matrix derived from pedigree with the realized relationship matrix in best linear unbiased prediction (BLUP) of breeding values, the accuracy of the breeding values can be substantially increased, especially for individuals with no phenotype of their own. We further demonstrate that this method of predicting breeding values is exactly equivalent to the genomic selection methodology where the effects of quantitative trait loci (QTLs) contributing to variation in the trait are assumed to be normally distributed. The accuracy of breeding values predicted using the realized relationship matrix in the BLUP equations can be deterministically predicted for known family relationships, for example half sibs. The deterministic method uses the effective number of independently segregating loci controlling the phenotype that depends on the type of family relationship and the length of the genome. The accuracy of predicted breeding values depends on this number of effective loci, the family relationship and the number of phenotypic records. The deterministic prediction demonstrates that the accuracy of breeding values can approach unity if enough relatives are genotyped and phenotyped. For example, when 1000 full sibs per family were genotyped and phenotyped, and the heritability of the trait was 0.5, the reliability of predicted genomic breeding values (GEBVs) for individuals in the same full sib family without phenotypes was 0.82. These results were verified by simulation. A deterministic prediction was also derived for random mating populations, where the effective population size is the key parameter determining the effective number of independently segregating loci. If the effective population size is large, a very large number of individuals must be genotyped and phenotyped in order to accurately predict breeding values for unphenotyped individuals from the same population. If the heritability of the trait is 0.3, and N(e)=100, approximately 12474 individuals with genotypes and phenotypes are required in order to predict GEBVs of un-phenotyped individuals in the same population with an accuracy of 0.7 [corrected].

  • [Show abstract] [Hide abstract]
    ABSTRACT: The use of mixed models to determine narrow-sense heritability and related quantities such as SNP heritability has received much recent attention. Less attention has been paid to the inherent variability in these estimates. One approach for quantifying variability in estimates of heritability is a frequentist approach, in which heritability is estimated using maximum likelihood and its variance is quantified through an asymptotic normal approximation. An alternative approach is to quantify the uncertainty in heritability through its Bayesian posterior distribution. In this paper, we develop the latter approach, make it computationally efficient and compare it to the frequentist approach. We show theoretically that, for a sufficiently large sample size and intermediate values of heritability, the two approaches provide similar results. Using the Atherosclerosis Risk in Communities cohort, we show empirically that the two approaches can give different results and that the variance/uncertainty can remain large.Journal of Human Genetics advance online publication, 27 March 2014; doi:10.1038/jhg.2014.15.
    Journal of Human Genetics 03/2014; · 2.37 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally 'unrelated' individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM) to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.
    PLoS Genetics 04/2014; 10(4):e1004234. · 8.52 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The present study investigated the parameter settings for obtaining a simulated genome at steady state of allele frequency (mutation–drift equilibrium) and linkage disequilibrium (LD), and evaluated the impact of whether or not the simulated genome reached steady state of allele frequency and LD on the accuracy of genomic estimated breeding values (GEBVs). After 500 to 50 000 historical generations, the base population and subsequent seven generations were generated as recent populations. The allele frequency distribution of the last generations of the historical population and LD in the base population were calculated when varying the values of five parameters: initial minor allele frequency, mutation rate, effective population size, number of markers and chromosome length. The accuracies of GEBVs in the last generation of the recent population were calculated by genomic best linear unbiased prediction. The number of historical generations required to reach mutation–drift equilibrium depended on the initial allele frequency and mutation rate. Regardless of the parameters, LD reached a steady state before allele frequency distribution reached mutation–drift equilibrium. The accuracies of GEBVs largely reflect the extent of linkage disequilibrium with the exception of varying chromosome length, although there were no associations between the accuracies of GEBVs and allele frequency distribution.
    Animal Science Journal 06/2014; · 1.04 Impact Factor


Available from