Genetics Selection Evolution

Published by BioMed Central
Online ISSN: 1297-9686
In a previous study in the Fleckvieh dual purpose cattle breed, we mapped a quantitative trait locus (QTL) affecting milk yield (MY1), milk protein yield (PY1) and milk fat yield (FY1) during first lactation to the distal part of bovine chromosome 5 (BTA5), but the confidence interval was too large for positional cloning of the causal gene. Our objective here was to refine the position of this QTL and to define the candidate region for high-throughput sequencing. In addition to those previously studied, new Fleckvieh families were genotyped, in order to increase the number of recombination events. Twelve new microsatellites and 240 SNP markers covering the most likely QTL region on BTA5 were analysed. Based on haplotype analysis performed in this complex pedigree, families segregating for the low frequency allele of this QTL (minor allele) were selected. Single- and multiple-QTL analyses using combined linkage and linkage disequilibrium methods were performed. Single nucleotide polymorphism haplotype analyses on representative family sires and their ancestors revealed that the haplotype carrying the minor QTL allele is rare and most probably originates from a unique ancestor in the mapping population. Analyses of different subsets of families, created according to the results of haplotype analysis and availability of SNP and microsatellite data, refined the previously detected QTL affecting MY1 and PY1 to a region ranging from 117.962 Mb to 119.018 Mb (1.056 Mb) on BTA5. However, the possibility of a second QTL affecting only PY1 at 122.115 Mb was not ruled out. This study demonstrates that targeting families segregating for a less frequent QTL allele is a useful method. It improves the mapping resolution of the QTL, which is due to the division of the mapping population based on the results of the haplotype analysis and to the increased frequency of the minor allele in the families. Consequently, we succeeded in refining the region containing the previously detected QTL to 1 Mb on BTA5. This candidate region contains 27 genes with unknown or partially known function(s) and is small enough for high-throughput sequencing, which will allow future detailed analyses of candidate genes.
Genetic parameters for meat quality traits and their relationships with body weight and breast development were estimated for a total of 420 male turkeys using REML. The birds were slaughtered in a commercial plant and the traits measured included pH at 20 min (pH20) and 24 h post-mortem (pHu) and colour of the breast and thigh meat. The heritabilities of the rate and the extent of the pH fall in the breast muscle were estimated at h2=0.21 +/- 0.04 and h2=0.16 +/- 0.04, respectively. Heritabilities ranging from 0.10 to 0.32 were obtained for the colour indicators in the breast muscle. A marked negative genetic correlation (rg=-0.80 +/- 0.10) was found between pH20 and lightness (L*) of breast meat, both traits corresponding to PSE indicators. The pH20 in the thigh muscle had a moderate heritability (h2=0.20 +/- 0.07) and was partially genetically related to pH20 in the breast muscle (rg=0.45 +/- 0.17). Body weight and breast yield were positively correlated with both initial and ultimate pH and negatively with the lightness of breast meat.
The attainment of a specific mature body size is one of the most fundamental differences among species of mammals. Moreover, body size seems to be the central factor underlying differences in traits such as growth rate, energy metabolism and body composition. An important proportion of this variability is of genetic origin. The goal of the genetic analysis of animal growth is to understand its "genetic architecture", that is the number and position of loci affecting the trait, the magnitude of their effects, allele frequencies and types of gene action. In this review, the different strategies developed to identify and characterize genes involved in the regulation of growth in the mouse are described, with emphasis on the methods developed to map loci contributing to the regulation of quantitative traits (QTLs).
Background Genomic selection is an appealing method to select purebreds for crossbred performance. In the case of crossbred records, single nucleotide polymorphism (SNP) effects can be estimated using an additive model or a breed-specific allele model. In most studies, additive gene action is assumed. However, dominance is the likely genetic basis of heterosis. Advantages of incorporating dominance in genomic selection were investigated in a two-way crossbreeding program for a trait with different magnitudes of dominance. Training was carried out only once in the simulation. Results When the dominance variance and heterosis were large and overdominance was present, a dominance model including both additive and dominance SNP effects gave substantially greater cumulative response to selection than the additive model. Extra response was the result of an increase in heterosis but at a cost of reduced purebred performance. When the dominance variance and heterosis were realistic but with overdominance, the advantage of the dominance model decreased but was still significant. When overdominance was absent, the dominance model was slightly favored over the additive model, but the difference in response between the models increased as the number of quantitative trait loci increased. This reveals the importance of exploiting dominance even in the absence of overdominance. When there was no dominance, response to selection for the dominance model was as high as for the additive model, indicating robustness of the dominance model. The breed-specific allele model was inferior to the dominance model in all cases and to the additive model except when the dominance variance and heterosis were large and with overdominance. However, the advantage of the dominance model over the breed-specific allele model may decrease as differences in linkage disequilibrium between the breeds increase. Retraining is expected to reduce the advantage of the dominance model over the alternatives, because in general, the advantage becomes important only after five or six generations post-training. Conclusion Under dominance and without retraining, genomic selection based on the dominance model is superior to the additive model and the breed-specific allele model to maximize crossbred performance through purebred selection.
Linear marker order of the comprehensive map. The relative distance between the loci indicated in cR 12 000 was estimated according to the RHMAXLIK multi-point analysis program option of the RHMAP package. The total map length is 2694.7 cR 12 000 . The program option RH2PT was used to determine the retention frequencies in percent for all loci tested.  
In this paper, we present a radiation hybrid framework map of BTA13 composed of nine microsatellite loci, six genes and one EST. The map has been developed using a recently constructed 12'000 rad bovine-hamster whole-genome radiation hybrid panel. Moreover, we present a comprehensive map of BTA13 comprising 72 loci, of which 45 are microsatellites, 20 are genes and seven are ESTs. The map has an estimated length of 2694.7 cR(12'000). The proposed order is in general agreement with published maps of BTA13. Our results only partially support previously published information of five blocks of conserved gene order between cattle and man. We found no evidence for the existence of an HSA20 homologous segment of coding DNA on BTA13 located centromeric of a confirmed HSA10 homologous region. The present map increases the marker density and the marker resolution on BTA13 and enables further insight into the evolutionary development of the chromosome as compared to man.
We present a gene-based RH map of the chicken microchromosome GGA14, known to have synteny conservations with human chromosomal regions HSA16p13.3 and HSA17p11.2. Microsatellite markers from the genetic map were used to check the validity of the RH map and additional markers were developed from chicken EST data to yield comparative mapping data. A high rate of intra-chromosomal rearrangements was detected by comparison to the assembled human sequence. Finally, the alignment of the RH map to the assembled chicken sequence showed a small number of discordances, most of which involved the same region of the chromosome spanning between 40.5 and 75.9 cR(6000) on the RH map.
Cryopreservation of three endangered Belgian sheep breeds required to characterize their intra-breed genetic diversity. It is assumed that the genetic structure of a livestock breed depends mostly on gene flow due to exchanges between herds. To quantify this relation, molecular data and analyses of the exchanges were combined for three endangered Belgian breeds. For each breed, between 91 and 225 sheep were genotyped with 19 microsatellites. Genetic differentiations between breeds and among herds within a breed were evaluated and the genetic structure of the breeds was described using Bayesian clustering (Structure). Exchanges of animals between 20, 46 and 95 herds according to breed were identified via semi-directed interviews and were analyzed using the concepts of the network theory to calculate average degrees and shortest path lengths between herds. Correlation between the Reynolds' genetic distances and the shortest path lengths between each pair of herds was assessed by a Mantel test approach. Genetic differentiation between breeds was high (0.16). Overall Fst values among herds were high in each breed (0.17, 0.11 and 0.10). Use of the Bayesian approach made it possible to identify genetic groups of herds within a breed. Significant correlations between the shortest path lengths and the Reynolds' genetic distances were found in each breed (0.87, 0.33 and 0.41), which demonstrate the influence of exchanges between herds on the genetic diversity. Correlation differences between breeds could be explained by differences in the average degree of the animal exchange networks, which is a measure of the number of exchanges per herd. The two breeds with the highest average degree showed the lowest correlation. Information from the exchange networks was used to assign individuals to the genetic groups when molecular information was incomplete or missing to identify donors for a cryobank. A fine-scale picture of the population genetic structure at the herd level was obtained for the three breeds. Network analysis made it possible to highlight the influence of exchanges on genetic structure and to complete or replace molecular information in establishing a conservation program.
Significance level of OPPV proviral load levels between PRNP genotypes
Number of sheep distributed among PRNP genotypes. Left = codon 171, Right = codon 143, y-axis = number of animals.
Odds ratio and 95% confidence interval for effect of PRNP genotype upon frequency of OPPV positive animals.
Adjusted mean log10 provirus levels and 95% confidence interval among PRNP genotypes used for statistical comparison.
Selective breeding of sheep for arginine (R) at prion gene (PRNP) codon 171 confers resistance to classical scrapie. However, other effects of 171R selection are uncertain. Ovine progressive pneumonia/Maedi-Visna virus (OPPV) may infect up to 66% of a flock thus any affect of 171R selection on OPPV susceptibility or disease progression could have major impact on the sheep industry. Hypotheses that the PRNP 171R allele is 1) associated with the presence of OPPV provirus and 2) associated with higher provirus levels were tested in an Idaho ewe flock. OPPV provirus was found in 226 of 358 ewes by quantitative PCR. The frequency of ewes with detectable provirus did not differ significantly among the 171QQ, 171QR, and 171RR genotypes (p > 0.05). Also, OPPV provirus levels in infected ewes were not significantly different among codon 171 genotypes (p > 0.05). These results show that, in the flock examined, the presence of OPPV provirus and provirus levels are not related to the PRNP 171R allele. Therefore, a genetic approach to scrapie control is not expected to increase or decrease the number of OPPV infected sheep or the progression of disease. This study provides further support to the adoption of PRNP 171R selection as a scrapie control measure.
Evolution of the annual number of animals karyotyped in the laboratory, and of the number of structural chromosomal rearrangements identified.
GTG-banding karyotype of a gilt carrier of the 13 / 17 Robertsonian translocation. NB: The rearrangement was initially identified in a boar; daughters were produced for experimental purposes. 
The chromosomal control of pig populations has been widely developed in France over the last ten years. By December 31st, 2006, 13,765 individuals had been karyotyped in our laboratory, 62% of these since 2002. Ninety percent were young purebred boars controlled before service in artificial insemination centres, and 3% were hypoprolific boars. So far, 102 constitutional structural chromosomal rearrangements (67 since 2002) have been described. Fifty-six were reciprocal translocations and 8 peri- or paracentric inversions. For the first time since the beginning of the programme and after more than 11,000 pigs had been karyotyped, one Robertsonian translocation was identified in 2005 and two others in 2006. The estimated prevalence of balanced structural chromosomal rearrangements in a sample of more than 7,700 young boars controlled before service was 0.47%. Twenty-one of the 67 rearrangements described since 2002 were identified in hypoprolific boars. All were reciprocal translocations. Twelve mosaics (XX/XY in 11 individuals, XY/XXY in one individual) were also diagnosed. Two corresponded to hypoprolific boars, and three to intersexed animals. The results presented in this communication would justify an intensification of the chromosomal control of French and, on a broader scale, European and North-American pig populations.
Contributing reviewers The Genetics Selection Evolution Editors-in-Chief would like to thank all of our reviewers who contributed to peer review for the journal in 2013.
Genomic selection has become a very important tool in animal genetics and is rapidly emerging in plant genetics. It holds the promise to be particularly beneficial to select for traits that are difficult or expensive to measure, such as traits that are measured in one environment and selected for in another environment. The objective of this paper was to develop three models that would permit multi-trait genomic selection by combining scarcely recorded traits with genetically correlated indicator traits, and to compare their performance to single-trait models, using simulated datasets. Three (SNP) Single Nucleotide Polymorphism based models were used. Model G and BCπ0 assumed that contributed (co)variances of all SNP are equal. Model BSSVS sampled SNP effects from a distribution with large (or small) effects to model SNP that are (or not) associated with a quantitative trait locus. For reasons of comparison, model A including pedigree but not SNP information was fitted as well. In terms of accuracies for animals without phenotypes, the models generally ranked as follows: BSSVS > BCπ0 > G > > A. Using multi-trait SNP-based models, the accuracy for juvenile animals without any phenotypes increased up to 0.10. For animals with phenotypes on an indicator trait only, accuracy increased up to 0.03 and 0.14, for genetic correlations with the evaluated trait of 0.25 and 0.75, respectively. When the indicator trait had a genetic correlation lower than 0.5 with the trait of interest in our simulated data, the accuracy was higher if genotypes rather than phenotypes were obtained for the indicator trait. However, when genetic correlations were higher than 0.5, using an indicator trait led to higher accuracies for selection candidates. For different combinations of traits, the level of genetic correlation below which genotyping selection candidates is more effective than obtaining phenotypes for an indicator trait, needs to be derived considering at least the heritabilities and the numbers of animals recorded for the traits involved.
Three modes of genetic identity-by-descent between two outbred individuals at a single locus.
Features of the regression of genealogical coancestry f on molecular coancestry (f M ) and molecular covariance (Cov M )
Nine ways in which a pair of relatives can share genes identical by descent.
Features of the regression of genealogical coancestry f on estimators
Genetic relatedness or similarity between individuals is a key concept in population, quantitative and conservation genetics. When the pedigree of a population is available and assuming a founder population from which the genealogical records start, genetic relatedness between individuals can be estimated by the coancestry coefficient. If pedigree data is lacking or incomplete, estimation of the genetic similarity between individuals relies on molecular markers, using either molecular coancestry or molecular covariance. Some relationships between genealogical and molecular coancestries and covariances have already been described in the literature. We show how the expected values of the empirical measures of similarity based on molecular marker data are functions of the genealogical coancestry. From these formulas, it is easy to derive estimators of genealogical coancestry from molecular data. We include variation of allelic frequencies in the estimators. The estimators are illustrated with simulated examples and with a real dataset from dairy cattle. In general, estimators are accurate and only slightly biased. From the real data set, estimators based on covariances are more compatible with genealogical coancestries than those based on molecular coancestries. A frequently used estimator based on the average of estimated coancestries produced inflated coancestries and numerical instability. The consequences of unknown gene frequencies in the founder population are briefly discussed, along with alternatives to overcome this limitation. Estimators of genealogical coancestry based on molecular data are easy to derive. Estimators based on molecular covariance are more accurate than those based on identity by state. A correction considering the random distribution of allelic frequencies improves accuracy of these estimators, especially for populations with very strong drift.
MHC class I and II molecules are immunoregulatory cell surface glycoproteins, which selectively bind to and present antigenic peptides to T-lymphocytes. Murine and human studies show that variable peptide binding affinity to MHC II molecules influences Th1/Th2 responses by inducing distinctive cytokine expression. To examine the biological effects of peptide binding affinity to bovine MHC (BoLA), various self peptides (BoLA-DQ and fibrinogen fragments) and non-self peptides from ovalbumin (OVA), as well as VP2 and VP4 peptides from foot and mouth disease virus (FMD-V) were used to (1) determine binding affinities to the BoLA-DRB3*2703 allele, previously associated with mastitis susceptibility and (2) determine whether peptide binding affinity influences T-lymphocyte function. Peptide binding affinity was determined by a competitive assay using high affinity biotinylated self-peptide incubated with purified BoLA-DRB3*2703 in the presence of various concentrations of competing peptides. The concentrations of non-self peptide required to inhibit self-peptide binding by 50% (IC50) were variable, ranging from 26.92 to > 320 microM. Peptide-specific T-lymphocyte function was determined by measuring DNA synthesis, cell division, and IFN-gamma production in cultures of mononuclear cells from a BoLA-DRB3*2703 homozygous cow. When compared to non-stimulated control cultures, differences in lymphocyte function were observed for all of the assessed parameters; however, peptide-binding affinity did not always account for the observed differences in lymphocyte function.
The prediction of identity by descent (IBD) probabilities is essential for all methods that map quantitative trait loci (QTL). The IBD probabilities may be predicted from marker genotypes and/or pedigree information. Here, a method is presented that predicts IBD probabilities at a given chromosomal location given data on a haplotype of markers spanning that position. The method is based on a simplification of the coalescence process, and assumes that the number of generations since the base population and effective population size is known, although effective size may be estimated from the data. The probability that two gametes are IBD at a particular locus increases as the number of markers surrounding the locus with identical alleles increases. This effect is more pronounced when effective population size is high. Hence as effective population size increases, the IBD probabilities become more sensitive to the marker data which should favour finer scale mapping of the QTL. The IBD probability prediction method was developed for the situation where the pedigree of the animals was unknown (i.e. all information came from the marker genotypes), and the situation where, say T, generations of unknown pedigree are followed by some generations where pedigree and marker genotypes are known.
In order to study duck microsatellites, we constructed a library enriched for (CA)n, (CAG)n, (GCC)n and (TTTC)n. A total of 35 pairs of primers from these microsatellites were developed and used to detect polymorphisms in 31 unrelated Peking ducks. Twenty-eight loci were polymorphic and seven loci were monomorphic. A total of 117 alleles were observed from these polymorphic microsatellite markers, which ranged from 2 to 14 with an average of 4.18 per locus. The frequencies of the 117 alleles ranged from 0.02 to 0.98. The highest heterozygosity (0.97) was observed at the CAUD019 microsatellite locus and the lowest heterozygosity (0.04) at the CAUD008 locus, and 11 loci had heterozygosities greater than 0.50 (46.43%). The polymorphism information content (PIC) of 28 loci ranged from 0.04 to 0.88 with an average of 0.42. All the above markers were used to screen the polymorphism in other bird species. Two markers produced specific monomorphic products with the chicken DNA. Fourteen markers generated specific fragments with the goose DNA: 5 were polymorphic and 9 were monomorphic. But no specific product was detected with the peacock DNA. Based on sequence comparisons of the flanking sequence and repeat, we conclude that 2 chicken loci and 14 goose loci were true homologous loci of the duck loci. The microsatellite markers identified and characterized in the present study will contribute to the genetic map, quantitative traits mapping, and phylogenetic analysis in the duck and goose.
Accuracy of selection, genetic variance, genetic level and inbreeding level for the basic scheme. Accuracy of selection (a), genetic variance (b), genetic level (c) and inbreeding level (d) for schemes when sib-testing was every year (EVERY-GEN; circles), every second year (EVERY-2GEN; triangles), only the first year (FIRST-GEN; cross) or the first three years (FIRST-3GEN; squares); Random selection (solid line); Ncand = 3000, h2 = 0.4, Nfamilies = 100, Ntested = 3000.
Accuracy of selection with different numbers of markers and sib-testing strategies. Accuracy of selection for schemes with 1000 (circles), 5000 (squares) or 10000 (triangles) markers when sib-testing was every year (EVERY-GEN; open) or only the first year (FIRST-GEN; filled); Ncand = 3000, h2 = 0.4, Nfamilies = 100, Ntested = 3000.
Genomic selection is a selection method where effects of dense genetic markers are first estimated in a test population and later used to predict breeding values of selection candidates. The aim of this paper was to investigate genetic gains, inbreeding and the accuracy of selection in a general genomic selection scheme for aquaculture, where the test population consists of sibs of the candidates. The selection scheme started after simulating 4000 generations in a Fisher-Wright population with a size of 1000 to create a founder population. The basic scheme had 3000 selection candidates, 3000 tested sibs of the candidates, 100 full-sib families, a trait heritability of 0.4 and a marker density of 0.5N(e)/M. Variants of this scheme were also analysed. The accuracy of selection in generation 5 was 0.823 for the basic scheme when the sib-testing was performed every generation. The accuracy was hardly reduced by selection, probably because the increased frequency of favourable alleles compensated for the Bulmer effect. When sib-testing was performed only in the first generation, in order to reduce costs, accuracy of selection in generation 5 dropped to 0.304, the main reduction occurring in the first generation. The genetic level in generation 5 was 6.35 sigma(a) when sib-testing was performed every generation, which was 72%, 12% and 9% higher than when sib-testing was performed only in the first generation, only in the first three generations or every second generation, respectively. A marker density above 0.5N(e)/M hardly increased accuracy of selection further. For the basic scheme, rates of inbreeding were reduced by 81% in these schemes compared to traditional selection schemes, due to within-family selection. Increasing the number of sibs to 6000 hardly affected the accuracy of selection, and increasing the number of candidates to 6000 increased genetic gain by 10%, mainly because of increased selection intensity. Various strategies were evaluated to reduce the amount of sib-testing and genotyping, but all resulted in loss of selection accuracy and thus of genetic gain. Rates of inbreeding were reduced by 81% in genomic selection schemes compared to traditional selection schemes for the parameters of the basic scheme, due to within-family selection.
Single marker %PCA (first two axes). The populations are labelled in their confidence ellipse (P = 0.95), within an envelope formed by the alleles (arrows). Figures are on the same scale as indicated by the mesh of the grid (d = 0.5). Eigenvalue percents are indicated for each axis. The colors are based on the most congruent dif- ferentiation in the reference scores. 
Single marker coordinated %PCA (first two axes). The populations are labelled in their confidence ellipse (P = 0.95), within an envelope formed by the alleles (arrows). Figures are on the same scale as indicated by the mesh of the grid (d = 0.5). Variance percents are indicated for each axis). The colors are based on the most congruent di ff erentiation in the reference scores. 
Cohesion plots showing the di ff erences between the reference typology (la- bels and arrows origin) and the coordinated single-marker analyses (normed scores) on the first two axes. The arrows represent the typological “mistakes” displayed by the markers. The longer an arrow is, the greater the mistake is. A common scale is used (d = 1) for all plots. 
Diagrams of typological values components, in percentages, for the three reference structures, corresponding to (A) Africa-France separation (B) within Africa differentiation and (C) within France differentiation.
Working with weakly congruent markers means that consensus genetic structuring of populations requires methods explicitly devoted to this purpose. The method, which is presented here, belongs to the multivariate analyses. This method consists of different steps. First, single-marker analyses were performed using a version of principal component analysis, which is designed for allelic frequencies (%PCA). Drawing confidence ellipses around the population positions enhances %PCA plots. Second, a multiple co-inertia analysis (MCOA) was performed, which reveals the common features of single-marker analyses, builds a reference structure and makes it possible to compare single-marker structures with this reference through graphical tools. Finally, a typological value is provided for each marker. The typological value measures the efficiency of a marker to structure populations in the same way as other markers. In this study, we evaluate the interest and the efficiency of this method applied to a European and African bovine microsatellite data set. The typological value differs among markers, indicating that some markers are more efficient in displaying a consensus typology than others. Moreover, efficient markers in one collection of populations do not remain efficient in others. The number of markers used in a study is not a sufficient criterion to judge its reliability. "Quantity is not quality".
Chromosomal location of markers on chromosome 13 (CRIMAP estimations). 
In this study, the potential association of PrP genotypes with health and productive traits was investigated. Data were recorded on animals of the INRA 401 breed from the Bourges-La Sapinière INRA experimental farm. The population consisted of 30 rams and 852 ewes, which produced 1310 lambs. The animals were categorized into three PrP genotype classes: ARR homozygous, ARR heterozygous, and animals without any ARR allele. Two analyses differing in the approach considered were carried out. Firstly, the potential association of the PrP genotype with disease (Salmonella resistance) and production (wool and carcass) traits was studied. The data used included 1042, 1043 and 1013 genotyped animals for the Salmonella resistance, wool and carcass traits, respectively. The different traits were analyzed using an animal model, where the PrP genotype effect was included as a fixed effect. Association analyses do not indicate any evidence of an effect of PrP genotypes on traits studied in this breed. Secondly, a quantitative trait loci (QTL) detection approach using the PRNP gene as a marker was applied on ovine chromosome 13. Interval mapping was used. Evidence for one QTL affecting mean fiber diameter was found at 25 cM from the PRNP gene. However, a linkage between PRNP and this QTL does not imply unfavorable linkage disequilibrium for PRNP selection purposes.
Verification of analytic theory via MATLAB simulation.
The magnitude of selection-induced departures from Hardy-Weinberg proportions. Fsel is a function of allele frequency (p) and fitness dominance (k); negative values of Fsel indicate an excess of heterozygotes, while positive values of Fsel indicate a deficit of heterozygotes, the dashed line corresponds to Hardy-Weinberg proportions.
Sample size as a function of allele frequency and fitness dominance. Sample sizes (n) required to detect selection at a significance level of 0.05 and a power of 0.5 are plotted as a function of allele frequency and fitness dominance; scale on the y-axis is logarithmic; A) Weak selection (k = 0.99); B) Strong selection (k = 0.9); C) Unequal allele frequencies (p = 0.1 and q = 0.9); D) Equal allele frequencies (p = 0.5 and q = 0.5).
Viability selection influences the genotypic contexts of alleles and leads to quantifiable departures from Hardy-Weinberg proportions. One measure of these departures is Wright's inbreeding coefficient (F), where observed heterozygosity is compared with expected heterozygosity. Here, I extend population genetics theory to describe post-selection genotype frequencies in terms of post-selection allele frequencies and fitness dominance. The resulting equations correspond to non-equilibrium populations, allowing the following questions to be addressed: When selection is present, how large a sample size is needed to detect significant departures from Hardy-Weinberg? How do selection-induced departures from Hardy-Weinberg vary with allele frequencies and levels of fitness dominance? For realistic selection coefficients, large sample sizes are required and departures from Hardy-Weinberg proportions are small.
Distribution of EBVs for Australian Selection Index (ASI, a) and protein percentage (PPT, b), distribution of reliabilities of EBVs (c), and number of bulls within year of birth (d).
Partial least squares regression model optimization for Australian Selection Index using cross-validation. Shown is the mean prediction error (MSEP) in the training (MSEPtraining) data set, the average MSEP in the 5-fold cross-validation samples (MSEPCV), the proportion of EBV (VarEBV) and SNP variance (VarSNP) explained in the training data for models with an increasing number of latent components; the optimal prediction model includes the first 5 latent components, identified by the smallest MSEPCV.
Fit of models relating EBVs and predicted MBVs in the training data and in young bulls. To avoid cluttering predictions are plotted for a single fold of the cross-validation (CV) of the training data set and young bull cohort 1998; ASI: Australian Selection Index; PPT: protein percentage; FR-LS: fixed regression-least squares; RR-BLUP: random regression-BLUP; Bayes-R: Bayesian regression; SVR: support vector regression; PLSR: partial least squares regression.
Distribution of 7,372 SNP effects along the genome estimated by four methods. The right most 772 SNPs are unassigned to chromosomes; ASI: Australian Selection Index; PPT: protein percentage; FR-LS: fixed regression-least squares; RR-BLUP: random regression-BLUP; Bayes-R: Bayesian regression.
Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle. Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls. For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy.All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time. The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended.
Genomic prediction of breeding values involves a so-called training analysis that predicts the influence of small genomic regions by regression of observed information on marker genotypes for a given population of individuals. Available observations may take the form of individual phenotypes, repeated observations, records on close family members such as progeny, estimated breeding values (EBV) or their deregressed counterparts from genetic evaluations. The literature indicates that researchers are inconsistent in their approach to using EBV or deregressed data, and as to using the appropriate methods for weighting some data sources to account for heterogeneous variance. A logical approach to using information for genomic prediction is introduced, which demonstrates the appropriate weights for analyzing observations with heterogeneous variance and explains the need for and the manner in which EBV should have parent average effects removed, be deregressed and weighted. An appropriate deregression for genomic regression analyses is EBV/r2 where EBV excludes parent information and r2 is the reliability of that EBV. The appropriate weights for deregressed breeding values are neither the reliability nor the prediction error variance, two alternatives that have been used in published studies, but the ratio (1 - h2)/[(c + (1 - r2)/r2)h2] where c > 0 is the fraction of genetic variance not explained by markers. Phenotypic information on some individuals and deregressed data on others can be combined in genomic analyses using appropriate weighting.
Extinction of breeds threatens genetic diversity of livestock species. The need to conserve genetic diversity is widely accepted but involves in general two questions: (i) is the expected loss of diversity in a set of breeds within a defined future time horizon large enough to establish a conservation plan, and if so (ii) which breeds should be prioritised for such a conservation plan? The present study uses a marker assisted methodology to address these questions. The methodology combines core set diversity measures with a stochastic method for the estimation of expected future diversity and breed marginal diversities. The latter is defined as the change in the total diversity of all breeds caused by a one unit decrease in extinction probability of a particular breed. The stochastic method was validated by means of simulations. A large field data set consisting of 44 North Eurasian cattle breeds was analysed using simplified determined extinction probabilities. The results show that the expected loss of diversity in this set within the next 20 to 50 years is between 1 and 3% of the actual diversity, provided that the extinction probabilities which were used are approximately valid. If this loss is to be reduced, it is sufficient to include those three to five breeds with the highest marginal diversity in a conservation scheme.
cDNA clones from a pig granulosa cell cDNA library were isolated by (differential hybridisation for follicle stimulating hormone (FSH) regulation in granulosa cells in a previous study. The clones that did not match any known sequence were studied for their expression in granulosa cells (treated or not by FSH) and in fresh isolated ovarian follicles mainly by comparative RT-PCR analysis. These results give functional data on genes that may be implicated in follicular growing. These ESTs have been localised on the porcine genome, using a somatic cell hybrid panel, providing new type I markers on the porcine map and information on the comparative map between humans and pigs.
Association between the average number of alleles for each of the 52 populations and their mean genetic distance (MGD) from the remaining populations, based on the Cavalli-Sforza distance. The points are marked according to their chicken types.
In a project on the biodiversity of chickens funded by the European Commission (EC), eight laboratories collaborated to assess the genetic variation within and between 52 populations from a wide range of chicken types. Twenty-two di-nucleotide microsatellite markers were used to genotype DNA pools of 50 birds from each population. The polymorphism measures for the average, the least polymorphic population (inbred C line) and the most polymorphic population (Gallus gallus spadiceus) were, respectively, as follows: number of alleles per locus, per population: 3.5, 1.3 and 5.2; average gene diversity across markers: 0.47, 0.05 and 0.64; and proportion of polymorphic markers: 0.91, 0.25 and 1.0. These were in good agreement with the breeding history of the populations. For instance, unselected populations were found to be more polymorphic than selected breeds such as layers. Thus DNA pools are effective in the preliminary assessment of genetic variation of populations and markers. Mean genetic distance indicates the extent to which a given population shares its genetic diversity with that of the whole tested gene pool and is a useful criterion for conservation of diversity. The distribution of population-specific (private) alleles and the amount of genetic variation shared among populations supports the hypothesis that the red jungle fowl is the main progenitor of the domesticated chicken.
Phenotypic evolutions for the average daily gain in rabbits divergently selected for a high () or a low (•) body weight at 63 days, and in the control population (), (σ p = 4.89 g/d). 
The effects of selection for growth rate on weights and qualitative carcass and muscle traits were assessed by comparing two lines selected for live body weight at 63 days of age and a cryopreserved control population raised contemporaneously with generation 5 selected rabbits. The animals were divergently selected for five generations for either a high (H line) or a low (L line) body weight, based on their BLUP breeding value. Heritability (h2) was 0.22 for 63-d body weight (N = 4754). Growth performance and quantitative carcass traits in the C group were intermediate between the H and L lines (N = 390). Perirenal fat proportion (h2 = 0.64) and dressing out percentage (h2 = 0.55) ranked in the order L < H = C (from high to low). The weight and cross-sectional area of the Semitendinosus muscle, and the mean diameter of the constitutive myofibres were reduced in the L line only (N = 140). In the Longissimus muscle (N = 180), the ultimate pH (h2 = 0.16) and the maximum shear force reached in the Warner-Braztler test (h2 = 0.57) were slightly modified by selection.
Chromosomal location of previously published and present QTL related to fatness. Three additional fatness QTL on GGA9 (AF9a), GGA11 (AFW7b) and GGA27 (AF9b) are not presented. AFx: abdominal fat weight adjusted for body weight at x weeks of age. AFWx: abdominal fat weight (raw data) at x weeks of age. SKx: skin weight adjusted to body weight at x weeks of age. The boxes encompass the confidence interval of the QTL; When it is unknown, a broken arrow replaces the box. Black boxes: the present results; striped boxes: fat-related trait adjusted for body weight; empty boxes: raw data. a Ikeobi et al. [8]; b Jennen et al. [10]; c McElroy et al. [21]; d Tatsuda et al. [29].
Quantitative trait loci (QTL) for abdominal fatness and breast muscle weight were investigated in a three-generation design performed by inter-crossing two experimental meat-type chicken lines that were divergently selected on abdominal fatness. A total of 585 F2 male offspring from 5 F1 sires and 38 F1 dams were recorded at 8 weeks of age for live body, abdominal fat and breast muscle weights. One hundred-twenty nine microsatellite markers, evenly located throughout the genome and heterozygous for most of the F1 sires, were used for genotyping the F2 birds. In each sire family, those offspring exhibiting the most extreme values for each trait were genotyped. Multipoint QTL analyses using maximum likelihood methods were performed for abdominal fat and breast muscle weights, which were corrected for the effects of 8-week body weight, dam and hatching group. Isolated markers were assessed by analyses of variance. Two significant QTL were identified on chromosomes 1 and 5 with effects of about one within-family residual standard deviation. One breast muscle QTL was identified on GGA1 with an effect of 2.0 within-family residual standard deviation.
Genotype means, standard errors, P values and estimates of additive and dominance effects for SNP with significant trait associations
F-statistics and degrees of freedom for SNP-by-haplotype interaction term for a nested model of genotype effects
The purpose of this study was to evaluate the effects of eight single nucleotide polymorphisms (SNP), previously associated with meat and milk quality traits in cattle, in a population of 443 commercial Aberdeen Angus-cross beef cattle. The eight SNP, which were located within five genes: mu-calpain (CAPN1), calpastatin (CAST), leptin (LEP), growth hormone receptor (GHR) and acylCoA:diacylglycerol acyltransferase 1 (DGAT1), are included in various commercial tests for tenderness, fatness, carcass composition and milk yield/quality. A total of 27 traits were examined, 19 relating to carcass quality, such as carcass weight and fatness, one mechanical measure of tenderness, and the remaining seven were sensory traits, such as flavour and tenderness, assessed by a taste panel. An SNP in the CAPN1 gene, CAPN316, was significantly associated with tenderness measured by both the tenderometer and the taste panel as well as the weight of the hindquarter, where animals inheriting the CC genotype had more tender meat and heavier hindquarters. An SNP in the leptin gene, UASMS2, significantly affected overall liking, where animals with the TT genotype were assigned higher scores by the panellists. The SNP in the GHR gene was significantly associated with odour, where animals inheriting the AA genotype produced steaks with an intense odour when compared with the other genotypes. Finally, the SNP in the DGAT1 gene was associated with sirloin weight after maturation and fat depth surrounding the sirloin, with animals inheriting the AA genotype having heavier sirloins and more fat. The results of this study confirm some previously documented associations. Furthermore, novel associations have been identified which, following validation in other populations, could be incorporated into breeding programmes to improve meat quality.
Bacterial count per gram in the left pre-scapular node. 
An experimental population (1216 lambs from 30 sires) of the Inra401 sheep was created in an Inra flock to allow QTL detection for susceptibility to Salmonella infection, wool and carcass traits. The Inra401 is a sheep composite line developed from two breeds: Berrichon du Cher and Romanov. At 113 days of age on average, the lambs were inoculated intravenously with 10(8) Salmonella abortusovis Rv6 (vaccinal strain). They were slaughtered 10 days after the inoculation. Several traits were measured at inoculation and/or slaughtering to estimate the genetic resistance of the lambs to Salmonella infection: specific IgM and IgG1 antibody titres, body weight loss, spleen and pre-scapular node weights and counts of viable Salmonella persisting in these organs. This paper presents a quantitative analysis of the genetic variability of the traits related to salmonellosis susceptibility. The heritabilities of the traits varied between 0.10 and 0.64 (significantly different from zero). Thus, in sheep as well as in other species, the determinism of resistance to Salmonella infection is under genetic control. Moreover, the correlations between the traits are in agreement with the known immune mechanisms. The genetic variability observed should help QTL detection.
Schematic diagram of the host-parasite interaction model. Rectangular boxes indicate the fate of ingested protein, rounded boxes indicated host-parasite interactions and diamond boxes indicate key quantifiable parasite lifecycle stages. Dotted lines refer to the parasite lifecycle. 
The wide range of genetic parameter estimates for production traits and nematode resistance in sheep obtained from field studies gives rise to much speculation. Using a mathematical model describing host - parasite interactions in a genetically heterogeneous lamb population, we investigated the consequence of: (i) genetic relationships between underlying growth and immunological traits on estimated genetic parameters for performance and nematode resistance, and (ii) alterations in resource allocation on these parameter estimates. Altering genetic correlations between underlying growth and immunological traits had large impacts on estimated genetic parameters for production and resistance traits. Extreme parameter values observed from field studies could only be reproduced by assuming genetic relationships between the underlying input traits. Altering preferences in the resource allocation had less pronounced effects on the genetic parameters for the same traits. Effects were stronger when allocation shifted towards growth, in which case worm burden and faecal egg counts increased and genetic correlations between these resistance traits and body weight became stronger. Our study has implications for the biological interpretation of field data, and for the prediction of selection response from breeding for nematode resistance. It demonstrates the profound impact that moderate levels of pleiotropy and linkage may have on observed genetic parameters, and hence on outcomes of selection for nematode resistance.
Abstract The efficiency of the French marker-assisted selection (MAS) was estimated by a simulation study. The data files of two different time periods were used: April 2004 and 2006. The simulation method used the structure of the existing French MAS: same pedigree, same marker genotypes and same animals with records. The program simulated breeding values and new records based on this existing structure and knowledge on the QTL used in MAS (variance and frequency). Reliabilities of genetic values of young animals (less than one year old) obtained with and without marker information were compared to assess the efficiency of MAS for evaluation of milk, fat and protein yields and fat and protein contents. Mean gains of reliability ranged from 0.015 to 0.094 and from 0.038 to 0.114 in 2004 and 2006, respectively. The larger number of animals genotyped and the use of a new set of genetic markers can explain the improvement of MAS reliability from 2004 to 2006. This improvement was also observed by analysis of information content for young candidates. The gain of MAS reliability with respect to classical selection was larger for sons of sires with genotyped progeny daughters with records. Finally, it was shown that when superiority of MAS over classical selection was estimated with daughter yield deviations obtained after progeny test instead of true breeding values, the gain was underestimated.
We investigated the joint evolution of neutral and selected genomic regions in three chicken lines selected for immune response and in one control line. We compared the evolution of polymorphism of 21 supposedly neutral microsatellite markers versus 30 microsatellite markers located in seven quantitative trait loci (QTL) regions. Divergence of lines was observed by factor analysis. Five supposedly neutral markers and 12 markers in theQTL regions showed F(st) values greater than 0.15. However, the non-significant difference (P > 0.05) between matrices of genetic distances based on genotypes at supposedly neutral markers on the one hand, and at markers in QTL regions, on the other hand, showed that none of the markers in the QTL regions were influenced by selection. A supposedly neutral marker and a marker located in the QTL region on chromosome 14 showed temporal variations in allele frequencies that could not be explained by drift only. Finally, to confirm that markers located inQTL regions on chromosomes 1, 7 and 14 were under the influence of selection, simulations were performed using haplotype dropping along the existing pedigree. In the zone located on chromosome 14, the simulation results confirmed that selection had an effect on the evolution of polymorphism of markers within the zone.
High levels of androstenone and skatole in fat tissues are considered the primary causes of boar taint, an unpleasant odour and flavour of the meat from non-castrated male pigs. The aim of this article is to review our current knowledge of the biology and genetic control of the accumulation of androstenone and skatole in fat tissue. Two QTL mapping studies have shown the complexity of the genetic control of these traits. During the last ten years, several authors have taken a more physiological approach to investigate the involvement of genes controlling the metabolism of androstenone and skatole. Although some authors have claimed the identification of candidate genes, it is more appropriate to talk about target genes. This suggests that genes affecting androstenone and skatole levels will have to be sought for among specific or non-specific transcription factors interacting with these target genes.
The aim of this paper was to describe, and when possible compare, the multivariate methods used by the participants in the EADGENE WP1.4 workshop. The first approach was for class discovery and class prediction using evidence from the data at hand. Several teams used hierarchical clustering (HC) or principal component analysis (PCA) to identify groups of differentially expressed genes with a similar expression pattern over time points and infective agent (E. coli or S. aureus). The main result from these analyses was that HC and PCA were able to separate tissue samples taken at 24 h following E. coli infection from the other samples. The second approach identified groups of differentially co-expressed genes, by identifying clusters of genes highly correlated when animals were infected with E. coli but not correlated more than expected by chance when the infective pathogen was S. aureus. The third approach looked at differential expression of predefined gene sets. Gene sets were defined based on information retrieved from biological databases such as Gene Ontology. Based on these annotation sources the teams used either the GlobalTest or the Fisher exact test to identify differentially expressed gene sets. The main result from these analyses was that gene sets involved in immune defence responses were differentially expressed.
The “fishtail” appearance of M-A plots for the raw data for slides 1–4. Lines are Loess curves for each of the 48 print-tips. Control spots were omitted. 
M-A plots after normalisation for ROSLIN team with a print-tip Loess correction for slides 1 to 4. 
Venn diagram for the lists of differentially expressed genes found for E. coli between times 0 and 24 h after infection at a 5% Benjamini and Hochberg threshold for IDL, INRA_J and PTP teams. Normalisation and analyses methods used by these teams are presented in Tables I and II.
Venn diagram for the 500 first differentially expressed genes found for E. coli between times 0 and 24 h after infection for IDL, INRA_J and PTP teams.
A large variety of methods has been proposed in the literature for microarray data analysis. The aim of this paper was to present techniques used by the EADGENE (European Animal Disease Genomics Network of Excellence) WP1.4 participants for data quality control, normalisation and statistical methods for the detection of differentially expressed genes in order to provide some more general data analysis guidelines. All the workshop participants were given a real data set obtained in an EADGENE funded microarray study looking at the gene expression changes following artificial infection with two different mastitis causing bacteria: Escherichia coli and Staphylococcus aureus. It was reassuring to see that most of the teams found the same main biological results. In fact, most of the differentially expressed genes were found for infection by E. coli between uninfected and 24 h challenged udder quarters. Very little transcriptional variation was observed for the bacteria S. aureus. Lists of differentially expressed genes found by the different research teams were, however, quite dependent on the method used, especially concerning the data quality control step. These analyses also emphasised a biological problem of cross-talk between infected and uninfected quarters which will have to be dealt with for further microarray studies.
Mean body weight (g ± standard error) of 10 rainbow trout clones (1–10) fed ad libitum then submitted to two periods of feed deprivation each followed by periods of re-feeding. The bold line represents the population mean body weight. The dotted line represents the expected population mean body weight if fish are not submitted to feed deprivation. 'A' corresponds to the first experimental period, i.e. when the genetic variability of residual feed intake is estimated. 'B' corresponds to the second experimental period, i.e. when the indirect criteria are tested.  
Correlations, for 10 rainbow trout clones (1–10), between residual feed intake (RFI) and different indirect criteria: A = weighted criteria; B = cumulative feed intake (CI). Weighted indirect criteria correspond to the sum of all types of G (growth rates) corrected by the weighting coefficients (see Tab. IV). Each square represents a clone.  
Little is known about the genetic basis of residual feed intake (RFI) variation in fish, since this trait is highly sensitive to environmental influences, and feed intake of individuals is difficult to measure accurately. The purpose of this work was (i) to assess the genetic variability of RFI estimated by an X-ray technique and (ii) to develop predictive criteria for RFI. Two predictive criteria were tested: loss of body weight during feed deprivation and compensatory growth during re-feeding. Ten heterozygous rainbow trout clones were used. Individual intake and body weight were measured three times at three week intervals. Then, individual body weight was recorded after two cycles of a three-week feed deprivation followed by a three-week re-feeding. The ratio of the genetic variance to the phenotypic variance was found high to moderate for growth, feed intake, and RFI (VG/VP = 0.63+/-0.11, 0.29 +/-0.11, 0.29 +/-0.09, respectively). The index that integrates performances achieved during deprivation and re-feeding periods explained 59% of RFI variations. These results provide a basis for further studies on the origin of RFI differences and show that indirect criteria are good candidates for future selective breeding programs.
Example background plots. The top two images show the background for Cy5 and Cy3 in slide 9, and the bottom two images show the same for slide 10. 
MA-plots of slides 1, 5 and 6. These slides are examples of the three patterns displayed by the simulated data in the MA-space: positive correlation, negative correlation and a more pronounced non-linear correlation. 
Boxplots of M values (log 2 (cy5 / cy3)) across the 10 arrays for three nor- 
Microarrays allow researchers to measure the expression of thousands of genes in a single experiment. Before statistical comparisons can be made, the data must be assessed for quality and normalisation procedures must be applied, of which many have been proposed. Methods of comparing the normalised data are also abundant, and no clear consensus has yet been reached. The purpose of this paper was to compare those methods used by the EADGENE network on a very noisy simulated data set. With the a priori knowledge of which genes are differentially expressed, it is possible to compare the success of each approach quantitatively. Use of an intensity-dependent normalisation procedure was common, as was correction for multiple testing. Most variety in performance resulted from differing approaches to data quality and the use of different statistical tests. Very few of the methods used any kind of background correction. A number of approaches achieved a success rate of 95% or above, with relatively small numbers of false positives and negatives. Applying stringent spot selection criteria and elimination of data did not improve the false positive rate and greatly increased the false negative rate. However, most approaches performed well, and it is encouraging that widely available techniques can achieve such good results on a very noisy data set.
Haplotype relationships indicate two central alleles. Each numbered circle represents a haplotype at the amino acid level. The frequency of each haplotype is depicted by circle size. Lines connect haplotypes related by a single amino acid substitution, and loops formed by haplotype connections may indicate historical recombination events. The two haplotypes represented by black circles are related to all the remaining haplotypes by single amino acid substitutions. 
Scrapie eradication efforts cost 18 million dollars annually in the United States and rely heavily upon PRNP genotyping of sheep. Genetic resistance might reduce goat scrapie and limit the risk of goats serving as a scrapie reservoir, so PRNP coding sequences were examined from 446 goats of 10 breeds, 8 of which had not been previously examined at PRNP. The 10 observed alleles were all related to one of two central haplotypes by a single amino acid substitution. At least five of these alleles (M142, R143, S146, H154, and K222) have been associated with increased incubation time or decreased odds of scrapie. To the best of our knowledge, neither S146 nor K222 has been found in any goats with scrapie, though further evaluation will be required to demonstrate true resistance. S146 was more common, present in several breeds at widely varying frequencies, while K222 was observed only in two dairy breeds at low frequency. Overall, this study provides frequency data on PRNP alleles in US goats, shows the pattern of relationships between haplotypes, and demonstrates segregation of multiple scrapieassociated alleles in several breeds not examined before at PRNP.
Approximate power of the design for the detection of QTL for production traits and reproduction traits, as a function of the QTL substitution effect. R_0.05 and R_0.01 are, respectively, the power for a 0.1 heritability reproduction trait considering a 5% or a 1% type I error. P_0.05 and P_0.01 are, respectively, the power for a 0.4 heritability production trait considering a 5% or a 1% type I error. 
A genome-wide scan was performed in Large White and French Landrace pig populations in order to identify QTL affecting reproduction and production traits. The experiment was based on a granddaughter design, including five Large White and three French Landrace half-sib families identified in the French porcine national database. A total of 239 animals (166 sons and 73 daughters of the eight male founders) distributed in eight families were genotyped for 144 microsatellite markers. The design included 51 262 animals recorded for production traits, and 53 205 litter size records were considered. Three production and three reproduction traits were analysed: average backfat thickness (US_M) and live weight (LWGT) at the end of the on-farm test, age of candidates adjusted at 100 kg live weight, total number of piglets born per litter, and numbers of stillborn (STILLp) and born alive (LIVp) piglets per litter. Ten QTL with medium to large effects were detected at a chromosome-wide significance level of 5% affecting traits US_M (on SSC2, SSC3 and SSC17), LWGT (on SSC4), STILLp (on SSC6, SSC11 and SSC14) and LIVp (on SSC7, SSC16 and SSC18). The number of heterozygous male founders varied from 1 to 3 depending on the QTL.
Differences between simulated and estimated values for the means of the distributions for healthy (plain bar) and infected (open bar) cows as a function of the proportion of infected cows.  
Sensitivity (plain bar), specificity (open bar), and probability of correct classification (slash bar) as a function of the proportion of E. coli among infected cows.  
A mixed hidden Markov model (HMM) was developed for predicting breeding values of a biomarker (here, somatic cell score) and the individual probabilities of health and disease (here, mastitis) based upon the measurements of the biomarker. At a first level, the unobserved disease process (Markov model) was introduced and at a second level, the measurement process was modeled, making the link between the unobserved disease states and the observed biomarker values. This hierarchical formulation allows joint estimation of the parameters of both processes. The flexibility of this approach is illustrated on the simulated data. Firstly, lactation curves for the biomarker were generated based upon published parameters (mean, variance, and probabilities of infection) for cows with known clinical conditions (health or mastitis due to Escherichia coli or Staphylococcus aureus). Next, estimation of the parameters was performed via Gibbs sampling, assuming the health status was unknown. Results from the simulations and mathematics show that the mixed HMM is appropriate to estimate the quantities of interest although the accuracy of the estimates is moderate when the prevalence of the disease is low. The paper ends with some indications for further developments of the methodology.
Surface representing the space of solutions for the constraints: c T Ac / 2 ≤ F ∗ 
Pedigree structure for Example A. Values in bracket are their estimated breeding value and their relationship was calculated using the shown pedigree structure. 
Contour plot showing the space of feasible solutions for Example A. The x- and y-axes are the contribution of sires 1 and 2, respectively. The contribution of sire 3 is 0.5-x-y. Shadow area is the space of feasible solutions. The ellipse curve is contour line for F = 0 . 29, and where RSRO searches for the solution. The solid triangle ( ) is the solution for first iteration of RSRO. The solid circle ( • ) is the optimum solution. 
Pedigree structure for Example B. Values in bracket are their estimated breeding value and their relationship was calculated using the shown pedigree structure. 
An approach for optimising genetic contributions of candidates to control inbreeding in the offspring generation using semidefinite programming (SDP) was proposed. Formulations were done for maximising genetic gain while restricting inbreeding to a preset value and for minimising inbreeding without regard of gain. Adaptations to account for candidates with fixed contributions were also shown. Using small but traceable numerical examples, the SDP method was compared with an alternative based upon Lagrangian multipliers (RSRO). The SDP method always found the optimum solution that maximises genetic gain at any level of restriction imposed on inbreeding, unlike RSRO which failed to do so in several situations. For these situations, the expected gains from the solution obtained with RSRO were between 1.5-9% lower than those expected from the optimum solution found with SDP with assigned contributions varying widely. In conclusion SDP is a reliable and flexible method for solving contribution problems.
Microarray analyses have become an important tool in animal genomics. While their use is becoming widespread, there is still a lot of ongoing research regarding the analysis of microarray data. In the context of a European Network of Excellence, 31 researchers representing 14 research groups from 10 countries performed and discussed the statistical analyses of real and simulated 2-colour microarray data that were distributed among participants. The real data consisted of 48 microarrays from a disease challenge experiment in dairy cattle, while the simulated data consisted of 10 microarrays from a direct comparison of two treatments (dye-balanced). While there was broader agreement with regards to methods of microarray normalisation and significance testing, there were major differences with regards to quality control. The quality control approaches varied from none, through using statistical weights, to omitting a large number of spots or omitting entire slides. Surprisingly, these very different approaches gave quite similar results when applied to the simulated data, although not all participating groups analysed both real and simulated data. The workshop was very successful in facilitating interaction between scientists with a diverse background but a common interest in microarray analyses.
Parameter expanded and standard expectation maximisation algorithms are described for reduced rank estimation of covariance matrices by restricted maximum likelihood, fitting the leading principal components only. Convergence behaviour of these algorithms is examined for several examples and contrasted to that of the average information algorithm, and implications for practical analyses are discussed. It is shown that expectation maximisation type algorithms are readily adapted to reduced rank estimation and converge reliably. However, as is well known for the full rank case, the convergence is linear and thus slow. Hence, these algorithms are most useful in combination with the quadratically convergent average information algorithm, in particular in the initial stages of an iterative solution scheme.
The effects of additive, dominance, additive by dominance, additive by additive and dominance by dominance genetic effects on age at first service, non-return rates and interval from calving to first service were estimated. Practical considerations of computing additive and dominance relationships using the genomic relationship matrix are discussed. The final strategy utilized several groups of 1000 animals (heifers or cows) in which all animals had a non-zero dominance relationship with at least one other animal in the group. Direct inversion of relationship matrices was possible within the 1000 animal subsets. Estimates of variances were obtained using Bayesian methodology via Gibbs sampling. Estimated non-additive genetic variances were generally as large as or larger than the additive genetic variance in most cases, except for non-return rates and interval from calving to first service for cows. Non-additive genetic effects appear to be of sizeable magnitude for fertility traits and should be included in models intended for estimating additive genetic merit. However, computing additive and dominance relationships for all possible pairs of individuals is very time consuming in populations of more than 200 000 animals.
Size of the cohorts according to different levels of heritability and genomic selection intensity
Number and type of performances available in BLUP evaluations for the four tested scenarios
Quality of BLUP evaluations with or without accounting for pre-selection in the cohort of daughters of the selected young sires
Effect of accuracy of genomic evaluations on BLUP evaluations for foot angle in the cohort of selected young sires
In future Best Linear Unbiased Prediction (BLUP) evaluations of dairy cattle, genomic selection of young sires will cause evaluation biases and loss of accuracy once the selected ones get progeny. To avoid such bias in the estimation of breeding values, we propose to include information on all genotyped bulls, including the culled ones, in BLUP evaluations. Estimated breeding values based on genomic information were converted into genomic pseudo-performances and then analyzed simultaneously with actual performances. Using simulations based on actual data from the French Holstein population, bias and accuracy of BLUP evaluations were computed for young sires undergoing progeny testing or genomic pre-selection. For bulls pre-selected based on their genomic profile, three different types of information can be included in the BLUP evaluations: (1) data from pre-selected genotyped candidate bulls with actual performances on their daughters, (2) data from bulls with both actual and genomic pseudo-performances, or (3) data from all the genotyped candidates with genomic pseudo-performances. The effects of different levels of heritability, genomic pre-selection intensity and accuracy of genomic evaluation were considered. Including information from all the genotyped candidates, i.e. genomic pseudo-performances for both selected and culled candidates, removed bias from genetic evaluation and increased accuracy. This approach was effective regardless of the magnitude of the initial bias and as long as the accuracy of the genomic evaluations was sufficiently high. The proposed method can be easily and quickly implemented in BLUP evaluations at the national level, although some improvement is necessary to more accurately propagate genomic information from genotyped to non-genotyped animals. In addition, it is a convenient method to combine direct genomic, phenotypic and pedigree-based information in a multiple-step procedure.
Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but the recurrent inclusion of genotyped sires in retraining analyses will be necessary to routinely produce for the industry the direct genomic values with the highest accuracy.
Parameters of phenotypic traits 
Heritabilities (upper) and standard errors (lower) of GEBV 
Correlation|standard error between the direct additive genetic component of the phenotypic trait and GEBV from different prediction equations for Australian Brahman cattle 
The major obstacles for the implementation of genomic selection in Australian beef cattle are the variety of breeds and in general, small numbers of genotyped and phenotyped individuals per breed. The Australian Beef Cooperative Research Center (Beef CRC) investigated these issues by deriving genomic prediction equations (PE) from a training set of animals that covers a range of breeds and crosses including Angus, Murray Grey, Shorthorn, Hereford, Brahman, Belmont Red, Santa Gertrudis and Tropical Composite. This paper presents accuracies of genomically estimated breeding values (GEBV) that were calculated from these PE in the commercial pure-breed beef cattle seed stock sector. PE derived by the Beef CRC from multi-breed and pure-breed training populations were applied to genotyped Angus, Limousin and Brahman sires and young animals, but with no pure-breed Limousin in the training population. The accuracy of the resulting GEBV was assessed by their genetic correlation to their phenotypic target trait in a bi-variate REML approach that models GEBV as trait observations. Accuracies of most GEBV for Angus and Brahman were between 0.1 and 0.4, with accuracies for abattoir carcass traits generally greater than for live animal body composition traits and reproduction traits. Estimated accuracies greater than 0.5 were only observed for Brahman abattoir carcass traits and for Angus carcass rib fat. Averaged across traits within breeds, accuracies of GEBV were highest when PE from the pooled across-breed training population were used. However, for the Angus and Brahman breeds the difference in accuracy from using pure-breed PE was small. For the Limousin breed no reasonable results could be achieved for any trait. Although accuracies were generally low compared to published accuracies estimated within breeds, they are in line with those derived in other multi-breed populations. Thus PE developed by the Beef CRC can contribute to the implementation of genomic selection in Australian beef cattle breeding.
Description of SNP panels for chromosome 1
Accuracy of imputation for twelve genotyping scenarios
Accuracy of imputation for genotyping scenarios when removing subsets of individuals from the "Other" category
Accuracy and costs of imputation for different genotyping scenarios
Summary of informative SNP
Commercial breeding programs seek to maximise the rate of genetic gain while minimizing the costs of attaining that gain. Genomic information offers great potential to increase rates of genetic gain but it is expensive to generate. Low-cost genotyping strategies combined with genotype imputation offer dramatically reduced costs. However, both the costs and accuracy of imputation of these strategies are highly sensitive to several factors. The objective of this paper was to explore the cost and imputation accuracy of several alternative genotyping strategies in pedigreed populations. Pedigree and genotype data from a commercial pig population were used. Several alternative genotyping strategies were explored. The strategies differed in the density of genotypes used for the ancestors and the individuals to be imputed. Parents, grandparents, and other relatives that were not descendants, were genotyped at high-density, low-density, or extremely low-density, and associated costs and imputation accuracies were evaluated. Imputation accuracy and cost were influenced by the alternative genotyping strategies. Given the mating ratios and the numbers of offspring produced by males and females, an optimized low-cost genotyping strategy for a commercial pig population could involve genotyping male parents at high-density, female parents at low-density (e.g. 3000 SNP), and selection candidates at very low-density (384 SNP). Among the selection candidates, 95.5 % and 93.5 % of the genotype variation contained in the high-density SNP panels were recovered using a genotyping strategy that costs respectively, $24.74 and $20.58 per candidate.
Background Genomic predictions can be applied early in life without impacting selection candidates. This is especially useful for meat quality traits in sheep. Carcass and novel meat quality traits were predicted in a multi-breed sheep population that included Merino, Border Leicester, Polled Dorset and White Suffolk sheep and their crosses. Methods Prediction of breeding values by best linear unbiased prediction (BLUP) based on pedigree information was compared to prediction based on genomic BLUP (GBLUP) and a Bayesian prediction method (BayesR). Cross-validation of predictions across sire families was used to evaluate the accuracy of predictions based on the correlation of predicted and observed values and the regression of observed on predicted values was used to evaluate bias of methods. Accuracies and regression coefficients were calculated using either phenotypes or adjusted phenotypes as observed variables. Results and conclusions Genomic methods increased the accuracy of predicted breeding values to on average 0.2 across traits (range 0.07 to 0.31), compared to an average accuracy of 0.09 for pedigree-based BLUP. However, for some traits with smaller reference population size, there was no increase in accuracy or it was small. No clear differences in accuracy were observed between GBLUP and BayesR. The regression of phenotypes on breeding values was close to 1 for all methods, indicating little bias, except for GBLUP and adjusted phenotypes (regression = 0.78). Accuracies calculated with adjusted (for fixed effects) phenotypes were less variable than accuracies based on unadjusted phenotypes, indicating that fixed effects influence the latter. Increasing the reference population size increased accuracy, indicating that adding more records will be beneficial. For the Merino, Polled Dorset and White Suffolk breeds, accuracies were greater than for the Border Leicester breed due to the smaller sample size and limited across-breed prediction. BayesR detected only a few large marker effects but one region on chromosome 6 was associated with large effects for several traits. Cross-validation produced very similar variability of accuracy and regression coefficients for BLUP, GBLUP and BayesR, showing that this variability is not a property of genomic methods alone. Our results show that genomic selection for novel difficult-to-measure traits is a feasible strategy to achieve increased genetic gain.
The purpose of this work was to study the impact of both the size of genomic reference populations and the inclusion of a residual polygenic effect on dairy cattle genetic evaluations enhanced with genomic information. Direct genomic values were estimated for German Holstein cattle with a genomic BLUP model including a residual polygenic effect. A total of 17,429 genotyped Holstein bulls were evaluated using the phenotypes of 44 traits. The Interbull genomic validation test was implemented to investigate how the inclusion of a residual polygenic effect impacted genomic estimated breeding values. As the number of reference bulls increased, both the variance of the estimates of single nucleotide polymorphism effects and the reliability of the direct genomic values of selection candidates increased. Fitting a residual polygenic effect in the model resulted in less biased genome-enhanced breeding values and decreased the correlation between direct genomic values and estimated breeding values of sires in the reference population. Genetic evaluation of dairy cattle enhanced with genomic information is highly effective in increasing reliability, as well as using large genomic reference populations. We found that fitting a residual polygenic effect reduced the bias in genome-enhanced breeding values, decreased the correlation between direct genomic values and sire's estimated breeding values and made genome-enhanced breeding values more consistent in mean and variance as is the case for pedigree-based estimated breeding values.
Top-cited authors
Rohan L Fernando
  • Iowa State University
Dorian J Garrick
  • Massey University
Karin Meyer
  • University of New England (Australia)
Daniel Gianola
  • University of Wisconsin–Madison
Mogens Sandø Lund
  • Aarhus University