Hans-Peter Piepho

Hohenheim University, Stuttgart, Baden-Württemberg, Germany

Are you Hans-Peter Piepho?

Claim your profile

Publications (107)245 Total impact

  • [show abstract] [hide abstract]
    ABSTRACT: The calibration data for genomic prediction should represent the full genetic spectrum of a breeding program. Data heterogeneity is minimized by connecting data sources through highly related test units. One of the major challenges of genome-enabled prediction in plant breeding lies in the optimum design of the population employed in model training. With highly interconnected breeding cycles staggered in time the choice of data for model training is not straightforward. We used cross-validation and independent validation to assess the performance of genome-based prediction within and across genetic groups, testers, locations, and years. The study comprised data for 1,073 and 857 doubled haploid lines evaluated as testcrosses in 2 years. Testcrosses were phenotyped for grain dry matter yield and content and genotyped with 56,110 single nucleotide polymorphism markers. Predictive abilities strongly depended on the relatedness of the doubled haploid lines from the estimation set with those on which prediction accuracy was assessed. For scenarios with strong population heterogeneity it was advantageous to perform predictions within a priori defined genetic groups until higher connectivity through related test units was achieved. Differences between group means had a strong effect on predictive abilities obtained with both cross-validation and independent validation. Predictive abilities across subsequent cycles of selection and years were only slightly reduced compared to predictive abilities obtained with cross-validation within the same year. We conclude that the optimum data set for model training in genome-enabled prediction should represent the full genetic and environmental spectrum of the respective breeding program. Data heterogeneity can be reduced by experimental designs that maximize the connectivity between data sources by common or highly related test units.
    Theoretical and Applied Genetics 04/2014; · 3.66 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: Water, forage and predation constrain ungulate distributions in savannas. To understand these constraints, we characterized distributions of 15 herbivore species from water, locations of peak density and degree of clustering around the peaks using zero-inflated count data models and mapping census data collected in the Mara reserve and the adjoining pastoral ranches in Kenya during a wet and dry year. Herbivores followed a humped pattern (n = 46), suggesting constrained foraging in which they balance the benefits of proximity to water with the costs of foraging where food is depleted near water and travelling to more abundant food distant from water; an exponentially decreasing pattern (n = 11), indicating strong attraction to water or vegetation near water; or a uniform (n = 3) pattern. The details rather than the types of these patterns varied between years. Herbivores concentrated farther from water and more tightly around locations of their peak densities in the ranches than the reserve. Herbivores were more abundant and widely distributed from water in the wet than the dry year, and segregated along the distance-to-water gradient, presumably to minimize interspecific competition for food. Pastoralism compressed herbivore distributions and partially excluded some species (warthog, hartebeest, topi, wildebeest, zebra, eland, buffalo and elephant) from, while attracting others (Grant’s and Thomson’s gazelles, impala, giraffe) to the ranches, relative to the reserve. Regulating cultivation, fencing, settlements and livestock stocking levels in the ranches would allow continued wildlife access to water, reduce competition with, displacement or harassment of wildlife by people, livestock and dogs near water.
    Biodiversity and Conservation 03/2014; 23(3). · 2.26 Impact Factor
  • Jens Moehring, Emlyn R Williams, Hans-Peter Piepho
    [show abstract] [hide abstract]
    ABSTRACT: The paper shows that unreplicated designs in multi-environmental trials are most efficient. If replication per environment is needed then augmented p-rep designs outperform augmented and replicated designs in triticale and maize. In plant breeding, augmented designs with unreplicated entries are frequently used for early generation testing. With limited amount of seed, this design allows to use a maximum number of environments in multi-environmental trials (METs). Check plots enable the estimation of block effects, error variances and a connection of otherwise unconnected trials in METs. Cullis et al. (J Agri Biol Environ Stat 11:381-393, 2006) propose to replace check plots from a grid-plot design by plots of replicated entries leading to partially replicated (p-rep) designs. Williams et al. (Biom J 53:19-27, 2011) apply this idea to augmented designs (augmented p-rep designs). While p-rep designs are increasingly used in METs, a comparison of the efficiency of augmented p-rep designs and augmented designs in the range between replicated and unreplicated designs in METs is lacking. We simulated genetic effects and allocated them according to these four designs to plot yields of a triticale and a maize uniformity trial. The designs varied in the number of environments, but have a fixed number of entries and total plots. The error model and the assumption of fixed or random entry effects were varied in simulations. We extended our simulation for the triticale data by including correlated entry effects which are common in genomic selection. Results show an advantage of unreplicated and augmented p-rep designs and a preference for using random entry effects, especially in case of correlated effects reflecting relationships among entries. Spatial error models had minor advantages compared to purely randomization-based models.
    Theoretical and Applied Genetics 02/2014; · 3.66 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Heterosis, the greater vigor of hybrids compared to their parents, has been exploited in maize breeding for more than 100 years to produce ever better performing elite hybrids of increased yield. Despite extensive research, the underlying mechanisms shaping the extent of heterosis are not well understood, rendering the process of selecting an optimal set of parental lines tedious. This study is based on a dataset consisting of 112 metabolite levels in young roots of four parental maize inbred lines and their corresponding twelve hybrids, along with the roots' biomass as a heterotic trait. Because the parental biomass is a poor predictor for hybrid biomass, we established a model framework to deduce the biomass of the hybrid from metabolite profiles of its parental lines. In the proposed framework, the hybrid metabolite levels are expressed relative to the parental levels by incorporating the standard concept of additivity/dominance, which we name the Combined Relative Level (CRL). Our modeling strategy includes a feature selection step on the parental levels which are demonstrated to be predictive of CRL across many hybrid metabolites. We demonstrate that these selected parental metabolites are further predictive of hybrid biomass. Our approach directly employs the diallel structure in a multivariate fashion, whereby we attempt to not only predict macroscopic phenotype (biomass), but also molecular phenotype (metabolite profiles). Therefore, our study provides the first steps for further investigations of the genetic determinants to metabolism and, ultimately, growth. Finally, our success on the small-scale experiments implies a valid strategy for large-scale experiments, where parental metabolite profiles may be used together with profiles of selected hybrids as a training set to predict biomass of all possible hybrids.
    PLoS ONE 01/2014; 9(1):e85435. · 3.73 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: Genomic selection has been routinely implemented in plant breeding in two stages. The first stage usually omits the marker information and estimates adjusted means of genotypes across environments. The second stage uses the adjusted means to predict genomic breeding values. However, if the effects of markers vary substantially between different environments, it may be important to account for this variation for varieties adapted to different environments. Using two maize data sets, we investigated whether modelling the marker-by-environment interaction can improve the predictive ability of genomic selection relative to modelling genotype-by-environment interaction alone. Modelling the marker-by-environment interaction did not substantially increase the predictive ability relative to modelling only the genotype-by-environment interaction for the two tested data sets. Thus, genomic selection, carried out in a stagewise fashion, such that the marker information is omitted until the last stage of the process, may suffice for most practical purposes. Moreover, predictive ability did not reduce substantially even when the number of markers with consistent effects across environments used for genomic prediction was reduced to about 50.
    Plant Breeding 12/2013; 132(6). · 1.18 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: We present experimental data for wheat, barley, and triticale suggesting that hybrids manifest on average higher yield stability than inbred lines. Yield stability is assumed to be higher for hybrids than for inbred lines, but experimental data proving this hypothesis is scarce for autogamous cereals. We used multi-location grain yield trials and compared the yield stability of hybrids versus lines for wheat (Triticum aestivum L.), barley (Hordeum vulgare L.), and triticale (×Triticosecale Wittmack). Our study comprised three phenotypic data sets of 1,749 wheat, 96 barley, and 130 triticale genotypes, which were evaluated for grain yield in up to five contrasting locations. Yield stability of the group of hybrids was compared with that of the group of inbred lines estimating the stability variance. For all three crops we observed a significantly (P < 0.05) higher yield stability of hybrids compared to lines. The enhanced yield stability of hybrids as compared to lines represents a major step forward, facilitating coping with the increasing abiotic stress expected from the predicted climate change.
    Theoretical and Applied Genetics 10/2013; · 3.66 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: Maize (Zea mays L.) develops an extensive shoot-borne root system to secure water and nutrient uptake and to provide anchorage in the soil. In the present study, early coleoptilar node (first shoot-node) development was subjected to a detailed morphological and histological analysis. Subsequently, microarray profiling via hybridization of oligonucleotide microarrays representing transcripts of 31,355 unique maize genes at three early stages of coleoptilar node development was performed. These pairwise comparisons of wild-type versus mutant rtcs coleoptilar nodes which do not initiate shoot-borne roots revealed 828 unique transcripts that displayed RTCS-dependent expression. A stage specific GO enrichment analysis revealed overrepresentation of "cell wall", "stress" and "development" related transcripts among the differentially expressed genes. Differential expression of a subset of 15 of 828 genes identified by these microarray experiments was independently confirmed by qRT-PCR. In silico promoter analyses revealed that 100 differentially expressed genes contained at least one LBD motif within 1 kb upstream of the ATG start codon. EMSA (Electrophoretic Mobility Shift Assay) experiments demonstrated RTCS binding for four of these promoter sequences, supporting the notion that differentially accumulated genes containing LBD motifs are likely direct downstream targets of RTCS.
    Plant physiology 07/2013; · 6.56 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Genotype-by-environment interaction (GxE) has been widely reported in dairy cattle. One way to analyse GxE is to apply reaction norm models. The first derivative of a reaction norm is the environmental sensitivity (ES). In the present study we conducted a large scale genome-wide association analysis to identify SNPs that affect general production (GP) and ES of milk traits in the German Holstein population. Sire estimates for GP and for ES were calculated from around 13 million daughter records, using linear reaction norm models. The daughters were offspring from 2,297 sires. Sires were genotyped for 54k SNPs. The environment was defined as the average milk energy yield performance of the herds at the time where the daughter observations were recorded. The sire estimates were used as observations in a genome-wide association analysis, using 1,797 sires. Significant SNPs were confirmed in an independent validation set (500 sires of the same population). In order to separate GxE scaling and other GxE effects, the observations were log-transformed in some analyses. Results from the reaction norm model revealed GxE effects. Numerous significant SNPs were validated for both GP and ES. Many SNPs affecting GP also affect ES. We showed that ES of milk traits is a typical quantitative trait, genetically controlled by many genes with small effects and few genes with larger effect. A log-transformation of the observation resulted in a reduced number of validated SNPs for ES, pointing to genes that not only caused scaling GxE effects. The results will have implications for breeding for robustness in dairy cattle.
    G3-Genes Genomes Genetics 05/2013; · 1.79 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: Heterosis is the superior performance of heterozygous F1-hybrid plants compared to their homozygous genetically distinct parents. Seminal roots are embryonic roots that play an important role during early maize (Zea mays L.) seedling development. In the present study the most abundant soluble proteins of 2-4 cm seminal roots of the reciprocal maize F1-hybrids B73xMo17 and Mo17xB73 and their parental inbred lines B73 and Mo17 were quantified by label-free LC-MS/MS. In total, 1,918 proteins were detected by this shot-gun approach. Among those, 970 were represented by at least two peptides and were further analyzed. Eighty-five proteins displayed non-additive accumulation in at least one hybrid. The functional category protein metabolism was the most abundant class of non-additive proteins represented by 27 proteins. Within this category 16 of 17 non-additively accumulated ribosomal proteins showed high or above high parent expression in seminal roots. These results imply that an increased protein synthesis rate in hybrids might be related to the early manifestation of hybrid vigor in seminal roots. SIGNIFICANCE: In the present study a shot-gun proteomics approach allowed for the identification detection of 1,917 proteins and analysis of 970 seminal root proteins of maize that were represented by at least 2 peptides. The comparison of proteome complexity of reciprocal hybrids and their parental inbred lines indicate an increased protein synthesis rate in hybrids that may contribute to the early manifestation of heterosis in seminal roots.
    Journal of proteomics 04/2013; · 5.07 Impact Factor
  • Regina G. Belz, Hans-Peter Piepho
    [show abstract] [hide abstract]
    ABSTRACT: Chemical hormesis constitutes an alternative possible use of herbicidal agents for crop enhancement that is, however, compromised by the apparent variability of this low-dose stimulation phenomenon. Studies demonstrating the variability are rare and, therefore, this study investigated the interspecies variability of growth stimulation induced by the auxin-inhibitor PCIB [2-(p-chlorophenoxy)-2-methylpropionic acid] to determine if hormesis is generalizable enough and sufficiently stable between species/cultivars for practical use or which implications may have to be taken into account. In 85 complete dose–response bioassays with 23 cultivars of five species, the variability of PCIB effects was evaluated. The expression of PCIB hormesis proved to depend on the species/cultivar tested, ranging from a cultivar-dependent hormetic efficacy and an occasional lack of hormesis, to a complete lack of hormetic effectiveness in certain species/cultivars. Therefore, frequency estimations, as well as the pattern of dose-dependent variability of dose–response quantities, may inevitably depend on the biological model(s) used and, thus, apply only to the specific conditions for characterization. Comparing the frequency distribution of effective doses demonstrated a risk of a previously hormetic dose causing a loss of hormesis or inhibitory effects in another species/cultivar. Therefore, selecting a dose that will induce hormesis in every species/cultivar is unrealistic. This may limit the window for practical applications to stimulants with negligible varietal differences, to cultivar selective treatments, and/or to cultivars that enable a beneficial long-term use. Hence, efficient crop enhancement by chemical hormesis needs not only a good stimulant, but also a species/cultivar able to convert a specific low-dose treatment into an economic benefit.
    Journal of Plant Growth Regulation 01/2013; · 1.99 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Energy crop production for fermentation in biogas plants significantly increased in the last years. At present maize is the most common energy crop for biogas production due to high dry matter and methane yields together with advanced breeding activities. In Germany nearly 80% of all energy crop biogas production is based on maize for silage. This dominance frequently causes environmental risks like soil erosion, nutrient losses and increased use of pesticides. Furthermore it puts societies’ acceptance towards biogas production increasingly at risk. Thus, the development of innovative cropping systems is essential for a sustainable energy crop production. This paper reviews results from 3 years field trials at seven locations with 12 different double-cropping systems (turnip rape, rye and mixture of rye–winterpea as first crops and maize, sorghum, sunflower and mixture of maize–sunflower as succeeding second crops) in comparison with three sole crops systems as references: maize and sunflower after mustard as catch crop and energy rye, harvested at dough ripeness as whole-crop silage. Double-cropping systems with rye or rye–pea as first crop and maize as second crop achieved highest DM yields of 23 t DM ha−1 on average, followed by sole cropped maize and double-cropping system of turnip rape and maize. Sole-cropped sunflowers had the lowest yield with nearly 15 t DM ha−1, followed by winter rye harvested at dough ripeness. Yields of other double-cropping systems had an intermediate position between those treatments. Methane yield per hectare was positively correlated with DM yield. Double cropping systems mostly achieved lower DM contents than sole cropping systems due to later sowing and altered photoperiodic influences. Further research on cultivars suitable for late sowing dates is necessary. Methane yield of crops was very similar with 282–298 Nl kg−1 oDM, except for sorghum with 255 Nl kg−1 oDM on average. Hence, differences in chemical components did not result in large changes regarding to methane yield, again except for sorghum, probably due to higher lignin contents. Double-cropping systems had mostly higher yield stability than sole cropping systems. The cultivation of two crops within 1 year may also spread the risk of weather extremes among two crops (or more if mixtures are grown) resulting in higher yield stability. This property is getting increasingly important with regard to climate change. Hence, double-cropping systems may contribute to a more sustainable energy crop production.
    European Journal of Agronomy 01/2013; 51:120–129. · 2.80 Impact Factor
  • Joseph O Ogutu, Hans-Peter Piepho, Holly T Dublin
    [show abstract] [hide abstract]
    ABSTRACT: Animal population dynamics can be driven by rainfall variability through its influence on habitat suitability, availability and nutritional sufficiency of forage. To understand how rainfall influences ostriches, we related changes in ostrich recruitment in the Mara–Serengeti ecosystem to rainfall. Over a 15-year period, monthly counts of ostriches were made and the number of hatchlings, chicks, hens, cocks, and the size of the groups in which they occurred were recorded. Breeding was bimodal with a major peak in February and a minor peak in October. Ostriches formed larger groups in the wet (4.41 ± 5.17 (mean ± SD), range 1–72, n = 672 groups) than in the dry (2.49 ± 2.70, range 1–29, n = 398) season. The number of hatchlings plus chicks per hen increased across the duration of the study period and with increasing annual and early wet-season rainfall, affecting forage availability and quality. Recruitment was highest at intermediate levels of the five-year average of the late wet-season rainfall, implying that a change in long-term rainfall and habitat suitability would move recruitment away from the optimum. Outstanding adaptations to life in arid environments could make ostriches more resilient than sympatric ungulates if food shortages and water stress became more frequent because of widening climatic variability.
    Ostrich - Journal of African Ornithology 11/2012; 83(3):119-136. · 0.47 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The objective of the present study was to monitor the occurrence and distribution of a spectrum of trichothecene toxins in different parts of maize plants. Therefore maize plants were sampled randomly from 13 fields in southwest Germany and the fractions kernels, cobs, husks, stalks, leaves and rudimentary ears were analyzed for eight A-type and five B-type trichothecenes. Each of the toxins was found in at least three of the total of 78 samples. The study revealed that both A-type and B-type trichothecenes may be present in all parts of the maize plant but may be unevenly distributed. For the contents of deoxynivalenol, 3- and 15-acetyldeoxynivalenol, nivalenol, scirpentriol, 15-monoacetoxyscirpenol, HT-2 and T-2 toxin significant differences (p < 0.05) were found between different parts of the maize plants whereas no significant differences were observed for fusarenon-X, 4,15-diacetoxyscirpenol, neosolaniol, T-2 triol and T-2 tetraol. Up to twelve toxins co-occurring in one sample were detected. As a group B-type trichothecenes dominated over A-type trichothecenes concerning incidences and levels. Contamination was strongest with rudimentary ears based on incidence and mean and maximum contents; mean contents with few exceptions tended towards a higher level than in other fractions with significant (p < 0.05) differences compared to leaves for seven toxins.
    Toxins 10/2012; 4(10):778-87. · 2.13 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: Plant breeders and variety testing agencies routinely test candidate genotypes (crop varieties, lines, test hybrids) in multiple environments. Such multi-environment trials can be efficiently analysed by mixed models. A single-stage analysis models the entire observed data at the level of individual plots. This kind of analysis is usually considered as the gold standard. In practice, however, it is more convenient to use a two-stage approach, in which experiments are first analysed per environment, yielding adjusted means per genotype, which are then summarised across environments in the second stage. Stage-wise approaches suggested so far are approximate in that they cannot fully reproduce a single-stage analysis, except in very simple cases, because the variance-covariance matrix of adjusted means from individual environments needs to be approximated by a diagonal matrix. This paper proposes a fully efficient stage-wise method, which carries forward the full variance-covariance matrix of adjusted means from the individual environments to the analysis across the series of trials. Provided the variance components are known, this method can fully reproduce the results of a single-stage analysis. Computations are made efficient by a diagonalisation of the residual variance-covariance matrix, which necessitates a corresponding linear transformation of both the first-stage estimates (e.g. adjusted means and regression slopes for plot covariates) and the corresponding design matrices for fixed and random effects. We also exemplify the extension of the general approach to a three-stage analysis. The method is illustrated using two datasets, one real and the other simulated. The proposed approach has close connections with meta-analysis, where environments correspond to centres and genotypes to medical treatments. We therefore compare our theoretical results with recently published results from a meta-analysis.
    Biometrical Journal 09/2012; 54(6):844-60. · 1.15 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: Genomic selection (GS) is a method for predicting breeding values of plants or animals using many molecular markers that is commonly implemented in two stages. In plant breeding the first stage usually involves computation of adjusted means for genotypes which are then used to predict genomic breeding values in the second stage. We compared two classical stage-wise approaches, which either ignore or approximate correlations among the means by a diagonal matrix, and a new method, to a single-stage analysis for GS using ridge regression best linear unbiased prediction (RR-BLUP). The new stage-wise method rotates (orthogonalizes) the adjusted means from the first stage before submitting them to the second stage. This makes the errors approximately independently and identically normally distributed, which is a prerequisite for many procedures that are potentially useful for GS such as machine learning methods (e.g. boosting) and regularized regression methods (e.g. lasso). This is illustrated in this paper using componentwise boosting. The componentwise boosting method minimizes squared error loss using least squares and iteratively and automatically selects markers that are most predictive of genomic breeding values. Results are compared with those of RR-BLUP using fivefold cross-validation. The new stage-wise approach with rotated means was slightly more similar to the single-stage analysis than the classical two-stage approaches based on non-rotated means for two unbalanced datasets. This suggests that rotation is a worthwhile pre-processing step in GS for the two-stage approaches for unbalanced datasets. Moreover, the predictive accuracy of stage-wise RR-BLUP was higher (5.0-6.1 %) than that of componentwise boosting.
    Theoretical and Applied Genetics 08/2012; · 3.66 Impact Factor
  • André Schützenmeister, Hans-Peter Piepho
    [show abstract] [hide abstract]
    ABSTRACT: In the framework of the general linear model, residuals are routinely used to check model assumptions, such as homoscedasticity, normality, and linearity of effects. Residuals can also be employed to detect possible outliers. Various types of residuals may be defined for linear mixed models. It is shown how residual plots can be used to check model assumptions by comparing empirical residual distributions with appropriate null distributions based on a parametric bootstrap approach. This allows constructing simultaneous tolerance bounds, which helps in assessing the normality and homoscedasticity of residuals of linear mixed models, identifying possible outliers and interpreting residual plots. The usefulness of this method is demonstrated by applying it to several previously published datasets.
    Computational Statistics & Data Analysis 06/2012; 56(6):1405–1416. · 1.30 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: 1. The distributions of large herbivores in protected areas and their surroundings are becoming increasingly restricted by changing land use, with adverse consequences for wildlife populations. 2. We analyse changes in distributions of herbivore hotspots to understand their environmental and anthropogenic correlates using 50 aerial surveys conducted at a spatial resolution of 5 × 5 km(2) (n = 289 cells) in the Mara region of Kenya during 1977-2010. We compare the distributions across seasons, land use types (protection, pastoralism and agro-pastoralism) and 10 species with different body sizes and feeding styles. 3. Small herbivores that are the most susceptible to predation and dependent on high-quality forage concentrate in the greenest and wet areas and close to rivers in Masai pastoral ranches in both seasons. Livestock grazing creates conditions favouring small herbivores in these ranches, including high-quality short grasses and better visibility, implying facilitation. But in the reserve, they concentrate in browner, drier and flatter areas and farther from rivers, suggesting facilitation by large grazers in the wet season, or little competition with migratory herbivores occupying the reserve in the dry season. 4. In the wet season, medium herbivores concentrate in similar areas to small herbivores in the ranches and reserve. However, in the dry season, they stay in the reserve, and also concentrate in green and wet areas close to rivers when migrants occur in the reserve. As such areas typically have higher predation risk, this suggests facilitation by the migrants by absorbing most predation pressure or, alternatively, competitive displacement by the migrants from preferred habitats. 5. Large herbivores, which suffer the least predation, depend on bulk forage and are the most likely to engender conflicts with people, concentrate in the reserve all year. This suggests attraction to the taller and denser grass and perceived greater safety in the reserve in both seasons. 6. These results reveal how predation risk, forage quantity and quality, water, competition with and facilitation by livestock interact with individual life-history traits, seasons and land use in shaping the dynamics of herbivore hotspots in protected and human-dominated savannas.
    Journal of Animal Ecology 05/2012; · 4.84 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Genomic selection (GS) is emerging as an efficient and cost-effective method for estimating breeding values using molecular markers distributed over the entire genome. In essence, it involves estimating the simultaneous effects of all genes or chromosomal segments and combining the estimates to predict the total genomic breeding value (GEBV). Accurate prediction of GEBVs is a central and recurring challenge in plant and animal breeding. The existence of a bewildering array of approaches for predicting breeding values using markers underscores the importance of identifying approaches able to efficiently and accurately predict breeding values. Here, we comparatively evaluate the predictive performance of six regularized linear regression methods-- ridge regression, ridge regression BLUP, lasso, adaptive lasso, elastic net and adaptive elastic net-- for predicting GEBV using dense SNP markers. We predicted GEBVs for a quantitative trait using a dataset on 3000 progenies of 20 sires and 200 dams and an accompanying genome consisting of five chromosomes with 9990 biallelic SNP-marker loci simulated for the QTL-MAS 2011 workshop. We applied all the six methods that use penalty-based (regularization) shrinkage to handle datasets with far more predictors than observations. The lasso, elastic net and their adaptive extensions further possess the desirable property that they simultaneously select relevant predictive markers and optimally estimate their effects. The regression models were trained with a subset of 2000 phenotyped and genotyped individuals and used to predict GEBVs for the remaining 1000 progenies without phenotypes. Predictive accuracy was assessed using the root mean squared error, the Pearson correlation between predicted GEBVs and (1) the true genomic value (TGV), (2) the true breeding value (TBV) and (3) the simulated phenotypic values based on fivefold cross-validation (CV). The elastic net, lasso, adaptive lasso and the adaptive elastic net all had similar accuracies but outperformed ridge regression and ridge regression BLUP in terms of the Pearson correlation between predicted GEBVs and the true genomic value as well as the root mean squared error. The performance of RR-BLUP was also somewhat better than that of ridge regression. This pattern was replicated by the Pearson correlation between predicted GEBVs and the true breeding values (TBV) and the root mean squared error calculated with respect to TBV, except that accuracy was lower for all models, most especially for the adaptive elastic net. The correlation between the predicted GEBV and simulated phenotypic values based on the fivefold CV also revealed a similar pattern except that the adaptive elastic net had lower accuracy than both the ridge regression methods. All the six models had relatively high prediction accuracies for the simulated data set. Accuracy was higher for the lasso type methods than for ridge regression and ridge regression BLUP.
    BMC proceedings 05/2012; 6 Suppl 2:S10.
  • Source
    Regina G Belz, Hans-Peter Piepho
    [show abstract] [hide abstract]
    ABSTRACT: Two hormetic modifications of a monotonically decreasing log-logistic dose-response function are most often used to model stimulatory effects of low dosages of a toxicant in plant biology. As just one of these empirical models is yet properly parameterized to allow inference about quantities of interest, this study contributes the parameterized functions for the second hormetic model and compares the estimates of effective dosages between both models based on 23 hormetic data sets. Based on this, the impact on effective dosage estimations was evaluated, especially in case of a substantially inferior fit by one of the two models. The data sets evaluated described the hormetic responses of four different test plant species exposed to 15 different chemical stressors in two different experimental dose-response test designs. Out of the 23 data sets, one could not be described by any of the two models, 14 could be better described by one of the two models, and eight could be equally described by both models. In cases of misspecification by any of the two models, the differences between effective dosages estimates (0-1768%) greatly exceeded the differences observed when both models provided a satisfactory fit (0-26%). This suggests that the conclusions drawn depending on the model used may diverge considerably when using an improper hormetic model especially regarding effective dosages quantifying hormesis. The study showed that hormetic dose responses can take on many shapes and that this diversity can not be captured by a single model without risking considerable misinterpretation. However, the two empirical models considered in this paper together provide a powerful means to model, prove, and now also to quantify a wide range of hormetic responses by reparameterization. Despite this, they should not be applied uncritically, but after statistical and graphical assessment of their adequacy.
    PLoS ONE 01/2012; 7(3):e33432. · 3.73 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: The study was conducted to determine the effect of graded levels of feed intake on apparent (AID) and standardized (SID) ileal digestibilities of crude protein (CP) and amino acids (AA) in diets for piglets. The piglets were surgically fitted with simple T-cannulas at the distal ileum. The cornstarch-casein-soybean meal-based diets were fed at three graded levels of feed intake corresponding to 30, 45 and 60 g kg(-1) body weight (BW) per day. The AID and SID of most AA were quadratically affected by the feed intake level (P≤0.05). Initially, both AID and SID of most AA increased up to 1.9 percentage units as the feed intake level was increased from 30 to 45 g kg(-1) BW. Thereafter, these AID and SID values decreased by 2.6 and 2.7 percentage units, respectively, as the feed intake level was further increased from 45 to 60 g kg(-1) BW. Because the voluntary feed intake is highly variable in piglets after weaning, comparison of ileal AA digestibilities between and within studies may be confounded by variations in feed intake level. Thus, when designing digestibility studies with piglets, a standardization of feed intake should be taken into consideration.
    Journal of the Science of Food and Agriculture 11/2011; 92(6):1261-6. · 1.76 Impact Factor

Publication Stats

820 Citations
245.00 Total Impact Points


  • 2002–2014
    • Hohenheim University
      • • Institute of Plant Production and Agroecology in the Tropics and Subtropics
      • • Institute of Crop Science
      • • State Plant Breeding Institute
      Stuttgart, Baden-Württemberg, Germany
  • 2013
    • University of Bonn
      • Institute of Crop Science and Resource Conservation (INRES)
      Bonn, North Rhine-Westphalia, Germany
  • 2006–2010
    • University of Tuebingen
      • Center for Plant Molecular Biology
      Tübingen, Baden-Wuerttemberg, Germany
  • 2009
    • Max Planck Institute for Plant Breeding Research
      Köln, North Rhine-Westphalia, Germany
  • 1991–2009
    • Christian-Albrechts-Universität zu Kiel
      • Institute of Crop Science and Plant Breeding
      Kiel, Schleswig-Holstein, Germany
    • Università degli Studi di Torino
      Torino, Piedmont, Italy
  • 2008
    • Leibniz Institute of Plant Genetics and Crop Plant Research
      Gatersleben, Saxony-Anhalt, Germany
  • 1994–2008
    • Universität Kassel
      • Department of Grassland Science and Renewable Plant Resources
      Cassel, Hesse, Germany