Access to this full-text is provided by Springer Nature.
Content available from Scientific Data
This content is subject to copyright. Terms and conditions apply.
1
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
WnRoeZetzcheBllvorFriet
KeLéonLichthrtOronSnowon
SthlStützelWittopChen
✉
Wheat (Triticum aestivum L.) is a cornerstone of the world’s food supply. Its products cover 19% of the cal-
orie intake and 20% of the protein consumption of the world’s population. In Europe the relevance is even
higher, with 25% of calorie intake and 26% of protein consumption by humans accountable to wheat (average
of the years 2014-20181). Aer a rapid and sustained global increase in wheat yields in the second half of the
20th century2, many countries with high wheat yields - including France, the United Kingdom and Germany -
have recently experienced little to no yield progress2–5. us, the trajectory of total global food production is
currently below the rate of increase needed to adequately feed the world population in 20506. Consequently,
signicant progress in crop science and breeding is required to achieve the desired yield increases in wheat.
Improvements of genotypes and cropping systems are needed in the context of recent challenges of climate
change and the parallel increase in social demand for reductions in environmental pollution, atmospheric emis-
sions and the use of agrochemical inputs in crop production systems7,8. To study crop performance in terms
of variation due to environmental change and genotypic improvement (breeding), multi-environmental trials
(MET) are indispensable9–13.
e unique MET dataset presented here combines 29 environments (unbalanced combinations among six
years and six locations), 9 agricultural management scenarios (unbalanced combinations among three treat-
ments, depending on the combination of year and location), and 228 genotypes (released cultivars) with detailed
eld phenotyping of 24 labour-intensive traits (e.g. total nal biomass) of winter wheat (Triticum aestivum L.).
1Section of Intensive Plant Food Systems, Albrecht Daniel Thaer-Institute of Agricultural and Horticultural
Sciences, Humboldt Universität zu Berlin, Berlin, Germany. 2Department of Agronomy and Crop Science, Christian
Albrechts University of Kiel, Kiel, Germany. 3Julius Kuehn Institute (JKI), Federal Research Centre for Cultivated
Plants, Institute for Resistance Research and Stress Tolerance, Quedlinburg, Germany. 4Institute of Crop Science
and Resource Conservation, Chair of Plant Breeding, University of Bonn, Bonn, Germany. 5Department of Plant
Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany.
6Bundessortenamt, Hannover, Germany. 7Institute of Horticultural Production Systems, Leibniz University Hannover,
Hannover, Germany. 8These authors contributed equally: Tien-Cheng Wang, Till Rose. ✉e-mail: tsu-wei.chen@
hu-berlin.de
Content courtesy of Springer Nature, terms of use apply. Rights reserved
2
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
e MET were conducted for six years (2015–2020) in Germany at six locations (Fig.1): Gross Gerau, Hannover,
Klein Altendorf, Kiel, Quedlinburg, and Rauischholzhausen. e management scenarios comprised three treat-
ments (Fig.2): nitrogen treatment with two total fertilizer levels (HN and LN, with 220 and 110 kg N ha−1,
respectively), fungicide treatment with (WF) or without fungicides (NF) and a water availability treatment with
three levels: irrigated (IR), rain-fed (RF) and rain-out shelter (RO). All MET were conducted with a panel of 228
cultivars released between 1963 and 2016. In total, there are 526,751 data points for 24 traits, including grain
and biomass yield, agronomic traits, grain quality traits and fungal disease infection scores (Table1 and Fig.3).
Parts of this MET dataset have been used to demonstrate that: (1) long-term breeding has improved grain
yield in European winter wheat independently of input intensity among management scenarios10; (2) traits for
sink and source capacity are co-selected throughout breeding history14; (3) stimuli aect the formation of yield
components in a cultivar- and stage-specic manner11; (4) the correlation of determination between traits and
year of release13; and (5) breeding progress in fungal disease resistance has contributed to breeding progress
Fig. 1 Locations and soil characteristics of six experimental elds in multi-environmental trials (MET) between
2015 and 2020. (a) Geographic locations. Abbreviations for six locations: Gross Gerau (GGE), Hannover
(HAN), Klein Altendorf (KAL), Kiel (KIE), Quedlinburg (QLB), and Rauischholzhausen, (RHH). (b) Soil
properties. Colours in (b) indicate dierent locations, while numbers represent experimental years. Slight
variations in soil properties between years at the same location are due to eld alternations within the location
across experimental years.
Fig. 2 Unbalanced nine managements in multi-environmental trials (MET) dataset. Nine managements
comprise of three treatments: nitrogen fertilizer, fungicide application and water availability. Nitrogen
treatments has two levels: high (HN: 220 kg N ha−1) and low (LN: 110 kg N ha−1). Fungicide treatment contains
two application levels: with (WF) or without (NF) fungicide application. Water availability treatment has three
levels: rain-fed (RF), irrigated (IR) and rainout-shelter (RO). Abbreviations for six locations: Gross Gerau
(GGE), Hannover (HAN), Klein Altendorf (KAL), Kiel (KIE), Quedlinburg (QLB), and Rauischholzhausen
(RHH).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
in yield12. Here we present the complete MET dataset, including previously unpublished results, and further
showcase the value of this dataset for studying GxExM interactions by answering four research questions:
(1) How consistent are agronomic traits between years (Y), locations (L) and managements (M) and how does
the combination of Y, L and M aect trait consistency? (2) Can a well-calibrated crop model that considers
GxExM interactions properly represent the trait-trait correlations observed in the elds? (3) To which extent has
breeding progress in agronomic traits contributed to breeding progress for grain yield? (4) How do individual
agronomic traits contribute to yield stability?
e MET data-
set was collected with the support of the project Breeding Innovations in Wheat for Ecient Cropping Systems
(BRIWECS). Experiments were conducted in Germany from 2015 to 2020 in six locations (Fig.1 and Fig.2),
including Gross Gerau (GGE), Hannover (HAN), Klein Altendorf (KAL), Kiel (KIE), Quedlinburg (QLB) and
Rauischholzhausen (RHH).
Management scenarios comprised three treatments with dierent levels of nitrogen fertilizer, fungicide appli-
cation and water availability (Fig.2). Nitrogen fertilizer treatment includes two application levels: high (HN:
220 kg N ha−1) and low (LN: 110 kg N ha−1), both include soil mineral nitrogen (0–90 cm) measured in early
spring. Fungicide treatment contains two application levels: with (WF) or without (NF) fungicide application.
Water availability treatment has three levels: rain-fed (RF), irrigated (IR) and rainout-shelter treated (RO). Most
of the managements were grown under the rain-fed. Only Gross Gerau and Kiel were additionally tested with
irrigated and rainout-shelter treatments, respectively. e aim of the water availability treatments in these two
locations was to compare them with the main on-farm practice (HN_WF_RF). In Gross Gerau from a subset
of seasons (2015, 2018, and 2019), all managements were irrigated, together with only one rain-fed treatment
for high nitrogen with fungicide application. In Kiel, all managements were grown under rain-fed, along with a
management HN_WF treated with rainout-shelter in a subset of seasons (2016, 2017, and 2019).
e MET dataset contains a panel of 228 winter wheat cultivars, with year of release ranging from 1963 to
2016. e BRIWECS project was conducted in two phases. In Phase I10–14, experiments followed a randomized
block design (2015–2017) with 220 cultivars (except for Rauischholzhausen in 2017). In Phase II10, experiments
followed a full treatment-factorial design (2018-2019) with 52 selected cultivars that were a subset of the 220 cul-
tivars from Phase I. Note that each cultivar has two replicates for each treatment, except for rain-fed treatments
(2015–2019) and rainout shelter treatment (2019) in Kiel, which has three replicates.
e MET dataset contains in total 526,751 observations aer removing outliers (Fig.3). ere are two
sample sources in the dataset: a 50 cm cut and whole plot (Table1). A 50 cm cut sampling was collected from a
trait full name trait source trait name in dataset trait range unit
above-ground dry mass at maturity 50 cm cut Biomass_bio 0~3495 g/m2
harvest index 50 cm cut Harvest_Index_bio 0.1~0.79
grains per spike 50 cm cut Grain_per_spike_bio 3.6~144.2 number
plant height 50 cm cut Plantheight_bio 40~145 cm
grain yield 50 cm cut Seedyield_bio 28.3~1815 g/m2
spike number 50 cm cut Spike_number_bio 48~1390 number /m2
thousand grain weight 50 cm cut TGW_bio 4.7~77.8 g
day when 75% of the ears are visible whole plot BBCH59 123~181 days of year
day when 75% hard dough whole plot BBCH87 175~213 days of year
above-ground dry mass at maturity whole plot Biomass 14.2~732.8 dt/ha
crude protein percentage per grain dry mass w hole plot Crude_protein 6.2~21.3 %
leaf tan spot caused by Drechslera tritici-repentis whole plot DTR 0~100 % leaf area
falling number whole plot Falling_number 60~700 s
fusarium head blight whole plot Fusarium 0~27 % spike
number of grains per unit area whole plot Grain 143.7~3915.5 number x 105/ha
leaf rust caused by Puccinia triticina whole plot Leaf_rust 0~90 % leaf area
powdery mildew caused by Blumeria graminis f. sp. tritici whole plot Powdery_mildew 0~100 % leaf area
grain protein yield whole plot Protein_yield 0~22.2 dt/ha
sedimentation whole plot Sedimentation 2.1~83.3 ml
grain yield whole plot Seedyield 0~141.6 dt/ha
leaf spot caused by Septoria tritici whole plot Septoria 0~80 % leaf area
above ground biomass subtracted by grain yield whole plot Straw 8.9~625.4 dt/ha
stripe rust caused by Puccinia striiformis whole plot Stripe_rust 0~100 % leaf area
thousand grain weight whole plot TGW 11.9~67.4 g
Tab le 1. Names, sampling source, column name range and unit of 24 traits.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
non-border row of 50 cm from the whole plot at the time of grain maturity (BBCH87) to determine biomass,
grain yield, thousand grain weight (TGW), spike number, plant height, grain per spike, and harvest index. Whole
plot was evaluated non-destructively during the growing periods and destructively maturity. Non-destructive
measurements include: heading date (BBCH59), grain maturity date (BBCH87) and six fungal diseases infection
area throughout the cropping cycle (leaf tan spot, Fusarium head blight, leaf rust, powdery mildew, Septoria,
stripe rust). Destructive measurements include dry mass of shoot, dry mass of straw, harvest index, grain yield,
TGW, grain protein, grain falling number, grain sedimentation. Harvest index was calculated as the grain yield
divided by the above ground dry mass at maturity. For details of trait collections see material and methods from10.
Fungal disease infection (% area) was recorded in the eld for each plot with visual infection score ranging
from 0-10012. Disease scores were collected from natural infections in the eld, except for the management sce-
narios HN_NF_RF and LN_NF_RF in Quedlinburg, where manual inoculation was applied with pathogens of
stripe rust, leaf rust and Fusarium head blight (for details see12). Total fungal infection area (TFI) is dened as
the sum of infected area from all six fungal diseases, assuming that infection scores are additive:
= ++ + + + TFIstriperustSeptoriapowdery mildew leafrust leaftanspotFusariumheadblight(1)
Aer quality control of the raw data, negative values or values with unrealistic ranges
(e.g., grain yield > 3000 dt/ ha; TGW > 80 g/1000 grain) were re-called as “not available” (NA). For each growing
condition, a cultivar with a trait value (HI and spike number) beyond the range of mean plus and minus four
times standard deviation was considered an outlier and excluded from further calculations. Aer pre-processing,
a total of 526,751 data points were available (Fig.3).
To provide an unbi-
ased estimation of trait performance of each genotype under each combination of year, location and management
(referred to as growing condition), BLUEs values were used for the further validation (technical validation I, III
Fig. 3 An overview of the multi-environmental trial (MET) dataset containing 24 traits of winter wheat
collected across six locations in Germany (GGE, HAN, KAL, KIE, QLB, RHH) with nine managements
during six years (2015–2020). (a) Density plot of four agronomic traits (harvest index, grain number, grain
yield and straw dry mass at maturity) as examples to demonstrate the eect of managements (M) on traits
distributions across 29 combinations of year by location (Y/L). (b) Total number of observations for 24 traits
across all combinations of growing conditions (year by location by management; Y/L/M) from sampling sources
collected from 50 cm cut and whole plot. Abbreviation of locations: Gross Gerau (GGE), Hannover (HAN),
Klein Altendorf (KAL), Kiel (KIE), Quedlinburg (QLB), and Rauischholzhausen (RHH). Un-balanced nine
managements comprise of three treatments: nitrogen fertilizer, fungicide application and water availability.
Nitrogen treatments has two levels: high (HN: 220 kg N ha−1) and low (LN: 110 kg N ha−1). Fungicide treatment
contains two application levels: with (WF) or without (NF) fungicide application. Water availability treatment
has three levels: rain-fed (RF), irrigated (IR) and covered with rainout-shelter (RO).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
5
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
and IV, see the next sections). e MET dataset contains two experimental phases: randomized block design for
phase I (2015–2017; except for Quedlinburg and Rauischholzhausen in 2017) and full treatment-factorial design
for phase II (2018–2020; except for Gross Gerau, Klein Altendorf and Rauischholzhausen in 2018). For each com-
bination of year, location and management, we included random eect from both row and column to consider the
potential uneven gradient of soil fertility in the eld. e calculation of BLUEs was based on the following model:
µ=+++ygRC (2)
irci
rc
where yirc is the performance of the ith cultivar of the rth row and the cth column, μ is the general mean, gi is the
xed eect of the ith cultivar, Rr is the random eect of the rth row and Cc is the random eect of the cth column.
Fixed eects are denoted by lowercase letters, while random eects are denoted by uppercase letters.
full name abbreviation
Fig.4Fig.5Fig.6Fig.7 Fig.8Fig.9
R2sma R2sma trait-trait correlation BP SI
aboveground dry mass at maturity SDM v
straw dry mass at maturity Straw v v v v
owering time FT v
grain number GN v v
grain protein concentration GP v v v
grain per spike GpS v v v
grain yield GY v v v v v
harvest index HI v v v v
total fungal disease infection area TFI v
light extinction coecient k v
leaf area index LAI v
maturity MT v
radiation use eciency rue v
spike number SN v v v
thousand grain weight TGW v v v
Tab le 2. Trait names and abbreviations for examples analyses I-IV.
Fig. 4 Trait consistency (R2sma) of nine agronomic traits across all combinations of growing conditions
(Y/L/M). Each point represents the consistency of a trait between two Y/L/M. ere are 9900 combinations
in total, resulting from the permutation of two out of 45 Y/L/M. Blue letters above the boxplot denote three
statistics of R2sma: M for maximum; A for average and m for minimum. Dierent dark red lowercase letters
below denote statistical signicance at level of alpha = 0.05 based on Fisher’s post hoc test following ANOVA.
e abbreviation of nine traits are: grain yield (GY); grain number (GN); thousand grain weight (TGW);
harvest index (HI); grain protein concentration (GP); above ground dry mass at maturity (SDM); grain per
spike (GpS); straw dry mass at maturity (Straw) and spike number (SN).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
6
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
e data set15 is deposited on Figshare (https://doi.org/10.6084/m9.gshare.27910269). ere
are six folders in the main directory: data, docs, gure, metadata, output and scripts. e folder data contains the
raw data with three subfolders: locations, management and weather. e folder metadata contains the cultivar
information on the cultivars investigated (BRIWECS_BRISONr_information.csv) and units used to describe all
traits (Unit.xlsx). Folder scripts contains four les as follows: File data_cleaning.R combines les in folder data
and remove outliers and store output (BRIWECS_data_publication.csv) in folder output. File extract_manage-
ment.R combine les in sub-folder management in folder data and store four combined managements les-dis-
ease_record.xlsx, fertilizer.xlsx, plant_protection.xlsx and soil.xlsx in folder output. File data_overview.qmd
generates visualizations showing the distributions and correlations of trait performance among dierent growing
conditions (Y/L/M), sowing dates, precipitation and global radiation levels in each growing condition, with all
relevant les stored in folder docs. Parts of the MET dataset have been published in previous studies (TableS1),
including SNP data10,14, climatic data11,16 and adjusted means for pathogen infections12,17.
Fig. 5 Consistency (R2sma) of grain yield (GY) with ve groupings: (a) year, (b) location, (c) management,
(d) management-location and (e) management-year. Each point represents a R2sma of a trait between two
Y/L/M. Dierent lowercase letters denote statistical signicance at level of p = 0.05 based on Fisher’s post
hoc test following analysis of variance. Abbreviation for locations: Gross Gerau (GGE), Hannover (HAN),
Klein Altendorf (KAL), Kiel (KIE), Quedlinburg (QLB), and Rauischholzhausen (RHH). Un-balanced nine
managements comprise of three treatments: nitrogen fertilizer, fungicide application and water availability.
Nitrogen treatments has two levels: high (HN: 220 kg N ha−1) and low (LN: 110 kg N ha−1). Fungicide treatment
contains two application levels: with (WF) or without (NF) fungicide application. Water availability treatment in
this analysis has one levels: rain-fed (RF).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
7
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
For further validation, a subset from the MET dataset (Table2) was
utilised. e subset included 15 traits in 220 genotypes growing in 45 growing conditions (year by location by
management; Y/L/M), comprising three years from phase I (Y: 2015–2017), ve locations (L: GGE, HAN, KAL,
KIE, QLB), and managements from three rain-fed conditions (M: HN_WF_RF, HN_NF_RF, LN_NF_RF) for the
technical validation I-IV. is subset is more balanced in number of genotypes and Y/L/M combinations ensuring
comparability.
First validation shows the con-
sistency of trait performance (i.e., BLUEs) across growing conditions (Y/L/M). Here, we dene trait consist-
ency (R2sma) as R2 derived from standardized major axis (SMA)18,19 regression of BLUEs of a population (220
genotypes) between two growing conditions. SMA regression assumes the source of error coming from both
dependent and independent variables, therefore suitable for non-causal relationships20. In this validation, we
demonstrate R2sma of nine agronomic traits (Table2), including grain yield (GY), harvest index (HI), straw dry
mass (Straw), above ground dry mass (SDM), grain number (GN), grain protein concentration (GP), grain per
spike (GpS), spike number (SN) and thousand grain weight (TGW).
To further validate the eect of year, location or management on R2sma, trait consistency was analysed by
single or double grouping of growing conditions. Single grouping considers only year, location and manage-
ment individually. For instance, when grouping by management, we calculate R2sma between every pairs of Y/L
under the same management level (e.g., HN_WF_RF). We showcased the results of two double groupings:
management-location (to examine inter-years’ similarity) and management-year (to examine inter-locations
similarity). One-way analysis of variance (ANOVA) was performed to examine the mean dierence of R2sma
between levels within each group. Fisher’s least signicant dierence test was used to dierentiate the mean of
levels within each group once signicance of ANOVA (p-value < 0.05) was detected for the group.
R2 from standardized major axis regression (SMA; R2sma) was used to evaluate the consistency of traits
between growing conditions (Y/L/M). R2sma for grain yield range widely, spanning from 0.09 to 0.84, with an
average of 0.47 (Fig.4). In other words, on average, yield in one growing condition explained less than 50% of
variation in yield in another growing conditions. Unexpectedly, although grain yield is the most complex trait,
it showed the highest R2sma together with grain number. Average R2sma in TGW was at a similar level but signif-
icantly lower than that in grain yield. Above ground and straw dry mass at maturity, spike number and grain
number per spike had low consistency (average R2sma < 0.22), especially straw dry mass at maturity (R2sma = 0.15)
and spike number (R2sma = 0.07), suggesting the strongest GxExM eects on tillering and canopy development.
Furthermore, the consistency (R2sma) of grain yield diered between years, locations, and managements.
Interestingly, R2sma showed the largest variations between locations (Fig.5). On average, R2sma was the highest
Fig. 6 Comparison of trait-trait correlations between eld experiments and crop model simulations. Field
dataset from three consecutive years (2015–2017) under high nitrogen and fungicide application in rain-fed
treatment (HN_WF_RF) from (A) Hannover and (B) Kiel was used. Simulation dataset comes from previous
publications (doi: 10.5281/zenodo.7569104)21,22. Each point represents the Pearson correlation coecient (r)
between two traits observed in the eld experiment (x-axis) and in the simulations of APSIM-wheat (y-axis).
e diagonal dashed line represents a one-to-one line and the distance of a point to the one-to-one line
represents the similarity of r between eld and simulation. Abbreviation of ten traits are: owering time (FT);
harvest index (HI); light extinction coecient (k); leaf area index (LAI); maturity time (MT); grain number
(GN); grain protein concentration (GP); grain yield (GY); radiation use eciency (rue); straw dry mass at
maturity (Straw) and thousand grain weight (TGW). e trait-trait combinations were bolted if their distances
to one-to-one line below 0.09 and both absolute value of x and y larger than 0.5.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
8
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
in Hannover (60%) and the lowest in Quedlinburg (37%). Notably, grain yield was more consistent under the
management HN_NF_RF (high nitrogen without fungicide in rain-fed, average R2sma = 0.53) than HN_WF_RF
(high nitrogen and fungicide under rain-fed, average R2sma = 0.47) and LN_NF_RF (low nitrogen without fun-
gicide under rain-fed, average R2sma = 0.47). is indicates that plant protection is a management that increases
GxE, due to the fact that, in our panel, the accumulation of genes for diseases resistance is an important results
of the breeding history12, and if the contribution of these genes on yield is replaced by the plant protection, the
genotypic characteristic is not fully exploited, therefore less consistent results.
-
e second validation is the extent to which the results from crop model simulations represent “real
world” data. To achieve this, the Pearson correlation coecient (r) between two agronomic traits (referred to as
trait-trait correlation) in the eld was compared with the trait-trait correlation simulated by the well-calibrated
crop simulation model APSIM-wheat (doi: 10.5281/zenodo.7569104)21–23. As examples, we selected two locations
(Hannover and Kiel) from one management scenario (high nitrogen and with fungicide under rain-fed condi-
tion; HN_WF_RF), where the maximum number of directly comparable traits to APSIM-wheat can be found.
Note that each location also has a dierent number of traits measured. e analysis of trait-trait correlations
Fig. 7 Distribution of breeding progress (BP) of eight agronomic traits from all combinations of growing
conditions (Y/L/M). (a-h) Abbreviation of eight traits: straw dry mass at maturity (Straw), spike number (SN),
thousand grain weight (TGW), shoot dry mass at maturity (SDM), grain per spike (GpS), grain number (GN),
harvest index (HI), grain yield (GY). Unit abbreviations: Nbr stands for number; year stands for dierence in
year of release between genotypes. Colours and stars symbols refers to signicance level of p-value of each term
in (5): * and red refers to signicance at 5% level; ** and green refers to signicance at 1% level; *** and blue
refers to signicance at 0.1% level; purple refers to not signicant with p-value larger than 5% level. CVBPtrait
stands for coecient of variation; BPtrait over bar stands for mean BPtrait.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
9
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
encompassed eleven traits (Table2): grain number (GN), grain protein concentration (GP), grain yield (GY),
harvest index (HI), thousand grain weight (TGW), radiation use eciency (rue), leaf area index (LAI), light
extinction coecient (k), owering time (FT), maturity time (MT), and straw dry mass at maturity (Straw). In
APSIM-wheat, radiation use ecient (rue) and light extinction coecient (k) are input parameters that can be
varied between simulations and the rest of the traits are simulated outputs. For each available pair of traits, r was
calculated for both the simulation and the MET dataset.
To validate whether crop models considering GxExM interactions correctly represent eld observations, we
showed pairwise correlations among traits (trait-trait correlations) between simulations and detailed eld obser-
vations at two locations (Hannover and Kiel). In general, three relationships aligned well between eld obser-
vations and simulations (Fig.6): (1) a positive correlation was observed between grain number and grain yield
(both locations r > 0.75; simulations: r = 0.67); (2) a negative correlation was observed between grain yield and
grain protein concentration (both locations r < −0.79; simulations: r = −0.74); (3) a negative correlation was
observed between grain number and thousand grain weight (both locations r < −0.53; simulations: r = −0.61).
Consistency in these well-known trait-trait correlations showcases the ability of the APSIM-wheat model to
represent relationships between yield components. However, two trait-trait correlations from eld observations
are weak or missing in the simulation: (1) a positive correlation was observed between straw dry mass at matu-
rity and maturity time (Hannover: r = 0.71; Kiel: r = 0.45; simulation: r = 0.08), indicating the missing link of
phenology and the growth of straw (an indicator of canopy volume) in the APSIM-wheat model. Furthermore,
(2) a positive correlation was found between grain number and harvest index (Hannover: r = 0.67; Kiel: r = 0.52;
simulation: r = −0.04), indicating that the allocation of dry mass to straw and grains should be re-examined in
the crop model.
Correlations related to straw dry mass showed contrasts between locations and were frequently inconsistent
between eld and simulation (Fig.6). For instance, simulated results overestimated the positive correlation
between grain yield and straw dry mass at maturity (Hannover: r = 0.37; Kiel: r = −0.12; simulation: r = 0.62).
Additionally, simulated results overestimated the negative correlation between grain protein and straw dry mass
at maturity (Hannover: r = −0.2; Kiel: r = 0.28; simulation: r = −0.5). Together, these results indicate a potential
Fig. 8 Multi-linear regression analysis breeding progress (BP) in ve traits to the BP of grain yield from
no groupings (all) or single grouping of growing condition (Y/L/M). Stacked bar plot represent the relative
importance of each trait in (4). Number in bracket refers to the number of observation for each level.
Abbreviations of ve traits are: grain per spike (GpS); harvest index (HI); spike number (SN); straw dry mass at
maturity (Straw); total fungal infection area (TFI); and thousand grain weight (TGW). X-axis: no grouping (all:
all growing conditions) and levels from single grouping of growing conditions.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
10
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
improvement of crop models by better considering the canopy development and dry mass allocation. Extensive
MET trait datasets like the one described here are essential to achieve this.
Since the data were collected to estimate breeding progress of dierent agronomic
traits in winter wheat, the third validation showcased to which extent breeding progress (BP) in agronomic traits
contributed to the BP in grain yield. BP was dened as the slope from simple linear regression between BLUEs
of cultivars and their year of release. Analysis was conducted on a subset of 191 cultivars representing the breed-
ing history of winter wheat in Germany between 1963 and 201310. e agronomic traits (Table2) include grain
per spike (GpS), harvest index (HI), spike number (SN), thousand grain weight (TGW) and straw dry mass at
Fig. 9 Multi-linear regression analysis of stability in six traits (SItrait) to stability of grain yield using nine
stability indices (SI). Stacked bar plot represents relative importance of each trait in (5). Abbreviations of seven
traits: grain protein concentration (GP); grain per spike (GpS); harvest index (HI); spike number (SN); straw
dry mass at maturity (Straw); total fungal infection area (TFI); and thousand grain weight (TGW). Nine SI:
coecient of determination (r2i), coecient of regression (bi), deviation mean squares (S 2di), ecovalence (Wi),
environmental variance (S2xi), genotypic stability (D2i), genotypic superiority measure (Pi), stability variance
(σ2i), variance of rank (Si4).
Trait HN_NF_RF HN_WF_RF LN_NF_RF GGE HAN KIE QLB 2015 2016 2017 all
BP (Intercept) 0.25 0.14 −0.1 0.2 −0.21 0.25 0.19 0.42 0.19 *0.28 ** 0.12 *
BPGpS 0.07 0.58 0.09 −1.4 2.8 −0.73 0.15 1.7 0.66 *0.62 0.31
BPHI 49.6 70.5 270.3 ** 181.5 82.3 134.9 102.2 −142 −30.5 −83.5 101.1 *
BPSN −0.03 −0.02 0.005 −0.03 0.02 −0.04 −0.02 −0.06 0.13 *0.02 −0.003
BPStraw 0.59 *0.27 0.72 ** 0.66 −0.48 1.1 0.21 0.66 *0.1 0.04 0.36 ***
BPTGW 0.72 0.94 0.4 1.2 1.6 0.05 0.92 −0.35 1 1.8 0.94 *
Tab le 3. Coecient of regressors in multi-linear regression (3) for breeding progress of grain yield with
or without grouping (all) of growing conditions. Stars symbol refers to signicance level of p-value of each
term in (5): * signicant at 5% level; ** signicant at 1% level; *** signicant at 0.1% level. Abbreviation for
locations: Gross Gerau (GGE), Hannover (HAN), Klein Altendorf (KAL), Kiel (KIE), Quedlinburg (QLB), and
Rauischholzhausen (RHH). Un-balanced nine managements comprise of three treatments: nitrogen fertilizer,
fungicide application and water availability. Nitrogen treatments has two levels: high (HN: 220 kg N ha−1)
and low (LN: 110 kg N ha−1). Fungicide treatment contains two application levels: with (WF) or without (NF)
fungicide application. Water availability treatment in this analysis has one levels: rain-fed (RF).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
11
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
maturity (Straw). BP analysis was conducted to all combinations (all) of Y/L/M or single grouping of Y/L/M by a
multi-linear regression (3):
BP BP BP BP BP BP (3)
GY TGWHIstraw GpSSN
=+++++ε
where all regressors of breeding progress are assumed to have xed eect, and the error term is ε. With the
regression model, we further quantify the relative importance of each regressor from multi-linear regression
using R package relaimpo24.
Breeding progress (BP) of grain yield and six other agronomic traits (Table2) varied largely between years,
locations and managements and showed contrasting distributions across 40 growing condition (Y/L/M) (Fig.7).
Grain number, grain yield and harvest index have all BP values above zero, while the most inconsistent traits,
namely spike number and straw dry mass at maturity (Fig.4), have negative BP values in 13 (33%) and 17 (42%)
of the 40 growing conditions, respectively. Breeding progress for grain yield (BPGY) showed three-fold dier-
ences between growing conditions, ranging from 0.23 to 0.68 (dt/ha year) with average BPGY = 0.37 (dt/ha year).
In general, BP values close to 0 are more likely to show a higher p-value of regression.
e contribution of BP for ve agronomic traits (Table2) to BP for grain yield can be validated by multi-linear
regression (3) with or without grouping of growing conditions. In most cases the regression coecient (β) was
not signicant (p-value > 0.05) and showed no pattern across the grouping (Table3). In cases without grouping
(all), BPStraw, BPTGW and BPHI were signicant regressors, which collectively explained 48% of the R2 in BPGY
(Fig.8). Note that the result from relative importance should be considered together with the β. Positive β of
these three traits suggested that growing conditions stimulating stronger straw growth, higher grain per spike
and heavier grain of the modern cultivars led to higher BPGY. Non-signicance in β could be related to the low
number of observations (number in brackets in Figure8). which could reduce the degree of freedom or the
co-linearity among regressors.
e last validation of the
dataset involved nine stability indices (SI) and showed the contribution of stability in ve agronomic traits plus
one pathogen trait to the stability of grain yield (GY). e ve agronomic traits (Table2) were grain protein (GP),
grain per spike (GpS), harvest index (HI), spike number (SN), thousand grain weight (TGW), straw dry mass at
maturity (Straw). e single pathogen trait considered in this case was the total fungal infection area (TFI). Nine
SI including both static and dynamic concepts of stability were chosen: coecient of determination (r2i), coe-
cient of regression (bi), deviation mean squares (s2di), ecovalence (Wi), environmental variance (S2xi), genotypic
stability (D2i), genotypic superiority measure (Pi), stability variance (σ2i), variance of rank (Si4). Each SI was calcu-
lated for each genotypes of a trait. For each SI, a multi-linear regression was implemented (4):
=++++++εSI SI SI SI SI SI TFI(4)
GY TGWHIstraw GpSSN
where all regressors of stability indices are assumed to have xed eect, and the error term is ε. Similar to tech-
nical validation III, the relative importance of each regressor from multi-linear regression was quantied using
R package relaimpo24 and stability index was calculated using R package toolStability25.
To validate the contribution of stability in seven agronomic traits to the yield stability, stability indices (SI)
were calculated and multi-linear regression analyses (4) were conducted. As shown in in Table4, the regression
coecient (β) showed signicant and positive contribution of SITGW and SIHI to SIGY. Among nine SI, genotypic
superiority index (Pi) for yield was best explained by the Pi of seven traits and showed signicance in β for every
trait considered.
e contribution of regressors to R2 varied between SI (Fig.9), ranging between 61% (variance of rank; Si4) to
94% (Pi). SITGW and SIHI were of most important traits and contributed collectively at least 36% to R2 across SI.
Trait biDi2Piri2Sxi2Sdi2Si4σi2Wi
SI(Intercept) 0.06 −3.5 −24.7 *** −0.2 ** 0.29 −0.59 −15.7 *−0.87 −5
SIGP 0.27 *** 0.77 −1.3 ** 0.16 *1.8 *** 0.31 −0.02 0.46 0.4
SIGpS 0.16 *** 0.41 *** 0.64 *** 0.01 0.41 *** 0.22 ** 0.18 ** 0.19 *0.22 *
SIHI 0.48 *** 94.5 *** 109.3 *** 0.39 *** 79.2 *** 82.6 *** 0.46 *** 74.7 *** 83.3 ***
SISN 0.02 0.002 0.04 *** 0.08 *** 0.004 0.008 0.12 0.01 0.009
SIStraw −0.2 *** −0.05 0.19 *** 0.004 −0.08 ** −0.009 0.18 *0.008 0.005
SITGW −5e-04 0.4 ** 0.28 *** −6e-04 0.05 *0.05 *0.11 0.05 *0.27
TFI 0.22 *** 0.96 *** 0.84 *** 0.58 *** 0.67 *** 1.2 *** 0.29 *** 1.3 *** 1.5 ***
Tab le 4. Coecient of seven regressors from multi-linear regression (4) for nine stability of grain yield of
220 genotypes. Stars symbol refers to signicance level of p-value of each term in (5): * signicant at 5% level;
** signicant at 1% level; *** signicant at 0.1% level. Nine stability indices (SI): coecient of determination
(r2i), coecient of regression (bi), deviation mean squares (S 2di), ecovalence (Wi), environmental variance
(S2xi), genotypic stability (D2i), genotypic superiority measure (Pi), stability variance (σ2i), variance of rank (Si4).
Abbreviations of seven traits: grain protein concentration (GP); grain per spike (GpS); harvest index (HI); spike
number (SN); straw dry mass at maturity (Straw); total fungal infection area (TFI); and thousand grain weight
(TGW).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
12
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
In the case of Pi, 77% of R2 of SIGY could be explained by the three main contributors: total fungi infection area
(TFI), SIHI and SIGps. Interestingly, the stability of the least consistent traits - spike number and straw (Fig.4) -
explained together less than 6% of SIGY (Fig.9). Furthermore, we showed that the relative importance of the
stability of a trait depended on SI. For instance, TFI explain from 1.5% (coecient of regression; bi) to 26% (Pi)
of R2 in SI in grain yield.
e data were processed in R (version 4.3.2). e code to reproduce the results in this publication is publicly
available at https://github.com/tillrose/BRIWECS_Data_Publication (pre-processing and visualization) and
https://github.com/Illustratien/Scientic_Data_Analyis (technical validation I–IV). Both codes are subject to
the MIT license (https://opensource.org/license/mit).
Received: 20 August 2024; Accepted: 18 December 2024;
Published: xx xx xxxx
1. FAOSTAT. Available at https://www.fao.org/faostat/en/#home (2023).
2. Calderini, D. F. & Slafer, G. A. Changes in yield and yield stability in wheat during the 20th century. Field Crops esearch 57, 335–347,
https://doi.org/10.1016/S0378-4290(98)00080-X (1998).
3. Brisson, N. et al. Why are wheat yields stagnating in Europe? A comprehensive data analysis for France. Field Crops esearch 119,
201–212, https://doi.org/10.1016/j.fcr.2010.07.012 (2010).
4. Lin, M. & Huybers, P. econing wheat yield trends. Environmental esearch Letters 7, 24016, https://doi.org/10.1088/1748-9326/
7/2/024016 (2012).
5. Schauberger, B. et al. Yield trends, variability and stagnation analysis of major crops in France over more than a century. Scientic
reports 8, 16865, https://doi.org/10.1038/s41598-018-35351-1 (2018).
6. Steensland, A. 2020 global agricultural productivity report: productivity in a time of pandemics (ompson, T., Ed., Virginia Tech
College of Agriculture and Life Sciences Global Programs, 2020).
7. Garnett, T. et al. Agriculture. Sustainable intensication in agriculture: premises and policies. Science 341, 33–34, https://doi.org/
10.1126/science.1234485 (2013).
8. Tilman, D., Balzer, C., Hill, J. & Befort, B. L. Global food demand and the sustainable intensication of agriculture. Proceedings of the
National Academy of Sciences of the United States of America 108, 20260–20264, https://doi.org/10.1073/pnas.1116437108 (2011).
9. Cormier, F. et al. A multi-environmental study of recent breeding progress on nitrogen use eciency in wheat (Triticum aestivum
L.). eoretical and Applied Genetics 126, 3035–3048, https://doi.org/10.1007/s00122-013-2191-9 (2013).
10. Voss-Fels, . P. et al. Breeding improves wheat productivity under contrasting agrochemical input levels. Nature Plants 5, 706–714,
https://doi.org/10.1038/s41477-019-0445-5 (2019).
11. Sabir, . et al. Stage-specific genotype-by-environment interactions determine yield components in wheat. Nature Plants 9,
1688–1696, https://doi.org/10.1038/s41477-023-01516-8 (2023).
12. Zetzsche, H., Friedt, W. & Ordon, F. Breeding progress for pathogen resistance is a second major driver for yield increase in German
winter wheat at contrasting N levels. Scientic reports 10, 20374, https://doi.org/10.1038/s41598-020-77200-0 (2020).
13. ose, T. & age, H. e Contribution of Functional Traits to the Breeding Progress of Central-European Winter Wheat Under
Diering Crop Management Intensities. Frontiers in Plant Science 10, 1521, https://doi.org/10.3389/fpls.2019.01521 (2019).
14. Lichthardt, C., Chen, T.-W., Stahl, A. & Stützel, H. Co-Evolution of Sin and Source in the ecent Breeding History of Winter Wheat
in Germany. Frontiers in Plant Science 10, 1771, https://doi.org/10.3389/fpls.2019.01771 (2019).
15. Wang, T.-C., et al. Multi-environment eld trials for wheat yield, stability and breeding progress in Germany. Available at https://doi.org/
10.6084/m9.gshare.27910269 (Figshare, 2024).
16. Sabir, ., et al. Stage-specic genotype-by-environment interactions determine yield components in wheat. Available at https://zenodo.
org/records/8248543 (Zenodo, 2023).
17. Zetzsche, H., Heinze, J., Friedt, W. & Ordon, F. Data: Breeding progress for pathogen resistance is a second major driver for yield
increase in German winter wheat at contrasting N levels. Available at https://zenodo.org/records/3697514 (Zenodo, 2020).
18. Sprent, P. & Dolby, G. . Query: The Geometric Mean Functional elationship. Biometrics 36, 547–550, https://doi.org/
10.2307/2530224 (1980).
19. Correndo, A. A., Heey, T. J., Holzworth, D. P. & Ciampitti, I. A. evisiting linear regression to test agreement in continuous
predicted-observed datasets. Agricultural Systems 192, 103194, https://doi.org/10.1016/j.agsy.2021.103194 (2021).
20. Warton, D. I., Wright, S. T. & Wang, Y. Distance-based multivariate analyses confound location and dispersion eects. Methods in
Ecology and Evolution 3, 89–101, https://doi.org/10.1111/j.2041-210X.2011.00127.x (2012).
21. Wang, T.-C., Casadebaig, P. & Chen, T.-W. More than 1000 genotypes are required to derive robust relationships between yield, yield
stability and physiological parameters: a computational study on wheat crop. eoretical and Applied Genetics 136, 34, https://doi.org/
10.1007/s00122-023-04264-7 (2023).
22. Casadebaig, P. et al. Assessment of the Potential Impacts of Wheat Plant Traits across Environments by Combining Crop Modeling
and Global Sensitivity Analysis. PLOS ONE 11, e0146385, https://doi.org/10.1371/journal.pone.0146385 (2016).
23. Wang, T.-C., Chen, T.-W., Casadebaig, P. & Chenu, . Data: More than 1000 genotypes are required to derive robust relationships
between yield, yield stability and physiological parameters: a computational study on wheat crop. Available at https://zenodo.org/
records/7569104 (Zenodo, 2023).
24. Groemping, U. elative Importance for Linear egression in : e Pacage relaimpo. Journal of Statistical Soware 17, 1–27,
https://doi.org/10.18637/jss.v017.i01 (2006).
25. Wang, T.-C. & Chen, T.-W. toolStability: Tool for Stability Indices Calculation. pacage version 0.1.1. Available at https://
cran.r-project.org/pacage=toolStability (2022).
e collection of the MET dataset was supported by the German Federal Ministry of Education and Research
(BMBF) grant no. 031A354 to W.F., H.K., J.L., F.O., R.J.S. and H.S. within the project Breeding Innovations in
Wheat for Ecient Cropping Systems (BRIWECS) as part of the funding initiative Innovative Plant Breeding in
the Production Systems (IPAS). T.-W.C. was funded by Deutsche Forschungsgemeinscha (German Research
Foundation, DFG) under project number 419973621 and 442020478. T.-W.C., R.J.S. and A.S. were funded by
DFG under project numbers 518863370, 518783157 and 518913298, respectively. We acknowledge support by
the Open Access Publication Fund of Humboldt-Universität zu Berlin.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
13
SCIENTIFIC DATA | (2025) 12:64 | https://doi.org/10.1038/s41597-024-04332-7
www.nature.com/scientificdata
www.nature.com/scientificdata/
T.-W.C. conceived the analyses. H.K., J.L., F.O., W.F., H.S. and A.S. conceived and designed the experiments. B.W.
selected the genotypes. T.R., H.Z., A.B., H.K., J.L., C.L., F.O., R.J.S., A.S., H.S., B.W. and T.-W.C. collected the data.
T.-C.W. and T.R. pre-processed and maintain the data repository. T.-C.W. and T.-W.C. wrote the manuscript. All
authors helped to revise the manuscript.
Open Access funding enabled and organized by Projekt DEAL.
e authors declare no competing interests.
Supplementary information e online version contains supplementary material available at https://doi.org/
10.1038/s41597-024-04332-7.
Correspondence and requests for materials should be addressed to T.-W.C.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. e images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
© e Author(s) 2025
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
Content uploaded by Tien-Cheng Wang
Author content
All content in this area was uploaded by Tien-Cheng Wang on Jan 14, 2025
Content may be subject to copyright.