Tree Genetics & Genomes

Published by Springer Nature
Online ISSN: 1614-2950
Print ISSN: 1614-2942
Learn more about this page
Recent publications
Known geographical distribution of focal species, aPinus pungens and bP. rigida, (Little 1971) in relation to populations sampled (black dots) for genetic analysis; phenotypic characterization of each species was illustrated by Pierre-Joseph Redouté (Michaux 1819)
Measures of genetic differentiation and diversity among sampled trees of P. pungens and P. rigida: a Principal components analysis of 2168 genome-wide single-nucleotide polymorphisms (SNPs) for Pinus pungens (blue, left side of PC1) and P. rigida (orange, right side of PC1); b log-likelihood values across ten replicate runs in fastSTRUCTURE for K = 2 through K = 7; c results of averaged K = 2 ancestry (Q) assignments for each sample arranged latitudinally in each species
Redundancy analysis (RDA) of the multilocus genotypes for each tree with climate and geographic predictor variables (full model). Direction and length of arrows on each RDA plot correspond to the loadings of each variable
Hypotheses associated with each SDM-GCM model prediction versus the ensemble SDM prediction based on relative grid cell counts of high habitat suitability (> 0.5) for P. rigida, P. pungens, and overlap across four time periods (LIG, LGM, HOL, and PD). Bolded text were statements supported by the best-fit model of demographic inference
The best-fit model (PSCMIGCs) and unscaled parameter estimates from ∂a∂i\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\partial a \partial i$$\end{document} analysis. Time intervals (Ti) are represented in millions of years and associated with lineage population sizes (Ni) and a specific rate of symmetrical gene flow (Mi)
Long-lived species of trees, especially conifers, often display weak patterns of reproductive isolation, but clear patterns of local adaptation and phenotypic divergence. Discovering the evolutionary history of these patterns is paramount to a generalized understanding of speciation for long-lived plants. We focus on two closely related yet phenotypically divergent pine species, Pinus pungens and P. rigida, that co-exist along high elevation ridgelines of the southern Appalachian Mountains. In this study, we performed historical species distribution modeling (SDM) to form hypotheses related to population size change and gene flow to be tested in a demographic inference framework. We further sought to identify drivers of divergence by associating climate and geographic variables with genetic structure within and across species boundaries. Population structure within each species was absent based on genome-wide RADseq data. Signals of admixture were present range-wide, however, and species-level genetic differences associated with precipitation seasonality and elevation. When combined with information from contemporary and historical species distribution models, these patterns are consistent with a complex evolutionary history of speciation influenced by Quaternary climate. This was confirmed using inferences based on the multidimensional site frequency spectrum, where demographic modeling inferred recurring gene flow since divergence (2.74 million years ago) and population size reductions that occurred during the last glacial period (~ 35.2 thousand years ago). This suggests that phenotypic and genomic divergence, including the evolution of divergent phenological schedules leading to partial reproductive isolation, as previously documented for these two species, can happen rapidly, even between long-lived species of pines.
a Geographic distribution of the accessions sampled in Galicia, in the Northwest of Spain. The numbers represent several ranges, basins or natural parks. 1: Ancares Range, 2: Courel Range, 3: Eixe Range, 4: Segundeira Range, 5: Queixa Range, 6: Maceda Basin, 7: Faro Range, 8: Fragas do Eume Natural Park, 9: Xistral Range. b Geographical distribution of the genetic clusters identified in Galicia considering 15 populations sampled in stands across this region. The black points are the sampled populations. The Atlantic, Cantabrian and Western Mediterranean clusters defined in Fernández-Cruz and Fernandez-López (2016) are represented by the blue, green and orange colors, respectively
Genetic distances of the 51 varieties analyzed in this study and geographic distribution of the accessions sampled in Galicia taking into account the UPGMA dendrogram. a UPGMA dendrogram based on Nei’s distance including the 51 varieties. Varieties in black indicate that they are cultivated at least in Galicia and in grey show that they are located in other cultivation areas; b geographic distribution of the sampled accessions of 42 varieties cultivated in Galicia according to the genetic clusters provided by the UPGMA dendrogram. The varieties included in the genetic clusters G1a, G1b, G2a and G2b are represented by black circles, squares, diamonds and triangles, respectively. Rub-DM-Br = Rubias-Monterrei-Branca; Zaj-RdR-Joa = Zajorilas-Rubia del Real-Joaquinas; Marradi = Marrone di Marradi; Goujounac = Marron de Goujounac; Gal = variety cultivated in Galicia; Gal-Bie = variety cultivated in Galicia and El Bierzo; Gal-Por = variety cultivated in Galicia and Portugal; CIP = variety cultivated in the Central Iberian Peninsula; Fra = variety cultivated in France; Ita = variety cultivated in Italy
Genetic classification of 51 Castanea sativa varieties. a UPGMA dendrogram based on Nei’s distance including 51 chestnut varieties; b assignment in STRUCTURE at K = 4 of 51 chestnut varieties with 435 reference; c with reference samples from Atlantic, Basque Country and Mercurín populations; d with reference samples from Atlantic, Basque Country and Central Iberian populations; e with reference samples from Atlantic, Basque Country and Italian populations; f with reference samples from Atlantic, Basque Country and Greek populations. Considering the classification with 435 reference samples, varieties in blue belong to the Atlantic cluster. Varieties in green belong to the Cantabrian cluster. Varieties in orange belong to the Western Mediterranean cluster. Varieties in grey are hybrids among all these groups. Rub-DM-Br = Rubias-De Monterrei-Branca; Zaj-RdR-Joa = Zajorilas-Rubia del Real-Joaquinas; Marradi = Marrone di Marradi; Goujounac = Marron de Goujounac; Gal = variety cultivated in Galicia; Gal-Bie = variety cultivated in Galicia and Bierzo; Gal-Por = variety cultivated in Galicia and Portigal; ATL = Atlantic; CAN = Cantabrian; WMED = Western Mediterranean; EMED = Eastern Mediterranean; BC = Basque Country; MER = Mercurín; CIP = Central Iberian Peninsula; Fra = France; ITA = Italy; GRE = Greece
Geographic distribution of 42 sweet chestnut varieties (Castanea sativa) cultivated in Northwest of the Iberian Peninsula and its assignment in STRUCTURE to a Atlantic cluster, b Cantabrian cluster, c Atlantic-Cantabrian hybrids, d Western Mediterranean cluster, e Atlantic-Western Mediterranean hybrids, f Cantabrian-Western Mediterranean hybrids, g Atlantic-Cantabrian-Western Mediterranean hybrids, h Cantabrian-Eastern Mediterranean hybrids, i Western-Eastern Mediterranean hybrids. Pie charts show the membership probability of each variety to each cluster at K = 4. Blue color belongs to the Atlantic cluster, green color belongs to the Cantabrian cluster, orange color refers to the Western Mediterranean cluster and the red color refers to the Easter Mediterranean cluster. Afo = Afonso; Aml = Amarela; Ama = Amarelante; Am3 = Amarelante 3; Ana = Anaxa; Ber = Bermella; B-C = Bermella-Castelá; Bla = Blanca; Cal = Calva; Cam = Campilla; Car = Carrelao; DSa = De Sangre; DVe = Das Verdes; Fam = Famosa; Gar = Garrida; Inx = Inxerta; Lar = Laruda; Lon = Longal; Lou = Loura; Lug = Luguesa; Neg = Negral; N-P = Negra-Patacuda; Pac = Paciao; Par = Parede; PdA = Puga de Afora; PdB = Puga do Bolo; Pon = Ponteareas; Poñ = Porteliña; Por = Portugués; Pre = Presa; Rai = Raigona; Rañ = Rañuda; Rap = Rapada; Vch = Vilachá; RMB = Rubias-De Monterrei-Branca; Ser = Serodia; Se2 = Serodia 2; Ven = Ventura; Ver = Vérdea; Vea = Verea; Vil = Villafranquina; Xud = Xudía
a Factorial correspondence analysis (FCA) to classify 51 varieties of C. sativa into the four predefined clusters. Western Mediterranean populations were split into several populations: Mercurín, El Tiemblo and Hervás and Pellice and Petralia; b zoom to the area where most of the varieties were plotted. R-DM-Br: Rubias-De Monterrei-Branca; ZRJ: Zajorilas-Rubia del Real-Joaquinas
Sweet chestnut is a valuable species, highly managed for centuries for nut and wood production, whose genetic structure was affected by translocations. In this study, we selected a total of 51 genetically different clonal varieties from Galicia (NW of the Iberian Peninsula), Central Iberian Peninsula, France and Italy that were genotyped at 9 microsatellites. Almost all Galician varieties include at least two accessions with the same genotype. Several datasets of reference samples, from 29 natural or naturalized populations, were used to classify them into several groups. Genetic distances among varieties showed its cultivation area. Almost all Galician varieties cultivated in orchards were grouped in a single cluster except to ‘Famosa’, ‘Longal’, ‘Garrida’ and ‘Presa’ that were classified to the Central Iberian group and ‘Luguesa’ and ‘Carrelao’ to the French-Italian varieties. The Bayesian analysis with reference samples identified a group of varieties that could be autochthonous in Galicia because they were assigned to the Atlantic or the Cantabrian cluster. Other varieties from the Galician inner mountains that belong to the Mediterranean cluster could be translocated because this gene pool was found previously in several populations in the Iberian and Italian Peninsulas. Additionally, a large number of hybrid varieties between the Western Mediterranean cluster and the Atlantic or the Cantabrian cluster were found. Further analysis indicated that these Mediterranean varieties could be originated in Mercurín, in Central Iberian or Italian Peninsulas, and that ‘Luguesa’ and ‘Puga de Afora’ could be translocated from France or Italy. The results provided in this work provide a valuable information for a more efficient use of sweet chestnut genetic resources.
Map (a) of the interspecific seed orchard for hybrid larch, which included five blocks (BL1 ~ BL5). b–e show the locations of planted mother trees (Larix. gmelinii var. japonica) and planted father trees (L. kaempferi). Open circles represent mother trees, black dots represent chosen mother trees, and gray dots represent chosen candidate father trees. Crosses represent father trees from which DNA could not be extracted owing to withering or other factors. The black and white segments of the pie graph represent assigned and unassigned progenies, respectively
Distribution of paternal contributions of the 59 assigned paternal clones in the seed orchard
Pollen dispersal patterns and male reproductive success are crucial factors for seed orchard management. Information on pedigree reconstruction is also important for backward selection in selecting potential superior paternal clones in an open-pollinated seed orchard. The breeding without breeding (BwB) strategy has been shown to be helpful in avoiding expensive and laborious controlled mating, and in achieving genetic gain without making any artificial crosses. Although it is known that the efficiency of BwB depends on the scale of evaluated materials and the accuracy of paternity assignment, empirical data regarding the application of this strategy for various seed orchard designs are limited. In this study, we performed paternity analysis of 360 progenies derived from a 15-year-old hybrid larch test plantation (Larix gmelinii var. japonica × L. kaempferi), with candidate male parents (L. kaempferi) located within 50 m of 11 mother trees (L. gmelinii var. japonica) from an open-pollinated seed orchard. We then examined the pollen dispersal pattern within the seed orchard using molecular marker data and statistical modeling. We were able to assign 59 fathers to 57% of all progenies by paternity analysis using SSRs within a 50-m radius, and the mean distance of pollen dispersal was 42.2 m. We evaluated the performance of 17 paternal clones with at least four progenies based on the volume genetic gain of the progeny. As a result, the top two superior clones (average volume genetic gain: 32%) are expected to be candidates for producing new F1 cultivars.
Millions of lodgepole pine seedlings are planted each year to replace losses due to harvest or large-scale natural disturbances such as fires and forest pests. In Canada, replacement seeds and seedlings used for reforestation are often regulated by explicit policies. For example, in the province of Alberta, seedlings must be grown from seeds collected within a strictly defined zone that includes the harvested area where the seedlings will be planted. Thus, traceability along the entire reforestation chain of custody, from seed collection to seedling outplanting, is vital to ensure policy compliance. Here, we report a case study in which we used genomic tools to determine if seedlings were sown from a contaminated seed source. The 165,000 seedlings under scrutiny were scheduled for deployment the same year in which the seeds were sown, necessitating fast processing to make decisions on deployment. The scenario was made more complex by the fact that most of the potentially contaminated seed sources represented wild-collected genetic material in close geographic proximity to each other, rather than pedigree genetic material. With genotyping data obtained from a high-density single nucleotide polymorphism array analyzed with clustering analyses, kinship estimations, and genetic assignment tests, we were able to determine the probable seed source of this suspect group of seedlings and make data-guided recommendations on whether these seedlings could be confidently deployed onto the landscape without violating policy guidelines. This case study demonstrates the unique utility of molecular markers to confidently assign seedlings to a non-pedigree parent seed source originating within a limited geographic range, thereby ensuring traceability within a reforestation pipeline.
Haplotype distribution map for common ash (Fraxinus excelsior) across its natural distribution range (grey) (Pliûra and Heuertz et al. 2003). Most of the data (n = 1280 individuals) is taken from the work by Heuertz et al. (2004a, b). The northern data primarily comes from 830 individuals analysed by Tollefsrud et al. (2016), whereas the data across Britain is mainly based on 498 individuals analysed by Sutherland et al. (2010). Finally, the data across Ireland is mainly based on 344 individuals analysed in the current work. Haplotypes marked with asterisks are found in Ireland. Pie chart size is proportional to the number of individuals analysed for a given population
Pairwise FST comparisons between sampling populations. Estimates are based on the nSSR data. Values which are significantly different from zero are marked with asterisks (*p ≤ 0.05; **p ≤ 0.01; p*** ≤ 0.001)
a Frequency of STRUCTURE groups within each common ash (Fraxinus excelsior) population sampled in Ireland, inferred under the LOCPRIOR model. b Bar plots of admixture coefficients for each sampled individual. c Spatially interpolated TESS3 groups. Used to select these are the cross-validation scores (right), which are the root mean-squared errors between the genotypic matrix and the genotype likelihood matrix (for each locus) across replications
Principal coordinates analysis (PCoA) of the allelic composition of each population. The first and second principal coordinates are shown. Label colouring is according to the STRUCTURE group (K = 2) which is most frequent in each population
a Denuded canopies caused by ash dieback disease in trees sampled in Garryland Wood, Co. Galway. b Mean ash dieback score assigned to each sampling population (error bars are ± SD)
A large proportion of the western marginal range for common ash (Fraxinus excelsior L.) is located on the island of Ireland. However, the molecular diversity of common ash in Ireland has only been studied in a limited number of populations and using mainly non-standard chloroplast and nuclear simple sequence repeat (SSR) markers. This has prevented direct comparisons with studies on the rest of the species’ range across Europe. Here, four chloroplast and six nuclear SSR markers were used to infer the genetic diversity from 347 trees sampled across 20 populations. Results confirmed that, like Britain, Ireland is dominated by one main haplotype (H04) which originates from an Iberian glacial refugium. The occurrence of a second, rarer haplotype (H13) that also occurs as a rare haplotype in Britain but nowhere else, suggests at least some post-glacial recolonisation from the east. Chloroplast allelic richness was similar to Norway, which constitutes the species’ northern marginal range, but lower than in Britain and the European average. Nuclear allelic richness was also comparable with Norway, but Irish common ash differed in a complete absence of sub-population structure and geographic variability at both the chloroplast and nuclear level. Analysis of nuclear genetic structure indicated that common ash in Ireland mainly comprises one genetic group which is likely part of a single, western European meta-population. However, a less frequent genetic cluster is hypothesised to represent a mix of non-native alleles from imported plantation ash. Finally, conservation recommendations and the consequences of a uniform and low genetic diversity are discussed in the context of ash dieback disease, which was present in all populations sampled here.
Proportion of the different interaction variances (Af × S, Am × S, A, and D × S) in the total G × S interaction variance for height at different ages
Proportion of the different interaction variances (Af × S, Am × S, A, and D × S) in the total G × S interaction variance for circumference at different ages
Proportion of the different interaction variances (Af × S, Am × S, A, and D × S) in the total G × S interaction variance for bark thickness at different ages
The objective of this study was to better understand the underlying gene action in eucalyptus, under different plantation densities, for a different set of traits: growth, bark thickness, ecophysiological, and wood chemical property traits. We estimated the magnitude and relative proportion of the various genetic variance components using a eucalyptus genotype by spacing (G × S) interaction experiment. A clonally replicated progeny test including 888 clones belonging to 64 full-sib families of Eucalyptus urophylla × Eucalyptus grandis hybrid was used to estimate genetic parameters using genomic information to assess relationship matrix. Two densities (833 and 2500 trees/ha) were used representing contrasted environments in terms of individual tree available resource. Results showed that for height and circumference, additive-by-spacing (A × S) interaction variance increased from 18 to 55 months old, while dominance-by-spacing (D × S) interaction variance decreased. For bark thickness, specific leaf area, nitrogen, calcium, and magnesium, A × S interaction variance was preponderant. For wood chemical properties, except with Klason lignin, genetic additive effects strongly interacted with spacing compared to non-additive effects.
Plots for phenotypical characterization. Temporal witches broom disease progression from 2015 to 2019 with standard deviation whiskers in each evaluation point for (a) proportion of plants with any symptom (SINT), the proportion of plants with terminal brooms (TB), the proportion of plants with flower cushion brooms (FCB), and disease index (DI). (b) Progression of WBD severity in the quantitative traits: total numbers of brooms (NTB), numbers of flower cushion brooms (NFCB), numbers of terminal brooms (TNB), and length of terminal brooms. (c) Correlation heatmap among six mapped traits (p < 0.01). (d) Histogram showing NFCB values for evaluation years and p-value mean pairwise comparison (top brackets)
High-density linkage map representation shows ten linkage groups corresponding to each cacao chromosome. At the top, there is a scale in centiMorgan (cM). To the left, a density bar depicts higher marker density areas (red, 3.33 marker/cM) to lesser marker density regions (blue, 0.033 marker/cM)
(a) Genetic map recombination rate distributed over the 10 LGs on the physical map of the cacao genome (Argout et al. 2017). Scaffold successions are illustrated in gray and arranged at the bottom and right. The highest recombination rates are colored dark red and located around the centromere region. Segregation distortion is shown on the right for each LG. (b) The segregation distortions graph of linkage groups one and four show low frequencies for the homozygous “a” and “b” alleles, respectively. The frequency of the heterozygous allele “h” is the one expected
Interval mapping plot for all 6 traits and the 10 linkage groups. Average data from 5 years. The black dashed line shows the average LOD score threshold for all characters at 0.95 (about 3.51), and the red dashed line indicates the average LOD score threshold at 0.99 (about 4.23). Significant peaks can be observed, surpassing the thresholds mentioned above. SINT: presence or absence of any WBD symptoms; TB: presence or absence of terminal brooms; NTB: number of terminal brooms; FCB: presence or absence of floral cushion broom; NFCB: number of flower cushion brooms; TNB: total number of all types of brooms; LTB: length of terminal broom; DI: disease index; T. diam: Trunk diameter
QTL estimated intervals (MIM) are shown alongside the genetic linkage groups (number in the top). Six traits were mapped and represented by a color (in parentheses): SINT (chocolate), NTB (fuchsia), NFCB (red), TNB (black), DI (purple), and trunk diameter (orange). For comparison purposes, three QTLs detected by Brown et al. (2005) were located physically (using the reference Criollo genome) and included in this linkage genetic map (QTL1.1 and QTL9.1 for the total number of brooms: lime color, QTL9.2 for trunk diameter, green color)
The genetic architecture of resistance for witches’ broom disease of cacao (WBD) was reexamined in an F2 population (Sca-6 × ICS-1), addressing symptom-specificity and possible genetic basis for the differences in disease scores from terminal and cushion brooms. A high-density genetic linkage map was constructed with 494 individuals and 2968 SNPs, obtaining 10 linkage groups comprising 1595 centiMorgans. The trees were evaluated under field conditions with high WBD pressure from 2015 to 2019, with low spatial autocorrelation tested by Moran’s I. Five WBD symptoms and one tree growth trait were mapped, resulting in 23 minor-effects QTLs, primarily arranged in clusters and distributed in all linkage groups except 4 and 6, indicating that WBD has a polygenic inheritance. Terminal and cushion brooms shared a genomic region in linkage group 9, suggesting pleiotropy. In these conditions, the ICS-1 grandparent contributed with more QTLs than Sca-6 to WBD resistance, indicating that the resistance pattern has changed and confirming the susceptible parent’s importance. Few QTLs were identified in the same or proximal loci comparing the 5-year, annual, or biennial periods. Several candidate genes such as glutathione peroxidases, threonine-serine receptors, and endochitinases were potentially associated with WBD resistance. These findings strongly suggest that WBD resistance is more complex than previously postulated, and future directions are presented and suggested to investigate further and improve the insights into WBD resistance.
Scatter plots showing the dispersion of hybrid means for DBH against the number of ramets of the corresponding hybrid with no signs of attack by Septoria canker, at age 3. The main plot is divided into four sub-plots, according to the nothospecies studied (DxN, TDxD, TDxTD, and TxD). The code of each hybrid taxa is explained in the text. The continuous vertical and horizontal red lines are marking the first (lower) and third (upper) quartiles. The dashed vertical and horizontal red lines are marking the mean values of the full data set. The clonal means were estimated by including all alive trees, regardless of the level of attack
Scatter plots showing the dispersion of hybrid means for DBH against the number of ramets of the corresponding hybrid with no signs of attack by Septoria canker, at age 4. The main plot is divided into four sub-plots, according to the nothospecies studied (DxN, TDxD, TDxTD, and TxD). The code of each hybrid taxa is explained in the text. The continuous vertical and horizontal red lines are marking the first (lower) and third (upper) quartiles. The dashed vertical and horizontal red lines are marking the mean values of the full data set. The clonal means were estimated by including all alive trees, regardless of the level of attack
Scatter plots showing the dispersion of hybrid means for DBH against the number of ramets of the corresponding hybrid with no signs of attack by Septoria canker, at age 5. The main plot is divided into four sub-plots, according to the nothospecies studied (DxN, TDxD, TDxTD, and TxD). The code of each hybrid taxa is explained in the text. The continuous vertical and horizontal red lines are marking the first (lower) and third (upper) quartiles. The dashed vertical and horizontal red lines are marking the mean values of the full data set. The clonal means were estimated by including all alive trees, regardless of the level of attack
Least square means (LSM) for diameter at the breast height (DBH in cm) and for each group of cloned hybrids under the same level of attack. The scored levels of attack were different for each year of assessment
Scatter plots showing the dispersion of the best linear unbiased predictions (BLUPs) for clonal effects obtained at age 5 by using the reduced model (x-axis), which excludes the levels of Septoria attack, and by using the full model (y-axis), which includes the levels of Septoria attack. The main plot is divided into four sub-plots, according to the nothospecies studied. The code of each hybrid taxa is explained in the text
A clonal trial, including 124 hybrid poplars, was planted in the center of Chile in 2014 and Septoria canker was detected in 2016. We propose a new approach to analyzing the relationship between fungi attack and growth. As an example, we report the analysis of the diameter of trees growing in the presence of Septoria canker for three consecutive years. We modeled two linear models. The original (reduced) model of the trial and an alternative (full) model added the level of attack effect. We compared variances of additive random effects and their interaction with the level of fungi attack, genetic parameters (heritability and environmental variances), and BLUPs of the clonal values. The focus was on modeling the intra-clonal covariation that depends on the interaction between the genetic and micro-environmental effects. Our results show that the most severe fungi attack occurred in trees with the largest growth in diameter. Including the level of attack effect into the full model produced significant changes in the estimation of genetic parameters at age 5. We observed a genotype-by-micro-environment interaction at ages 3 and 4. We conclude that including the level of Septoria attack in the modeling of genetic parameters for diameter growth of poplar hybrids is a way to correct the prediction of the clonal value of each hybrid planted in a trial. Using a full model that included the pathogen effect allowed a better prediction (BLUP) of the clonal worth.
of breeding materials with ms1–1 and ms1–2. The dark-gray prefectures marked with large roman numerals are prefectures where MAS was performed in this study (I, Akita; II, Iwate; III, Fukushima; IV, Gunma; V, Chiba; VI, Tokyo; VII, Kanagawa; VIII, Nagano; IX, Aichi; X, Ishikawa; XI, Mie; XII, Wakayama). Numbers in parentheses indicate sample size. Light-gray prefectures marked with small roman numerals are prefectures where MAS was reported by Moriguchi et al. (2020) (i: Miyagi, ii: Yamagata, iii: Niigata, iv: Shizuoka, v: Tottori, vi: Kumamoto). In some prefectures, a few breeding materials were investigated by Hasegawa et al. (2021). The bold font shows the trees selected for this study. The superscripts indicate trees selected in previous studies; a Hasegawa et al. (2021), b Moriguchi et al. (2020)
Needle tissues for 10-individual mixed bulk DNA sample extraction. a Needle tissues from 10 individuals. b Tube containing 10-individual mixed bulk sample. Scale bars = 5 mm
Flow chart of the blind test for genotyping
Peaks for ms1–1, ms1–2, and Ms1 detected using bulk DNA or single-individual DNA. The vertical axis and horizontal axis represent the fluorescence intensity and the fragment length of PCR products, respectively. Red triangle: peak of ms1–1 (mutant allele); yellow triangle: peak of ms1–2 (mutant allele); black triangle: peak of Ms1 (wild type allele). The CJt020762_ms1–1 and CJt020762_ms1–2 markers were used to detect the ms1–1 and ms1–2 alleles, respectively
Cost comparison of MAS using 10-individual mixed bulk DNA samples and single-individual DNA to select trees with ms1. MAS was performed on breeding materials from Akita, Iwate, Fukushima, Gunma, Chiba, Tokyo, Kanagawa, Nagano, Aichi, Ishikawa, Mie, and Wakayama Prefectures (866 trees). The cost of labor for one person was set to US$8.11/h. Operating time of instruments (e.g., PCR, sequencer and electrophoresis) was not included
Recently, a candidate gene (CJt020762) for MALE STERILITY 1 (MS1) in Cryptomeria japonica has been identified, which made it possible to perform accurate selection of trees with mutant alleles (ms1–1 or ms1–2). Marker-assisted selection (MAS) is an effective method for drastically reducing the time required for a breeding cycle; however, a larger sample size for selection increases the labor and cost of analysis. In this study, firstly, we developed an efficient and low-cost marker selection method using bulk DNA extracted from a mixture of needle tissues from several individual trees. The time required for the extraction of bulk DNA, the accuracy of target peak identification in fragment analysis, and the numbers of samples required to identify trees with ms1 were compared for 3-, 5-, 7-, and 10-individual mixed bulk DNA samples. The results showed that MAS using 10-individual mixed bulk DNA samples was the most efficient and lowest cost for selecting trees with ms1. The accuracy of genotyping using 10-individual mixed bulk DNA samples was verified by conducting a blind test consisting of sample preparation, extraction of bulk DNA, and genotyping under blind conditions (i.e., all researchers were unaware of the correct genotype of the samples). Next, we tried selecting trees with ms1 from 866 breeding materials by MAS using 10-individual mixed bulk DNA samples. We successfully selected nine previously untested trees that were heterozygous for MS1. Finally, we showed that the use of bulk DNA in MAS enabled significant reductions in labor and cost by comparing the approaches using bulk DNA samples with single-individual (non-bulk) DNA samples, although it should be noted that the efficiency of selection depends on the proportion of samples with a target allele.
The evolutionary trajectory of a population both influences and is influenced by characteristics of its genome. A disjunct population, for example is likely to exhibit genomic features distinct from those of continuous populations, reflecting its specific evolutionary history and influencing future recombination outcomes. We examined genetic diversity, population differentiation and linkage disequilibrium (LD) across the highly disjunct native range of the Australian forest tree Eucalyptus globulus, using 203,337 SNPs genotyped in 136 trees spanning seven races. We found support for four broad genetic groups, with moderate FST, high allelic diversity and genome-wide LD decaying to an r² of 0.2 within 4 kb on average. These results are broadly similar to those reported previously in Eucalyptus species and support the ‘ring’ model of migration proposed for E. globulus. However, two of the races (Otways and South-eastern Tasmania) exhibited a much slower decay of LD with physical distance than the others and were also the most differentiated and least diverse, which may reflect the effects of selective sweeps and/or genetic bottlenecks experienced in their evolutionary history. We also show that FST and rates of LD vary within and between chromosomes across all races, suggestive of recombination outcomes influenced by genomic features, hybridization or selection. The results obtained from studying this species serve to illustrate the genomic effects of population disjunction and further contribute to the characterisation of genomes of woody genera.
Lignin biosynthesis occurs via the phenylpropanoid pathway and is regulated by transcription factors (TFs) including R2R3-MYB family members. In this study, we functionally characterized the R2R3-MYB TF VcMYB4a from blueberry (Vaccinium corymbosum) in lignin biosynthetic pathway. Phylogenetic analysis indicated that VcMYB4a clusters in a subclade with other TFs that act as transcriptional repressors of lignin and phenolic acid biosynthesis. Furthermore, lignin accumulation appeared to be negatively correlated with VcMYB4a expression during fruit development. Heterologous expression of VcMYB4a repressed lignin accumulation in Arabidopsis. Overexpression of VcMYB4a decreased lignin content in blueberry calli, whereas inhibition of VcMYB4a expression increased lignin accumulation in blueberry leaves. Finally, the transcriptome sequencing showed that overexpressing VcMYB4a in blueberry calli downregulated the expression of Vc4CL (Vc4CL5 and Vc4CL7), VcCOMT (VcCOMT1 and VcCOMT2), and VcCAD (VcCAD1 and VcCAD9) genes involved in lignin biosynthetic pathway. The heterologously expressing VcMYB4a in Arabidopsis downregulated the expression of genes, including AtC4H, At4CL (At4CL1 and At4CL5), AtCAD (AtCAD5 and AtCAD9), and AtCOMT1. The promoter sequences of these genes all contain MYB binding sites, and VcCAD9 and AtCAD9 genes have the most MYB binding sites. At the same time, VcCAD9 is more closely related to AtCAD9 than other CAD homologs from blueberry and Arabidopsis according to phylogenetic analysis. These findings suggested that VcMYB4a functions as a repressor of lignin biosynthesis by downregulating expression of 4CL, COMT, and CAD family members, especially CAD9 homologs. Our studies provide prospects for breeding new blueberry varieties with high lignin contents.
Date palm orchards in the South of Iran, Khuzestan
Breeding goals in date palm and a variety of biotechnological methods to achieve these goals
The procedure of producing synthetic seed in date palm; shoot tip explants, callus tissue formations, induction of somatic embryogenesis, isolated embryoids mixed with alginate solution, encapsulated embryoids, beads cultivated on vermiculite, and germination and planting in the tray
Schematic showing the stepwise processes in developing sex-specific PCR-based marker in date palm: DNA extraction and gel analysis, using various PCR-based marker, detecting sex-specific markers, purifying slice of gel containing the marker, transferring ligation product to bacteria, isolation of recombinant plasmid, sequencing analysis using Sanger and reconstruction, and BLAST analysis
Date palm (Phoenix dactylifera L.), a monocotyledonous species of the Arecaceae family, is widely cultivated in the arid regions of the Middle East and North Africa. Considering the prolonged generation cycle, the dioecious nature of date palm trees, and high heterozygosity, the traditional breeding approaches in date palm are lengthy and laborious, and numerous crosses and back-crosses all have led to intangible advancement in date palm breeding. In recent years, the powerful potential of biotechnology has been considered for resolving fundamental difficulties associated with date palm breeding. Plant tissue culture, an important application of biotechnology, is an essential tool for vegetative propagation and a prerequisite for genetic modification. Genomic studies and molecular tools are integrated with modern plant breeding programs for precise determination of genetic diversity, identification of desired traits, germplasm conservation, and genetic drift control. This technology clarifies how the genome works in a specific evolutionary or environmental condition, determines relationships between genes, identifies the role of coding and non-coding parts of the genome, and identifies key points in regulating evolutionary processes and responding to plant internal and external factors. A comprehensive and clear picture of functional genomics pave the way for plant genetic engineering to improve the desired traits. This review surveys the recent approaches and applications of biotechnology in date palm breeding.
In the present study, phenotypic correlations and direct and indirect effects were estimated in a breeding population of cacao involving 22 full-sib families from 14 reciprocals and 8 direct crosses to obtain information aiming to increase selection efficiency for higher production. Path analysis was used to obtain estimates at the family level, within families, and the individual level. High phenotypic correlation coefficients were found between the total number of pods per tree and frosty pod rot incidence, with bean dry weight per tree, at the family (r = 0.91 and − 0.84, p < 0.001) and individual levels (r = 0.89 and − 0.50, p < 0.001), respectively. Path analysis revealed that the total number of pods per tree had the highest positive direct effects (0.66 to 1.05) on bean dry weight per tree expression. Likewise, indirect effects via the total number of pods per tree were important to explain the significant association of the other variables with the bean dry weight yield per tree. Variations in the correlation significance and direct and indirect effect magnitudes were observed among sample size, families, reciprocal and direct crosses, years, and bimonthly. However, beyond the influence of these, the total number of pods per tree had the greatest effects on production. These results suggest that indirect selection on the total number of pods per tree would improve selection efficiency for high bean yield in these breeding populations, accelerating and reducing costs than using a larger number of traits. The low heritability associated with the number of pods per tree might be beneficial in the second step of the selection process, considering other yield components of higher heritability as bean dry weight per pod. Also, extrapolation of the results should be done with care, considering that genetic parameter estimates are strictly valid for the population and environment studied, especially here that the number of parents used is a small sample (although important) of the parents used in cacao breeding programs.
Gene map of Crataegus bretschneideri C. K. Schneid. chloroplast genome. Genes shown outside of the outer circle are transcribed clockwise and those inside are transcribed counterclockwise. Genes belonging to different functional groups are color-coded. The dashed area in the inner circle indicates the GC content of the chloroplast genome
Comparison of the LSC, IRs, and SSC border regions of five Crataegus chloroplast genomes. The chloroplast genome of Crataegus kansuensis E. H. Wilson is considered as a reference. LSC, large single copy region; SSC, small single copy region; IRa, inverted repeats a region; IRb, inverted repeats b region
Type and distribution of repeated sequences and SSRs in five Crataegus chloroplast genomes. a Repeat types number. b Number of repeat sequences by length. c SSR type number. d Number of identified SSR motifs. Mono., Di., Tri., Tetra., and Penta. represent mononucleotide, dinucleotide, trinucleotide, tetranucleotide, and pentanucleotide short sequence repeats
Divergence time estimation for Crataegus and Amelanchier based on the chloroplast genomes. The number at each node represents the median divergence time, and the node bars represent 95% HPD (highest posterior density). The accession numbers in Genbank (C. chungtienensis (KY419947), C. hupehensis (MW201730), C. kansuensis (MF784433), C. marshallii (MK920293), C. pinnatifida var. major (KY419945), C. pinnatifida (MN102356), Crataegus sp. (MK920294), Mespilus germanica (MK920295), A. alnifolia (MN068255), A. ovalis (MK920297), A. sanguinea (MN068262), and A. spicata (LMK920292)) are listed here. The ruler on the lower left represents the geologic timescale. Paleogene (23.03–66 Mya); Eocene (33.90–55.80 Mya); OLI (Oligocene, 23.03–33.90 Mya); Neogene (0–23.03 Mya); Miocene (5.33–23.03 Mya); Pliocene (1.81–5.33 Mya); PLE (Pleistocene, 0.01–1.81 Mya)
Phylogenetic trees of Crataegus accessions using maximum likelihood (ML) and Bayesian inference (BI) based on ITS (a) and LEAFY (b) sequences. The midpoints in ML and BI analyses are listed above the branches (ML/BI), and the root is positioned at the midpoint between the two longest branches. The color of each accession represents the different accession of Crataegus.
Crataegus bretschneideri C. K. Schneid. is one of the species cultivated in China. Due to its unclear taxonomic classification status, the conservation and utilization of this germplasm resource have been limited. In this study, we analyzed the chloroplast genomes and nuclear sequences to reveal the taxonomic relationships among C. bretschneideri and related species. We assembled the chloroplast genomes of C. bretschneider and related species and varieties, including C. maximowiczii C. K. Schneid., C. maximowiczii var. ninganensis S. Q. Nie & B. J. Jen., C. pinnatifida Bunge, and C. pinnatifida var. major N. E. Br. The lengths of the chloroplast genomes ranged from 159,644 bp (C. bretschneideri) to 159,947 bp (C. pinnatifida var. major). The five Crataegus chloroplast genomes had similar features and possessed 86 to 88 protein-coding genes, 37 tRNA genes, and eight rRNA genes which were arranged in the same order. Eight mutation hotspot regions, including matk, psaB, accD, petA, clpP, trnD-GUC, psbH-petB, and trnN-GUU-trnR-ACG could be used as potential molecular markers for further studies of Crataegus genetic diversity. Phylogenetic analyses based on 17 chloroplast genomes of Crataegus and Amelanchier indicated that C. bretschneideri was related to C. maximowiczii and C. maximowiczii var. ninganensis. However, the phylogenetic trees constructed by nuclear sequences of 36 Crataegus accessions reflected a closer relationship between C. bretschneideri and C. pinnatifida. Furthermore, divergence time estimation suggested that C. bretschneideri and C. maximowiczii diverged in the late Miocene and that speciation of C. pinnatifida occurred during the middle to late Miocene. These findings revealed that C. bretschneideri is an independent species and may be of hybrid origin.
Geographic distribution of populations of E. dysenterica analyzed in this study, in the “Cerrado” biome (gray-shaded area). See Barbosa et al. (2015) and Diniz-Filho et al. (2016) for details and Table S1 for geographical coordinates of the populations
Mantel and partial Mantel tests between genetic (G), environmental (E), and geographical (S) pairwise distances between 23 populations of E. dysenterica in Brazilian Cerrado. The numbers over the arrows indicate the Mantel correlations between two sets, and the numbers in parentheses are the partial Mantel correlations taking into account the third matrix (i.e., the Mantel correlation between G and E is 0.551, and the partial of 0.182 refers to the Mantel partial correlation between G and E taking S into account)
Mantel correlograms for the genetic differentiation estimated by pairwise FST among E. dysenterica populations, using geographic (a) and environmental (b) distances to define five connectivity classes (for visualization purposes, the G and E distances were scaled to vary between 0 and 1). Open circles indicate non-significant (P > 0.05) Mantel correlations
Spatial patterns in the first principal coordinate of genetic (a) and environmental (b) distances, for the 23 populations of E. dysenterica from Brazilian Cerrado analyzed here (see Fig. 1)
Results of 1000 simulations of stochastic population differentiation under isolation-by-distance for the spatial configuration of E. dysenterica populations, with the statistical distribution of Mantel tests for the correlation between G and S (a), reflecting the pure spatial component and similar to the observed one, and the partial Mantel test between G and E taking S into account and expressing IBE (b), showing that observed patterns significantly (P < 0.01) depart from the neutral expectation. Vertical solid lines indicate the observed statistics
Although spatial analysis of population genetic structure has been one of the most important ways to infer microevolutionary processes, these studies are usually focused on neutral dynamics and limited dispersal, interpreted under the theoretical reasoning of isolation-by-distance. More recently, however, there has been a growing interest on how environmental variation is also involved in population differentiation, both by direct effects of local adaptation and other processes related to environmentally or ecologically constrained dispersal. Here we evaluated patterns of genetic population structure and isolation-by-ecology, or environment (IBE), in Eugenia dysenterica DC (Myrtaceae), a fruit tree species of economic potential interest and widely distributed throughout the Central Brazil and endemic to the Cerrado biome (Neotropical savannas). We analyzed population structure using nuclear SSR markers for 736 individuals sampled from 23 localities (local population) and disentangled the effects of genetic molecular variation, estimated by pairwise FST (matrix G) and geographical distances (matrix S) into Grinnelian niche of populations (matrix E), based on climate and soil data. Spatial patterns in eigenvectors of G and E reveal northwest-southeast gradients, coherent with geographic range shifts after the Last Glacial Maximum. We used different forms of Mantel regression and correlation and redundancy analyses, as well as simulations of isolation-by-distance, to show that there is a significant partial correlation between G and E taking S into account, thus supporting the IBE process for E. dysenterica, in addition to other processes related to spatially constrained gene flow.
A Overall range of Tilia cordata (EUFORGEN 2009) with context of sampled area indicated by a box. B The location of sampling sites (see Table 1 for key to numbers and exact location); the dashed line indicates the approximate location of the northern range edge in the UK, based on Pigott (1991)
Genotypic richness R (Dorken and Eckert 2001) across all samples. A score of one indicates no clonal reproduction (all genotypes are unique) whereas zero would be a monoclonal site (all genotypes identical). Mean R (0.57) is indicated by the dashed line
A stacked barplot summarising genotypic diversity and evenness across all samples. Samples are provided in north-south order from left to right. Each stacked bar represents all genotypes for a given stand, and the relative proportion of each within that quadrat. The number above each bar represents the number of individuals in that quadrat. Filled segments are clonal (repeated) genotypes, while those unfilled are unique and likely the product of recruitment from seed. Most quadrats have several filled segments of similar size even where the proportion of individuals recruited from seed it low, showing that clonal reproduction in T. cordata still maintains appreciable genetic diversity. Notable exceptions include site 5 and site 7
The spatial arrangement of example quadrats, showing the tight aggregation of clonal groups. As well as individual location, the age class of stems and the clonal status of each are provided. Point shape indicates stem maturity, with squares being juvenile, while circles are adult, based primarily on diameter at breast height (DBH). Point size gives an indication of relative DBH, but is not to scale. Point fill indicates clonal status: open symbols are part of clonal groups; dark grey symbols are not genotyped due to inaccessible leaves; light grey symbols indicate unique genotypes. Memberships of distinct clonal groups are circumscribed by a convex hull indicated by black lines. A minimum observed genotypic richness index R (0.17, sample #5); B 1st quartile R (0.48, sample #6); C median R (0.57, sample #15); D 3rd quartile R (0.57, sample #10); E maximum R (0.96, sample #17)
A beta regression model (pseudo-R² = 0.66, n = 18), examining the incidence of clonality and its covariates. A) Scatterplot of juvenile stem density in sampled stands against the proportional measure of genotypic richness R, showing that the number of recently recruited stems within the sample quadrat is a significant predictor of clonality. B) Scatterplot of mean daily maximum temperature during July plotted against the same response, showing that temperature during the flowering period is a significant predictor of clonality. In both instances, the line of best fit is calculated from the model’s complementary log-log link function with all other covariates set to their mean values. Only significant covariates are plotted here; the full model is specified in Table 4
Facultative clonality is extremely common in plants, but the relative emphasis on sexual versus asexual reproduction varies both between and within species, which in turn may influence individual fitness and population persistence. Tilia cordata is a temperate, entomophilous canopy tree that is partially clonal. Favourably warm climatic conditions have been linked with successful sexual reproduction in the species with clonality being suggested as the reason for population persistence in colder periods. Despite this the extent, character and structure of asexual reproduction in the species have never been described, nor has its relationship with climate. Fine-scale spatial genetic structure was assessed in 23 stands across a latitudinal gradient. The proportion of individuals that are of clonal origin has a wide range with a mean of ~43%. Genetic diversity is high, with even mostly clonal stand possessing several distinct genotypes. A beta regression model shows that historic summer temperatures and density of recent recruits are predictors of the proportion of clonal recruitment. Clonal reproduction is less important in stands that experience higher temperatures during flowering while stands with more saplings have more clones. Additional factors likely affect the balance between the two reproductive modes. The climatic relationship suggests a trend towards a higher proportion of recruitment from seed in a warming climate, although factors such as herbivory may prevent this.
(A) The normal-growth (SN) and (B) necrotic weak-growth phenotypes (SW) of the hybrid seedlings approximately 2 weeks after germination. (C) Sampling scheme of each part of the SW seedlings. Leaf, true leaves; Coty, cotyledons; Hypo, hypocotyl; Root, root
Significantly enriched GO terms detected by goseq in the upregulated (A) and downregulated (B) DEGs of the peach (Ppe), sweet cherry (Pav), and ‘Somei-yoshino’ (Pye) genome referencing analysis. Significant enrichment of biological process (BP), cell component (CC), and molecular function (MF) GO terms was represented with a q value (FDR) and the number of DEGs (dot size). The GO terms “defense response” and “response to biotic stimulus” are indicated by red arrows
An example of gene expression pattern on each seedling part. The normalized gene expression levels for eight upregulated and four downregulated SW-specific DEGs were calculated by TMM method using TCC-GUI pipeline. SN, normal-growth seedlings; Coty, cotyledons; Leaf, true leaves; Hypo, hypocotyl; Root, root
A heatmap for the enrichment pattern of biological process GO terms in the comparison of normal seedlings (SN) vs. cotyledons (Coty), SN vs. true leaves (Leaf), and SN vs. hypocotyl (Hypo). Significant enrichment terms were represented with the Z score for upregulation (red) and downregulation (blue) detected by PAGE analysis. Several GO terms related to plant defense response (red), photosynthesis (blue), and cell cycles (orange) are indicated by arrows
Knowledge of post-zygotic hybrid incompatibility is essential to understand speciation. Although the genes and molecular mechanisms involved in hybrid incompatibility are being elucidated in model plants and crops, the information on woody non-model plants is lacking. In the seedlings of a cross between the most famous ornamental cherry cultivar Cerasus × yedoensis ‘Somei-yoshino’ and its closely related wild species Cerasus itosakura, we discovered a hybrid incompatibility characterized by a phenotype in which growth stops after the expansion of the first true leaves and the seedling eventually dies. To elucidate the molecular mechanisms related to this seedling necrosis, we performed a comprehensive expressed gene analysis on normal-growth and necrotic weak-growth (SW) hybrid seedlings. The RNA-seq results showed over 1500 differentially expressed genes (DEGs) specified for the SW. Numerous genes associated with plant defense response, such as pathogenesis-related genes, and several receptor-like protein kinases were included in SW-specific upregulated DEGs. The Gene Ontology enrichment analysis also showed the significant association of “defense response” in SW seedlings. These upregulated defense-related gene expressions were particularly observed in the hypocotyls. On the contrary, the reduction of photosynthesis-related gene expression and reduction in the gene expressions of cell division and cell cycle at specific parts of seedlings were also observed in the SW. Our results suggest that an upregulated defense-related gene expression suppresses the meristem growth and deviation, resulting in growth failure as an autoimmune response in hybrid cherry seedlings.
The genus Rosa comprises more than 150 species spread across three subgenera, Hesperhodos, Hulthemia, and Rosa, most of which have high economic and ecological values. Here, we report 31 complete plastomes that belong to the genus Rosa, with the aim of better understanding the evolution and divergence of genes of the plastome in this genus. A comparative analysis was conducted to characterize the chloroplast genomes of 12 taxa that cover all the sections in the three subgenera of Rosa. Further, complete chloroplast genome sequences revealed six hotspots of nucleotide polymorphism, including five intergenic regions and one coding sequence. In addition, a pairwise analysis revealed that R. stellata and R. berberifolia have the highest average genetic distances (Da) and nucleotide divergence (Dxy) compared with other species. Moreover, the lowest Da and Dxy was observed between R. gallica and R. canina, followed by R. multiflora and R. chinensis var. spontanea. The phylogenetic relationships within Rosa inferred from the 44 chloroplast genomes revealed the R. subg. Hesperhodos is the clade that diverged the earliest. Its successive clades were identified as R. subg. Huithemia and R. sect. Pimpinellifolia. The phylogenomic analysis also revealed rapid simultaneous diversification within the Rosa subgenus. Significant increases in Pi and dN for ycf1, dN/dS for ycf2 were observed across the genus. Finally, we found that most RNA editing sites identified in the genus are section-specific, suggesting that the subgenera or sections have a self-evolving lineage. Taken together, the plastome information is valuable for species identification, phylogenetic studies, molecular genetics and breeding Rosa species.
Linear regression models (blue line; standard deviation- gray shadow) for Psidium cattleyanum ploidy vs. diversity indexes: in A Shannon index, B Simpson index, C Nei index, and D number of bands
Population structure for each of the 328 individuals of Psidium cattleyanum from 12 populations based on ten microsatellite markers. A Bar plots of the estimated membership coefficient inferred by STRUCTURE. The most likely value of K inferred was 11. Vertical bars represent each genotype. In B, discriminant analysis of principal components (DAPCs). Colors in each bar represent the probability a sampled individual belongs to a genetic cluster. The codes below bars correspond to population codes from Table 1
Principal component analysis of environmental variables for Psidium cattleyanum cytotypes. Variable codes are as follows: SRAD01, solar radiation in January (kJ m⁻² day⁻¹); SRAD03, solar radiation in March (kJ m⁻² day⁻¹); BIO01, annual mean temperature; BIO4, temperature seasonality (standard deviation × 100); BIO07, temperature annual range; BIO10, mean temperature of warmest quarter; BIO 12, annual precipitation; BIO 15, precipitation seasonality (coefficient of variation); BIO 16, precipitation of wettest quarter; BIO 19, precipitation of coldest quarter
Linear regression model (blue line; standard deviation-gray shadow) of pairwise comparisons. A Genetic distance vs. geographic distance. B Genetic distance vs. environmental distance. C Environmental distance vs. geographic distance
Niche overlap results. A Distribution of the sampled populations. B Detail of the population sampled. C Binary ensembles (majority consensus) of cytotypes. D Niche overlap table between cytotypes, higher values correspond to Schoener’s D (Schoener 1968) and lower values to Warren’s I
Polyploidy is defined as the presence of more than two complete chromosome sets in an organism and has frequently occurred throughout the history of angiosperms. Polyploidization is a process that typically results in instant speciation. Using Psidium cattleyanum, a natural polyploid complex with several cytotypes, we aim to test two hypotheses regarding speciation in polyploids: polyploidization promotes (1) interruption of gene flow and (2) intraspecific niche divergence. We analyzed 12 natural populations of P. cattleyanum, integrating population genetics data, accessed by microsatellite markers, and climatic niche analysis, using environmental niche modeling, to provide insights about polyploid speciation. We found strong genetic structure in populations and cytotypes and low environmental niche similarity between cytotypes. Genetic diversity declines with increasing ploidy levels which is probably associated with asexual reproduction. Our results corroborate that polyploidy is generating a reproductive barrier and is associated with niche divergence among cytotypes. Therefore, we infer future divergent lineages between cytotypes of P. cattleyanum and confirm the role of polyploidy as an evolutionary step in speciation in this group. Additionally, this study provides new information for the discussion about how polyploidy affects the genetic diversity of taxa and ecological niches.
Schematic representation of fine-scale spatial genetic structure (FSGS) on different life stages in the space–time complex. Pollination events (dotted lines), time (t), and distance (d) are represented
Sampling sites of forest tree populations from seasonally dry tropical forests (SDTF) analyzed in the studies considered in this review. Schematic SDTF distribution based on DRYFLOR (2016)
Gene dispersal processes shape demographic and microevolutionary dynamics of tree species. Gene dispersal patterns can be studied by spatially explicit methods. Spatial genetic structure (SGS), summarized in the Sp statistic, provides indirect estimates of gene dispersal across generations for a known or assumed population effective density. Sp is modulated by exogenous and endogenous factors including the mating system that can be assessed using outcrossing rates (tm). Knowledge on tm and Sp are particularly important for the conservation of species in fragmented biomes such as seasonally dry tropical forests (SDTF). The main aim of this review was to evaluate putative drivers of Sp and tm, and their consequences for gene dispersal in tree species from SDTF. We reviewed 59 genetic studies on SDTF tree species published between 2000 and 2020 and extracted data on propagule dispersal, successional stages, seasonality, mating system, population density, landscape features, type of molecular markers, pairwise kinship in the first distance class (F1), Sp statistic, mean gene dispersal distance (σg), and multilocus outcrossing rates (tm). Sp was significantly associated with the mating system where Sp(outcrossing) > Sp(mixed-mating), and population density where Sp was higher in high-density populations. Outcrossing rate was significantly associated with the type of propagule dispersal, where tm was higher in populations of plants pollinated by wind, and in those with animal-mediated seed dispersal, tm(zoochory) > tm(anemochory) > tm(autochory), and with successional stage where tm(late-successional) > tm(pioneer). These factors are relevant to inform management actions in conservation and restoration projects. Thus, the knowledge on the determinants of gene dispersal processes can help to rescue SDTF through sustainable management.
Description of germplasm and trials used to identify the 70 families supplied for screening in nursery and subsequent field trials. Germplasm sourced from East and West Papua, North and South Cape York was evaluated in first-generation progeny trials across Indonesia with seed from selections returned to CFBTI. Seed was distributed to industry partners in Sumatra for clonal replication and disease resistance screening in the three nursery trials and subsequent evaluation of survivors in nearby field trials
Reduction in percent survival in clonally replicated progeny trials of Acacia mangium following inoculation with Ceratocystis in three nursery screening trials in Indonesia
Relationships between parental breeding values in pairs of trials depicting the observed genetic correlations for survival. A filled circle identifies the family with the highest survival and smallest lesions in all three trials
Relationships among parental breeding values in pairs of trials describing the observed genetic correlations for lesion length. A filled circle identifies the family with the highest survival and smallest lesions in all three trials
Three screening trials of clonally replicated Acacia mangium seedlings were evaluated for survival and lesion length following inoculation with locally collected strains of Ceratocystis in Indonesia. Tolerance in the population was low with 6.7% of the 1033 clones represented by more than 4 ramets surviving repeated inoculations. Differences in tolerance among populations were slight; however, populations with consistently higher survival and shorter lesion lengths were from Papua New Guinea rather than Queensland. Estimates of the proportion of the experimental variation attributable to differences among parents (heritability) were low to moderate for both survival and lesion length. Estimates of the proportion of the experimental variation that was attributable to differences among clones (repeatability) were greater but typically similar to the heritability estimates, indicating that initial improvements from selection will primarily be derived from identifying tolerant parents. While genetic correlations among experiments were positive, estimates could not exclude the existence of host–pathogen interactions. Two validation trials of the tolerant clones were assessed 9 months after establishment; these trials verified that one-third of the clones identified in the nursery screening were also tolerant to Ceratocystis in field trials. The experiments confirmed that nursery screening may be used to quickly focus efforts on parents that produce more tolerant progeny, screening additional seedlings to increase selection intensity rather than using clonal replication to increase accuracy would lead to greater improvements in tolerance and field trials are required to verify disease tolerance at later ages.
Scatterplot of raw and spatially corrected family-level means for total dry weight. A weakly reactive family is shown circled. a Orthogonal regression fits are represented by the overlapping lines for raw (dashed;y=0.005+1.26x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y=0.005+1.26x$$\end{document}) and spatially corrected (solid;y=0.008+1.24x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y=0.008+1.24x$$\end{document}) datasets. A 1:1 relation is shown (dashed). b [CO2] response ratios (RDW) of spatially corrected family-level dry weights for each of the 124 experimental seedlots included in mixed modelling and power-fitted [CO2] response-ratio function (y=0.8782x-0.203\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y=0.8782{x}^{-0.203}$$\end{document}). Pearson’s (r) and Spearman’s rank (ρ) correlation coefficient are reported for comparative a[CO2] and e[CO2] spatially corrected dry weights and the coefficient of determination (r²) for RDW. Superscripted ‡ denotes significance at p < 0.0001
Scatterplots of spatially corrected family-level mean total and seed weights and [CO2] response ratio (RDW). a Ordinary least squares fits and their respective coefficient of determination (r²) for a[CO2] (dashed; y=0.10+0.03x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y=0.10+0.03x$$\end{document}) and e[CO2] (solid; y=0.11+0.05x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y=0.11+0.05x$$\end{document}) dry weights plotted against seed weight. b Coefficient of determination (r²) is reported for RDW
Scatterplot of family-level Δ¹³C and spatially corrected total dry weight means correlation coefficients (r) are reported separately for a[CO2] and e[CO2]
Scatterplot matrix of family-level BLUPs and Δ¹³C of seedlings raised in a[CO2] (this study) and regionally averaged plantation-based stem volumes. Product moment correlation coefficients (r) are reported on each scatterplot (ns denotes non-significance at p = 0.05 and ‡significance at p < 0.0001)
Increasing [CO2] may influence commercial crop and timber yield. While selection of genotypes sensitive to elevated [CO2] (e[CO2]) appears possible in agricultural crops, there is limited evidence for genotype-by-CO2 (G × CO2) interactions in commercial tree species. We examined [CO2] responsiveness in 124 open-pollinated Eucalyptus globulus ssp. globulus (E. globulus) families with the aim of assessing whether G × CO2 interactions are detectable in seedlings for early-age screening. Plants were grown in ambient (a[CO2]; ~ 405 μmol mol⁻¹) and e[CO2] (640 μmol mol⁻¹) and harvested 25 days after germination. Total, shoot, and root dry weights were determined for each plant. Carbon isotopic discrimination against ¹³C (Δ¹³C) was determined at the family level. We observed highly significant (p < 0.0001) increases in mean total, shoot, and root dry weights. Mixed-model equations were used to estimate the main and interaction effects of the G × CO2 for each mass trait. The main effects from the mixed-model output ([CO2] and individual-tree effects) were significant for all traits. However, [CO2]-by-individual tree interactions were non-significant for all traits, indicating little G × CO2 interaction. A secondary aim was to examine the correlation between greenhouse and mature-age growth from breeding trials that use common families conducted under ambient [CO2]. These correlations were non-significant, suggesting early growth is not necessarily indicative of later-age responses. Our results suggest that while early growth of E. globulus is enhanced under e[CO2], genotypes respond relatively uniformly to e[CO2] and little opportunity exists for seedling-based selection at the population level based upon the response of plants during the first weeks of growth.
Principal coordinate analysis of macadamia analysed based on genetic distance using GenAlex v6.502. Accessions differentiate into three genetic clusters. The font colour represents cultivar origin with the pink font representing the S.A. (South African), the green font representing the HAES (Hawaiian Agricultural Experimental Station) representative accessions, gold font representing the AUS (Australian) representative accessions and the grey font representing the OTH (Californian and Israel) representative accessions. The pink cluster contains mainly South African accessions forming a M. tetraphylla-derived group, the green cluster contains mainly HAES and Australian accessions forming the M. integrifolia-derived group and the blue cluster contains the local FARM (Farmer’s breeding population). A 3D interactive version is available here:
Neighbour-joining tree using Nei’s genetic distance of macadamia displaying phylogenetic relationship among representative national collections visualized in iTOL. Accessions separated into three major genetic clades. One major clade consists mostly of accessions from HAES, many of which are M. integrifolia-derived. The second major clade consists of accessions from the local Farmer’s population. The final major clade consists mainly of accessions from S.A. many of which are M. tetraphylla-derived. The colours are co-ordinated according the cultivar origin, green, HAES (Hawaiian Agricultural Experimental Station) representative accessions; gold, AUS (Australian) representative accessions); blue, FARM (local Farmers samples); pink, S.A. (South African) representative accessions
STRUCTURE analysis of macadamia accessions from K = 2 to K = 4. Accessions from national country representative collections are separated by black vertical lines. Commercial macadamia originate from two ancestral species, M. integrifolia in green, and M. tetraphylla in pink. The HAES (Hawaiian Agricultural Experimental Station) representative collection has a higher proportion of M. integrifolia, the AUS (Australian) representative collection consists mostly of hybrids, the S.A. (South African) representative collection has a higher proportion of M. tetraphylla, and the Farmer's (FARM) breeding population is genetically unique from the other three collections
Macadamia nuts are known globally for their high quality and economic value. Global macadamia commercial nut production amounts to 60,000 metric tonnes and is increasing steadily. South Africa is the leading producer with 29% of worldwide kernel production. Commercial macadamia germplasm was originally selected from a small genepool (mainly Macadamia integrifolia species) from a limited geographic distribution in Australia. These accessions were subsequently bred, cloned and exported across the world to start local macadamia industries. The South African macadamia industry was established with pre-commercial and commercial macadamia from different parts of the world, and local selections were also performed. Many of these accessions have unique genetic compositions that have not been characterized yet. We used 13 nuclear microsatellite markers to study the genetic diversity and structure of macadamia germplasm cultivated in South Africa. We compared four groups of accessions including 31 originating from the Hawaiian Agricultural Experimental Station (HAES), 19 from Australia (AUS), two from California and one from Israel (OTH), 31 from South Africa’s locally selected accessions (SA) and 26 from two local Farmers (FARM). We used STRUCTURE, PCoA and neighbour-joining phylogenetic analyses to show that the South African selected accessions include diverse hybrid genotypes with strong Macadamia tetraphylla composition, unlike the Hawaiian commercially released and Australian representative collections that mostly have M. integrifolia or hybrid composition. Our results suggest that the South African selections represent a unique and diverse set of germplasm for future macadamia improvement efforts that will benefit from genomic breeding technologies.
We evaluated differential expression of genes in leaf and xylem tissues for three Eucalyptus clones in the field using Illumina sequencing, under four contrasting fertilization regimes: a control combining nitrogen (N), phosphorus (P), and potassium (K) and three regimes with N, K, and P deficiency. The field results showed significantly better performance with a control fertilizing regime for height and circumference at 14 months, but no differences between clones. The number of up and down regulated DEG (differentially expressed genes) in pairwise clone comparisons was around 5900 for leaf and 6900 for xylem at FDR < 0.01. With fertilization treatment comparisons, DEG were only observed for N deficiency versus control with 45 up and down-regulated DEG for leaf and 1022 for xylem. The number of DEGs between fertilizer deficiency treatment and control varied greatly within each clone showing important clone by fertilization interaction. Gene ontology analysis showed that a great number of genes were related to stress, transport, and transcription factors. The co-expression analysis showed some significant correlations between complex network gene expression and tree growth induced by fertilization regime. For example, the co-expression of a 54 DEG network showed a significant correlation with tree height (0.84, P value: 1e⁻⁰⁴) and DBH (0.89, P value: 7e⁻⁰⁶) in the case of N deficiency versus control. Our results suggest that gene expression levels between different fertilization treatments and clones can provide a basis for future research on gene function in Eucalyptus under nutrient stress with the perspective of new development in Eucalyptus breeding.
Word cloud of the keywords provided for all oral and poster presentations. Created with Wordle (https:// mrfei nberg. com).
Rapid human-induced environmental changes like climate warming represent a challenge for forest ecosystems. Due to their biological complexity and the long generation time of their keystone tree species, genetic adaptation in these ecosystems might not be fast enough to keep track with conditions changing at such a fast pace. The study of adaptation to environmental change and its genetic mechanisms is therefore key for ensuring a sustainable support and management of forests. The 4-day conference of the European Research Group EvolTree ( on the topic of "Genomics and Adaptation in Forest Ecosystems" brought together over 130 scientists to present and discuss the latest developments and findings in forest evolutionary research. Genomic studies in forest trees have long been hampered by the lack of high-quality genomics resources and affordable genotyping methods. This has dramatically changed in the last few years; the conference impressively showed how such tools are now being applied to study past demography, adaptation and interactions with associated organisms. Moreover, genomic studies are now finally also entering the world of conservation and forest management, for example by measuring the value or cost of interspecific hybridization and introgression, assessing the vulnerability of species and populations to future change, or accurately delineating evolutionary significant units. The newly launched conference series of EvolTree will hopefully play a key role in the exchange and synthesis of such important investigations. Supplementary information: The online version contains supplementary material available at 10.1007/s11295-022-01542-1.
Alignment analysis of eight whole cp genome sequences by mVISTA with A. cissifolium as the reference genome
Comparison of boundaries among the LSC, SSC, and IR regions in 10 Acer species
Number of repeat elements in the cp genomes of eight Acer species. a Number of forward, reverse, complement, and palindromic repeats. b Number of simple sequence repeats (SSRs). c Frequencies of SSRs in the LSC, SSC, and IR regions. d Frequencies of SSRs in the intergenic regions, protein-coding genes, and introns. e Number of different SSR types detected in 8 cp genomes. f Frequencies of identified SSR motifs in 8 cp genomes
Phylogenetic tree based on the complete cp genome sequences of 106 species using the maximum likelihood (ML) method. Numbers above the lines indicate the bootstrap value of the phylogenetic analysis for each clade
Acer L. (Sapindaceae) consists of approximately 200 species, with great ornamental and commercial values. However, due to a substantial divergence in inflorescence, leaf shape, and fruit shape during the process of long-term natural evolution, it is remarkably difficult to distinguish them by morphological features. Eight species with compound-leaved maples from Sect. Trifoliata, Pentaphylla, and Negundo play an important role in revealing the morphological variation of Acer. Hence, the complete chloroplast (cp) genomes of all eight compound-leaved maples native to Asia were characterized, and comparative genomic analysis was conducted to infer their phylogenetic relationships. A few differences were found in cp genome size and gene content among eight Acer species. The gene rps2 was only identified in A.griseum. The differences in the cp genome sequences among eight Acer species have been clearly demonstrated, where matK-rps16, trnE-trnT, ndhC-trnV, ccsA-ndhD, and ycf1-trnN were the most divergent regions. The phylogenetic analysis revealed that Acer was clustered into monophyly by 100% bootstrap values, with A.glabrum (Sect. Glabra) and A.pseudoplatanus (Sect. Palmata) as the most basic species. Except for A.henryi and A. negundo, seven compound-leaved maples and some simple-leaved maples were highly supported to cluster into one clade with A.sutchuenense as the primitive species of Sect. Trifoliata and A.pentaphyllum as a series (Ser. Pentaphylla) of Sect. Pentaphylla. Besides, it is speculated by plastid phylogeny reconstruction that A.cissifolium (Sect. Negundo) may have ancestral connections with A. triflorum (Sect. Trifoliata). Finally, we conjectured that compound-leaved maples may have evolved from simple-leaved maples.
Distribution of pairwise relatedness coefficients for parent-offspring using likelihood estimator (expected value = 0.5). The vertical dashed line corresponds to the threshold used to identify selfed individuals in the population
Mean theoretical accuracies of breeding values for three classes of individuals: maternal parent (M), non-genotyped offspring (NO) and genotyped offspring (GO) for models fit using pedigree-based (ABLUP_SO), pedigree-based incorporating selfing rate in population (ABLUP_S05), pedigree-genomic-based models (ssGBLUP_S0) and pedigree-genomic-based incorporating selfing rate in population (ssGBLUP_S05)
Predictive abilities of two cross-validation methods: random (CV-random) or between family relatedness (CV-family) for four phenotypic traits and the four models: ABLUP_S0 (pedigree-based); ABLUP_S05 (pedigree rescaled to populational selfing rate); ssGBLUP_S0 (combined pedigree marker-based) and ssGBLUP_S05 (combined pedigree marker-based rescaled to population selfing rate)
Bias (regression slope) of two cross-validation methods: random (CV-random) or between family relatedness (CV-family) for the four phenotypic traits and the four models: ABLUP_S0 (pedigree-based); ABLUP_S05 (pedigree rescaled to populational selfing rate); ssGBLUP_S0 (combined pedigree marker-based) and ssGBLUP_S05 (combined pedigree marker-based rescaled to population selfing rate)
Spearman rank correlation of two cross-validation methods: random (CV-random) or between family relatedness (CV-family) for four phenotypic traits and the four models: ABLUP_S0 (pedigree-based); ABLUP_S05 (pedigree rescaled to populational selfing rate); ssGBLUP_S0 (combined pedigree marker-based) and ssGBLUP_S05 (combined pedigree marker-based rescaled to population selfing rate)
In forest tree breeding programs, open-pollinated families are frequently used to estimate genetic parameters and evaluate genetic merit of individuals. However, the presence of selfing events not documented in the pedigree affects the estimation of these parameters. In this study, 194 open-pollinated families of Eucalyptus globulus Labill. trees were used to compare the precision of estimated genetic parameters and accuracies of predicted breeding values with the conventional pedigree-based model (ABLUP) and the pedigree-genomic single-step model (ssGBLUP). The available genetic information for pairwise parent-offspring allows us to estimate an actual populational selfing rate of 5.4%. For all the growth and disease resistance traits evaluated, the inclusion of selfing rate was effective in reducing the upward bias, between 7 and 30%, in heritability estimates. The predictive abilities for ssGBLUP models were always higher than those for ABLUP models. In both cases, a considerable reduction of predictive abilities was observed when relatedness between training and validation populations was removed. We proposed a straightforward approach for the estimation of the actual selfing rate in a breeding population. The incorporation of this parameter allows for more reliable estimation of genetic parameters. Furthermore, our results proved that ssGBLUP was effective for the accurate estimation of genetic parameters and to improve the prediction of breeding values in presence of selfing events, thus a valuable tool for genomic evaluations in Eucalyptus breeding programs.
Overview of World citrus fruits production trends, breeding obstacles, molecular approaches to overcome breeding challenges, and outline representing citrus improvement protocol
Structural outline representing general overview molecular marker-based genotyping. (The image was created in the biorender and plant sample image were
taken from
Phylogenetic relationships between key citrus species and their maternal and paternal ancestors (Luro et al. 2017). (Plant sample images were
taken from Wikipedia and
Conceptual schematic representation of Omics study in citrus
Historical milestones revealing the sequencing information of citrus genome. (Plant sample images were
taken from Wikipedia and
Citrus is an economically important fruit crop growing worldwide, with enormous health benefits. However, conventional citrus breeding has been hindered by a variety of genetic factors, thereby becoming obsolete and insufficient. Citrus research mostly focused on botanical, taxonomic, and cytogenetics issues. Nowadays the knowledge base has strengthened with the plausible outcomes of commercially successful varietal releases. Unfortunately; this has been gradual with only a few success stories among citrus rootstocks and even fewer among scion cultivars. Recent advancements in genetics, molecular biology, biotechnology, and omics (genomics, transcriptomics, proteomics, and metabolomics) have expedited citrus breeding and genetics research. Linkage mapping, genetic diversity, phylogenetic relationships, mutation breeding, mapping, and the international citrus genome sequencing initiatives along with functional analysis have been comprehensively summarized in this review. While providing information on future avenues, this review provides novel mechanistic compiled up-to-date information based on the past and recent progress, facilitating their broader applications to accelerate citrus breeding.
Map of southwestern Western Australia showing the location of the Corymbia calophylla provenances and the two experimental sites Margaret River (MR) and Mount Barker (MtB) Provenance abbreviations (see Table 2)
Correlation between QSB resistance or height of Corymbia calophylla provenances at six years of age and climatic factors of the origin of each provenance: mean annual precipitation, maximum temperature of the warmest month, and 1/aridity index (see also Supplementary Table 3)
Relative level of QSB resistance of 3-month-old seedlings of Corymbia calophylla provenances in a glasshouse and the same provenances in field trials at Margaret River and Mount Barker after 2, 4, and 6 years of growth (data for 2-year-old trees from Ahrens et al. (2019)) (see also Supplementary Table 4)
Quambalaria shoot blight (QSB) has emerged recently as a severe disease of Corymbia calophylla (marri). In this study, QSB damage and growth were assessed in Corymbia calophylla trees at 4 and 6 years of age in two common gardens consisting of 165 and 170 open-pollinated families representing 18 provenances across the species' natural distribution. There were significant differences between provenances for all traits. The narrow-sense heritability for growth traits and QSB damage at both sites were low to moderate. The genetic correlation between QSB damage and growth traits was negative; fast-growing families were less damaged by QSB disease. Age-age genetic correlations for individual traits at four and six years were very strong, and the type-B (site-site) correlations were strongly positive for all traits. Provenances from cooler wetter regions showed higher resistance to QSB. The QSB incidence at 6 years was significantly correlated with environmental factors of the provenance's origin. The QSB incidence at years four and six was not correlated with the QSB expression in 3-month-old seedlings. Based on these results, selection for resistance could be undertaken using 4-year-old trees. There is potential for a resistance breeding program to develop populations of marri genetically diverse and resistant to QSB.
a Representative chromatograms of the four standard PMFs (S, sinensetin; N, nobiletin; HMF, heptamethoxyflavone; T, tangeretin) and the extracts from ‘Katsuyama kuganii’, Hanayu, and their hybrid HK detected by HPLC with an Inertsil® ODS-3 column at 254 nm. The concentration of the four standard PMFs was 50 ppm (STD 50 ppm). b The chemical structures of nobiletin and tangeretin
Frequency distributions of nobiletin (a–d) and tangeretin (e–h) accumulation for the HK (Hanayu × ‘Katsuyama kuganii’) population in March 2020 (a, e), July 2020 (b, f), August 2020 (c, g), and March 2021 (d, f). Solid lines indicate the PMF accumulation of Hanayu (1.361 mg/gDW for nobiletin and 1.428 mg/gDW for tangeretin). Dotted lines indicate the PMF accumulation of ‘Katsuyama kuganii’ (2.672 mg/gDW for nobiletin and 3.270 mg/gDW for tangeretin)
Frequency distributions of PMF accumulation for the KK (‘Kiyomi’ × ‘Katsuyama kuganii’) (a, b), HY (Hassaku × ‘Yoshida’ Ponkan) (c, d), and HG (Hanayu × ‘Genkou’) (e, f) populations. Solid lines indicate the PMF accumulation of ‘Kiyomi’ (0.527 mg/gDW for nobiletin and 0.000 mg/gDW for tangeretin), Hassaku (0.053 mg/gDW for nobiletin and 0.000 mg/gDW for tangeretin), and Hanayu (0.644 mg/gDW for nobiletin and 0.612 mg/gDW for tangeretin). Dotted lines indicate the PMF accumulation of ‘Katsuyama kuganii’ (2.734 mg/gDW for nobiletin and 2.281 mg/gDW for tangeretin), ‘Yoshida’ Ponkan (3.270 mg/gDW for nobiletin and 3.340 mg/gDW for tangeretin), and ‘Genkou’ (0.478 mg/gDW for nobiletin and 0.413 mg/gDW for tangeretin)
Genetic linkage map of Hanayu (HA) and significant quantitative trait loci (QTLs) identified. Numbering and orientation of HA linkage groups (LGs) are based on Clementine reference genome v. 1.0 ( QTLs are named literally using trait abbreviations followed by the number of the LG in which the QTL is located. The years and months of sample collection are presented with QTL names. Significant QTLs are shown at the side of each linkage group; the 1-LOD (boxes) and 1.5-LOD (range lines) support intervals are shown. The details of QTLs are presented in Table 1
Polymethoxyflavones (PMFs) are bioactive flavonoids exclusively found in the genus citrus, and they show various health-promoting activities. Researchers have studied the biosynthesis of PMFs and focused on isolation and characterization of the responsible genes. However, our knowledge about the biosynthesis of PMFs is still limited. In this study, we aimed to reveal the loci for causative factors that are responsible for PMF accumulation. We investigated the frequency distributions of PMF accumulation in F1 hybrids derived from several patterns of parent combinations. The distribution patterns in F1 hybrids indicated that the major gene for PMF biosynthesis was recessive. We developed single-nucleotide polymorphism (SNP) markers by taking advantage of their high abundance in genomes. SNP genotyping was performed using high-resolution melting (HRM) analysis. We successfully mapped 119 SNP markers and were able to construct a linkage map consisting of nine linkage groups (LGs) with a total genetic distance of 1504.3 cM. Quantitative trait locus (QTL) analysis for PMF accumulation was carried out with the mapped SNP markers. We found three novel QTLs: two QTLs related to nobiletin accumulation in LG1 and LG8 and one QTL related to tangeretin accumulation in LG3. We found 27 candidate genes in QTL confidence intervals, which could be associated with the biosynthesis of PMFs and suggested that flavanone 3-hydroxylase, flavonoid 3′-monooxygenase, and O-methyltransferase could be the key enzymes responsible for PMF biosynthesis. Our results provide valuable information that contributes to the elucidation of the PMF biosynthetic pathway.
Interspecific (red) and intraspecific (green) genetic distances for 74 samples of Myrtaceae from Sumatra using the traditional barcode markers matK, rbcL, and ITS and their combinations. The differences between inter- and intraspecific distances were significant for all markers or their combinations according to a Wilcoxon signed-rank test (p < 0.001)
Bayesian inference of tribes Syzygieae and Myrteae (family Myrtaceae) based on the combined sequences of matK, rbcL and ITS for samples collected in Sumatra, Indonesia. Posterior probability values are placed next to the nodes. Tip labels display species identification based on morphological taxonomy, and internal IDs. Colours represent genera Syzygium, Rhodamnia, and Decaspermum
Bayesian inference of tribes Syzygieae and Myrteae (family Myrtaceae) based on ITS sequences of samples collected in Sumatra, Indonesia. Posterior probability values are placed next to the nodes. Tip labels display species identification based on morphological identification, and internal IDs. Colours represent genera Syzygium, Rhodamnia, and Decaspermum. Accession numbers of sequences downloaded from NCBI are displayed beside the species IDs
Bayesian inference of tribes Syzygieae and Myrteae (family Myrtaceae) based on matK and rbcL sequences of samples collected in Sumatra, Indonesia. Posterior probability values are placed next to the nodes. Tip labels display species identification based on morphological taxonomy, and internal IDs. Colours represent genera Syzygium, Rhodamnia, and Decaspermum. Information on accession numbers of sequences downloaded from NCBI are included in Table S2
Given the difficulties for rapid biodiversity assessments in understudied regions, DNA barcoding appears as a suitable alternative. Still, this approach relies heavily on accurate reference sequence databases for correct taxonomic assignments. In this study, we evaluated the effectiveness of matK, rbcL, and ITS regions for the identification of Myrtaceae species with emphasis on the megadiverse genus Syzygium from Sumatra, Indonesia; and analyzed the applicability of species-tree inference for species assignment using barcode markers. ITS was the most variable barcode region (42.6% of variable sites), followed by matK (25.7%), and rbcL (14.9%). In terms of assignments of sequences using the BLAST algorithm, all markers were effective for genus-level attribution. For assignments at species rank, rbcL was able to attribute 30.15% of the samples at the species level, followed by matK (26.47%), and ITS (17.21%). These results are largely related to the availability of reference sequences for Myrtaceae in the databases since for the 27 species analyzed in this study, only 8 species had reference sequences for all three barcode regions available in GenBank. The species-tree inference based on the combination of matK, rbcL, and ITS markers recovered 41% of the species as monophyletic clades with strong node support. Due to its high level of differentiation, we recommend the ITS region as the most efficient barcode marker for the identification of Syzygium, and the traditional core-barcodes (matK + rbcL) as add-on barcodes.
Locations of sites sampled in this study. Numbers correspond to those in Table 1
Distribution of clones at Murlough Bay (top) and Boa Island (bottom). Coloured circles indicate different clones. White circles indicate unique genotypes
Frequency of clones by size class across all populations studied. N, number of genets in each class
Boxplots showing comparisons of levels of nuclear (R, D* and HO) and chloroplast (H) genetic diversity between hedgerow and woodland samples. Significance of differences (P) between Hedgerow and Woodland samples for each diversity measure was assessed using Mann–Whitney tests
Hedgerows are an important component of agricultural landscapes, but in recent years have increasingly faced threats such as habitat loss, land use change, climate change, invasive species, pests and plant pathogens. Given the potential importance of genetic diversity in countering these threats, and the spatial distribution of such diversity within and across natural populations, we analyzed levels and patterns of diversity in blackthorn ( Prunus spinosa ), a key component of many hedgerows. Twenty-one populations of blackthorn from a mixture of hedgerows and woodlands were genotyped for four nuclear and five chloroplast microsatellites. Three hundred twenty-one unique clonal genotypes were identified from 558 individuals analyzed, 207 of which were found in a single individual. With the exception of a single population that appears to have been planted recently from seed (Peatlands Park), all populations exhibited evidence of vegetative reproduction via suckering. Multi-ramet clones were highly spatially structured within populations, and ranged in size from < 1 to 258 m. These findings indicate that asexual reproduction is widespread in the populations of blackthorn studied. Although levels of clonality varied across study sites, there was clear spatial structuring of clones in each case. Such clonal organization should be taken into account in hedge management or where planting or replanting of hedgerows becomes necessary. Knowledge of the patterns and extent of spatial structuring of genotypes within potential source populations will allow the selection of genetically divergent material, rather than selection of clonal replicates of the same genotype.
We investigated the efficiency of genomic selection in a large clonal population (N = 2023) of Pinus taeda L. The study population comprised 58 families that were tested across eight locations in the southern USA. The clones were genotyped with the Pita50K SNP array. Whole-genome regression models were used to obtain genomic estimated breeding values (GEBVs). The predictive ability of SNP markers for commercially important traits were estimated using various cross-validation scenarios that address the family structure in the population. In the random cross-validation scenario (clonal varieties randomly assigned to either training or validation sets), the predictive ability of GEBVs for stem volume, stem straightness, and fusiform rust disease incidence was 0.43, 0.57, and 0.26, respectively. In the family cross-validation scenario (whole families randomly assigned to either training or validation sets), the predictive ability for stem volume dropped to 0.36, but the change for the other two traits was small. In the third scenario, the predictive ability of the GEBVs of clones in a new environment was 0.32 for stem volume, 0.40 for stem straightness, and 0.18 for fusiform rust disease incidence. The predictive ability of the models dropped for all three traits when the GEBVs of untested varieties (varieties excluded from the training population) were predicted across multiple environments (range of 0.06 to 0.40 across traits). This study highlights the importance of genetic relatedness between the model training and validation sets of a cloned population of P. taeda. The expected genetic gain was about twice the expected genetic gain achieved by a traditional breeding strategy, mainly due to a 50% shorter breeding cycle achieved through the implementation of genomic selection.
Venn diagram of differentially expressed genes in Eucalyptus globulus (a and b) and E. grandis (c and d). Genes upregulated (a, c) and downregulated (b, d) under LT, HT, LT + CO2, HT + CO2, and CT + CO2
Lignin pathway biosynthesis (A). Transcripts encoding enzymes identified in our analysis are shown in red letters. Heatmap of lignin genes expressed in all treatments (B). The red and green colors indicate upregulated and downregulated genes, respectively
Photosynthesis activity (A), transpiration (B), and stomatal conductance (C) in E. globulus and E. grandis plants at different combinations of low, normal, and high temperatures with ambient and eCO2. Ambient temperature = CT, low temperature and ambient CO2 = LT, high temperature and ambient CO2 = HT, high temperature and eCO2 (HT + CO2), low temperature and eCO2 level (LT + CO2), control temperature and eCO2 (CT + CO2). Vertical bars indicate SEDs and the corresponding value
Changes in physiological processes (photosynthesis and osmoprotectant metabolites) and gene expression (boxed panels) induced by low and high temperatures combined with eCO2 in E. grandis and E. globulus. Blue arrows indicate gene expression increased by low temperature (LT), and red arrows indicate gene expression increased by high temperature (HT). eCO2 indicates combinations elevated carbon dioxide with LT and HT (arrows). Black dash indicates no changes. Egl, E. globulus; Egr, E. grandis
Climate change may lead to severe losses in agriculture, including wood production. To understand the effects of climate change on physiology and molecular aspects of wood formation, we grew plants of Eucalyptus grandis and E. globulus for 35 days under three temperatures (10–12 °C, 20–22 °C, and 33–35 °C) combined with two CO2 concentrations (390 and 700 ppm). Biochemical analyses and RNAseq in stems were carried out together with leaf gas exchange measurements. We analyzed in-depth cell wall biosynthesis genes and their regulation by several transcription factors, as well as genes associated with carbon partitioning, cell wall remodeling, and hormonal regulation. E. globulus, a species adapted to low temperature, was more responsive to the treatments than E. grandis. Gene expression was greatly affected by changes in temperature than in CO2. The most relevant processes affected by the treatments were related to stress, secondary metabolism, hormonal response, and signaling. Ethylene and auxin biosynthetic genes were upregulated in both species, but more intensely in E. globulus. High CO2 stimulated lignin biosynthesis genes and increased S-containing oligomers in E. globulus. Genes related to cell wall carbohydrates and lignin were strongly induced by temperature and CO2, respectively. Photosynthesis activity and transpiration were highest under high temperature and high temperature + high CO2 in both species. Our results show that responses of woody plants may be different regarding the temperature at eCO2.
Maximum likelihood phylogenetic tree constructed using SNPs at four-fold degenerate sites of all pear accessions
Population structure of 102 pear accessions. a, b Population structure based on the number of ancestry populations (K = 2 and K = 8). Each vertical bar represents a single accession that was assigned ancestry to one or more of the populations (different colors)
Genetic relationships of pears in different geographical regions. a, b Principal component analysis and nucleotide diversity of the 102 pear accessions that clustered into eight pear groups. c Genome-wide IBD sharing for the average pair of accessions across eight pear groups. d Map of dissemination routes of pears in China
Pear (Pyrus) is an important temperate fruit, which originates in the southwestern region of China and has more than 3000 year’s cultivation history. However, the historic routes of pear dissemination in China have not been fully elucidated. In this study, a total of 2,412,930 single nucleotide polymorphisms (SNPs) at a density of 4.74 SNP/kb were identified by resequencing. The SNP-based phylogenetic analysis revealed that 102 pear samples from 23 provinces in China were divided into two major clades and eight geographic groups, and these divisions were supported by results of a population structure analysis and principal component analysis (PCA). Combined with the results of population diversity and identity-by-descent (IBD) analysis, it was revealed that the dissemination direction of pear was from southwest to southeast and from south to north. In the southern region, the dispersal pattern of pear spreading from west to east was generally in line with the course of the Yangtze River and Pearl River. The southern pear spread by multiple routes to its north neighboring areas, and regions in the middle and lower reaches of the Yellow River played important roles in the further dissemination of pear in the northern region of China. Moreover, we identified comparative higher genetic diversity of Ussurian pear than other populations, which might be due to low degree of domestication and closely resembled the high diversity of its wild counterpart. Our study provides new information to further our understanding of pear evolution in China, while laying a foundation of data for population genetic research, germplasm protection, and utilization for pear breeding in the future.
Chloroplast genome map of Prunus clarofolia. Genes inside the circle are transcribed clockwise, and those outside are transcribed counterclockwise. Genes of different functions are color-coded. The darker gray in the inner circle shows the GC content, while the lighter gray shows the AT content
Distribution of the percentage of molecular variation across 203 homologous regions: variation of the (a) coding and (b) non-coding homologous regions. The 20 regions with the highest variation are indicated with asterisks (*)
Phylogenetic Maximum Parsimony tree for the 23 Prunus species based on (a) 18 highly variable regions (marked with asterisks in Table 3), and (b) whole chloroplast genomes of 23 Prunus species, with Pyrus pyrifolia (Rosaceae) (NC_015996.1) as the outgroup. Bootstrap values are presented at each node. (c) The geographical distribution of different clades. Green indicates the distribution of clade I; red indicates clade II distribution; red border indicates clade III; blue indicates clade IV; purple border indicates clade V; yellow indicates clade VI; and blue border indicates clade VII
Prunus subgenus Cerasus is a large and diverse genus. Its botanical classification has long been controversial and complicated. Molecular markers derived from the chloroplast genome may provide useful tools for phylogenetic resolution and taxonomic study in Prunus species. Thus, we compared chloroplast genomes of 23 Prunus species to identify sequences with high variation levels. The repeated sequences (RSs) and two types of sequence variation, comprising divergent homologous regions and short sequence repeats (SSRs), were identified. The species with the highest number of RSs was P. emarginata (60), whereas P. campanulata contained the highest amount of SSRs (70). In contrast, P. dictyoneura and P. yedoensis contained the lowest number of RSs (37) and SSRs (46), respectively. Out of these homologous SSRs, 11 were shared in the 23 species, and seven of them showed variations. A total of 203 homologous regions were identified. Out of these, 20 regions (19 IGSs and one intron) showed a high variation from 6.75 to 19.31%. The phylogenetic tree which was inferred based on 19 highly variable regions showed a topology similar to the one obtained using the whole chloroplast genome. We conclude that the seven variable SSRs and the 20 highly variable regions may potentially be convenient as molecular markers for phylogenetic and population genetics studies in subgenus Cerasus species. In addition, the molecular phylogenetic position of P. clarofolia was inferred based on the 19 highly variable regions.
Population structure and hierarchical organization of genetic relatedness of 181 genotypes from the whole hazelnut germplasm collection (WHGC) at K = 2, K = 3, and K = 5, as inferred by STRUCTURE software
Neighbor-joining dendrogram based on the Dice similarity index showing the relationships among 181 hazelnut genotypes from WHGC. Genotypes are colored according to their assignment to the different gene pools, as inferred by STRUCTURE software at K = 5: Central Europe (Q1), British Islands (Q2), Iberian Peninsula (Q3), Italian Peninsula (Q4), Balkans-Black Sea (Q5), and mosaic group (M). Entries of the final core collection (Cv-Dce30) are reported as CC
Two-dimensional PCoA scatterplot of 181 hazelnut genotypes from WHGC based on Dice’s distance. Genotypes are colored according to their assignment to the different gene pools, as inferred by STRUCTURE software at K = 5: Central Europe (Q1), British Islands (Q2), Iberian Peninsula (Q3), Italian Peninsula (Q4), Balkans-Black Sea (Q5), and mosaic group (M). Entries of the final core collection (Cv-Dce30) are reported as CC
Hazelnut (Corylus avellana L.) is one of the most important tree nut crops in Europe. Germplasm accessions are conserved in ex situ repositories, located in countries where hazelnut production occurs. In this work, we used ten simple sequence repeat (SSR) markers as the basis to establish a core collection representative of the hazelnut genetic diversity conserved in different European collections. A total of 480 accessions were used: 430 from ex situ collections and 50 landraces maintained on-farm. SSR analysis identified 181 genotypes, that represented our whole hazelnut germplasm collection (WHGC). Four approaches (utilizing MSTRAT, Power Core, and Core Hunter’s single- and multi-strategy) based on the maximization (M) strategy were used to determine the best sampling method. Core Hunter’s multi-strategy, optimizing both allele coverage (Cv) and Cavalli-Sforza and Edwards (Dce) distance with equal weight, outperformed the others and was selected as the best approach. The final core collection (Cv-Dce30) comprised 30 entries (16.6% of genotypes). It recovered all SSR alleles and preserved parameter variations when compared to WHGC. Entries represented all six gene pools obtained from the population structure analysis of WHGC, further confirming the representativeness of Cv-Dce30. Our findings contribute towards improving the conservation and management of European hazelnut genetic resources and could be used to optimize future research by identifying a minimum number of accessions on which to focus.
Geographic origin of 130 collection locations for 206 individuals representing 22 taxa that were sampled in this study in USA, Mexico, and Guatemala (Online Resource 3) using Albers projection. Inset is nuclear species phylogeny, modified from (Willyard et al. 2021) with taxon symbols that match Fig. 4. USA states: AZ, Arizona, CA, California; CO, Colorado; ID, Idaho; MT, Montana; NE; Nebraska; NM, New Mexico; NV, Nevada; OR, Oregon; SD, South Dakota; TX, Texas; UT, Utah; WA, Washington; WY, Wyoming. Mexican states: AGS, Aguascalientes; BC, Baja California; CDMX, Ciudad de México; CHH, Chihuahua; CHS, Chiapas; COA, Coahuila; COL, Colima; DUR, Durango; EM, Estado de México; GRO, Guerrero; GTO, Guanajuato; HDG, Hidalgo; JAL, Jalisco; MIC, Michoacán; MOR, Morelos; NAY, Nayarit; NL, Nuevo León; OAX, Oaxaca; PUE, Puebla; QRO, Querétaro; SLP, San Luis Potosí; SNL, Sinaloa; SON, Sonora; TAM, Tamaulipas; TLA, Tlaxcala; VER, Veracruz; ZAC, Zacatecas
A Nucleotide sequences of Motifs 1 through 8 corresponding to Motifs A through H (Potter et al. 2013). B Motif repeat patterns for 29 mitochondrial haplotypes detected using nucleotide sequence alignment, including two different haplotypes discovered in GenBank (Hj, Hk). Motif x denotes any nucleotide sequence that does not match one of the eight basic motifs. Colors are used in Figs. 3 and 4
Median Joining Network using a binary matrix for the presence/absence of motifs in nucleotide alignments of the second intron in the nad1 mitochondrial gene (Fig. 2; Online Resources 4, 6). Connections with more than one mutational step are given with numbers. The size of the nodes is roughly proportional to the number of known samples (Online Resource 7 plus previously reported for P. ponderosa s.l. (Potter et al. 2013)). Haplotypes Hc1, Hc2, Hc3, Hs1, Ht1, and Ht2 have only been observed in P. coulteri, P. sabiniana, and P. torreyana. Haplotypes shown in the shaded area were generally found in the western part of the study area; haplotypes not shaded were generally found in the eastern part of the geographic range. Colors match Fig. 4
The geographic locations in the United States, Mexico, and Guatemala carrying the 27 mitochondrial haplotypes observed in this study and previously reported as P. ponderosa s.l. (Potter et al. 2013) using Albers projection. Locations where more than one haplotype was observed have a symbol for each haplotype. (a) H1, H5, and H8; (b) H2; (c) H3 and H6; (d) H4; (e) H7; (f) H9, H10, H11, Hc2, Hs1, Ht1, and Ht2; (g) H12, H14, and H21; (h) H13, H15, H16, H17, H18, H19, and H20; Haplotypes Hj, Hk, Hc1, and Hc3 are not included on the maps because geographic source locations are not known. Taxon symbols match Fig. 1; haplotype colors match Fig. 3
The mitochondrial phylogeography of some conifers shows evidence of introgression from sympatric congeners, with mitochondrial lineages not always reflecting species. This suggests that unique mitochondrial haplotypes previously reported in the ponderosa pines (Pinus subsection Ponderosae) from the USA might be more widespread in taxa not yet sampled. Recent nuclear and plastome phylogenies placed Pinus ponderosa paraphyletic in relation to Ponderosae in Mexico and Central America and confirmed that sympatric Pinus jeffreyi is more closely related to the California big-cone pines (Pinus subsection Sabinianae). We describe a broad survey of the repeated motifs in nad1 intron 2 of Ponderosae and Sabinianae, which revealed that most of the 27 mitochondrial haplotypes were not exclusive to a taxon but showed strong geographic patterns. In surprising contrast to nuclear and plastid phylogenies that resolve a monophyletic P. jeffreyi, unidirectional mitochondrial capture by P. jeffreyi (Sabinianae) from P. ponderosa was observed in all 28 samples of Jeffrey pine. Confirming the paraphyly of P. ponderosa sensu lato, mitochondrial haplotypes found mostly west and those found mostly east of the Great Basin each have more similarity to haplotypes found in Mexican taxa than they have to each other. Two distinctive haplotypes that were terminal nodes on the network were confirmed to be endemic to the Great Basin, USA, suggesting that they arose in place and have been maintained in isolation. Altogether, our results indicate a history of complex and intriguing mitochondrial relationships among the ponderosa pine species, especially between P. ponderosa and P. jeffreyi.
Haplotype assembly workflow. Inputs are reported as shaded parallelograms, outputs as shaded rounded rectangles. Briefly, de novo assembled contigs were polished with long and short reads, then each polished sequence was aligned on the reference genomes of C. maxima, C. reticulata, and C. medica, and the alignments were then parsed to assign contigs to the H1 and H2 haplotypes. Purge Haplotigs software was then applied on these two sets to correct possible errors to this method. The two sets of contigs were then arranged into H1 and H2 chromosomes by using RaGOO and C. maxima as the guiding sequence
(A) Primary (derived from sour orange) and alternative (derived from citron) assemblies of the lemon genome. Chromosomes are represented as colored blocks; the two haplotypes of the same chromosome are represented by the same color. The position of genes showing a BLAST + sequence identity of at least 96% and an alignment length of at least 60% compared to the smallest sequence are linked by straight lines. (B) Gene density distribution (blue histogram) across the primary and alternative assembly: the number of genes is calculated on a sliding window of 1 Mb with a 200 kb step
A Venn diagram with number of expressed genes in flower, fruit, leaf, and root for the two haplotypes. (B) Bar plot representing the percentages of genes uniquely expressed in flower, fruit, leaf, and root in for the two haplotypes
Enriched GO terms for the genes involved in ‘biological process’ and mapped in the alternative assembly in flower (A) and root (B). The different color indicates different p values, such as light red corresponds to a p value < 0.005, light orange corresponds to a p value < 0.01, and yellow corresponds to a p value < 0.05. Small enrichments are created using only the best significant GO terms
Lemon (Citrus limon (L.) Burm. f.) is an evergreen tree belonging to the genus Citrus. The fruits are particularly prized for the organoleptic and nutraceutical properties of the juice and for the quality of the essential oils in the peel. Herein, we report, for the first time, the release of a high-quality reference genome of the two haplotypes of lemon. The sequencing has been carried out coupling Illumina short reads and Oxford Nanopore data leading to the definition of a primary and an alternative assembly characterized by a genome size of 312.8 Mb and 324.74 Mb respectively, which agree well with an estimated genome size of 312 Mb. The analysis of the transposable element (TE) allowed the identification of 2878 regions on the primary and 2897 on the alternative assembly distributed across the nine chromosomes. Furthermore, an in silico analysis of the microRNA genes was carried out using 246 mature miRNA and the respective pre-miRNA hairpin sequences of Citrus sinensis. Such analysis highlighted a high conservation between the two species with 233 mature miRNAs and 51 pre-miRNA stem-loops aligning with perfect match on the lemon genome. In parallel, total RNA was extracted from fruit, flower, leaf, and root enabling the detection of 35,020 and 34,577 predicted transcripts on primary and alternative assemblies respectively. To further characterize the annotated transcripts based on their function, a gene ontology and a gene orthology analysis with other Citrus and Citrus-related species were carried out. The availability of a reference genome is an important prerequisite both for the setup of high-throughput genotyping analysis and for functional genomic approaches toward the characterization of the genetic determinism of traits of agronomic interest.
Overview of bunch-by-bunch SHELL genetic screening to reduce non-tenera seeds entering the oil palm seed supply chain. a. Commercial oil palm seeds are derived from controlled crosses of maternal dura (ShDeliDura/ShDeliDura) and paternal pisifera palms. For simplicity, the variant SHELL allele is shown as shAVROS in the figure, although the paternal pisifera palm can also be homozygous for any fruit form phenotype-associated SHELL variant present in commercial populations (i.e. shAVROS, shMPOB, shMPOB2 or shMPOB3) or compound heterozygous for any combination of these variant alleles. b. Fruit bunches derived from the dura × pisifera cross yield thick-shelled dura fruit form F1 seeds (as they arose from a dura mother palm). From each bunch intended for seed production, 96 depericarped seeds (~ 10%) are randomly selected, DNA is extracted from the embryo of each seed and each embryo is individually genotyped for SHELL fruit form phenotype-associated variant alleles. Three embryo genotypes are possible: c.ShDeliDura/shAVROS embryos derived from the intended dura × pisifera cross, d. contaminant ShDeliDura/ShDeliDura embryos derived from either self-pollination of the dura mother palm or fertilization with unintended dura or tenera pollen and e. contaminant presumed shAVROS/shAVROS embryos that could arise only if the mother palm was tenera rather than dura or if the embryo is aneuploid. f. Only those palms propagated from an ShDeliDura/shAVROS embryo will produce the intended thin-shelled tenera fruit. g. At the embryo genotyping stage, seeds derived from bunches with ≤ 2% non-tenera embryos in the sampled subset are accepted to enter the seed supply chain, while seeds derived from bunches with > 2% non-tenera embryos in the sampled subset are excluded from the supply chain (h)
SHELL genotyping. a. Embryos extracted from 121,895 seeds from 1304 fruit bunches (average 94 seeds/bunch) were genotyped for ShDeliDura, shAVROS, shMPOB, shMPOB2 and shMPOB3 alleles. Relative frequencies of each SHELL FFPV allele are shown. b. Bunch-by-bunch non-tenera contamination percentage among up to 96 randomly selected seed embryos from each of 1304 bunches (blue data points). An additional 13 bunches from ShDeliDura × ShDeliDura (D × D) crosses were included as blinded controls (red data points). The dashed line indicates the average non-tenera contamination percentage across the 1304 bunches (3.32%). The solid line indicates the weighted average non-tenera contamination percentage across from a national seed production survey of 11 SPUs, including 4168 bunches and 585,346 seeds (7.29%). Gray lines represent the upper and lower 95% confidence interval for each non-tenera contamination measurement
SHELL allele genotyping identifies potential aneuploid embryos. a. 96 embryos randomly selected from a single D × P fruit bunch genotyped for the ShDeliDura and shAVROS alleles in triplicate technical replicates. Genotyping controls (orange) included synthetic DNA constructs for the ShDeliDura allele alone (WT), a 1:1 mixture of ShDeliDura and shAVROS alleles (HET) and the shAVROS allele alone (MUT). Ninety-eight percent of embryos (94 of 96) reported the expected ShDeliDura/shAVROS genotype (blue). One embryo reported a ShDeliDura/ShDeliDura (dura) genotype (green, 3 of 3 technical replicates). One embryo reported a shAVROS/shAVROS (pisifera) genotype (yellow, three of three technical replicates). b. 96 embryos randomly selected from a second D × P fruit bunch were genotyped as described in (a). Ninety-nine percent of embryos reported the expected ShDeliDura/shAVROS genotype (blue). One seed reported a shAVROS/shAVROS (pisifera) genotype (3 of 3 technical replicates). c. Ninety-six percent embryos randomly selected from a third D × P fruit bunch were genotyped as described in (a). Two embryos reported a dura ShDeliDura/ShDeliDura genotype (green, 3 of 3 technical replicates each). One embryo reported a ShDeliDura:shAVROS ratio intermediate between the heterozygous and homozygous mutant data range (red, 3 of 3 technical replicates)
Heterozygosity and copy number analyses of potential haploid or aneuploid embryos. a–d. Histograms of whole genome sequencing-based homozygous (green) and heterozygous (red) genotype calls relative to read depth coverage at each SNP position for the T128 diploid control genome (a), embryo 1 (b), embryo 2 (c) and embryo 3 (d). e. Estimated percent heterozygosity for T128 (black), embryo 1 (blue), embryo 2 (orange) and embryo 3 (gray) calculated for SNPs with 2–10X read depth. f–h. Embryo 1/T128 (blue), embryo 2/T128 (orange) and embryo 3/T128 (gray) average read depth calculated for sliding sequence windows including 1000 SNPs in one SNP steps across the chromosome. Chromosomes 1 (f), 2 (g) and 8 (h) are shown. The position of the SHELL gene is indicated by a labeled arrow (g)
Optimal oil palm (Elaeis guineensis) crop yields rely on the purity of the tenera fruit form. The high yielding hybrid tenera fruit form is the consequence of heterozygosity for one of nine genetic variants within the SHELL gene. High-throughput genotyping allows cost-efficient screening prior to planting to decrease unintentional non-tenera palm cultivation. We present a paradigm for dramatically reducing non-tenera cultivation by SHELL genotyping a ~ 10% sampling of seeds per seed production fruit bunch. Identification and seed supply chain removal of bunches above a predetermined non-tenera threshold represent a new paradigm for applying SHELL genetic testing in the industry. In a demonstration involving 121,896 embryos from 1304 independent dura x pisifera controlled crosses from two independent seed production units, we found that 38.4% of bunches achieved a 100% pure tenera prediction rate. The remaining bunches (61.6%) had predicted non-tenera contamination ranging from 1.0 to 89.6%, with an overall average of 3.32% seeds per bunch. SHELL genotyping of expected tenera embryos identified rare aneuploid embryos, confirmed by whole genome sequencing-based heterozygosity and copy number analyses.
Locations of the 18 Shorea parvifolia populations studied. Shorelines when sea levels were 50 m and 120 m lower than at present are shown as depth contours
Parsimony network for the 41 Shorea parvifolia cpDNA haplotypes. The dotted and solid lines indicate indels and nucleotide substitutions respectively. The sizes of circles are proportional to the frequencies of haplotypes; the smallest circles represent single individuals. The small filled circles indicate hypothetical intermediate haplotypes
Geographical distribution of the genetic clusters derived from STRUCTURE result across the distribution range of Shorea parvifolia. Based on the relationships between the number of clusters (K) and Evanno et al.’s (2005) ΔK (see Fig. S1), the results for K = 2 (a) and K = 3 (b) are shown here. Probabilities for each cluster within populations and individuals are shown as pie charts and bar plots respectively. Contour lines indicate the distribution probabilities of the clusters estimated using the kriging method. The two populations with small sample sizes (indicated by asterisks) were excluded from the spatial interpolation
A neighbor-joining (NJ) tree based on DA distances for 18 S. parvifolia populations
Geographical distribution of cpDNA haplotypes of the 41 Shorea parvifolia populations across Sundaland. Two common haplotypes (A, B) and their related haplotypes (AN, BN) are represented by filled and open circles respectively. Other haplotypes are indicated by circles with diagonal stripes
Shorea parvifolia (Dipterocarpaceae) is a widely distributed tree species which is important in terms of ecosystem functioning as well as forestry in Southeast Asia. During glacial periods, substantial precipitation decline is believed to have occurred in Southeast Asia, which considerably changed the distribution of the species. Repeated glacial and inter-glacial fluctuations were found to have influenced the genetic structure of the species, which is important to know for conservation and sustainable use. Leaf samples were collected from 18 populations covering most of the natural distribution of this species including the Malay Peninsula, Sumatra, and Borneo Islands. We investigated these samples using sequence data for eight chloroplast DNA (cpDNA) regions and 14 nuclear EST-SSR loci. The nucleotide diversity of cpDNA is higher in Malay Peninsula populations but the genetic diversity of nuclear DNA is higher in Borneo populations. The genetic structure revealed by nuclear DNA clearly separated Borneo populations from the rest, with an FST value of 0.150, while the genetic structure obtained from cpDNA was less pronounced (FST value = 0.136). Tajima’s D and Fu and Li’s D* for cpDNA showed statistical significance only in populations from Borneo. These results suggested that there has been recent population expansion of S. parvifolia in Borneo.
We aimed to test the extent to which plastid DNA gives incongruent phylogeographic patterns to nuclear DNA in a species of eucalypt, Eucalyptus behriana, a taxonomic group where chloroplast capture is a well-established phenomenon. Furthermore, we aimed to test the degree of influence chloroplast capture has on the observed patterns by broadly sampling co-occurring, related species. A genome skimming approach was used to sequence and assemble chloroplast genomes from population-level sampling of E. behriana, as well as samples of twenty-one other Eucalyptus section Adnataria species which co-occur with it. Phylogenetic analyses were first undertaken on just E. behriana to allow direct comparison to previously reported phylogeographic patterns based upon nuclear markers. A subsequent analysis including the related taxa was undertaken to investigate the degree of chloroplast capture and how this may be influencing the observed phylogeographic patterns. We found strong geographic structuring of plastid DNA relationships across the geographic range of E. behriana, with a basal divergence between the most northerly isolated population at West Wyalong and all other populations which does not match phylogeographic patterns based on nuclear markers. When outgroups were included, we found that E. behriana is highly polyphyletic with respect to all other species, starkly contrasting with the species well-supported monophylly based upon nuclear markers, and that chloroplast capture is so widespread that geographic patterns of the plastid genomes are consistent across species boundaries.
For continuous genetic gains over time, a balance between genetic gain and maintaining the genetic base must be a constant concern of forest breeders. This study aims at determining the best thinning strategies for a population of Eucalyptus dunnii, by incorporating the effects of environmental heterogeneity and competition in the analysis, as well as the best growth trait regarding precision and accuracy. The population studied consisted of 160 open-pollinated families. The survival and growth (height, HT; diameter at breast height, DBH; and volume, VOL) were evaluated 4 years after planting. The growth rate data were analyzed and compared by four mixed models. Selection and thinning strategies were simulated by varying the number of families, individuals within families, and selected individuals, considering the estimated genetic gains and the effective size. The species showed good survival (89.7%) and productive performance (mean annual increment = 42 m³ ha⁻¹ y⁻¹). The Spatial+Competition Model provided the best fit for DBH and VOL. The strategies that allow a balance between improvement (genetic gains) and genetic conservation (effective size) consist of keeping 36 to 50% of the individuals in the test (370 to 510 trees ha⁻¹), by reducing more intensively the number of individuals from the worst-performing families. The selection of 100 individuals with a restriction of at most one individual per family generates the largest number of effective size (Ne), with more than double the Ne obtained without restricting the individuals per family, with a small drop in genetic gain.
a Gene map of the chloroplast genomes of Malus baccata and Malus toringo collected from China, Japan, and Korea and sequenced in this study. b Flowers and leaves of Malus baccata. c Flowers and leaves of Malus toringo. Photo credit: Min Sung Cho, Sungkyunkwan University, Republic of Korea
Comparison of the border positions of the large single-copy (LSC), small single-copy (SSC), and inverted repeat (IR) regions among eight wild Malus chloroplast genomes. Gene names are indicated in colored arrow boxes, and their lengths in the corresponding regions are displayed beside the boxes. Ψ indicates a pseudogene
Comparison of the chloroplast genomes of eight Malus species visualized by mVISTA. Gray arrows indicate genes with their orientation and position. Genome regions are color coded as pink blocks for the conserved coding genes (exon), blue blocks for introns, and peach blocks for noncoding sequences in intergenic regions (CNS). Thick lines below the alignment indicate the quadripartite regions of genomes; the LSC region is green, IR regions, aqua blue, and SSC region, orange
Seven most variable hotspot regions found in eight plastomes of wild Malus species by sliding window analysis. Six intergenic regions of trnK-rps16, trnR-atpA, petN-psbM, trnT-psbD, psbZ-trnG, ndhC-trnV, and one coding gene of ycf1
Maximum likelihood tree inferred from 79 protein-coding genes of 23 Malus (Rosaceae) taxa using Pyrus pyrifolia as outgroup. Bootstrap values over 50%, based on 1000 replicates, are shown on each node. The species indicated in red are the four Malus plastomes newly sequenced in this study. The subclade of red square contains the species belonging to the series Baccatae (section Malus) and series Sieboldianae (section Sorbomalus) except four species specified in blue
Malus baccata (L.) Borkh. and Malus toringo (Siebold) Siebold ex de Vriese of the genus Malus Mill. (Rosaceae) are wild crabapples occurring in temperate East Asia. Despite their horticultural importance as ornamental trees and the natural resources in apple breeding, their phylogenetic relationships have never been determined clearly owing to lack of resolution in previous studies. We characterized four complete chloroplast genomes of these two species and conducted various phylogenomic analyses comparatively to the previously reported plastomes of other wild Malus species. They were highly conserved in genomic structures and gene contents, containing 129 genes including 84 protein-coding genes, eight rRNA genes, and 37 tRNA genes. Phylogenetic analysis of 23 representative Malus plastomes did not support the current classification of the major sections in Malus, revealing non-monophylies. The plastomes of M. toringo revealed two chloroplast types corresponding to their geographic distribution; M. toringo from China was more closely related to other sympatric species, while two conspecific M. toringo from Japan and Korea were in a sister relationship with M. baccata from Korea. We identified one positively selected gene (ndhD) and seven mutation hotspots (trnK-rps16, trnR-atpA, petN-psbM, trnT-psbD, psbZ-trnG, ndhC-trnV, and ycf1) and variable SSRs as potential useful plastid markers.
Sampling locations and distributions of trnT-L-F haplotypes in Turkey. Colors represent haplotypes and pie charts are proportional to haplotype frequency in locations. Name of haplotypes, abbreviated taxon names, and number of individuals under haplotypes are given for each location in the legend. Taxon abbreviations: Q. aucheri (auc), Q. brantii (bra), Q. cerris var. austriaca (cer.a), Q. cerris var. cerris (cer.c), Q. coccifera (coc), Q. frainetto (fra), Q. hartwissiana (har), Q. ilex (ile), Q. infectoria subsp. infectoria (inf.i), Q. infectoria subsp. veneris (inf.v), Q. ithaburensis subsp. macrolepis (ith), Q. libani (lib), Q. macranthera subsp. syspirensis (mac), Q. petraea subsp. iberica (pet.i), Q. petraea subsp. petraea (, Q. petraea subsp. pinnatiloba (pet.pi), Q. pubescens (pub), Q. pontica (pon), Q. robur subsp. pedunculiflora (rob.p), Q. robur subsp. robur (rob.r), Q. trojana subsp. trojana (tro.t), Q. trojana subsp. yaltirikii (tro.y), Q. vulcanica (vul)
Statistical parsimony network at 95% confidence level. Mutational changes across haplotypes (both substitutions and indels) and their positions on aligned trnT-L-F cpDNA sequences were shown on graph edges. Colors represent haplotypes. Sections and clades were indicated on the figure. Haplotype H16 represents “Cerris-Ilex” clade and Section Ponticae coincides with Haplotype H1
Scatter plot of isolation-by-distance estimates. Pairwise Nei’s genetic distance (D) against geographic distances between locations analyzed separately for each given group. Regression lines and mantel test result are indicated on graph
Maximum likelihood majority-rule consensus tree of trnT-L-F sequences of oaks in Turkey. The ML bootstrap values, the MP bootstrap values, and Bayesian posterior probabilities are provided on the tree, above left, above right, and below, respectively. Three main clades emerged from the root, corresponding to Section Quercus (including Section Ponticae), “Ilex” clade, and a clade of two sister groups Section Cerris (including “Cerrix-Ilex” clade) and “Coccifera” clade
The genus Quercus L. is one of the most abundant and important genera of woody plants in the Northern Hemisphere as well as in Turkey. In the current study which is the most comprehensive study dealing with Turkish oaks, sequence variations of three noncoding regions (trnT(UGU)-L(UAA) IGS, trnL(UAA)intron, trnL(UAA)-F(GAA) IGS) of chloroplast DNA (cpDNA) were used for phylogeographic and phylogenetic analysis on 319 individuals representing 23 taxa (17 species). The trnT(UGU)-L(UAA) region was found to be the most variable and parsimony informative region. Twenty-eight cpDNA haplotypes were identified based on 34 substitutions and 22 indels. High number of haplotypes and hT > vT observed in populations of oaks in Turkey indicated that the Anatolian Peninsula might have been a refugium at Glacial Periods. Phylogeographic construction and molecular variance analysis revealed that Quercus cpDNA haplotypes were geographically structured. Although local haplotype sharing among species from same infrageneric clades was common, levels of hybridization differ between species pairs. Haplotype analysis revealed four infrageneric clades, namely Section Quercus, Section Cerris and two clades corresponding to Section Ilex, namely “Ilex” and “Coccifera.” Furthermore, a Section Cerris haplotype was detected in the Aegean members of Q. ilex and Q. coccifera. Section Ponticae was placed in the Section Quercus cluster. In contrast to the phylogenetic reconstructions based on the nuclear DNA sequence data, Group Ilex seems to be polyphyletic based on plastome phylogeny. Chloroplast phylogeny of oaks reflects the traces of recent and ancient introgression events during diversification of species. In addition to this, incomplete linkage sorting may also explain this polymorphic assemblage. Therefore, further investigation is required to clarify the cpDNA phylogeny of oaks, especially for Section Ilex.
The distribution of mapped reads of each sample in the genome
a The four quadrant diagrams of differential m6A peaks and mRNA. b Scatterplot of GO category enrichment for differential m6A peaks in CK and LBD15-oe plants. c Scatterplot of KEGG pathways for differential m6A peaks in CK and LBD15-oe plants
QRT-PCR validation of the ten m6A-modified genes in CK and LBD15-oe plants. One-way ANOVA was calculated using IBM SPSS 19 software. *represents P < 0.05 and ** represents P < 0.01
a The enrichment of reads near TSS at the transcriptome initiation site of the gene. b Distribution of the differential m6A peaks. c The top 6 motifs enriched in differential m6A peaks
a Upregulated and downregulated differentially expressed genes in CK and LBD15-oe plants. b Heatmap of the gene expression in CK and LBD15-oe plants
N6-methyladenosine (m6A) plays an important role in the gene expression regulation. Previously, we found an ortholog of Arabidopsis LBD15 that showed xylem preferential expression and involved in leaf development in Poplar 84 K. In order to investigate whether m6A modification affects the function of LBD15, the m6A-immunoprecipitation sequence and the matched input RNA sequence for non-transgenic plants (CK) and the LBD15 overexpression (LBD15-oe) plants were compared and analyzed. As a result, 7,156 differential m6A peaks were identified, with 2,896 upregulated m6A peaks and 4,260 downregulated m6A peaks. Correlation analysis of differential expression genes and differential m6A peaks indicated that a total of 119 differently methylated genes showed a negative correlation with the differentially expressed genes. Among them, Nudix hydrolase, LRR receptor-like serine/threonine-protein kinase, tubulin, vacuole membrane protein KMS1, and MYB family transcription factor PHL11 may be involved in the posttranscriptional gene regulation in LBD15 overexpression plants. The expression of ten m6A-modified genes was validated by qRT-PCR. Our results will provide a basis for the further elucidation of the regulatory mechanism of m6A modification and the epigenetic regulation of LBD15.
Predictive abilities (rgy) with standard error bars of the different models (see text for details) for basic wood density (BWD), extractives (EXT), lignin (LIG), and carbohydrates (CBO) at 4 years of age and volume at 6 years of age (VOL6) for an open-pollinated progeny trial of E. benthamii
Increase in selection efficiency (Seff) of genomic selection (GS) in comparison to traditional BLUP-based selection (PS) with a progressive reduction in the time necessary to complete a breeding cycle for growth and wood traits in an open-pollinated progeny trial of E. benthamii
Increase in selection efficiency (Seff) of genomic selection (GS) with increasing selection intensity (evaluated plants) for a fixed reduction of 45% in breeding cycle for growth and wood traits in an open-pollinated progeny trial of E. benthamii
The unique adaptation of Eucalyptus benthamii to low temperatures coupled to fast growth and versatile wood quality has made it a valued plantation species in frost-prone areas worldwide, but little is known on its quantitative genetic parameters for key industrial traits. We used GBLUP additive (GA), additive-dominant (GAD), single-step (HBLUP), and pedigree-based predictive models to estimate lignin, extractives, carbohydrates, and wood density at age 4 and tree volume at age 6. By capturing hidden relatedness and correcting pedigree errors, SNP data disentangled non-additive from additive variance providing more realistic estimates of narrow-sense heritability than pedigrees, and more accurate predictions of trait values. Predictive abilities (PAs) ranged from 0.12 for volume (pedigree-based model) to 0.44 for wood density (models H, GA, and GAD). Considerable dominance variance was seen for all traits, growth was the trait most influenced by it, resulting in PAs 48.9% higher when this effect is considered, a result with important consequences both for clonal propagation and overall selection efficiency (Seff). Using a HBLUP model, phenotypes of non-genotyped trees increased PAs by increasing sample size and provided realized relationships with reduced genotyping cost. In a recurrent selection program, the preclusion of progeny testing provides an increase in Seff between 232% and 299%. In a clonal selection program, the elimination of both progeny and initial clonal trial may increase Seff between 134% and 277%. Increasing selection intensity by genomic prediction resulted in an additional impact on Seff. This study provides groundwork to implement genomic selection in E. benthamii breeding.
Top-cited authors
Fabrizio Costa
  • Università degli Studi di Trento
Eric Van de Weg
  • Plant Research International
Michela Troggio
  • Fondazione Edmund Mach - Istituto Agrario San Michele All'Adige
T. Zhebentyayeva
  • Pennsylvania State University
Daniele Bassi
  • University of Milan