Avril M. Harder’s research while affiliated with HudsonAlpha Institute for Biotechnology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (21)


Comparative analyses of four reference genomes reveal exceptional diversity and weak linked selection in the yellow monkeyflower (Mimulus guttatus) complex
  • Preprint

April 2025

·

25 Reads

·

·

Amelia Lawrence

·

[...]

·

John Willis

Allopolyploidy expanded gene content but not pangenomic variation in the hexaploid oilseed Camelina sativa

November 2024

·

66 Reads

·

2 Citations

Genetics

Ancient whole-genome duplications (WGDs) are believed to facilitate novelty and adaptation by providing the raw fuel for new genes. However, it is unclear how recent WGDs may contribute to evolvability within recent polyploids. Hybridization accompanying some WGDs may combine divergent gene content among diploid species. Some theory and evidence suggest that polyploids have a greater accumulation and tolerance of gene presence-absence and genomic structural variation, but it is unclear to what extent either is true. To test how recent polyploidy may influence pangenomic variation, we sequenced, assembled, and annotated twelve complete, chromosome-scale genomes of Camelina sativa, an allohexaploid biofuel crop with three distinct subgenomes. Using pangenomic comparative analyses, we characterized gene presence-absence and genomic structural variation both within and between the subgenomes. We found over 75% of ortholog gene clusters are core in Camelina sativa and <10% of sequence space was affected by genomic structural rearrangements. In contrast, 19% of gene clusters were unique to one subgenome, and the majority of these were Camelina-specific (no ortholog in Arabidopsis). We identified an inversion that may contribute to vernalization requirements in winter-type Camelina, and an enrichment of Camelina-specific genes with enzymatic processes related to seed oil quality and Camelina’s unique glucosinolate profile. Genes related to these traits exhibited little presence-absence variation. Our results reveal minimal pangenomic variation in this species, and instead show how hybridization accompanied by WGD may benefit polyploids by merging diverged gene content of different species.


A) Diagrams describing population size changes for each of the simulated demographic scenarios. Each diagram shows the final 1,000 generations of each scenario, which was preceded in each by a burn-in of 10,000 generations with a population size of 10,000 diploid individuals. Population sizes for each interval are noted for each scenario. B,D,F,H) True overall FROH frequencies. C,E,G,I) Length bin-specific true FROH values; horizontal lines correspond to bin median values.
The relationship between true and inferred FROH values depends on inference method and population demographic scenario
Each regression line represents linear model results for a single level of coverage with the shaded areas representing 95% confidence intervals. Each point represents data for a single simulated individual. Panels display outcomes using BCFtools in Genotypes mode (A-D), BCFtools in Likelihoods mode (E-H) and, PLINK (I-L), as well as by population scenarios including large (A, E, I), small (B, F, J), bottlenecked (C, G, K), and declining (D, H, L) populations. Dashed line is 1:1 line and x- and y-axes are consistent within each demographic scenario. Note the differing slopes across demographic scenarios (e.g., among panels A-D) and differing overall accuracies across methods (e.g., differing distances between regression lines and 1:1 line among panels D, H, and L). Another version of this figure with consistent axis limits across panels and colorized by sequencing depth is available in S19 Fig.
PLINK outperforms BCFtools with respect to false negative rates, but underperforms with respect to false positive rates
A) False positive (i.e., incorrectly calling a base position as being located in a ROH) and B) false negative (i.e., failing to identify a base position as being located in a ROH) rates across demographic scenarios and methods. Horizontal lines indicate median values and shaded boxes are 50% quantiles. Note the difference in scale of y-axis between panels A and B. Both BCFtools approaches outperform PLINK with respect to false positive rates but the reverse is true for false negative rates. Increasing coverage corresponds to decreasing false positive rates and to increasing false negative rates. Values displayed for 5X and 50X coverages; data for all coverage levels presented in S9 and S10 Figs, as well as a scatter plot in S18 Fig.
Increasing true ROH length corresponds to increasing detection
Called FROH−True FROH displayed by length bin (short, intermediate, long, very long) and demographic scenario (A: large population; B: small population; C: bottlenecked population; D: declining population) at 15X (results for all coverage levels presented in S9–S10 Figs). For BCFtools Genotypes and PLINK, FROH for short ROHs is consistently underestimated whereas FROH for very long ROHs is overestimated when these ROHs are present. BCFtools Likelihoods does not overestimate ROHs in any length bin.
All three methods tested combine multiple true ROHs into single called ROHs, with increasing coverage only providing improvements for BCFtools Likelihoods
A) Diagram illustrating this lumping issue. B) Examples of this issue at 5X and 50X in a single simulated individual drawn from the small population demographic scenario. C) Number of true ROHs combined into a single called ROH for ROHs of varying lengths when called by all three methods at 5X and (D) at 50X in the small population (results for all coverage levels and demographic scenarios provided in S16 Fig). Points correspond to mean values and vertical and horizontal error lines indicate 95% confidence intervals. Dashed horizontal line corresponds to y = 1 (a 1:1 relationship between numbers of true and called ROHs).

+3

Detectability of runs of homozygosity is influenced by analysis parameters and population-specific demographic history
  • Article
  • Full-text available

October 2024

·

86 Reads

·

3 Citations

Wild populations are increasingly threatened by human-mediated climate change and land use changes. As populations decline, the probability of inbreeding increases, along with the potential for negative effects on individual fitness. Detecting and characterizing runs of homozygosity (ROHs) is a popular strategy for assessing the extent of individual inbreeding present in a population and can also shed light on the genetic mechanisms contributing to inbreeding depression. Here, we analyze simulated and empirical datasets to demonstrate the downstream effects of program selection and long-term demographic history on ROH inference, leading to context-dependent biases in the results. Through a sensitivity analysis we evaluate how various parameter values impact ROH-calling results, highlighting its utility as a tool for parameter exploration. Our results indicate that ROH inferences are sensitive to factors such as sequencing depth and ROH length distribution, with bias direction and magnitude varying with demographic history and the programs used. Estimation biases are particularly pronounced at lower sequencing depths, potentially leading to either underestimation or overestimation of inbreeding. These results are particularly important for the management of endangered species, as underestimating inbreeding signals in the genome can substantially undermine conservation initiatives. We also found that small true ROHs can be incorrectly lumped together and called as longer ROHs, leading to erroneous inference of recent inbreeding. To address these challenges, we suggest using a combination of ROH detection tools and ROH length-specific inferences, along with sensitivity analysis, to generate robust and context-appropriate population inferences regarding inbreeding history. We outline these recommendations for ROH estimation at multiple levels of sequencing effort, which are typical of conservation genomics studies.

Download

(A) Global occurrences of thiamine deficiency complex (TDC) across five taxa, with the number of unique species affected per taxon provided in parentheses in the figure key. Each point represents one species-level occurrence per watershed (i.e., a single point may indicate multiple affected populations within the watershed). Point shape indicates whether the data supporting the detection of TDC (i) have been published as part of a peer-reviewed article or (ii) are studies in progress. Reference information for data points is available in Table A1. Map made with Natural Earth and the “rnaturalearth” package in R (v4.0.3) using the WGS84 geographic coordinate system (South 2017; R Core Team 2024). (B) Results of an ISI Web of Science Search (Clarivate PLC) for articles on thiamine deficiency from 1985 to 2023 using the search methodology Bernhardt et al. (2010) used in their review of nanoparticles. First, we searched all fields for “thiamin* deficien*” OR “early mortality syndrome” OR “M74” (TDC, n = 2489) to estimate the number of papers published to-date on thiamine deficiency. Next, we searched for publications addressing fish by adding “fish OR teleost OR salmon*” (TDC + fish; n = 205). Finally, to find papers that addressed evolutionary considerations, we next added “adapt* OR genetic* OR evolution*” (TDC + fish + evolution; n = 25). We examined all returns on the TDC + fish + evolution step to include only those papers that specifically mention genetics or adaptation to accurately reflect the (few) papers addressing evolution to low thiamine in fishes. Search conducted on 30 August 2024.
Overview of thiamine treatment considerations and options, given various population characteristics. While certainly not exhaustive, this diagram provides a summary of potential evolutionary outcomes of different treatment decisions, including the decision not to treat. We view this as a starting point and acknowledge many of these recommendations can only be easily applied to the hatchery-supported portion of populations. Additionally, determining where a population falls on the spectrum from small to large population size should include consideration of the population’s specific demographic and evolutionary history.
Evolutionary perspectives on thiamine supplementation of managed Pacific salmonid populations

September 2024

·

21 Reads

Thiamine deficiency complex (TDC) has been identified in an ever-expanding list of species and populations. In many documented occurrences of TDC in fishes, juvenile mortality can be high—up to 90% at the population level. Such sweeping demographic losses and concomitant decreases in genetic diversity due to TDC can be prevented by treating pre-spawn females or fertilized eggs with supplemental thiamine. However, some fisheries managers are hesitant to widely apply thiamine treatments due to the potential for unforeseen evolutionary consequences. With these concerns in mind, we first review the existing data regarding genetic adaptation to low-thiamine conditions and provide perspectives on evolution-informed treatment strategies with specific population examples. We also provide practical treatment information, consider the potential logistical constraints of thiamine supplementation, and explore the consequences of deciding against supplementation. Until new evidence bolsters or refutes the genetic adaptation hypothesis, we suggest that TDC mitigation strategies should be designed to support maximum population genetic diversity through thiamine supplementation.


Allopolyploidy expanded gene content but not pangenomic variation in the hexaploid oilseed Camelina sativa

August 2024

·

196 Reads

Ancient whole-genome duplications (WGDs) are believed to facilitate novelty and adaptation by providing the raw fuel for new genes. However, it is unclear how recent WGDs may contribute to evolvability within recent polyploids. Hybridization accompanying some WGDs may combine divergent gene content among diploid species. Some theory and evidence suggest that polyploids have a greater accumulation and tolerance of gene presence-absence and genomic structural variation, but it is unclear to what extent either is true. To test how recent polyploidy may influence pangenomic variation, we sequenced, assembled, and annotated twelve complete, chromosome-scale genomes of Camelina sativa , an allohexaploid biofuel crop with three distinct subgenomes. Using pangenomic comparative analyses, we characterized gene presence-absence and genomic structural variation both within and between the subgenomes. We found over 75% of ortholog gene clusters are core in Camelina sativa and <10% of sequence space was affected by genomic structural rearrangements. In contrast, 19% of gene clusters were unique to one subgenome, and the majority of these were Camelina-specific (no ortholog in Arabidopsis). We identified an inversion that may contribute to vernalization requirements in winter-type Camelina, and an enrichment of Camelina-specific genes with enzymatic processes related to seed oil quality and Camelina’s unique glucosinolate profile. Genes related to these traits exhibited little presence-absence variation. Our results reveal minimal pangenomic variation in this species, and instead show how hybridization accompanied by WGD may benefit polyploids by merging diverged gene content of different species.


Evolutionary perspectives on thiamine supplementation of managed Pacific salmonid populations

February 2024

·

21 Reads

Thiamine deficiency complex (TDC) in fishes has been identified in an ever-expanding list of species and populations. In many documented occurrences of TDC in fishes, rates of juvenile mortality have reached 90% at the population level, with many females producing no surviving offspring. Such sweeping demographic losses and concomitant decreases in genetic diversity due to TDC can be prevented by treating pre-spawn females or fertilized eggs with supplemental thiamine. However, some fisheries managers are hesitant to widely apply thiamine treatments due to the potential for unforeseen evolutionary consequences. Specifically, these hesitations are due in part to apprehension that thiamine supplementation may impede genetic adaptation to low-thiamine conditions or may give hatchery fish an advantage over wild-origin fish. With these concerns in mind, we first review the existing data regarding genetic adaptation to low-thiamine conditions and provide perspectives on evolution-informed treatment strategies with specific population examples. We also provide practical treatment information, consider the potential logistical constraints of thiamine supplementation, and explore the consequences of deciding against supplementation. Until new evidence bolsters or refutes the genetic adaptation hypothesis, we suggest that TDC mitigation strategies should be designed to support maximum population genetic diversity through thiamine supplementation. Furthermore, we offer guidelines on when the adaptation strategy may be applicable to certain populations.



(a) Map of the area surrounding the study site, which is situated in Arizona near the New Mexico and Mexico borders. The site is located just southeast of the Chiricahua Mountains. (b) Map of the study site with all mounds included in this study marked with points. The mounds are located on primarily flat areas surrounding a cinder cone.
Schematic showing temporal alignments between the predictor and response variables tested in the study. For example: annual means used to predict the number of offspring produced in year t were calculated from environmental data collected from July, year t − 1 through June, year t, whereas winter rainy season means were calculated from data collected from December, year t − 1 through March, year t. Although not indicated in this figure, PRISM data were only used as predictor variables for population fitness and number of active mounds (i.e., not for measures of individual fitness). Summer and winter rainy season results are indicated by “S” and “W,” respectively.
Mean values of Tasseled Cap indices (a–c) and surface temperature (d) across days of the year. Means were calculated using all cells that were occupied in at least 1 year over the course of the study plus all cells directly adjacent to those occupied cells. Note that the x‐axes are offset such that the axis begins with July 1 and ends with June 30. White lines connect dates from July 1 in each year through June 30 in the subsequent year. Vertices for shaded polygons encompass one standard deviation around each mean. The points to the right of the dashed line indicate annual and rainy season mean values within each year.
Schematic summarizing the statistically significant relationships identified between environmental variables and fitness or population size. “S” and “W” indicate summer and winter rainy season results, respectively. The sign in each colored polygon indicates the direction of the relationship (i.e., the only negative relationship identified was between mean annual surface temperature and number of active mounds). Polygon color indicates environmental predictor variable with outline pattern indicating the scale at which variables were tested (i.e., individual and population fitness and population size).
(a–c) Significant positive relationships between surface temperature and measures of individual fitness. Panels a and b present the effects of mean annual and mean winter rainy season surface temperatures, respectively, on number of offspring produced by individual females while setting the non‐focal predictor variable in each negative binomial model equal to its mean value. Panel c presents the final negative binomial model predicting number of surviving offspring with mean annual surface temperature. (d) Linear regression describing negative effect of mean annual surface temperature on population size, as measured by number of active mounds. For all panels, shaded polygons represent 95% confidence intervals. Statistical results for models are presented in Tables 1–4 and Table A2.
Remotely sensed environmental measurements detect decoupled processes driving population dynamics at contrasting scales

August 2023

·

170 Reads

The increasing availability of satellite imagery has supported a rapid expansion in forward‐looking studies seeking to track and predict how climate change will influence wild population dynamics. However, these data can also be used in retrospect to provide additional context for historical data in the absence of contemporaneous environmental measurements. We used 167 Landsat‐5 Thematic Mapper (TM) images spanning 13 years to identify environmental drivers of fitness and population size in a well‐characterized population of banner‐tailed kangaroo rats ( Dipodomys spectabilis ) in the southwestern United States. We found evidence of two decoupled processes that may be driving population dynamics in opposing directions over distinct time frames. Specifically, increasing mean surface temperature corresponded to increased individual fitness, where fitness is defined as the number of offspring produced by a single individual. This result contrasts with our findings for population size, where increasing surface temperature led to decreased numbers of active mounds. These relationships between surface temperature and (i) individual fitness and (ii) population size would not have been identified in the absence of remotely sensed data, indicating that such information can be used to test existing hypotheses and generate new ecological predictions regarding fitness at multiple spatial scales and degrees of sampling effort. To our knowledge, this study is the first to directly link remotely sensed environmental data to individual fitness in a nearly exhaustively sampled population, opening a new avenue for incorporating remote sensing data into eco‐evolutionary studies.



Detectability of runs of homozygosity is influenced by analysis parameters as well as population-specific demographic history

September 2022

·

93 Reads

·

3 Citations

Wild populations are increasingly threatened by human-mediated climate change and land use changes. As populations decline, the probability of inbreeding increases, along with the potential for negative effects on individual fitness. Detecting and characterizing runs of homozygosity (ROHs) is a popular strategy for assessing the extent of individual inbreeding present in a population and can also shed light on the genetic mechanisms contributing to inbreeding depression. However, selecting an appropriate program and parameter values for such analyses is often difficult for species of conservation concern, for which little is often known about population demographic histories or few high-quality genomic resources are available. Herein, we analyze simulated and empirical data sets to demonstrate the downstream effects of program selection on ROH inference. We also apply a sensitivity analysis to evaluate the effects of various parameter values on ROH-calling results and demonstrate its utility for parameter value selection. We show that ROH inferences can be biased when sequencing depth and the distribution of ROH length is not interpreted in light of program-specific tendencies. This is particularly important for the management of endangered species, as some program and parameter combinations consistently underestimate inbreeding signals in the genome, substantially undermining conservation initiatives. Based on our conclusions, we suggest using a combination of ROH detection tools and ROH length-specific inferences to generate robust population inferences regarding inbreeding history. We outline these recommendations for ROH estimation at multiple levels of sequencing effort typical of conservation genomics studies.


Citations (11)


... A genomic assessment within the Camelineae tribe found that the absence of indole glucosinolates was driven by the loss of a MYB34 homolog that in other Brassicaceae is necessary to express indole glucosinolate genes [20]. Corresponding to this transcription factor loss, Camelina sativa is losing CYP83B1 genes responsible for making indole glucosinolates and has lost the CYP81Fs responsible for hydroxylating indole glucosinolates [18]. The Camelineae tribe has also lost the ability to accumulate significant levels of short-chain methionine-derived glucosinolates and this is accompanied by a loss of all the enzymes that modify the shortchain methionine glucosinolates [18]. ...

Reference:

Phylogenetic and genomic mechanisms shaping glucosinolate innovation
Allopolyploidy expanded gene content but not pangenomic variation in the hexaploid oilseed Camelina sativa
  • Citing Article
  • November 2024

Genetics

... PLINK [7] scans chromosomes for consecutive homozygous genotypes by sliding a fixed-size window of detection, and an ROH is called if the count of consecutive homozygous SNPs satisfies the predefined condition. However, its algorithms were initially designed for SNP genotyping array data, and hence, it is necessary to adjust some of its parameters when applied to other data types [3,8]. Alternatively, several model-based programs, including Beagle, H 3 M 2 , and BCFtools, which are employing hidden Markov models (HMM), can identify potential sequences of homozygosity [9][10][11]. ...

Detectability of runs of homozygosity is influenced by analysis parameters and population-specific demographic history

... Patterns of ROH can be further confounded by method and data. Silva et al. (2024) used simulations to show that the accuracy of ROH detection varied across populations with different demographic histories; specifically, declining populations exhibited the highest error in F ROH estimates relative to the known value of F. Hewett et al. (2023) also showed that demographic impacts on ROH distributions are stronger than on recombination and selection. Direct estimates of TMCRA (Table 1) can be imprecise because of the stochasticity of recombination and Mendelian segregation (Thompson 2013;Kardos et al. 2016) and may be biased in some cases because of apparent long ROH arising via the conflation of multiple short adjacent IBD segments (Chiang, Ralph, and Novembre 2016). ...

Detectability of runs of homozygosity is influenced by analysis parameters as well as population-specific demographic history

... The wide distribution of epigenetic regulatory mechanisms in plant development and a high plasticity of the epigenome compared to the genome and its sufficient stability to transmit adaptive changes in generations is assumed to be the main source of phenotypic plasticity [5,6]. Epigenetic diversity in plant populations under unfavorable environment fluctuations sharply increases on an almost unchanged genetic background, that allows to say about ecological epigenetics (eco-epi) [7]. DNA cytosine methylation is regarded as a fundamental epigenetic mechanism of phenotypic variations [8][9][10]. ...

Epigenetics in Ecology, Evolution, and Conservation

... Previous studies have suggested that immune regulation may play a pivotal role in the migratory behaviors of fish 11,12 . The Pacific saury is renowned for its extensive, seasonal, and wide-range migrations 13 . ...

Genomic signatures of adaptation to novel environments: hatchery and life history-associated loci in landlocked and anadromous Atlantic salmon (Salmo salar)

... Finally, three preliminary assemblies, including one monoploid assembly and two haploid assemblies, were yielded, which spanned 375.62 Mb (monoploid), 373.25 Mb (Haploid-1) and 372.15 Mb (Haploid-2), with a contig N50 length of 11.55 Mb, 4.86 Mb and 4.87 Mb, respectively ( Table 2). The genome assembly was slightly larger than the estimated genome size of 369.48 Mb (Table 1) because some repeat fragments could be assembled by high-precision CCS reads 13 . Juicer 14 and 3D-DNA 15 were implemented to obtain the (Table 4). ...

High-Quality Reference Genome for an Arid-Adapted Mammal, the Banner-Tailed Kangaroo Rat ( Dipodomys spectabilis )

Genome Biology and Evolution

... Despite broad scientific consensus on the urgency of conserving intraspecific genetic diversity, and hence adaptive potential (DeWoody et al. 2021), conservation strategies often provide vague guidance on how to translate this goal into practical action, particularly at national and subnational levels (Pierson et al. 2016;Cook and Sgrò 2017;Laikre et al. 2020). Hoban et al. (2020) proposed three pragmatic indicators for genetic monitoring, based on the number of populations with an effective population size above the standard (but controversial) threshold of 500, the proportion of populations conserved for each species and, finally, the number of populations in which genetic diversity is assessed (and tracked) using molecular markers. ...

The long-standing significance of genetic diversity in conservation

Molecular Ecology

... While local adaptation on chromosomal rearrangements and large inversions (Barth et al., 2017;Meyer et al., 2024;Schaal et al., 2022;Wilder et al., 2020) or hitchhiking caused by hard selective sweeps (Bierne, 2010;Haenel et al., 2019) can generate clear peaks of differentiation and signals of adaptation in high gene flow systems, selection on standing genetic variation will produce much smaller peaks of differentiation that can be difficult to identify (Hermisson & Pennings, 2005). Nevertheless, selection at putatively adaptive loci can manifest as downstream changes in mRNA expression profiles or sequences, such that RNA sequencing (RNA-Seq) offers an opportunity to identify adaptive genetic variation (McGuigan et al., 2011;Paaby & Rockman, 2014;Thorstensen et al., 2021;Yin et al., 2021). Furthermore, coupling RNA-Seq with a common garden breeding design that minimizes the effects of environmental conditions on gene expression can improve our ability to identify genes affected by selection on regulatory elements (Christie et al., 2016;Gil & Ulitsky, 2020;Harder et al., 2020;Signor & Nuzhdin, 2018). ...

Incipient resistance to an effective pesticide results from genetic adaptation and the canalization of gene expression

... Hydroelectric dams create migratory challenges for both upstream migrating adults and smolts migrating downstream (Nyqvist et al., 2017a(Nyqvist et al., , 2017b. The introduction of Alewife Alosa pseudoharengus in 2003, a prey fish high in the enzyme thiaminase, has led to a severe thiamine deficiency complex (TDC) that limits the survival of the offspring of afflicted adults (Harder et al., 2018(Harder et al., , 2020. To address the thiamine deficiency issue, local managers have recently developed an experimental broodstock that is putatively resistant to TDC (Harder et al., 2020) and implemented a genetic marking program (e.g., Steele et al., 2019) to mark different experimental release groups. ...

Among family variation in survival and gene expression uncovers adaptive genetic variation in a threatened fish
  • Citing Article
  • December 2019

Molecular Ecology

... Hydroelectric dams create migratory challenges for both upstream migrating adults and smolts migrating downstream (Nyqvist et al., 2017a(Nyqvist et al., , 2017b. The introduction of Alewife Alosa pseudoharengus in 2003, a prey fish high in the enzyme thiaminase, has led to a severe thiamine deficiency complex (TDC) that limits the survival of the offspring of afflicted adults (Harder et al., 2018(Harder et al., , 2020. To address the thiamine deficiency issue, local managers have recently developed an experimental broodstock that is putatively resistant to TDC (Harder et al., 2020) and implemented a genetic marking program (e.g., Steele et al., 2019) to mark different experimental release groups. ...

Thiamine deficiency in fishes: causes, consequences, and potential solutions

Reviews in Fish Biology and Fisheries