Arlin StoltzfusUniversity of Maryland, College Park | UMD, UMCP, University of Maryland College Park · Institute for Bioscience and Biotechnology Research
Arlin Stoltzfus
Ph.D.
Working collaboratively on the role of mutation in evolution
About
92
Publications
18,209
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,113
Citations
Introduction
I do collaborative computer-based work, both empirical and theoretical, on various topics in molecular evolution, bioinformatics, epidemiology, and evolutionary theory. My main interest is the role of mutation in evolution, particularly the role of mutation as a dispositional factor that makes some changes more likely than others. My recent book "Mutation, Randomness and Evolution" (https://books.google.com/books/about/Mutation_Randomness_and_Evolution.htm) provides an introduction to the topic.
Publications
Publications (92)
The effect of replacing the amino acid at a given site in a protein is difficult to predict. Yet, evolutionary comparisons have revealed highly regular patterns of interchangeability between pairs of amino acids, and such patterns have proved enormously useful in a range of applications in bioinformatics, evolutionary inference, and protein design....
We evaluate approaches to vaccine distribution using an agent-based model of human activity and COVID-19 transmission calibrated to detailed trends in cases, hospitalizations, deaths, seroprevalence, and vaccine breakthrough infections in Florida, USA. We compare the incremental effectiveness for four different distribution strategies at four diffe...
The onset of the COVID-19 pandemic drove a widespread, often uncoordinated effort by research groups to develop mathematical models of SARS-CoV-2 to study its spread and inform control efforts. The urgent demand for insight at the outset of the pandemic meant early models were typically either simple or repurposed from existing research agendas. Ou...
The onset of the COVID-19 pandemic drove a widespread, often uncoordinated effort by research groups to develop mathematical models of SARS-CoV-2 to study its spread and inform control efforts. The urgent demand for insight at the outset of the pandemic meant early models were typically either simple or repurposed from existing research agendas. Ou...
The joint distribution of selection coefficients and mutation rates is a key determinant of the genetic architecture of molecular adaptation. Three different distributions are of immediate interest: (1) the nominal distribution of possible changes, prior to mutation or selection, (2) the de novo distribution of realized mutations, and (3) the fixed...
Policymakers must make management decisions despite incomplete knowledge and conflicting model projections. Little guidance exists for the rapid, representative, and unbiased collection of policy-relevant scientific input from independent modeling teams. Integrating approaches from decision analysis, expert judgment, and model aggregation, we conve...
Predicting evolutionary outcomes is an important research goal in a diversity of contexts. The focus of evolutionary forecasting is usually on adaptive processes, and efforts to improve prediction typically focus on selection. However, adaptive processes often rely on new mutations, which can be strongly influenced by predictable biases in mutation...
We evaluate approaches to vaccine distribution using an agent-based model of human activity and COVID-19 transmission calibrated to detailed trends in cases, hospitalizations, deaths, seroprevalence, and vaccine breakthrough infections in Florida, USA. We compare the incremental effectiveness for four different distribution strategies at four diffe...
The joint distribution of selection coefficients and mutation rates is a key determinant of the genetic architecture of molecular adaptation. Three different distributions are of immediate interest: (1) the nominal distribution of possible changes, prior to mutation or selection, (2) the de novo distribution of realized mutations, and (3) the fixed...
Predicting evolutionary outcomes is an important research goal in a diversity of contexts. The focus of evolutionary forecasting is usually on adaptive processes, and efforts to improve prediction typically focus on selection. However, adaptive processes often rely on new mutations, which can be strongly influenced by predictable biases in mutation...
The idea that adaptive change is subject to biases in variation by a "first come, first served" dynamic is not part of classic evolutionary reasoning. Yet, predictable effects of biases in the introduction of variation have been reported in models of population genetics, in laboratory evolution, and in retrospective analyses of natural adaptation....
Significance
How do mutational biases influence the process of adaptation? A common assumption is that selection alone determines the course of adaptation from abundant preexisting variation. Yet, theoretical work shows broad conditions under which the mutation rate to a given type of variant strongly influences its probability of contributing to a...
Evolutionary adaptation often occurs via the fixation of beneficial point mutations, but different types of mutation may differ in their relative frequencies within the collection of substitutions contributing to adaptation in any given species. Recent studies have established that this spectrum of adaptive substitutions is enriched for classes of...
Well-studied cases of programmed DNA rearrangements, e.g., somatic recombination in the emergence of specific antibodies, suggest a rubric for specially evolved mutation systems: they amplify the rates of specific types of mutations (by orders of magnitude), subject to specific modulation, using dedicated parts, with the favored types of mutations...
Mutation, Randomness, and Evolution presents a new understanding of how the course of evolution may reflect biases in variation and unites key concerns of molecular and microbial evolution, evo-devo, evolvability, and self-organization by placing these concerns on a solid theoretical and empirical foundation. It situates them within a broader movem...
Under the neo-Darwinian theory, selection is the potter and variation is the clay: peculiarities or regularities of variation may emerge from internal causes, but these are ultimately irrelevant, because selection governs the outcome of evolution. Chapter 6 addresses this sense of “randomness” as irrelevance or unimportance, featuring (1) an analog...
Chapter 9 presents an empirical case for the importance of mutational biases, based on studies of adaptation traced to the molecular level. Where Chapter 8 identified a variational cause of bias that does not depend on neutral evolution, absolute constraints, or high mutation rates, this chapter focuses on how quantitative biases in ordinary nucleo...
Chapter 10 includes a synopsis of key points from previous chapters as well as reflections on changing explananda , notions of causation, and the importance of identifying testable theories. The ongoing delay in recognizing the introduction process as a dispositional evolutionary cause reflects the lasting influence of the shifting-gene-frequencies...
Chapter 2 addresses how well the biological process of mutation is described by some of the ordinary meanings of “chance“ or “randomness“ in science: lack of purpose or foresight, uniformity (homogeneity), stochasticity, indeterminacy, unpredictability, spontaneity, and independence (chance). Ordinary mutations exhibit various kinds of heterogeneit...
Chapter 1 begins with a synopsis of the central argument concerning models of evolution (and theories of causation) that incorporate a mutational introduction process, using a study of laboratory adaptation that shows proportional effects of a 50-fold range of rates for different mutations. The exploration of the role of variation in this book cove...
Chapter 3 addresses the idea of randomness as a simplifying assumption, beginning with a discussion (using examples from phylogenetics) of the reasons that scientists employ simplifying assumptions that are known to be incorrect. That is, some ways of thinking about mutation may be useful, even if they are only approximately correct. Approximations...
Contemporary defenses of the randomness doctrine refer not to ordinary meanings of randomness, but to a special evolutionary meaning by which mutation is said to be independent of, or uncorrelated with, something like environment, selection, evolution, or adaptation. Chapter 4 addresses whether this type of claim is justified, either empirically or...
Chapter 7 maps out a broad framework for considering the problem of variation in evolution. Under the neo-Darwinian view that variation merely plays the role of supplying random infinitesimal raw materials, with no dispositional influence on the course of evolution, a substantive theory of form and its variation is not required to specify a complet...
Chapter 8 provides the formal basis to recognize biases in the introduction of variation as a cause of evolutionary biases. The shifting-gene-frequencies theory of the Modern Synthesis posits a “buffet” view in which evolution is merely a process of shifting the frequencies of pre-existing alleles, without new mutations. Within this theory, mutatio...
Policymakers make decisions about COVID-19 management in the face of considerable uncertainty. We convened multiple modeling teams to evaluate reopening strategies for a mid-sized county in the United States, in a novel process designed to fully express scientific uncertainty while reducing linguistic uncertainty and cognitive biases. For the scena...
A comprehensive phylogeny of species, i.e., a tree of life, has potential uses in a variety of contexts, including research, education, and public policy. Yet, accessing the tree of life typically requires special knowledge, complex software, or long periods of training. The Phylotastic project aims make it as easy to get a phylogeny of species as...
The battle between microbes and their viruses is ancient and ongoing. Clustered regularly interspaced short palindromic repeat (CRISPR) immunity, the first and, to date, only form of adaptive immunity found in prokaryotes, represents a flexible mechanism to recall past infections while also adapting to a changing pathogenic environment. Critical to...
An underexplored question in evolutionary genetics concerns the extent to which mutational bias in the production of genetic variation influences outcomes and pathways of adaptive molecular evolution. In the genomes of at least some vertebrate taxa, an important form of mutation bias involves changes at CpG dinucleotides: if the DNA nucleotide cyto...
An underexplored question in evolutionary genetics concerns the extent to which mutational bias in the production of genetic variation influences outcomes and pathways of adaptive molecular evolution. In the genomes of at least some vertebrate taxa, an important form of mutation bias involves changes at CpG dinucleotides: If the DNA nucleotide cyto...
(1) A comprehensive phylogeny of species, i.e., a tree of life, has potential uses in a variety of contexts in research and education. This potential is limited if accessing the tree of life requires special knowledge, complex software, or long periods of training.
(2) The Phylotastic project aims to use web-services technologies to lower the barri...
Our understanding of evolution is shaped strongly by how we conceive of its fundamental causes. In the original Modern Synthesis, evolution was defined as a process of shifting the frequencies of available alleles at many loci affecting a trait under selection. Events of mutation that introduce novelty were not considered evolutionary causes, but p...
High-level debates in evolutionary biology often treat the Modern Synthesis as a framework of population genetics, or as an intellectual lineage with a changing distribution of beliefs. Unfortunately, these flexible notions, used to negotiate decades of innovations, are now thoroughly detached from their historical roots in the original Modern Synt...
While mutational biases strongly influence neutral molecular evolution, the role of mutational biases in shaping the course of adaptation is less clear. Here we consider the frequency of transitions relative to transversions among adaptive substitutions. Because mutation rates for transitions are higher than those for transversions, if mutational b...
In recent years, there has been an explosion in the popularity of hackathons — creative, participant-driven meetings at which software developers gather for an intensive bout of programming, often organized in teams. Hackathons have tangible and intangible outcomes, such as code, excitement, learning, networking, and so on, whose relative merits ar...
While mutational biases strongly influence neutral molecular evolution, the role of mutational biases in shaping the course of adaptation is less clear. Here we consider the frequency of transitions relative to transversions among adaptive substitutions. Because mutation rates for transitions are higher than those for transversions, if mutational b...
Time-bounded collaborative events in which teams work together under intense time pressure are becoming increasingly popular. While hackathons, that is, competitive overnight coding events, are one of the more prevalent examples of this phenomenon, there are many more distinct event design variations for different audiences and with divergent aims,...
A pattern in which nucleotide transitions are favored several-fold over transversions is common in molecular evolution. When this pattern occurs among amino acid replacements, explanations often invoke an effect of selection, on the grounds that transitions are more conservative in their effects on proteins. However, the underlying hypothesis of co...
A pattern in which nucleotide transitions are favored several-fold over transversions is common in molecular evolution. When this pattern occurs among amino acid replacements, explanations often invoke an effect of selection, on the grounds that transitions are more conservative in their effects on proteins. However, the underlying hypothesis of co...
Background
Studies of diversification and trait evolution increasingly rely on combining molecular sequences and fossil dates to infer time-calibrated phylogenetic trees. Available calibration software provides many options for the shape of the prior probability distribution of ages at a node to be calibrated, but the question of how to assign a Ba...
Many models of evolution calculate the rate of evolution by multiplying the rate at which new mutations originate within a population by a probability of fixation. Here we review the historical origins, contemporary applications, and evolutionary implications of these "origin-fixation" models, which are widely used in evolutionary genetics, molecul...
According to a classical narrative, early geneticists, failing to see how Mendelism provides the missing pieces of Darwin's theory, rejected gradual changes and advocated an implausible yet briefly popular view of evolution-by-mutation; after decades of delay (in which synthesis was prevented by personal conflicts, disciplinary rivalries, and anti-...
Background
Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great “Tree of Life” (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generaliz...
Recently, various evolution-related journals adopted policies to encourage or require archiving of phylogenetic trees and associated data. Such attention to practices that promote sharing of data reflects rapidly improving information technology, and rapidly expanding potential to use this technology to aggregate and link data from previously publi...
Constructive neutral evolution (CNE) suggests that neutral evolution may follow a stepwise path to extravagance. Whether or not CNE is common, the mere possibility raises provocative questions about causation: in classical neo-Darwinian thinking, selection is the sole source of creativity and direction, the only force that can cause trends or build...
This Perl script will compute a graph showing the inversion paths between all (scrambled and unscrambled) configurations of a segmented gene (for help, type " ./scramble_space.pl – help").
The origin and evolution of "ORFans" (suspected genes without known relatives) remain unclear. Here we take advantage of a unique opportunity to examine the population diversity of thousands of ORFans, based on a collection of 35 complete genomes of isolates of E. coli and Shigella (which is included phylogenetically within E. coli). As expected fr...
Phylogenetic analyses can resolve historical relationships among genes, organisms or higher taxa. Understanding such relationships can elucidate a wide range of biological phenomena including the role of adaptation as a driver of diversification, the importance of gene and genome duplications in the evolution gene function, or the evolutionary cons...
In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard t...
Publishing re-usable phylogenetic trees, in theory and practice
Black cohosh (Actaea racemosa L., syn. Cimicifuga racemosa, Nutt., Ranunculaceae) is a popular herb used for relieving menopausal discomforts. A variety of secondary metabolites, including triterpenoids, phenolic dimers, and serotonin derivatives have been associated with its biological activity, but the genes and metabolic pathways as well as the...
Interoperability is the property that allows systems to work together independent of who created them, or how or for what purpose they were implemented. It is crucial for aggregating data from different online resources and for integrating different kinds of data. Interoperability is based on effective standards that become and remain broadly adopt...
In this chapter we describe the development of a new biomedical ontology in the context of the modern knowledge representation
research field. We also present the modeled concepts and their relevance in the light of the history of evolutionary biology.
CDAO stands for “Comparative Data Analysis Ontology” and allows the representation of data produc...
Comparative analysis is used throughout biology. When entities under comparison (e.g. proteins, genomes, species) are related by descent, evolutionary theory provides a framework that, in principle, allows N-ary comparisons of entities, while controlling for non-independence due to relatedness. Powerful software tools exist for specialized applicat...
Mutational biases refer to systematic asymmetries or nonuniformities in the occurrence of mutations, heritable changes that take place in an individual organism. Mutational biases arise by asymmetries in damage, repair and replication of the genetic material. Many such biases are known: familiar examples include transition:transversion bias, the Cp...
Spontaneous copying errors in replication often are assumed to be the main source of germline mutations in humans and other mammals. However, when laboratory data on context-dependent patterns of oxidative DNA damage are compared with patterns of mutation inferred from mammalian sequence evolution, the strength of the correlation suggests that dama...
In December, 2006, a group of 26 software developers from some of the most widely used life science programming toolkits and phylogenetic software projects converged on Durham, North Carolina, for a Phyloinformatics Hackathon, an intense five-day collaborative software coding event sponsored by the National Evolutionary Synthesis Center (NESCent)....
Since the genetic code first was determined, many have claimed that it is organized adaptively, so as to assign similar codons to similar amino acids. This claim has proved difficult to establish due to the absence of relevant comparative data on alternative primordial codes and of objective measures of amino acid exchangeability. Here we use a rec...
Claims of intron-structure correlations have played a major role in debates surrounding split gene origins. In the formative (as opposed to disruptive or "insertional") model of split gene origins, introns represent the scars of chimaeric gene assembly. When analyzed retrospectively, formative introns should tend to fall between modular units, if s...
figure1.nex. The NEXUS file corresponding to Figure 1.
example.nex. A simple NEXUS file used in the tutorial examples.
Evolutionary analysis provides a formal framework for comparative analysis of genomic and other data. In evolutionary analysis, observed data are treated as the terminal states of characters that have evolved (via transitions between states) along the branches of a tree. The NEXUS standard of Maddison, et al. (1997; Syst. Biol. 46: 590-621) provide...
Evolutionary trends responsible for systematic differences in genome and proteome composition have been attributed to GC:AT mutation bias in the context of neutral evolution or to selection acting on genome composition. A possibility that has been ignored, presumably because it is part of neither the Modern Synthesis nor the Neutral Theory, is that...
The rediscovery of Mendel's laws a century ago launched the science that William Bateson called "genetics," and led to a new view of evolution combining selection, particulate inheritance, and the newly characterized phenomenon of "mutation." This "mutationist" view clashed with the earlier view of Darwin, and the later "Modern Synthesis," by allow...
Nexplorer is a web-based program for interactive browsing and manipulation of character data in NEXUS format, well suited
for use with alignments and trees representing families of homologous genes or proteins. Users may upload a sequence family
dataset, or choose from one of several thousand already available. Nexplorer provides a flexible means t...
The comparative analysis of protein sequences depends crucially on measures of amino acid similarity or distance. Many such measures exist, yet it is not known how well these measures reflect the operational exchangeability of amino acids in proteins, since most are derived by methods that confound a variety of effects, including effects of mutatio...
Determining the relative contributions of mutation and selection to evolutionary change is a matter of great practical and theoretical significance. In this paper, we examine relative contributions of codon mutation rates and amino acid exchangeability on the frequencies of each type of amino acid difference in alignments of distantly related prote...
Theories regarding the evolution of spliceosomal introns differ in the extent to which the distribution of introns reflects either a formative role in the evolution of protein-coding genes or the adventitious gain of genetic elements. Here, systematic methods are used to assess the causes of the present-day distribution of introns in 10 families of...
The evolutionary origin of spliceosomal introns remains elusive. The startling success of a new way of predicting intron sites suggests that the splicing machinery determines where introns are added to genes.
According to New Synthesis doctrine, the direction of evolution is determined by selection and not by "internal causes" that act by way of propensities of variation. This doctrine rests on the theoretical claim that because mutation rates are small in comparison to selection coefficients, mutation is powerless to overcome opposing selection. Using...
The neutral theory often is presented as a theory of "noise" or silent changes at an isolated "molecular level," relevant to marking the steady pace of divergence, but not to the origin of biological structure, function, or complexity. Nevertheless, precisely these issues can be addressed in neutral models, such as those elaborated here with regard...
The 'introns-late' theory holds that spliceosomal introns have been added to genes during eukaryotic evolution. Few clear examples of recent intron gains have been well documented, but two such cases have now been reported, one with possible identification of the source of the intron.
Alignments of homologous genes typically reveal a great diversity of intron locations, far more than could fit comfortably in a single gene. Thus, a minority of these intron positions could be inherited from a single ancestral gene, but the larger share must be attributed to subsequent events of intron gain or intron "sliding" (movement from one po...
According to the exon theory of genes, protein-coding genes evolved originally by combinatorial assembly of mini-gene precursors
of modern exons. If so, then exons should tend to encode discrete bits of protein structure, as first suggested by C.C.F.Blake.
In order to assess the evidence for Blake's conjecture, we have developed methods for evaluat...
A tendency for exons to correspond to discrete units of protein structure in protein-coding genes of ancient origin would provide clear evidence in favor of the exon theory of genes, which proposes that split genes arose not by insertion of introns into unsplit genes, but from combinations of primordial mini-genes (exons) separated by spacers (intr...
Guidelines for submitting commentsPolicy: Comments that contribute to the discussion of the article will be posted within approximately three business days. We do not accept anonymous comments. Please include your email address; the address will not be displayed in the posted comment. Cell Press Editors will screen the comments to ensure that they...