
Barbara R Holland- University of Tasmania
Barbara R Holland
- University of Tasmania
About
166
Publications
24,555
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,865
Citations
Introduction
Current institution
Publications
Publications (166)
We consider a model for inferring functional links between genes. We begin with the simple case of two genes whose presence or absence evolves stochastically along a phylogenetic tree. We develop a hidden Markov model where the hidden states of the model correspond to whether or not the genes perform a joint function. In the case that two genes do...
We consider a subfunctionalisation model of gene family evolution. A family of n genes that perform z functions is represented by an n × z binary matrix Y t where a 1 in the ij th position indicates that gene i can perform function j . Y t evolves according to a continuous time Markov chain (CTMC) that represents the processes of gene duplication,...
The use of information criteria to distinguish between phylogenetic models has become ubiquitous within the field. However, the variety and complexity of available models is much greater now than when these practices were established. The literature shows an increasing trajectory of healthy scepticism with regard to the use of information theory-ba...
The underlying structure of the canonical amino acid substitution matrix (aaSM) is examined by considering stepwise improvements in the differential recognition of amino acids according to their chemical properties during the branching history of the two aminoacyl-tRNA synthetase (aaRS) superfamilies. The evolutionary expansion of the genetic code...
Comparative biological studies often investigate the morphological, physiological or ecological divergence (or overlap) between entities such as species or populations. Here we discuss the weaknesses of using existing methods to analyse patterns of phenotypic overlap and present a novel method to analyse co‐occurrence in multidimensional space.
We...
The size of plant stomata (adjustable pores that determine the uptake of CO 2 and loss of water from leaves) is considered to be evolutionarily important. This study uses fossils from the major Southern Hemisphere family Proteaceae to test whether stomatal cell size responded to Cenozoic climate change. We measured the length and abundance of guard...
The underlying structure of the canonical amino acid substitution matrix (aaSM) is examined by considering stepwise improvements in the differential recognition of amino acids according to their chemical properties during the branching history of the two aminoacyl-tRNA synthetase (aaRS) superfamilies. The evolutionary expansion of the genetic code...
The ancient catacombs of Egypt harbor millions of well-preserved mummified Sacred Ibis (Threskiornis aethiopicus) dating from ~600BC. Although it is known that a very large number of these ‘votive’ mummies were sacrificed to the Egyptian God Thoth, how the ancient Egyptians obtained millions of these birds for mummification remains unresolved. Anci...
A gene family is a set of evolutionarily related genes formed by duplication. Genes within a gene family can perform a range of different but possibly overlapping functions. The process of duplication produces a gene that has identical functions to the gene it was duplicated from with subsequent divergence over time. In this paper, we explore diffe...
Molecular sequence data that have evolved under the influence of heterotachous evolutionary processes are known to mislead phylogenetic inference. We introduce the General Heterogeneous evolution On a Single Topology (GHOST) model of sequence evolution, implemented under a maximum-likelihood framework in the phylogenetic program IQ-TREE (http://www...
The ancient catacombs of Egypt harbor millions of well-preserved mummified Sacred Ibis (Threskiornis aethiopicus) dating from ~600BC. Although it is known that a very large number of these votive mummies were sacrificed to the Egyptian God Thoth, how the ancient Egyptians obtained millions of these birds for mummification remains unresolved. Ancien...
Molecular phylogenetics plays a key role in comparative genomics and has an increasingly-significant impacts on science, industry, government, public health, and society. In this opinion paper, we posit that the current phylogenetic protocol is missing two critical steps, and that their absence allows model misspecification and confirmation bias to...
Principal components analysis (PCA) has been one of the most widely used exploration tools in genomic data analysis since its introduction in 1978 (Menozzi et al. 1978). PCA allows similarities between individuals to be efficiently calculated and visualized, optimally in two dimensions. While PCA is well suited to analyses concerned with autosomal...
The "Lie closure" of a set of matrices is the smallest matrix Lie algebra (a linear space of matrices closed under the operation $ [A, B] = AB-BA $) which contains the set. In the context of Markov chain theory, if a set of rate matrices form a Lie algebra, their corresponding Markov matrices are closed under matrix multiplication, which has been f...
Recently there has been renewed interest in phylogenetic inference methods based on phylogenetic invariants, alongside the related Markov invariants. Broadly speaking, both these approaches give rise to polynomial functions of sequence site patterns that, in expectation value, either vanish for particular evolutionary trees (in the case of phylogen...
Background
Gene duplication has been identified as a key process driving functional change in many genomes. Several biological models exist for the evolution of a pair of duplicates after a duplication event, and it is believed that gene duplicates can evolve in different ways, according to one process, or a mix of processes. Subfunctionalization i...
Models of codon evolution are commonly used to identify positive selection. Positive selection is typically a heterogeneous process, i.e., it acts on some branches of the evolutionary tree and not others. Previous work on DNA models showed that when evolution occurs under a heterogeneous process it is important to consider the property of model clo...
We give a non-technical introduction to convergence-divergence models, a new modeling approach for phylogenetic data that allows for the usual divergence of species post speciation but also allows for species to converge, i.e. become more similar over time. By examining the $3$-taxon case in some detail we illustrate that phylogeneticists have been...
Molecular sequence data that have evolved under the influence of heterotachous evolutionary processes are known to mislead phylogenetic inference. We introduce the General Heterogeneous evolution On a Single Topology (GHOST) model of sequence evolution, implemented under a maximum-likelihood framework in the phylogenetic program IQ-TREE. Extensive...
Background and aims:
Investigating species distributions across geographic barriers is a commonly utilized method in biogeography to help understand the functional traits that allow plants to disperse successfully. Here the biogeographic pattern analysis approach is extended by using chloroplast DNA whole-genome 'mining' to examine the functional...
Widespread species spanning strong environmental (e.g., climatic) gradients frequently display morphological and physiological adaptations to local conditions. Some adaptations are common to different species that occupy similar environments. However, the genomic architecture underlying such convergent traits may not be the same between species. Us...
"Phylogenetics" is the systematic study of reconstructing the past evolutionary history of extant species or taxa, based on present-day data, such as morphologies or molecular information (sequence data). This evolutionary history or phylogeny is ideally represented as a binary tree. In the method of "phylogenetic invariants," a pivotal role is pla...
We applied three statistical classification techniques—linear discriminant analysis (LDA), logistic regression, and random forests—to three astronomical datasets associated with searches for interstellar masers. We compared the performance of these methods in identifying whether specific mid-infrared or millimetre continuum sources are likely to ha...
We introduce a gene tree simulator that is designed for use in conjunction with approximate Bayesian computation approaches.
We show that it can be used to determine the relative importance of hybrid speciation and introgression compared to incomplete
lineage sorting in producing patterns of incongruence across gene trees. Important features of the...
Wastewater-based epidemiology is increasingly being used as a tool to monitor drug use trends. To minimize costs, studies have typically monitored a small number of days. However, cycles of drug use may display weekly and seasonal trends that affect the accuracy of monthly or annual drug use estimates based on a limited number of samples. This stud...
Detecting loci under selection is an important task in evolutionary biology. In conservation genetics detecting selection is key to investigating adaptation to the spread of infectious disease. Loci under selection can be detected on a spatial scale, accounting for differences in demographic history among populations, or on a temporal scale, tracin...
SNPs under selection at each year detected with BAYESCAN assuming prior odds of 10.
(PDF)
Overview of SNPs under selection detected with demographic and time-series methods.
(PDF)
SNPs under selection at each year detected with BAYESCAN assuming prior odds of 100.
(PDF)
Estimates of genetic diversity for the Tasmanian devil based on 1482 SNPs.
(PDF)
SNPs under selection in individual Tasmanian devil population detected with WFABC assuming a) a small (Ne = 50) and b) a large (Ne = 500) effective population size.
(PDF)
Accurate estimation of evolutionary distances between taxa is important for many phylogenetic reconstruction methods. Specifically, in the case of bacteria, distances can be estimated using a range of different evolutionary models, from single nucleotide polymorphisms to large-scale genome rearrangements. Most such methods use the minimal distance...
Campylobacter jejuni (C. jejuni) is an important gastrointestinal pathogen with multiple hosts. Wild birds are a source of Campylobacter spp., but little is known about its genomic characteristics and evolution within wild bird populations. Isolates from the Australian purple swamphen (Porphyrio porphyrio melanotus) were compared to isolates associ...
The yeast Candida albicans, a commensal colonizer and occasional pathogen of humans, has a rudimentary mating ability. However, mating is a cumbersome process that has never been observed outside the laboratory, and the population structure of the species is predominantly clonal. Here we discuss recent findings that indicate that mating ability is...
The yeast Candida albicans can mate. However, in the natural environment mating may generate progeny (fusants) fitter than clonal lineages too rarely to render mating biologically significant: C. albicans has never been observed to mate in its natural environment, the human host, and the population structure of the species is largely clonal. It see...
We assess phylogenetic patterns of hybridization in the speciose, ecologically and economically important genus Eucalyptus, in order to better understand the evolution of reproductive isolation. Eucalyptus globulus pollen was applied to 99 eucalypt species, mainly from the large commercially important subgenus, Symphyomyrtus. In the 64 species that...
The Tasmanian devil (Sarcophilus harrisii) was widespread in Australia during the Late Pleistocene but is now endemic to the island of Tasmania. Low genetic diversity combined with the spread of devil facial tumour disease have raised concerns for the species’ long-term survival. Here, we investigate the origin of low genetic diversity by inferring...
A number of studies have suggested using comparisons between DNA sequences of
closely related bacterial isolates to estimate the relative rate of
recombination to mutation for that bacterial species. We consider such an
approach which uses single locus variants: pairs of isolates whose DNA differ
at a single gene locus. One way of deriving point es...
Seasonally reproducing animals show many behavioural and physiological
changes during the mating period, including increased signalling for mate attraction.
Mammals often rely on chemical signals for communication and coordination
of mating and other social behaviours, but our understanding of the subtleties
and functions of mammalian signalling co...
New Zealand is an isolated archipelago in the South-West Pacific with a unique fauna and flora, a feature partly attributable to it being the last sizable land mass to be colonized by man. In this chapter we test the hypothesis that different periods in the history of New Zealand – from pre-history to post-Polynesian/pre-European arrival and post-E...
Campylobacter jejuni is a thermophilic species that grows well at 42°C, a temperature associated with the avian body, and does not actively grow below 30°C, excluding C. jejuni subsp. doylei. We examined the phenotypic pattern of six Campylobacter sequence types (ST) that are associated with different hosts, at two temperatures, 22°C and 42°C. ST42...
In Figure 3b, the shadings used to indicate membership in Neoptera and Chiastomyaria were placed incorrectly. The corrected figure is shown here. FIGURE 3. Tree inference from analysis of the morphological and molecular data. a) Consensus tree of the morphological data analyzed with Bayesian inference, ML, maximum parsimony, and parsimony bootstrap...
This paper provides a review of the many applications of statistics within the field of phylogenetics, that is, the study of evolutionary history. The reader is assumed to be a statistician rather than a phylogeneticist, so some background is given on what phylogenetics is, along with a brief history of different approaches to phylogenetic inferenc...
A repeated cross-sectional study was conducted to determine the prevalence of Campylobacter spp. and the population structure of C. jejuni in European starlings and ducks cohabiting multiple public access sites in an urban area of New Zealand. The country's geographical isolation and relatively recent history of introduction of wild bird species, i...
Phylogenetic studies based on molecular sequence alignments are expected to become more accurate as the the number of sites in the alignments increases. With the advent of genomic-scale data, where alignments have very large numbers of sites, bootstrap values close to 100% and posterior probabilities close to 1 are the norm, suggesting that the num...
'Phylogenetics' is the systematic study of reconstructing the past evolutionary history of extant species or taxa, based on present-day data, such as morphologies or molecular information (sequence data). This evolutionary history or phylogeny is ideally represented as a binary tree. In the method of 'phylogenetic invariants', a pivotal role is pla...
Background
Recombination rates vary at the level of the species, population and individual. Now recognized as a transient feature of the genome, recombination rates at a given locus can change markedly over time. Existing inferential methods, predominantly based on linkage disequilibrium patterns, return a long-term average estimate of past recombi...
Response of summary statistics to recombination rates increasing linearly.
Response of summary statistics to recombination rates decreasing linearly.
Response of summary statistics to recombination rates decreasing exponentially.
Response of summary statistics to recombination rates increasing exponentially.
Response of summary statistics to recombination rates decreasing logistically.
Determining the optimal scaling factor to capture n-tuple age.
Correlations between recombination summary statistics scaled by S.
Response of summary statistics to constant recombination rates.
Response of summary statistics to recombination rates increasing logistically. (PDF 279 kb)
Using a tensorial approach, we show how to construct a one-one correspondence
between pattern probabilities and edge parameters for any group-based model.
This is a generalisation of the "Hadamard conjugation" and is equivalent to
standard results that use Fourier analysis. In our derivation we focus on the
connections to group representation theor...
The relationships of the three major clades of winged insects - Ephemeroptera, Odonata and Neoptera - are still unclear. Many morphologists favor a clade Metapterygota (Odonata+Neoptera), but Chiastomyaria (Ephemeroptera+Neoptera) or Palaeoptera (Ephemeroptera+Odonata) have also been supported in some older and more recent studies.A possible explan...
In their 2008 and 2009 papers, Sumner and colleagues introduced the "squangles" - a small set of Markov invariants for phylogenetic quartets. The squangles are consistent with the general Markov model (GM) and can be used to infer quartets without the need to explicitly estimate all parameters. As GM is inhomogeneous and hence non-stationary, the s...
We investigate distances on binary (presence/absence) data in the context of a Dollo process, where a trait can only arise once on a phylogenetic tree but may be lost many times. We introduce a novel distance, the Additive Dollo Distance (ADD), that applies to data generated under a Dollo model, and show that it has some useful theoretical properti...
Single locus variants (SLVs) are bacterial sequence types that differ at only one of the seven canonical multilocus sequence typing (MLST) loci. Estimating the relative roles of recombination and point mutation in the generation of new alleles that lead to SLVs is helpful in understanding how organisms evolve. The relative rates of recombination an...
In their 2008 and 2009 papers, Sumner and colleagues introduced the
"squangles" - a small set of Markov invariants for phylogenetic
quartets. The squangles are consistent with the general Markov model
(GM) and can be used to infer quartets without the need to explicitly
estimate all parameters. As GM is inhomogeneous and hence
non-stationary, the s...
The general time reversible model (GTR) is presently the most popular model used in phylogentic studies. However, GTR has an undesirable mathematical property that is potentially of significant concern. It is the purpose of this article to give examples that demonstrate why this deficit may pose a problem for phylogenetic analysis and interpretatio...
In this paper, we use an example shorebird data set to explore three related questions regarding the interplay between alignment and phylogeny estimation:
1) can gap-rich alignments be used for reasonably accurate and unbiased phylogenetic inference? 2) How much phylogenetic information is contained in gap
characters as compared with the nucleotide...
Neotropical reef fish communities are species-poor compared to those of the Indo-West Pacific. An exception to that pattern is the blenny clade Chaenopsidae, one of only three rocky and coral reef fish families largely endemic to the Neotropics. Within the chaenopsids, the genus Acanthemblemaria is the most species-rich and is characterized by elab...
It is known that the Kimura 3ST model of sequence evolution on phylogenetic trees can be extended quite naturally to arbitrary split systems. However, this extension relies heavily on mathematical peculiarities of the associated Hadamard transformation, and providing an analogous augmentation of the general Markov model has thus far been elusive. I...
For the predominantly southern hemisphere plant group Styphelioideae (Ericaceae) published sequence datasets of five markers are now available for all except one of the 38 recognised genera. However, several markers are highly incomplete therefore missing data is problematic for producing a genus level phylogeny. We explore the relative utility of...
Despite trends towards maximum likelihood and Bayesian criteria, maximum parsimony (MP) remains an important criterion for evaluating phylogenetic trees. Because exact MP search is NP-complete, the computational effort needed to find provably optimal trees skyrockets with increasing numbers of taxa, limiting analyses to around 25-30 taxa. This is,...