January 2025
·
30 Reads
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
January 2025
·
30 Reads
January 2025
·
532 Reads
·
1 Citation
Grasses (Poaceae) comprise c . 11 800 species and are central to human livelihoods and terrestrial ecosystems. Knowing their relationships and evolutionary history is key to comparative research and crop breeding. Advances in genome‐scale sequencing allow for increased breadth and depth of phylogenomic analyses, making it possible to infer a new reference species tree of the family. We inferred a comprehensive species tree of grasses by combining new and published sequences for 331 nuclear genes from genome, transcriptome, target enrichment and shotgun data. Our 1153‐tip tree covers 79% of grass genera (including 21 genera sequenced for the first time) and all but two small tribes. We compared it to a newly inferred 910‐tip plastome tree. We recovered most of the tribes and subfamilies previously established, despite pervasive incongruence among nuclear gene trees. The early diversification of the PACMAD clade could represent a hard polytomy. Gene tree–species tree reconciliation suggests that reticulation events occurred repeatedly. Nuclear–plastome incongruence is rare, with very few cases of supported conflict. We provide a robust framework for the grass tree of life to support research on grass evolution, including modes of reticulation, and genetic diversity for sustainable agriculture.
October 2024
·
46 Reads
Specimen associated biodiversity data are sought after for biological, environmental, climate, and conservation sciences. A rate shift is required for the extraction of data from specimen images to eliminate the bottleneck that the reliance on human-mediated transcription of these data represents. We applied advanced computer vision techniques to develop the `Hespi' (HErbarium Specimen sheet PIpeline), which extracts a pre-catalogue subset of collection data on the institutional labels on herbarium specimens from their digital images. The pipeline integrates two object detection models; the first detects bounding boxes around text-based labels and the second detects bounding boxes around text-based data fields on the primary institutional label. The pipeline classifies text-based institutional labels as printed, typed, handwritten, or a combination and applies Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) for data extraction. The recognized text is then corrected against authoritative databases of taxon names. The extracted text is also corrected with the aide of a multimodal Large Language Model (LLM). Hespi accurately detects and extracts text for test datasets including specimen sheet images from international herbaria. The components of the pipeline are modular and users can train their own models with their own data and use them in place of the models provided.
May 2024
·
1,041 Reads
·
1 Citation
Grasses (Poaceae) comprise around 11,800 species and are central for human livelihoods and terrestrial ecosystems. Knowing their relationships and evolutionary history is key to comparative research and crop breeding. Advances in genome-scale sequencing allow for increased breadth and depth of phylogenomic analyses, making it possible to infer a new reference species tree of the family. We inferred a comprehensive species tree of grasses by combining new and published sequences for 331 nuclear genes from genome, transcriptome, target enrichment and shotgun data. Our 1,153-tip tree covers 79% of grass genera (including 21 genera sequenced for the first time) and all but two small tribes. We compared it to a 910-tip plastome tree. The nuclear phylogeny matches that of the plastome at most deep branches, with only a few instances of incongruence. Gene tree–species tree reconciliation suggests that reticulation events occurred repeatedly in the history of grasses. We provide a robust framework for the grass tree of life to support research on grass evolution, including modes of reticulation, and genetic diversity for sustainable agriculture.
April 2024
·
4,041 Reads
·
76 Citations
Nature
Angiosperms are the cornerstone of most terrestrial ecosystems and human livelihoods1,2. A robust understanding of angiosperm evolution is required to explain their rise to ecological dominance. So far, the angiosperm tree of life has been determined primarily by means of analyses of the plastid genome3,4. Many studies have drawn on this foundational work, such as classification and first insights into angiosperm diversification since their Mesozoic origins5–7. However, the limited and biased sampling of both taxa and genomes undermines confidence in the tree and its implications. Here, we build the tree of life for almost 8,000 (about 60%) angiosperm genera using a standardized set of 353 nuclear genes⁸. This 15-fold increase in genus-level sampling relative to comparable nuclear studies⁹ provides a critical test of earlier results and brings notable change to key groups, especially in rosids, while substantiating many previously predicted relationships. Scaling this tree to time using 200 fossils, we discovered that early angiosperm evolution was characterized by high gene tree conflict and explosive diversification, giving rise to more than 80% of extant angiosperm orders. Steady diversification ensued through the remaining Mesozoic Era until rates resurged in the Cenozoic Era, concurrent with decreasing global temperatures and tightly linked with gene tree conflict. Taken together, our extensive sampling combined with advanced phylogenomic methods shows the deep history and full complexity in the evolution of a megadiverse clade.
December 2023
·
413 Reads
Species trees, which depict the evolutionary relationships among organisms, underlie many evolutionary studies. Phylogenomics, the use of genome-scale datasets for phylogenetic inference, is the current gold standard for species tree inference. The development, maintenance, and execution of phylogenomic workflows is challenging, requiring programming, data management skills, and familiarity with changing best practices. We introduce OrthoFlow, a software wherein a single command automatically conducts end-to-end phylogenomic analysis—orthology inference and identification of phylogenomic markers, quality control, data matrix construction, diagnostics, and tree inference using supermatrix and supertree methods from multiple input data formats. To demonstrate the utility of OrthoFlow, we successfully recapitulate the evolutionary relationships among 24 yeast species. OrthoFlow increases the accessibility of researchers to conduct rigorous phylogenomic analysis flexibly. OrthoFlow is freely available from PyPI (https://pypi.org/project/orthoflow/), Bioconda (https://anaconda.org/bioconda/orthoflow) and GitHub (https://github.com/rbturnbull/orthoflow) under the Apache License 2.0.
December 2023
·
280 Reads
·
1 Citation
Species trees, which depict the evolutionary relationships among organisms, underlie many evolutionary studies. Phylogenomics, the use of genome-scale datasets for phylogenetic inference, is the current gold standard for species tree inference. The development, maintenance, and execution of phylogenomic workflows is challenging, requiring programming, data management skills, and familiarity with changing best practices. We introduce Orthoflow, a software wherein a single command automatically conducts end-to-end phylogenomic analysis—orthology inference and identification of phylogenomic markers, quality control, data matrix construction, diagnostics, and tree inference using supermatrix and supertree methods from multiple input data formats. To demonstrate the utility of Orthoflow, we successfully recapitulate the evolutionary relationships among 24 yeast species. Orthoflow increases the accessibility of researchers to conduct rigorous phylogenomic analysis flexibly. Orthoflow is freely available from PyPI (https://pypi.org/project/orthoflow/), Bioconda (https://anaconda.org/bioconda/orthoflow) and GitHub (https://github.com/rbturnbull/orthoflow) under the Apache License 2.0.
September 2023
·
147 Reads
·
4 Citations
Botanical Journal of the Linnean Society
Lomandra is the largest genus in Asparagaceae subfamily Lomandroideae and possesses economic, ecological, and ethnobotanical significance in Australia. Lomandra comprises four sections, L. section Capitatae, L. section Macrostachya, L. section Typhopsis and L. section Lomandra, the latter comprising series Lomandra and series Sparsiflorae, all recognized based solely on morphology. In this study, phylogenetic relationships were estimated for 79 Lomandroideae individuals, including 45 Lomandra species and subspecies (c. 63% of species and subspecies diversity). We generated genome-scale plastome sequence data and used maximum likelihood and Bayesian inference criteria for phylogenetic estimation. Lomandra was non-monophyletic, with Xerolirion divaricata nested within it. Two major clades were recovered: Capitatae–Macrostachya (CM) and Lomandra–Typhopsis (LT). The CM clade included a monophyletic Lomandra section Capitatae with a base chromosome number x = 7, and L. section Macrostachya (x = 8); the LT clade included L. sections Typhopsis and Lomandra, both x = 8. Section Lomandra series Lomandra and series Sparsiflorae were both recovered as non-monophyletic. Morphological characters were assessed to identify combinations of characters that characterize clades. A base chromosome number of x = 8 was plesiomorphic for Lomandra. The largest number of Lomandra species occupy the Mediterranean ecoregion and occupancy of sclerophyll vegetation was reconstructed as ancestral for the genus.
August 2023
·
198 Reads
·
3 Citations
Specimens or objects in natural history collections hold substantial research and cultural value that is enhanced where these items are made digitally available. Benefits of digitisation include increasing open access to collection-based biodiversity data, increasing productivity of scientific research, enabling novel research applications of digitally accessible data, reducing preservation requirements through reduced object handling, and expanding potential for “remote curation” in collections. However, the time available for object and data digitisation is limited for most collections. Well documented digitisation workflows can ensure that curation time is efficiently applied to achieve digitisation outputs, and that digitisation standards are consistently applied within and among projects. While this case study focused on the generation of digitisation workflows in a medium-sized Australian university-based herbarium, the findings of this study are relevant to collections globally. The curation workflows comprise a set of modular steps required for the digitisation of herbarium specimen data and images. Steps are clearly identified as requiring human-mediation versus those that can be automated, those that require on-site versus remote-access, and those that require transfer or transformation of data or files. This clarity enables consideration of the opportunities and challenges for increasing efficiencies for collection-based digitisation, data and file management. The maps provide a contextual framework for herbarium-based digitisation pathways for those who work with specimen-derived biodiversity data, and an insight into these tools for those who are not familiar with herbarium protocols.
August 2023
·
156 Reads
·
5 Citations
Advanced computer vision techniques hold the potential to mobilise vast quantities of biodiversity data by facilitating the rapid extraction of text‐ and trait‐based data from herbarium specimen digital images, and to increase the efficiency and accuracy of downstream data capture during digitisation. This investigation developed an object detection model using YOLOv5 and digitised collection images from the University of Melbourne Herbarium (MELU). The MELU‐trained ‘sheet‐component’ model—trained on 3371 annotated images, validated on 1000 annotated images, run using ‘large’ model type, at 640 pixels, for 200 epochs—successfully identified most of the 11 component types of the digital specimen images, with an overall model precision measure of 0.983, recall of 0.969 and moving average precision (mAP0.5–0.95) of 0.847. Specifically, ‘institutional’ and ‘annotation’ labels were predicted with mAP0.5–0.95 of 0.970 and 0.878 respectively. It was found that annotating at least 2000 images was required to train an adequate model, likely due to the heterogeneity of specimen sheets. The full model was then applied to selected specimens from nine global herbaria ( Biodiversity Data Journal , 7, 2019), quantifying its generalisability: for example, the ‘institutional label’ was identified with mAP0.5–0.95 of between 0.68 and 0.89 across the various herbaria. Further detailed study demonstrated that starting with the MELU‐model weights and retraining for as few as 50 epochs on 30 additional annotated images was sufficient to enable the prediction of a previously unseen component. As many herbaria are resource‐constrained, the MELU‐trained ‘sheet‐component’ model weights are made available and application encouraged.
... Indeed, magnoliids can be traced in the fossil record to at least the Barremian (ca. 121-129 Ma;Massoni et al., 2015), and fossilcalibrated molecular dating studies have estimated the stem age of the clade to fall between 133 and 242 Ma (Magallón et al., 2015;Ramírez-Barahona et al., 2020;Zuntini et al., 2024). Species of magnoliids are found across a broad range of habitats and climates, but occur predominantly in tropical and warm temperate rain forests. ...
April 2024
Nature
... (Asparagaceae: Lomandroideae) includes four sections and two series, according to Lee and Macfarlane (1986), and has been studied intensively, especially in recent years (Wang 2023a(Wang , 2023b(Wang , 2023cGunn et al. 2024;Wang 2024;Wang & Gray 2024). To date, 67 species and ten nonautonymic subspecies are recognised (IPNI 2024;POWO 2024). ...
September 2023
Botanical Journal of the Linnean Society
... Historically, specimen-associated data were documented on paper , Walton et al., 2020; recorded in field notebooks, transcribed into printed catalogs, and primary and secondary data were written on labels that were attached to the specimens. Mobilization of these specimen data is typically achieved by processing specimens through a digitization workflow, involving the production of a digital specimen image followed by the extraction of text data from that digital image either manually (i.e., via a human intermediary) or semi-automatically [Thompson and Birch, 2023, de la Hidalga et al., 2020, Kirchhoff et al., 2018, Nelson et al., 2015. Their digitization, conforming to biological data standards [e.g., ABCD (Access to Biological Collections Data) [Holetschek et al., 2012] and DarwinCore [Wieczorek et al., 2012]], is essential for maintaining their accuracy and ensuring their availability for reuse . ...
August 2023
... Based on 1000 randomizations, we show that despite only having one-third as many records, herbarium specimens accumulate more taxonomic (a), phylogenetic (b), and functional (c) diversity than iNaturalist observations. greatly reduce both the cost and time it takes to digitize, while also increasing data standards 44,46,[48][49][50] . For example, the Smithsonian's US National Herbarium, which houses roughly 3.8M specimens, was recently completely digitized and the use of high-throughput workflows reduced the cost of digitization from $3.32 down to $1.85 per specimen and allowed for the digitization of 3000-4000 specimens daily 35 . ...
August 2023
... Xanthodermatei and A. sect. Hondenses exhibit toxicity, such as A. xanthodermus Genev., which can induce gastrointestinal symptoms [49][50][51]. Despite a lack of research, A. daqinggouensis is unsuitable for consumption due to potential toxicity. ...
September 2021
... We used the script BYO_transcriptome.py to search for Rhododendron versions of the Angiosperms353 genes. The "Mega353" gene set, an expanded Angiosperms353 set with many additional taxa representing each sequence (McLay et al. 2021), was used as the reference. The three Rhododendron CDS files (see above, MarkerMiner method) were used as the input transcriptomes. ...
June 2021
... The order comprises ca. 36,265 species (Birch and Kocyan 2021). Asparagaceae is divided into seven subfamilies, viz. ...
May 2021
Molecular Phylogenetics and Evolution
... (chocolate lily). C. australasicum self-pollinates when its flowers are tripped (i.e. the lip pushed open), and can be pollinated by honey bees or native bees (Wang et al., 2010), whereas A. strictum is buzz pollinated and therefore not pollinated by honey bees (Gunn et al., 2020). Many Australian native bee species buzz pollinate (Smith & Saunders, 2019), and therefore, the use of these two species could provide some insight into the pollination services provided by native and non-native bees in the landscape. ...
April 2020
Molecular Phylogenetics and Evolution
... Taxonomic species are usually described based on morphological characteristics that can easily be altered by local adaptation, phenotypic plasticity, or neutral morphological polymorphism, which may cause a single variable species to be classified as many species (e.g., Gemeinholzer & Bachmann, 2005). On the other hand, very recent divergence and little differentiation might contribute to the inability of barcoding to separate species in some cases (Birch et al., 2017). ...
October 2017
... Asteliaceae contains two endemic Australian genera: Neoastelia, a monotypic genus of temperate rainforests in central eastern Australia and Milligania (5 species), a Tasmanian endemic with taxa occupying lowland, alluvial to alpine herbfield vegetation. Conversely, Astelia (30 species; Birch, 2015), the largest genus in the family, has a center of diversity in New Zealand, with secondary centers of diversity in Australia and Hawai'i, and two exceptional occurrences in Africa (La Réunion, Mauritius) and in South America (Patagonia). Blandfordiaceae and Boryaceae are Australian endemics containing a single genus Blandfordia (4 species) and Borya (13 species) and the monotypic genus Alania, respectively. ...
July 2015