About
226
Publications
143,211
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,794
Citations
Introduction
Skills and Expertise
Additional affiliations
November 1994 - July 1998
November 1993 - November 1994
November 1993 - October 1994
Publications
Publications (226)
Scientific publications provide a wealth of peer-reviewed, high-quality data that have been maintained over time, resulting in data persistence. As data repositories with rich provenance information, publications are indispensable sources for the integration and extension of networks of interlinked Findable, Accessible, Interoperable and Reusable (...
BiCIKL ( Bi odiversity C ommunity I ntegrated K nowledge L ibrary) is a European Union (EU) Horizon 2020 project (2021–2024) building a new community of research infrastructures (RIs), researchers and other stakeholders, through improved access to interlinked, open and FAIR ( F indable, A ccessible, I nteroperable, R eusable) biodiversity data alon...
Knowledge about biodiversity is largely embedded in a daily growing corpus of over 500 million pages of biodiversity literature that is not machine-actionable. It is thus not open to building a biodiversity knowledge graph, or facilitating the use of artificial intelligence tools. This hinders the completion of a much-needed taxonomic name referenc...
The EU and other states have made legislative efforts to clarify data mining in copyrightable works, but the situation remains obscure and confusing, especially in a globalised field where international legislation can contribute to opacity. The present paper aims at asserting a common position of three communities representing biodiversity science...
Taxonomy, and biodiversity science in general, mainly revolve around four types of entities, which are available digitally in ever increasing numbers from different services: (1) Physical specimens (kept in museums and other collections around the world) and observations are available digitally via the Global Biodiversity Information Facility (GBIF...
The Swiss Institute of Bioinformatics (SIB) Literature services (SIBiLS, Gobeill et al. 2020) provides powerful search capabilities to explore the life and health sciences literature by mirroring the United States National Institute of Health's National Library of Medicine (NIH/NLM) (MEDLINE) and National Center for Biotechnology Information (NCBI)...
One Health is an integrated, unifying approach that aims to sustainably balance and optimize the health of people, animals and ecosystems in compliance with the United Nations development goals (Dye 2022). However, premier life and health sciences digital libraries such as PubMed Central® tend to exclude or marginally include scientific publication...
The Biodiversity Knowledge Hub (BKH) is a web platform acting as an integration point and broker of an open, FAIR (Findable, Accessible, Interoperable, Reusable) and interlinked corpora of biodiversity data, services and knowledge. It serves the entire biodiversity research cycle, from specimens and observations to sequences, taxon names and finall...
Since 2008, the not-for-profit organization Plazi*1, based in Switzerland, has been supporting and promoting the development of persistent and openly accessible digital taxonomic literature. To achieve this goal, Plazi makes use of in-house software tools for data mining and extraction from taxonomic publications, along with other partner instituti...
One of the main challenges in biodiversity data reusability is finding ways to transform what is provided in research publications into different and reusable formats, following the FAIR (Findable, Accessible, Interoperable, Reusable) principles (Agosti and Egloff 2009).
Most often, data is restricted to text, figures and tables in the so-called “P...
A goal of natural history institutions is to contribute to the understanding of biodiversity and disseminate this knowledge through scholarly publications and other public activities. In today’s world, it is expected that this knowledge, imprisoned in separated silos of millions of publications scattered through ca. 500 million published pages, is...
The Audubon Core vocabulary terms subjectPart and subjectOrientation are used to describe the depicted part of an organism and its orientation in an image. We describe the criteria and process for developing controlled vocabularies for these two terms. The vocabularies take the form of Simple Knowledge Organization System (SKOS) concept schemes and...
Motivation
Since early 2020, the COVID-19 pandemic has confronted the biomedical community with an unprecedented challenge. The rapid spread of COVID-19 and ease of transmission seen worldwide is due to increased population flow and international trade. Front-line medical care, treatment research and vaccine development also require rapid and infor...
The paper summarises many years of discussions and experience of biodiversity publishers, organisations, research projects and individual researchers, and proposes recommendations for implementation of persistent identifiers for article metadata, structural elements (sections, subsections, figures, tables, references, supplementary materials and ot...
The BiCIKL project is born from a vision that biodiversity data are most useful if they are presented as a network of data that can be integrated and viewed from different starting points. BiCIKL’s goal is to realise that vision by linking biodiversity data infrastructures, particularly for literature, molecular sequences, specimens, nomenclature a...
This joint statement aims at encouraging all authors, publishers and editors involved in
scientific publishing to give the bibliographic source of the authorities of taxonomic names. This initiative, written by members of the three communities, has been approved by the executive boards of the SPNHC (Society for the Preservation of Natural History
C...
Taxonomy is the science of charting and describing the worlds biodiversity. Organisms are grouped into taxa which are given a given rank building the taxonomic hierarchy. The taxa are described in taxonomic treatments, well defined sections of scientific publications (Catapano 2019). They include a nomenclatural section and one or more sections inc...
Synospecies is a linked data application to explore changes in taxonomic names (Gmür and Agosti 2021). The underlying source of truth for the establishment of taxa, the assignment and re-assignment of names, are taxonomic treatments. Taxonomic treatments are sections of publications documenting the features or distribution of taxa in ways adhering...
Thousands of new species are discovered each year, and new results are published to add to the knowledge of existing species. A growing number of these are immediately accessible through the Biodiversity Literature Repository (BLR) and reused by Global Biodiversity Information Facility (GBIF), bringing the number of treatments covering plant specie...
To understand the loss of species, a benchmark is needed, e.g. the status of biodiversity in 1992 when the Convention on Biological Diversity recognized biodiversity crisis to compare to its status in the successive year. Though we are far from knowning how many species there are on planet Earth, we keep track of their descriptions and number throu...
Threats to global biodiversity are increasingly recognised by scientists and the public as a critical challenge. Molecular sequencing technologies offer means to catalogue, explore, and monitor the richness and biogeography of life on Earth. However, exploiting their full potential requires tools that connect biodiversity infrastructures and resour...
BiCIKL is an European Union Horizon 2020 project that will initiate and build a new European starting community of key research infrastructures, establishing open science practices in the domain of biodiversity through provision of access to data, associated tools and services at each separate stage of and along the entire research cycle. BiCIKL wi...
The European Journal of Taxonomy (EJT) is a decade-old journal dedicated to the taxonomy of living and fossil eukaryotes. Launched in 2011, the EJT published exactly 900 articles (31 778 pages) from 2011 to 2021. The journal has been processed in its entirety by Plazi, liberating the data therein, depositing it into TreatmentBank, Biodiversity Lite...
Threats to global biodiversity are increasingly recognised by scientists and the public as a critical challenge. Molecular sequencing technologies offer means to catalogue, explore, and monitor the richness and biogeography of life on Earth. However, exploiting their full potential requires tools that connect biodiversity infrastructures and resour...
Understanding the distribution of species is essential for the conservation and management of biodiversity. But the availability of this kind of information is still scarce for the most diverse regions. The higher‐taxon approach (i.e. use of coarser taxonomic levels to represent species) as an easier and efficient method in representing species pat...
Plazi's TreatmentBank is a research infrastructure and partner of the recent European Union-funded Biodiversity Community Integrated Knowledge Library (BiCIKL) project to provide a single knowledge portal to open, interlinked and machine-readable, findable, accessible, interoperable and reusable (FAIR) data. Plazi is liberating published biodiversi...
Plazi is a Swiss non-governmental organization dedicated to the liberation of data imprisoned in flat, dead-end formats such as PDFs. In the process, the data therein is annotated and exported in various formats, following field-specific standards, facilitating free access and reutilization by several other service providers and end-users. This dat...
Taxonomic treatments, sections of publications documenting the features or distribution of a related group of organisms (called a “taxon”, plural “taxa”) in ways adhering to highly formalized conventions, and published in scientific journals, shape our understanding of global biodiversity (Catapano 2019).
Treatments are the building blocks of the e...
Citing the specimens used to describe new species or augment existing taxa is integral to the scholarship of taxonomic and related biodiversity-oriented publications. These so-called material citations (Darwin Core Term MaterialCitation), linked to the natural history collections in which they are archived, are the mechanism by which readers may re...
Automatic data mining is not an easy task and its success in the biodiversity world is deeply tied to the standardization and consistency of scientific journals' layout structure. The various formatting styles found in the over 500 million pages of published biodiversity information (Kalfatovich 2010), pose a remarkable challenge towards the goal o...
Biodiversity sciences, including taxonomy, are empirical sciences where all results are published in scholarly publications as part of the research life cycle. This creates a corpus of an estimated 500 million printed pages (Kalfatovic 2010) including billions of facts such as traits, biotic interactions, observations characterizing all the estimat...
In the landscape of general-purpose repositories, Zenodo was built at the European Laboratory for Particle Physics' (CERN) data center to facilitate the sharing and preservation of the long tail of research across all disciplines and scientific domains. Given Zenodo’s long tradition of making research artifacts FAIR (Findable, Accessible, Interoper...
Plazi is a non-profit organization focused on the liberation of data from taxonomic publications. As one of Plazi’s goals of promoting the accessibility of taxonomic data, our team has developed different ways of getting the outside community involved. The Plazi community on GitHub encourages the scientific community and other contributors to post...
The BiCIKL Project is born from a vision that biodiversity data are most useful if they are viewed as a network of data that can be integrated and viewed from different starting points. BiCIKL’s goal is to realize that vision by linking biodiversity data infrastructures, particularly for literature, molecular sequences, specimens, nomenclature and...
The Horizon 2020 project Bi odiversity C ommunity I ntegrated K nowledge L ibrary (BiCIKL) (started 1st of May 2021, duration 3 years) will build a new European community of key research infrastructures, researchers, citizen scientists and other stakeholders in biodiversity and life sciences. Together, the BiCIKL 14 partners will solidify open scie...
Connecting basic data about bats and other potential hosts of SARS-CoV-2 with their ecological context is crucial to the understanding of the emergence and spread of the virus. However, when lockdowns in many countries started in March, 2020, the world's bat experts were locked out of their research laboratories, which in turn impeded access to lar...
Here we present a descriptive analysis of the bibliographic production of the world-renowned heteropterist Dr. Jocélia Grazia and comments on her taxonomic reach based on extracted taxonomic treatments. We analyzed a total of 219 published documents, including scientific papers, scientific notes, and book chapters. Additionally, we applied the Plaz...
Connecting basic data about bats and other potential mammal hosts of SARS-CoV-2 with their ecological context is now critical for understanding the emergence and spread of COVID-19. However, when global lockdown started in March 2020, the world’s bat experts were locked out of their research laboratories, which, in turn, locked up large volumes of...
As part of the CETAF COVID19 task force, Plazi liberated taxonomic treatments, figures, observation records, biotic interactions, taxonomic names, and collection and specimen codes involving bats and viruses from scholarly publications with the intention to create open access, findable, accessible, interoperable and reusable data (FAIR). The data i...
The growing corpus of hundreds of millions of pages of taxonomic literature reporting research results based on specimens is very rich in facts. In order to make them reusable, Plazi, Pensoft and Zenodo are building and maintaining the Biodiversity Literature Repository which includes a workflow to discover, describe, store, in order to making thes...
A deep irony of COVID-19 likely originating from a bat-borne coronavirus (Boni et al. 2020) is that the global lockdown to quell the pandemic also locked up physical access to much basic knowledge regarding bat biology. Digital access to data on the ecology, geography, and taxonomy of potential viral reservoirs, from Southeast Asian horseshoe bats...
Introduction
Scholarly literature is the primary source for biodiversity knowledge based on observations, field work, analysis and taxonomic classification. Publishing such literature in semantically enhanced formats (e.g., through Extensible Markup Language (XML) tagging) helps to make this knowledge easily accessible and available to humans and a...
This data publication originated as part of developing a biodiversity-related knowledge hub on COVID-19 via COVID19-TAF - Communities Taking Action (https://cetaf.org/covid19-taf-communities-taking-action), a community-rooted initiative raised jointly by the Consortium of European Taxonomic Facilitaties (CETAF, https://cetaf.org) and Distributed Sy...
This data publication originated as part of developing a biodiversity-related knowledge hub on COVID-19 via COVID19-TAF - Communities Taking Action (https://cetaf.org/covid19-taf-communities-taking-action), a community-rooted initiative raised jointly by the Consortium of European Taxonomic Facilitaties (CETAF, https://cetaf.org) and Distributed Sy...
This paper describes a set of guidelines for the citation of zoological and botanical specimens in the European Journal of Taxonomy. The guidelines stipulate controlled vocabularies and precise formats for presenting the specimens examined within a taxonomic publication, which allow for the rich data associated with the primary research material to...
The European Journal of Taxonomy (EJT) has been initiated by a consortium of European natural history publishers to take advantages of the shift form paper to e-only publishing. Whilst originally publishing in PDF format has been considered the state of the art, it became recently obvious, that complementary dissemination channels help to dissemina...
The European Journal of Taxonomy (EJT) was initiated by a consortium of European natural history publishers to take advantage of the shift from paper to electronic-only publishing (Benichou et al. 2011). Whilst originally publishing in PDF format has been considered the state of the art, it became recently obvious that complementary dissemination c...
Scholarly publications in taxonomy are used as the sole carrier of the communication channel to publicize the description of new species, more generally any kind of taxon, their augmentations in form of re-descriptions to small notes such as additional observation records, or deprecations when the name of a taxon is changing. This is communicated i...
Zenodo (https://zenodo.org) is an open-access repository operated by CERN (European Organization for Nuclear Research), which provides researchers with an easy and stable platform to archive and publish their data and other output, such as software tools, manuals and project reports. In the context of the ICEDIG (Innovation and Consolidation for La...
Digitisation of Natural History Collections (NHC) has evolved from transcription of specimen catalogues in databases to web portals providing access to data, digital images, and 3D models of specimens. These portals increase global accessibility to specimens and help preserve the physical specimens by reducing their handling. The size of the NHC re...
A vast amount of biodiversity data is reported in the primary taxonomic literature. In the past, we have demonstrated the use of semantic enhancement to extract data from taxonomic literature and make it available to a network of databases (Miller et al. 2015). For technical reasons, semantic enhancement of taxonomic literature is most efficient wh...
The Swiss NGO Plazi (http://plazi.org) has developed an automated workflow for liberating data, including images and text, from new taxonomic publications issued in PDF format. This stepwise process extracts, article metadata, illustrations and their captions, bibliographic references, scientific names, named geographic entities such as coordinates...
The paper investigates how to implement open access to data in collection institutions and in the DiSSCo research infrastructure. Large-scale digitisation projects generate lots of images, but data transcription often remains backlogged for years. The paper discusses minimum information standards (MIDS) for digital specimens, and tentatively define...
Essential Biodiversity Variables (EBV) are fundamental variables that can be used for assessing biodiversity change over time, for determining adherence to biodiversity policy, for monitoring progress towards sustainable development goals, and for tracking biodiversity responses to disturbances and management interventions. Data from observations o...
Essential Biodiversity Variables (EBVs) allow observation and reporting of global biodiversity change, but a detailed framework for the empirical derivation of specific EBVs has yet to be developed. Here, we re-examine and refine the previous candidate set of species traits EBVs and show how traits related to phenology, morphology, reproduction, ph...
The Open Biodiversity Knowledge Management System (OBKMS) is an end-to-end, eXtensible Markup Language (XML)- and Linked Open Data (LOD)-based ecosystem of tools and services that encompasses the entire process of authoring, submission, review, publication, dissemination, and archiving of biodiversity literature, as well as the text mining of publi...
ICEDIG is a design study for the new research infrastructure Distributed System of Scientific Collections (DiSSCo), focusing on the issues around digitisation of the collections and making their data freely and openly available following the FAIR principles (data being Findable, Accessible, Interoperable, and Re-usable).
As a design study, ICEDIG...
Background:
The biodiversity domain, and in particular biological taxonomy, is moving in the direction of semantization of its research outputs. The present work introduces OpenBiodiv-O, the ontology that serves as the basis of the OpenBiodiv Knowledge Management System. Our intent is to provide an ontology that fills the gaps between ontologies f...
We present OpenBiodiv - an implementation of the Open Biodiversity Knowledge Management System.
The need for an integrated information system serving the needs of the biodiversity community can be dated at least as far back as the sanctioning of the Bouchout declaration in 2007. The Bouchout declaration proposes to make biodiversity knowledge freel...
We present OpenBiodiv - an implementation of the Open Biodiversity Knowledge Management System. We believe OpenBiodiv is possibly the first pilot-stage implenatation of a semantic system running on top of the biodiversity knowledge graph.
The need for an integrated information system serving the needs of the biodiversity community can be dated at l...
We present OpenBiodiv - an implementation of the Open Biodiversity Knowledge Management System.
The need for an integrated information system serving the needs of the biodiversity community can be dated at least as far back as the sanctioning of the Bouchout declaration in 2007. The Bouchout declaration proposes to make biodiversity knowledge freel...
Much biodiversity data is collected worldwide, but it remains challenging to assemble the scattered knowledge for assessing biodiversity status and trends. The concept of Essential Biodiversity Variables (EBVs) was introduced to structure biodiversity monitoring globally, and to harmonize and standardize biodiversity data from disparate sources to...
The EU BON (" Building the European Biodiversity Observation Network ") project has made important contributions towards the achievement of global conservation targets. This infographic illustrates EU BON's contributions towards the achievement of Aichi Biodiversity Target 19 "By 2020, knowledge, the science base and technologies relating to biodiv...
Taxonomy is the discipline responsible for charting the world’s organismic diversity, understanding ancestor/descendant relationships, and organizing all species according to a unified taxonomic classification system. Taxonomists document the attributes (characters) of organisms, with emphasis on those can be used to distinguish species from each o...
What forces structure ecological assemblages? A key limitation to general insights about assemblage structure is the availability of data that are collected at a small spatial grain (local assemblages) and a large spatial extent (global coverage). Here, we present published and unpublished data from 51,388 ant abundance and occurrence records of mo...
Messor nests in Iranian steppe rangelands can be so large that they are visible from space. When compared with reference soils, nest soil is higher in nutrients and lower in pH. Ant nests also homogenise the nutrients throughout the upper soil profile, although this effect diminished when nests are abandoned. The denuded circles around nests are su...
The question whether taxonomic descriptions naming new animal species without type specimen(s) deposited in collections should be accepted for publication by scientific journals and allowed by the Code has already been discussed in Zootaxa (Dubois & Nemésio 2007; Donegan 2008, 2009; Nemésio 2009a–b; Dubois 2009; Gentile & Snell 2009; Minelli 2009;...
This plot is not part of the published stance but derives from it. The plot shows the number of authors by geographic region (courtesy of Dr. Diego Astua).
Taxonomy is the discipline responsible for charting the world’s organismic diversity, understanding ancestor/descendant relationships, and organizing all species according to a unified taxonomic classification system. Taxonomists document the attributes (characters) of organisms, with emphasis on those can be used to distinguish species from each o...
This is the application by Donat Agosti for a Shuttleworth Fellowship with the goal of making scientific data publication an integral part of an open knowledge management system. A successful application will help move TreatmentBank from its current prototype state into a production system that converts and extracts data from taxonomic publications...
The goal of the deliverable is to provide a technical summary of web services providing a harmonised and unified access layer to taxonomic information resources defined by Appendix 3 of the INSPIRE Directive. We describe the basic web service architecture, the unified access protocol, the harmonisation of contributing databases, integration into th...
The objective of Workpackage 4 of the European Marine Observation and Data network (EMODnet) is to fill spatial and temporal gaps in European marine species occurrence data availability by carrying out data archaeology and rescue activities. To this end, a workshop was organised in the Hellenic Center for Marine Research Crete (HCMR), Heraklion Cre...