Markus Weiss’s research while affiliated with Bavarian Natural History Collections and other places
What is this page?
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
The scientific community has been developing and refining digital data standards to ensure that biodiversity data can be easily exchanged between different databases, systems and institutions. However, scientists still face the challenge of effectively analysing this vast amount of data. Variations in the quality, documentation and availability of metadata often make it difficult for scientists to compile appropriate datasets for their research. One contribution towards this task is the research data repository ARAMOB of the Arachnologische Gesellschaft (AraGes), which focuses on systematically collected data on spider assemblages. Mandatory requirements
have been developed to ensure the quality and utility of the data for ecological research during a given project. A next step towards enhancing the data basis for convincing analyses of spider assemblages in Central Europe is the offer to now publish data in the society’s open access journal Arachnologische Mitteilungen/Arachnology Letters, which are integrated into the data repository and thus made effectively accessible and usable. These data papers will be presented as one printed page in the journal, accessible on the website of the AraGes and from the BioOne Digital Library, accompanied by a PDF-document containing metadata to effectively use the published data. The original dataset is published as spreadsheet tables, but also deposited in the ARAMOB data repository, which is managed with the modularized database software and virtual research environment Diversity Workbench. By this means, the published data packages are also accessible and analysable within a wider context through the ARAMOB web portal. On request, scientists can also exploit data with the free and well-documented Diversity Workbench software tools. The data pipelines involved are defined in the context of the National Research Data Infrastructure (NFDI).
There is a growing demand for monitoring pests in natural history collections (NHCs) and establishing integrated pest management (IPM) solutions (Crossman and Ryde 2022). In this context, up-to-date taxonomic reference lists and controlled vocabularies following standard schemes are crucial and facilitate recording organisms detected in collections.
The data pipeline described here results in the publication of a taxon reference list based on information from online resources and standard IPM literature. Most of the over 140 pest taxa on species level and above are insects, the rest belong to other animal groups and fungi.
The complete taxon names, synonyms, English and German common names, and the hierarchical classification (parent-child relationships) are organised in a client-server installation of DiversityTaxonNames (DTN) at the Bavarian Natural History Collections (SNSB). DTN is a Microsoft Structured Query Language (MS SQL) database tool of the Diversity Workbench (DWB) framework with a published Entity Relation (ER) diagram (Hagedorn et al. 2019). The management is done using the Global Biodiversity Information Facility (GBIF) backbone taxonomy as external name resource, with linkage to the respective Wikidata Q item ID as a external persistent identifier (PID). Moreover, information on pest occurrence in NHCs is given, distinguishing the Consortium of European Taxonomic Facilities (CETAF) major NHC collection types affected (i.e., heritage sciences, life sciences and earth sciences) and the object categories, e.g., natural objects/specimens damaged. The data management in DTN enables the long-running curation, done by list curators.
The generic data pipeline for the management and publication of a Global Taxonomic Reference List of Pests in NHCs is based on the DTN taxon lists concept and architecture and described under About "Taxon list of pest organisms for IPM at natural history collections compiled at the SNSB". It includes four steps (A–D) with significant results for best practices of data processing (Fig. 1).
A. The data is managed and processed for publication by list curators in the database DiversityTaxonNames (DTN).
As a result, the list can be kept up-to-date and is—without transformation—ready to be used for IPM solutions at any NHC with a DiversityCollection installation and as part of the DWB cloud services.
B. The up-to-date data is publicly available via the DTN REST Webservice for Taxon Lists with machine-readable Application Programming Interface (API).
As a result, the dynamic list publication service can be used as a reference backbone for establishing IPM solutions for pest monitoring at any NHC.
C. The data is provided via the GBIF checklist data publication pipeline of the SNSB through GBIF validation tools and Darwin Core Archive in DwC-A (zip format) for GBIF.
As a result, the checklist information becomes part of the GBIF network with GBIF ChecklistBank and GBIF Global Taxonomy. This ensures future compliance of data with the Findability, Accessibility, Interoperability, and Reuse (FAIR) guiding principles.
D. The DTN REST Web service for Taxon Lists (currently 60 lists) is registered and accessible through the German Federation for Biological Data (GFBio) Terminology service.
As a result, the lists with external PIDs and other information are available as a service (see DTN lists overview). In the upcoming Research Data Commons of the German National Research Data Infrastructure (NFDI) Initiative (Diepenbroek et al. 2021), it will be part of a standardized layer of APIs with an agreed interface scheme for improved accessibility.
The provided tools, API and data are part of the upcoming NFDI4Biodiversity service portfolio. Future scenarios include the use of the list items and properties as classes for diagnosis purposes with DiversityNaviKey (Triebel et al. 2021) including the publication of images for identifying pests.
"IndExs—Index of Exsiccatae" is an online database with bibliographic information on exsiccatae and exsiccata-like series launched in 2001 (Triebel and Scholz 2022). This type of series is a specific system in botany and mycology to create, publish and distribute well identified and documented reference material. In most cases the distributed specimens are numbered and each number consists of uniform material (herbarium duplicates) from a single collection event. Exsiccatal series are regularily published with small booklets containing the printed labels/ schedae of each numbered entity. The title of the series often shows the geographic and taxonomic focus of the series, e.g., "Delogne & Gravet, Hépat. Ardenne" and "Hertel, Lecideaceae Exs.". The persons editing the series are specialists, often recognized botanists and taxonomists. They are mostly not identical with the persons collecting and identifying the specimens distributed. Examples are E. M. Fries with "Fries, Herb. Norm. Pl. Suec.", G. L. Rabenhorst who published 24 series with more than 6,000 numbered entities and K. H. Rechinger with "Rechinger & Polunin, Exs. Herb. Baghdad". In the minority of cases the editors are anonymous persons and organisations devoted to plant exchange like "Société Dauphinoise pour l'échange des plantes". The more than 2.200 known series are widely distributed in public herbaria, either kept separately or integrated in the main collection. We estimate that more than 10 million specimens belong to such a series with printed labels. Approximately 70 series are running.
The eldest exsiccata might be that of Johann Balthasar Ehrhart (from 1732 see here). It is followed by the better known exsiccatae edited by Jakob Friedrich Ehrhart, e.g., the series "Ehrhart, Pl. Crypt. Linn." starting with 1785. The two newest series started in 2020 and are bryophyte series from Taiwan and Vietnam.
The online database IndExs categorizes the series according to the group of organisms distributed and delivers editors, full title, standard abbreviation, editing institution, place of publication, range of (suggested) publication dates, range of numbered entities, examplary images of printed label as well as information sources and literature (Triebel et al. 2004). A stable and persistent exsiccata identifier, so-called "IndExs Exsiccata ID" is given. This set of standardized information is available via the IndExs search interface, and ready to be downloaded via several formats (csv, xls, xml). A machine readable REST web service is under development. IndExs information and services with stable IndExs Exsiccata ID are used by data portals like the Macroalgal Herbarium Consortium Portal powered by Symbiota, the JACQ herbarium management system and integrated in collection management systems like DiversityCollection, a module of the Diversity Workbench (DWB) tool suite. It is envisaged to be included in future terminology services like the GFBio Terminology Service. IndExs is appropriate to build the curated reference list for exsiccatae in the frame of herbarium digitization approaches (Borsch et al. 2020).
IndExs is storing information on the work of 1,300 editors of exsiccatae who are persons from nearly 300 years. According to the data models of DiversityAgents (Weiss et al. 2016) and DiversityExsiccatae (Hagedorn et al. 2008) the information is managed in freely accessible interlinked instances of SQL RDBMS DiversityAgents and DiversityExsiccatae. The applications are installed as part of the data network at the SNSB IT Center.
In 2012, the Wikidata project started and acts as central storage for the structured data of its Wikimedia sister projects (Anonymous 2022, Vrandečić and Krötzsch 2014). The content of Wikidata is available under a free license, exported using standard formats, and can be interlinked to other open data sets on the linked data web including life sciences (Anonymous 2022, Mitraka et al. 2015).
The study will use existing IndExs services for the 1,300 IndExs editors and 2,200 disambiguated exsiccata series and expand them for linked data / semantic web approaches. It will explore the usability of Wikidata:
for disambiguation of person names (=editors) in IndExs by adapting Wikidata Identifiers,
for adding information to existing Wikidata person Q-entities (items) via statements on persons´ work,
for integrating IndExs information in Wikidata with URI for "IndExs Exsiccata ID" via a newly proposed P-entity (property) for this special kind of creative work in natural science,
for adding new Q-entities in Wikidata for IndExs editors.
for disambiguation of person names (=editors) in IndExs by adapting Wikidata Identifiers,
for adding information to existing Wikidata person Q-entities (items) via statements on persons´ work,
for integrating IndExs information in Wikidata with URI for "IndExs Exsiccata ID" via a newly proposed P-entity (property) for this special kind of creative work in natural science,
for adding new Q-entities in Wikidata for IndExs editors.
These editors of published booklet series with distributed physical material fulfill the Wikidata criteria of notability. They are often more or less well-known botanists and mycologists. By their published work they might even more fulfill the criteria than certain persons categorized as botanical collectors with assignment of a Wikidata ID through activities of CETAF, DiSSCo and COST MOBILISE.
Systems to gain and document the floristic status (= occurrence status) are an important
measure to describe the structure of a flora in space and time. There is a distinction between in situ
systems and ex situ systems. The first rely on the status assignment for single observations in the field,
the latter make summary classifications („ex situ“) for each taxon in relation to a major often politically
defined geographic entity. In this paper we describe the progress on using status systems for
the project „Flora of Bavaria”.
Since 2013 the data collection Flora of Bavaria (BFL) is managed in the database system Diversity
Workbench (DWB). The data are dynamically provided by several portals and networks: the
portal „BIB“ with regional user community as well as national and international scientific portals like
GBIF, GFBio and PLADIAS. Six million records of occurrence data with status assignment are organized
in the software DiversityCollection (DWB-DC). The absolute majority of these data are categorized
as so-called „Normalstatus“, which means native in a wide sense. Several figures show the
distribution of the values of the status according to the basis of occurrence record, time span of record
and kind of relation to the geographical area.
A second kind of floristic status is managed in the software DiversityTaxonNames (DWB-DTN)
since 2015. It is assigned to the taxa organized as taxonomic reference list of vascular plants in
Bavaria (“Taxonomische Referenzliste der Gefäßpflanzen Bayerns – TaxRef”) and is independent
from the „in situ“ status described above. The „ex situ“ floristic status of a taxon is named its „Bayernstatus”
and processed as a feature for each of the currently 6.999 accepted taxa on species level
and below. Both types of status are compared.
The changing values of the „in situ“ floristic status assigned to the taxa over the time give a first
hint on the change of the flora during the last 60 years. This estimation is confirmed by the „ex situ“
values „Bayernstatus“ as published in four versions of the „Taxonomische Referenzliste“ within the
last 15 years. It is recommended to calculate the in-situ and ex situ proportion of neophytes. This
value represents the number of „non-indigenous“ taxa divided by the number of taxa categorised as
„non-indigenous“ and „indigenous“ x 100 in percent. Values from subsequent time slots give a hint
on flora change.
The BiNHum project is a collective effort of five natural history museums and research collections of the Humboldt-Ring in Germany to collect, centralize and publish collection data in a unified web portal. The portal and the underlying data workflows provide an extensive set of tools to refine and enrich collection data from various sources. It provides a powerful, fast and intuitive search. As an additional benefit for current data providers and to encourage new providers to share data, we implemented a search widget as a light weight version of the portal. The widget enables users to implement their own custom search within their homepage with only minimal effort. All BiNHum collection data, search functions, user annotations and data representations (including map view, multimedia files and data exports) of the portal are also available within the widget. The widget is maintained by the BiNHum project and new portal developments are automatically deployed to the widget installations. Using the search widget makes it easy for museums, institutes and curators to display their own data on their personal pages, prefiltered by own criteria and adapted to local layout preferences.
BiNHum (http://wiki.binhum.net) is a project of five natural history museums and research collections representing the Humboldt-Ring: the State Museum of Natural History Karlsruhe (SMNK), the State Museum of Natural History Stuttgart (SMNS), the Zoological Research Museum Alexander Koenig in Bonn (ZFMK), the Bavarian Natural History Collections in Munich (SNSB), and the Botanic Garden and Botanical Museum Berlin-Dahlem (BGBM).
The three-year project is funded by the German Research Foundation and deals with:
-- data recovery and data mobilisation (led by SMNS, SMNK and the University of Ulm),
-- implementation of modern search technologies and development of the data portal (lead by ZFMK),
-- data harvesting and data quality assurance (lead by BGBM),
-- deployment of Diversity Workbench (DWB) as a virtual environment for BiNHum (lead by SNSB).
The BiNHum data portal will provide rich metadata content and multimedia data types like 3D images and audio files: The digitized fossil leaves database MORPHYLL (at SMNS) with ecophysiologically-relevant morphometry as well as ZFMK's acoustic multimedia databases will be accessible. For these purposes, the BiNHum portal will offer search options for complex data features (such as sound frequencies or geometric shapes) in addition to autocorrection, autocomplete, faceting, “shopping cart” and other harvesting features. A first test version of the portal is already available (http://solr.binhum.net/solr/browse) using a scalable full-text search indexing server (SOLR 4.4.0) and a relational database (MySQL).
All partners use the Biological Collection Access Service (BioCASe) to provide their data to GBIF and to the BiNHum portal. For BiNHum the transformed data sources are harvested by GBIF’s Harvesting and Indexing Toolkit (HIT), which has been extended to fulfill project-specific purposes (e.g. identification history, organismic associations).
The database framework Diversity Workbench (DWB; http://www.diversityworkbench.net) consists of 13 independent relational database components (MS SQL), each with a rich client and devoted to a specific data domain. DWB is used as the core database management system by most of the partners. The applications are being expanded to support the transformation of existing data to current IT standards in the BiNHum context. Independent DWB installations for the BiNHum partners SMNK, SMNS, SNSB and ZFMK allow them to curate, quality-control and manage the original data before they are processed by BioCASe and the BiNHum harvesting portal. A new implementation is done for DiversityDescriptions, the DWB application which provides a triple store architecture for the management of descriptive and measurement data.
A majority of biodiversity research projects depend on field recording and ecology data. Therefore it is important to provide a seamless and transparent data flow from the field to the data storage systems and networks. Seamless in the sense, that data are available shortly after their gathering, transparent in the sense that the history of data operations may be traced backward. DiversityMobile (with the complementing applications of the Diversity Workbench frameworks) is a GUI software that provides the option of gathering biological and ecological research data in a structured way by using mobile devices for data retrieval.
... Our previous work [31] has demonstrated that published exsiccata, a special kind of botanical publication and dissemination of information on plant diversity [32], have the utmost importance for the Hieracium nomenclature. The present contribution uncovers further aspects and complexities of this type of grey literature, which is still much under-explored and under-evaluated in plant taxonomy. ...
... However, the early literature of the 19 th century as well as the information in collection catalogues and archives of that time are usually vague concerning the precise locality for collection of the fossil specimens. Many historical specimens registered as originating from Solnhofen actually derive from depocenters other than the Solnhofen Basin (e.g., Moser et al. 2017). This lack of accuracy regarding the provenance of numerous fossils, including many type specimens collected over more than 200 years, makes it diffi-cult to reconstruct the composition of the individual faunas corresponding to the different basins within the Solnhofen Archipelago. ...