
Birgitta König-Ries- Prof. Dr.
- Friedrich Schiller University Jena
Birgitta König-Ries
- Prof. Dr.
- Friedrich Schiller University Jena
About
272
Publications
47,835
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,589
Citations
Current institution
Additional affiliations
January 2008 - December 2011
January 2003 - present
Publications
Publications (272)
This paper presents outcomes from the inaugural "EcoHack: AI & LLM Hackathon for Applications in Evidence-based Ecological Research & Practice," which convened participants from across Europe and beyond, culminating in 11 team submissions. These submissions highlighted six broad application areas of AI for ecology: (1) AI-enhanced decision support...
Search systems are a crucial means for users to access information. However, considering only the top search results naturally comes with biases that increase even further with personalization, biased databases, or intransparent retrieval systems. We believe that it is essential that users can a) easily understand the characteristics of their searc...
Artificial intelligence (AI) is revolutionizing biodiversity research by enabling advanced data analysis, species identification, and habitats monitoring, thereby enhancing conservation efforts. Ensuring reproducibility in AI-driven biodiversity research is crucial for fostering transparency, verifying results, and promoting the credibility of ecol...
We document the creation of two datasets consisting of scientific publications in the interdisciplinary field of water research. Additional labels regarding the research domain and particular attributes allow for measuring the topical diversity and the attributes' representation in data subsets.
While deep learning has significantly advanced automatic plant disease detection through image-based classification, improving model explainability remains crucial for reliable disease detection. In this study, we apply the Automated Concept-based Explanation (ACE) method to plant disease classification using the widely adopted InceptionV3 model an...
We apply NER to a particular sub-genre of legal texts in German: the genre of legal norms regulating administrative processes in public service administration. The analysis of such texts involves identifying stretches of text that instantiate one of ten classes identified by public service administration professionals. We investigate and compare th...
Deep Learning (DL) techniques are increasingly applied in scientific studies across various domains to address complex research questions. However, the methodological details of these DL models are often hidden in the unstructured text. As a result, critical information about how these models are designed, trained, and evaluated is challenging to a...
Recently, there has been a growing interest in Multimodal Large Language Models (MLLMs) due to their remarkable potential in various tasks integrating different modalities, such as image and text, as well as applications such as image captioning and visual question answering. However, such models still face challenges in accurately captioning and i...
Recently, Large Language Models (LLMs) have transformed information retrieval, becoming widely adopted across various domains due to their ability to process extensive textual data and generate diverse insights. Biodiversity literature, with its broad range of topics, is no exception to this trend (Boyko et al. 2023, Castro et al. 2024). LLMs can h...
Hypotheses are critical components of scientific argumentation. Knowing established hypotheses is often a prerequisite for following and contributing to scientific arguments in a research field. In scientific publications, hypotheses are usually presented for specific empirical settings, whereas the related general claim is assumed to be known. Pre...
Artificial Intelligence (AI) is revolutionizing biodiversity research by enabling advanced data analysis, species identification, and habitats monitoring, thereby enhancing conservation efforts. Ensuring reproducibility in AI-driven biodiversity research is crucial for fostering transparency, verifying results, and promoting the credibility of ecol...
Research has become increasingly reliant on extensive data. The integration, sharing and reuse of research data poses a significant challenge, particularly in the context of interdisciplinary collaborative projects. An essential objective for a research infrastructure dedicated to data management is to facilitate efficient data discovery and integr...
On March 11 and 12, 2024, the Spring Symposium of the GI Fachgruppe Datenbanken took place in Jena. The overarching theme of the meeting was research data management beyond isolated repositories, and it explicitly aimed to foster connections between the FG Datenbanken and the NFDI community. Talks and posters focused on incentives and hurdles for F...
Scientific workflows facilitate the automation of data analysis tasks by integrating various software and tools executed in a particular order. To enable transparency and reusability in workflows, it is essential to implement the FAIR principles. Here, we describe our experiences implementing the FAIR principles for metabolomics workflows using the...
Knowledge Graphs (KGs) present factual information about domains of interest. They are used in a wide variety of applications and in different domains, serving as powerful backbones for organizing and extracting knowledge from complex data. In both industry and academia, a variety of platforms have been proposed for managing Knowledge Graphs. To us...
Ontologies are the prime way of organizing data in the Semantic Web. Often, it is necessary to combine several, independently developed ontologies to obtain a complete representation of a domain of interest. The complementarity of existing ontologies can be leveraged by merging them. Existing approaches for ontology merging mostly implement a binar...
With the exponential increase in scientific publications, new conceptual and technological tools are needed to help scientists, students, managers and policy-makers to navigate and digest current scientific knowledge. Hi Knowledge is an initiative to synthesise and visualise scientific knowledge, with an initial focus on invasion biology that is cu...
In the midst of a looming global biodiversity crisis, approaches to rapidly collect, curate, catalog, and integrate biodiversity data at global scales are more important than ever before1. Historically, data collection and reuse have been linked to local access to funding for scientific research and infrastructure, generating blind spots in the dis...
In the midst of a looming global biodiversity crisis, approaches to rapidly collect, curate, catalog, and integrate biodiversity data at global scales are more important than ever before1. Historically, data collection and reuse have been linked to local access to funding for scientific research and infrastructure, generating blind spots in the dis...
In recent years, deep learning methods in the biodiversity domain have gained significant attention due to their ability to handle the complexity of biological data and to make processing of large volumes of data feasible. However, these methods are not easy to interpret, so the opacity of new scientific research and discoveries makes them somewhat...
The number of openly-accessible digital plant specimen images is growing tremendously and available through data aggregators: Global Biodiversity Information Facility (GBIF) contains 43.2 million images, and Intergrated Digitized Biocollections (iDigBio) contains 32.4 million images (Accessed on 29.06.2023). All these images contain great ecologica...
Scientific workflows facilitate the automation of different data analysis tasks by integrating various software and tools executed in a particular order. To enable transparency, accessibility, and reusability in workflows, it is essential to implement the 17 FAIR principles as much as possible. To do so, the research data management community has s...
Dataset discovery is a frequent task in daily research practice, yet studies are missing that explore the usability of user interfaces (UI) in data portals. In particular, very few user studies exist that analyze whether particular elements in the user interface are useful for search tasks. We aim to address those needs for more specific usability...
https://ceur-ws.org/Vol-3415/paper-7.pdf
Computing the semantic similarity between pairs of terms plays a vital role within a myriad of shared data applications, such as data integration and ontology evolution. A first step towards building such applications is to determine which terms are semantically similar to each other. One feasible way to compute the similarity of two terms is to as...
Biodiversity is the assortment of life on earth covering evolutionary, ecological, biological, and social forms. To preserve life in all its variety and richness, it is imperative to monitor the current state of biodiversity and its change over time and to understand the forces driving it. This need has resulted in numerous works being published in...
Macro- and microscopic images of organisms are pivotal in biodiversity research. Despite that bioimages have manifold applications such as assessing the diversity of form and function, FAIR bioimaging data in the context of biodiversity are still very scarce, especially for difficult taxonomic groups such as bryophytes. Here, we present a high-qual...
Our goal is to mobilize global species abundance and assemblage information, via a dedicated and openly accessible data repository and web portal. Preservation of raw observational data is a complement to the modelled geographic projections that are the focus of related projects such as the EBV Data Portal, which provides access to EBV (Essential B...
Nowadays, more and more biodiversity datasets containing observational and experimental data are collected and produced by different projects. In order to answer the fundamental questions of biodiversity research, these data need to be integrated for joint analyses. However, to date, too often, these data remain isolated in silos.
Both in academia...
Digital data have become an indispensable basis for biodiversity research. Sustainable curation, archiving, accessibility and integrability according to the FAIR principles ("Findable, Accessible, Interoperable and Reusable", Wilkinson et al. 2016) are essential for re-use to answer pressing questions in a rapidly changing environment.
As part of t...
The increasing volumes of data produced by high-throughput instruments coupled with advanced computational infrastructures for scientific computing have enabled what is often called a {\em Fourth Paradigm} for scientific research based on the exploration of large datasets. Current scientific research is often interdisciplinary, making data integrat...
Scientific data management plays a key role in the reproducibility of scientific results. To reproduce results, not only the results but also the data and steps of scientific experiments must be made findable, accessible, interoperable, and reusable. Tracking, managing, describing, and visualizing provenance helps in the understandability, reproduc...
Developing a precise argument is not an easy task. In real-world argumentation scenarios, arguments presented in texts (e.g. scientific publications) often constitute the end result of a long and tedious process. A lot of work on computational argumentation has focused on analyzing and aggregating these products of argumentation processes, i.e. arg...
Background
The advancement of science and technologies play an immense role in the way scientific experiments are being conducted. Understanding how experiments are performed and how results are derived has become significantly more complex with the recent explosive growth of heterogeneous research data and methods. Therefore, it is important that...
Obtaining fit-to-use data associated with diverse aspects of biodiversity, ecology and environment is challenging since often it is fragmented, sub-optimally managed and available in heterogeneous formats. Recently, with the universal acceptance of the FAIR data principles, the requirements and standards of data publications have changed substantia...
http://ceur-ws.org/Vol-2969/paper4-s4biodiv.pdf |
Dataset search is receiving increasing attention in a scholar’s daily research practice. In biodiversity research, dataset retrieval in particular is a challenging and time-consuming task as most search services in current data portals only offer a simple keyword-based search. In this work we intro...
Reproducibility is one of the fundamental characteristics of science. To reproduce scientific results, scientists need to manage and describe the provenance of end-to-end experimental pipelines. To understand , query, and reason how the results are derived, the provenance of the entire study needs to be described in an interoperable manner. Ontolog...
Machine learning (ML) is an increasingly important scientific tool supporting decision making and knowledge generation in numerous fields. With this, it also becomes more and more important that the results of ML experiments are reproducible. Unfortunately, that often is not the case. Rather, ML, similar to many other disciplines, faces a reproduci...
Computational notebooks have gained widespread adoption among researchers from academia and industry as they support reproducible science. These notebooks allow users to combine code, text, and visualizations for easy sharing of experiments and results. They are widely shared in GitHub, which currently has more than 100 million repositories, making...
One of the added values of long running and large scale collaborative projects is the ability to answer complex research questions based on the comprehensive set of data provided by their central repositories. In practice, however, finding data in such a repository to answer a specific question often proves to be a demanding task even for project s...
Biodiversity is the variety of life on earth which covers the evolutionary, ecological, and cultural processes that sustain life. Therefore, it is important to understand where biodiversity is, how it is changing over space and time, the driving factors of these changes and the resulting consequences on the diversity of life. To do so, it is necess...
Searching for scientific datasets is a prominent task in scholars' daily research practice. A variety of data publishers, archives and data portals offer search applications that allow the discovery of datasets. The evaluation of such dataset retrieval systems requires proper test collections, including questions that reflect real world information...
Earthworms are an important soil taxon as ecosystem engineers, providing a variety of crucial ecosystem functions and services. Little is known about their diversity and distribution at large spatial scales, despite the availability of considerable amounts of local-scale data. Earthworm diversity data, obtained from the primary literature or provid...
Earthworms are an important soil taxon as ecosystem engineers, providing a variety of crucial ecosystem functions and services. Little is known about their diversity and distribution at large spatial scales, despite the availability of considerable amounts of local-scale data. Earthworm diversity data, obtained from the primary literature or provid...
Scientific experiments and research practices vary across disciplines. The research practices followed by scientists in each domain play an essential role in the understandability and reproducibility of results. The “Reproducibility Crisis”, where researchers find difficulty in reproducing published results, is currently faced by several discipline...
The increasing amount of publicly available research data provides the opportunity to link and integrate data in order to create and prove novel hypotheses, to repeat experiments or to compare recent data to data collected at a different time or place. However, recent studies have shown that retrieving relevant data for data reuse is a time-consumi...
https://doi.org/10.1371/journal.pone.0246099
Soil is one of the most biodiverse terrestrial habitats. Yet, we lack an integrative conceptual framework for understanding the patterns and mechanisms driving soil biodiversity. One of the underlying reasons for our poor understanding of soil biodiversity patterns relates to whether key biodiversity theories (historically developed for aboveground...
Merging ontologies enables the reusability and interoperability of existing knowledge. With growing numbers of relevant ontologies in any given domain, there is a strong need for an automatic, scalable multi-ontology merging tool. We introduce \(\mathcal {C}\)o\(\mathcal {M}\)erger, which covers four key aspects of the ontology merging field: compa...
With the growing popularity of semantics-aware integration solutions, various ontology merging approaches have been proposed. Determining the success of these developments heavily depends on suitable evaluation criteria. However, no comprehensive set of evaluation criteria on the merged ontology exists so far. We develop criteria to evaluate the me...
Merging ontologies is the standard way to achieve interoperability of heterogeneous systems in the Semantic Web. Because of the possibility of different modeling, OWL restrictions from one ontology may not necessarily be compatible with those from other ontologies. Thus, the merged ontology can suffer from restriction conflicts. This problem so far...
With a rapidly growing body of knowledge, it becomes more and more difficult to keep track of the state of the art in a research field. A formal representation of the hypotheses in the field, their relations, the studies that support or question them based on which evidence, would greatly ease this task and help direct future research efforts. We p...
Personalized applications are a two-edged sword. They are convenient and assist users by keeping the focus on relevant topics, but they are often black boxes and users typically do not know why certain entries appear in their profile. As transparency and provenance are essential for researchers, in this paper, we introduceScholarLensViz, a visualiz...
In this article, we identify possibilities and limits of processing as yet unused data sources for spatio-temporal biodiversity trend analyses in Germany. The sMon synthesis project (https://www.idiv.de/smon) of the German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig is a joint working group of federal and state authoritie...
Despite conservation commitments, most countries still lack large-scale biodiversity monitoring programs to track progress toward agreed targets. Monitoring program design is frequently approached from a top-down, data-centric perspective that ignores the socio-cultural context of data collection. A rich landscape of people and organizations, with...
Ontology merging systems enable the reusability and interoperability of existing knowledge. Ideally, they allow their users to specify which characteristics the merged ontology should have. In prior work, we have identified Generic Merge Requirements (GMRs) reflecting such characteristics. However, not all of them can be met simultaneously. Thus, i...
Ontologies are the backbone of the Semantic Web. As a result, the number of existing ontologies and the number of topics covered by them has increased considerably. With this, reusing these ontologies becomes preferable to constructing new ontologies from scratch. However, a user might be interested in a part and/or a set of parts of a given ontolo...
With a rapidly growing body of knowledge, it becomes more and more difficult to keep track of the state of the art in a research field. A formal representation of the hypotheses in the field, their relations, the studies that support or question them based on which evidence, would greatly ease this task and help direct future research efforts. We p...
Computational notebooks have gained widespread adoption among researchers from academia and industry as they support reproducible science. These notebooks allow users to combine code, text, and visualizations for easy sharing of experiments and results. They are widely shared in GitHub, which currently has more than 100 million repositories making...
Machine learning (ML) is an increasingly important scientific tool supporting decision making and knowledge generation in numerous fields. With this, it also becomes more and more important that the results of ML experiments are reproducible. Unfortunately, that often is not the case. Rather, ML, similar to many other disciplines, faces a reproduci...
Dataset Retrieval is gaining importance due to a large amount of research data and the great demand for reusing scientific data.Dataset Retrieval is mostly based on metadata, structured information about the primary data. Enriching these metadata with semanticannotations based on Linked Open Data (LOD) enables datasets, publications and authors to...
Data is central in almost all scientific disciplines nowadays. Furthermore, intelligent systems have developed rapidly in recent years, so that in many disciplines the expectation is emerging that with the help of intelligent systems, significant challenges can be overcome and science can be done in completely new ways. In order for this to succeed...
Ontologies are the prime way of organizing data in the Semantic Web. Often, it is necessary to combine several, independently developed ontologies to obtain a knowledge graph fully representing a domain of interest. The complementarity of existing ontologies can be leveraged by merging them. Existing approaches for ontology merging mostly implement...
The increasing amount of research data provides the opportunity to link and integrate data to create novel hypotheses, to repeat experiments or to compare recent data to data collected at a different time or place. However, recent studies have shown that retrieving relevant data for data reuse is a time-consuming task in daily research practice. In...
Soil organisms, including earthworms, are a key component of terrestrial ecosystems. However, little is known about their diversity, their distribution, and the threats affecting them. We compiled a global
dataset of sampled earthworm communities from 6928 sites in 57 countries as a basis for predicting patterns in earthworm diversity, abundance, a...
Soil organisms, including earthworms, are a key component of terrestrial ecosystems. However, little is known about their diversity, their distribution, and the threats affecting them. We compiled a global dataset of sampled earthworm communities from 6928 sites in 57 countries as a basis for predicting patterns in earthworm diversity, abundance, a...
This PDF file includes:
Materials and Methods
Supplementary Text
Figs. S1 to S6
Tables S1 to S4
References
Soil organisms, including earthworms, are a key component of terrestrial ecosystems. However, little is known about their diversity, their distribution, and the threats affecting them. We compiled a global dataset of sampled earthworm communities from 6928 sites in 57 countries as a basis for predicting patterns in earthworm diversity, abundance, a...
Soil organisms, including earthworms, are a key component of terrestrial ecosystems. However, little is known about their diversity, their distribution, and the threats affecting them. We compiled a global dataset of sampled earthworm communities from 6928 sites in 57 countries as a basis for predicting patterns in earthworm diversity, abundance, a...
This PDF file includes:
Materials and Methods
Supplementary Text
Figs. S1 to S6
Tables S1 to S4
References
This PDF file includes:
Materials and Methods
Supplementary Text
Figs. S1 to S6
Tables S1 to S4
References
This PDF file includes:
Materials and Methods
Supplementary Text
Figs. S1 to S6
Tables S1 to S4
References
This PDF file includes:
Materials and Methods
Supplementary Text
Figs. S1 to S6
Tables S1 to S4
References
Ontologies reflect their creators’ view of the domain at hand and are thus subjective. For specific applications it may be necessary to combine several of these ontologies into a more comprehensive domain model by merging them. However, due to the subjective nature of the source ontologies, this can result in inconsistencies. Handling these inconsi...
Trait‐based approaches are widespread throughout ecological research as they offer great potential to achieve a general understanding of a wide range of ecological and evolutionary mechanisms. Accordingly, a wealth of trait data is available for many organism groups, but this data is underexploited due to a lack of standardization and heterogeneity...
Trait-based research spans from evolutionary studies of individual-level properties to global patterns of biodiversity and ecosystem functioning. An increasing number of trait data is available for many different organism groups, published as open access data on a variety of file hosting services. Thus, standardization between datasets is generally...
Soil organisms provide crucial ecosystem services that support human life. However, little is known about their diversity, distribution, and the threats affecting them. Here, we compiled a global dataset of sampled earthworm communities from over 7000 sites in 56 countries to predict patterns in earthworm diversity, abundance, and biomass. We ident...
Semantic annotations of datasets are very useful to support quality assurance, discovery, interpretability, linking and integration of datasets. However, providing such annotations manually is often a time-consuming task . If the process is to be at least partially automated and still provide good semantic annotations, precise information extractio...
The study of biodiversity has grown exponentially in the last thirty years in response to demands for greater understanding of the function and importance of Earth's biodiversity and finding solutions to conserve it. Here, we test the hypothesis that biodiversity science has become more interdisciplinary over time. To do so, we analyze 97,945 peer‐...
Data discovery is a frequent task in a scholar's daily work. In biodiversity, data search is a particular challenge. Here, scholars have complex information needs such as the rich interplay of organisms and their environments that cannot be unambiguously expressed with a traditional keyword search, e.g., Does tree diversity reduce competition in a...
Concern about the functional consequences of unprecedented loss in biodiversity has prompted biodiversity-ecosystem functioning (BEF) research to become one of the most active fields of ecological research in the past 25 years. Hundreds of experiments have manipulated biodiversity as an independent variable and found compelling support that the fun...
Visualizations are an important tool to transport information. However, finding the right visualization can be challenging. Using the biodiversity research domain as a showcase, we investigate where exactly these challenges are and what a tool should look like that helps scientists overcome them. Our results are based on a survey we performed.
1. Trait-based approaches are widespread throughout ecological research, offering great potential for trait data to deliver general and mechanistic conclusions. Accordingly, a wealth of trait data is available for many organism groups, but, due to a lack of standardisation, these data come in heterogeneous formats.
2. We review current initiatives...
Data discovery is a frequent task in a scholar's daily work. In biodiversity, data search is a particular challenge. Here, scholars have complex information needs such as the rich interplay of organisms and their environments that cannot be unambiguously expressed with a traditional keyword search, e.g., Does tree diversity reduce competition in a...
In an age where science is often interdisciplinary, it is frequently necessary to combine scientific data from different (sub-)disciplines and thus from different sources. Ontologies can play an important role in this integration process. However, existing ontologies will either cover just a part of the domain of interest or competing ontologies mo...
We introduce ADOnIS, an information system which coherently integrates two important, yet mostly disparate data sources, namely structured, tabular data, and unstructured data in terms of publications. The integration is achieved by providing the underlying background knowledge of the domains involved in terms of adequately tailored ontologies. Onc...
Semantic similarity plays a vital role within a myriad of shared data applications, such as data and information integration. A first step towards building such applications is to determine concepts, which are semantically similar to each other. One way to compute this similarity of two concepts is to assess their word similarity by exploiting diff...