Markus Stocker

Markus Stocker
Leibniz Information Centre for Science and Technology University Library · Research and Development

PhD

About

101
Publications
15,015
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,222
Citations
Introduction
Head of the Knowledge Infrastructures research group at TIB Leibniz Information Centre for Science and Technology. Advancing the software systems of knowledge infrastructures in their ability to support people and institutions in creating, maintaining and sharing knowledge about human and natural worlds. Primarily data science and knowledge graphs in scientific knowledge infrastructures within the earth and environmental sciences. On Twitter @envinf.
Additional affiliations
September 2009 - present
University of Eastern Finland
Position
  • Researcher
November 2006 - March 2007
University of Zurich
Position
  • Research Assistant
Education
September 2009 - December 2015
University of Eastern Finland
Field of study
  • Environmental Informatics
September 2001 - November 2006
University of Zurich
Field of study
  • Informatics

Publications

Publications (101)
Article
A recurrent problem in applications that build on environmental sensor networks is that of sensor data organization and interpretation. Organization focuses on, for instance, resolving the syntactic and semantic heterogeneity of sensor data. The distinguishing factor between organization and interpretation is the abstraction from sensor data with i...
Article
Full-text available
We present a software system for automated projection of situational knowledge for disease outbreak in agriculture. The system supports farmers and agricultural advisers in obtaining and maintaining awareness of present and future disease outbreaks in crops grown at agricultural parcels. It models objects such as plant pathogens and agricultural pa...
Article
Information systems that build on sensor networks often process data produced by measurement of physical properties. This data can serve in the acquisition of knowledge for real-world situations of interest to information services and, ultimately, to people. Such systems face a common challenge, namely the considerable gap between the data produced...
Article
Full-text available
Over the past decades, sensor networks have been deployed around the world to monitor over time and space a large number of properties appertaining to various environmental phenomena. A popular example is the monitoring of particulate matter and gases in ambient air undertaken, for instance, to assess air quality and inform decision makers and the...
Preprint
Information Extraction (IE) tasks are commonly studied topics in various domains of research. Hence, the community continuously produces multiple techniques, solutions, and tools to perform such tasks. However, running those tools and integrating them within existing infrastructure requires time, expertise, and resources. One pertinent task here is...
Preprint
Despite improved digital access to scholarly literature in the last decades, the fundamental principles of scholarly communication remain unchanged and continue to be largely document-based. Scholarly knowledge remains locked in representations that are inadequate for machine processing. The Open Research Knowledge Graph (ORKG) is an infrastructure...
Preprint
As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article...
Preprint
Full-text available
Background: Recent years are seeing a growing impetus in the semantification of scholarly knowledge at the fine-grained level of scientific entities in knowledge graphs. The Open Research Knowledge Graph (ORKG) https://www.orkg.org/ represents an important step in this direction, with thousands of scholarly contributions as structured, fine-grained...
Preprint
Full-text available
Leveraging a GraphQL-based federated query service that integrates multiple scholarly communication infrastructures (specifically, DataCite, ORCID, ROR, OpenAIRE, Semantic Scholar, Wikidata and Altmetric), we develop a novel web widget based approach for the presentation of scholarly knowledge with rich contextual information. We implement the prop...
Preprint
Review articles are a means to structure state-of-the-art literature and to organize the growing number of scholarly publications. However, review articles are suffering from numerous limitations, weakening the impact the articles could potentially have. A key limitation is the inability of machines to access and process knowledge presented within...
Preprint
Scholarly Knowledge Graphs (KGs) provide a rich source of structured information representing knowledge encoded in scientific publications. With the sheer volume of published scientific literature comprising a plethora of inhomogeneous entities and relations to describe scientific concepts, these KGs are inherently incomplete. We present exBERT, a...
Article
Full-text available
Research infrastructures play an increasingly essential role in scientific research. They provide rich data sources for scientists, such as services and software packages, via catalog and virtual research environments. However, such research infrastructures are typically domain-specific and often not connected. Accordingly, researchers and practiti...
Chapter
Full-text available
Review articles are a means to structure state-of-the-art literature and to organize the growing number of scholarly publications. However, review articles are suffering from numerous limitations, weakening the impact the articles could potentially have. A key limitation is the inability of machines to access and process knowledge presented within...
Chapter
A plethora of scholarly knowledge is being published on distributed scholarly infrastructures. Querying a single infrastructure is no longer sufficient for researchers to satisfy information needs. We present a GraphQL-based federated query service for executing distributed queries on numerous, heterogeneous scholarly infrastructures (currently, OR...
Preprint
Full-text available
A plethora of scholarly knowledge is being published on distributed scholarly infrastructures. Querying a single infrastructure is no longer sufficient for researchers to satisfy information needs. We present a GraphQL-based federated query service for executing distributed queries on numerous, heterogeneous scholarly infrastructures (currently, OR...
Chapter
Review articles summarize state-of-the-art work and provide a means to organize the growing number of scholarly publications. However, the current review method and publication mechanisms hinder the impact review articles can potentially have. Among other limitations, reviews only provide a snapshot of the current literature and are generally not r...
Chapter
Scientists always look for the most accurate and relevant answers to their queries in the literature. Traditional scholarly digital libraries list documents in search results, and therefore are unable to provide precise answers to search queries. In other words, search in digital libraries is metadata search and, if available, full-text search. We...
Article
Full-text available
Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which no...
Preprint
Full-text available
Review articles summarize state-of-the-art work and provide a means to organize the growing number of scholarly publications. However, the current review method and publication mechanisms hinder the impact review articles can potentially have. Among other limitations, reviews only provide a snapshot of the current literature and are generally not r...
Preprint
Full-text available
Scientists always look for the most accurate and relevant answers to their queries in the literature. Traditional scholarly digital libraries list documents in search results, and therefore are unable to provide precise answers to search queries. In other words, search in digital libraries is metadata search and, if available, full-text search. We...
Preprint
Full-text available
Scientists always look for the most accurate and relevant answer to their queries on the scholarly literature. Traditional scholarly search systems list documents instead of providing direct answers to the search queries. As data in knowledge graphs are not acquainted semantically, they are not machine-readable. Therefore, a search on scholarly kno...
Article
Full-text available
Research infrastructures play an increasingly essential role in scientific research. They provide rich data sources for scientists, such as services and software packages, via catalog and virtual research environments. However, such research infrastructures are typically domain-specific and often not connected. Accordingly, researchers and practiti...
Article
Full-text available
This document is an edited version of the original funding proposal entitled 'ORKG: Facilitating the Transfer of Research Results with the Open Research Knowledge Graph' that was submitted to the European Research Council (ERC) Proof of Concept (PoC) Grant in September 2020 (https://erc.europa.eu/funding/proof-concept). The proposal was evaluated b...
Chapter
We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these componen...
Preprint
Full-text available
In the last decade, a large number of Knowledge Graph (KG) information extraction approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG information extraction (IE) have not been studied in the literature. We propose Plumber, the first framework that brings together the...
Preprint
Full-text available
Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which no...
Article
Full-text available
When researchers analyze data, it typically requires significant effort in data preparation to make the data analysis ready. This often involves cleaning, pre-processing, harmonizing, or integrating data from one or multiple sources and placing them into a computational environment in a form suitable for analysis. Research infrastructures and their...
Chapter
The growth and population of the Semantic Web, especially the Linked Open Data (LOD) Cloud, has brought to the fore the challenges of ordering knowledge for data mining on an unprecedented scale. The LOD Cloud is structured from billions of elements of knowledge and pointers to knowledge organization systems (KOSs) such as ontologies, taxonomies, t...
Preprint
Full-text available
Scientific articles are typically published as PDF documents, thus rendering the extraction and analysis of results a cumbersome, error-prone, and often manual effort. New initiatives, such as ORKG, focus on transforming the content and results of scientific articles into structured, machine-readable representations using Semantic Web technologies....
Preprint
Full-text available
Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a sch...
Article
The transfer of knowledge has not changed fundamentally for many hundreds of years: It is usually document-based-formerly printed on paper as a classic essay and nowadays as PDF. With around 2.5 million new research contributions every year, researchers drown in a flood of pseudo-digitized PDF publications. As a result research is seriously weakene...
Chapter
Full-text available
Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a sch...
Chapter
Scientific articles are typically published as PDF documents, thus rendering the extraction and analysis of results a cumbersome, error-prone, and often manual effort. New initiatives, such as ORKG, focus on transforming the content and results of scientific articles into structured, machine-readable representations using Semantic Web technologies....
Article
The transfer of knowledge has not changed fundamentally for many hundreds of years: It is usually document-based - formerly printed on paper as a classic essay and nowadays as PDF. With around 2.5 million new research contributions every year, researchers drown in a flood of pseudo-digitized PDF publications. As a result research is seriously weake...
Chapter
Open Science Graphs (OSGs) are Scientific Knowledge Graphs whose intent is to improve the overall FAIRness of science, by enabling open access to graph representations of metadata about people, artefacts, institutions involved in the research lifecycle, as well as the relationships between these entities, in order to support stakeholder needs, such...
Chapter
Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which no...
Chapter
Answering questions on scholarly knowledge comprising text and other artifacts is a vital part of any research life cycle. Querying scholarly knowledge and retrieving suitable answers is currently hardly possible due to the following primary reason: machine inactionable, ambiguous and unstructured content in publications. We present JarvisQA, a BER...
Presentation
Full-text available
With the volume of publications growing exponentially each year, it is becoming increasingly important to provide machine-based tools for finding contents relevant to researchers. This requires translating unstructured text into machine-actionable data, for instance in the form of graphs that contain information from ontologies and that comply with...
Chapter
Full-text available
Environmental research infrastructures aim to provide scientists with facilities, resources and services to enable scientists to effectively perform advanced research. When addressing societal challenges such as climate change and pollution, scientists usually need data, models and methods from different domains to tackle the complexity of the comp...
Chapter
Full-text available
Whenever a community of practice starts developing an IT solution for its use case(s) it has to face the issue of carefully selecting “the platform” to use. Such a platform should match the requirements and the overall settings resulting from the specific application context (including legacy technologies and solutions to be integrated and reused,...
Chapter
Full-text available
The ENVRI Reference Model provides architects and engineers with the means to describe the architecture and operational behaviour of environmental and Earth science research infrastructures (RIs) in a standardised way using the standard terminology. This terminology and the relationships between specific classes of concept can be used as the basis...
Chapter
Full-text available
The Open Research Knowledge Graph (ORKG) provides machine-actionable access to scholarly literature that habitually is written in prose. Following the FAIR principles, the ORKG makes traditional, human-coded knowledge findable, accessible, interoperable, and reusable in a structured manner in accordance with the Linked Open Data paradigm. At the mo...
Preprint
Full-text available
The Open Research Knowledge Graph (ORKG) provides machine-actionable access to scholarly literature that habitually is written in prose. Following the FAIR principles, the ORKG makes traditional, human-coded knowledge findable, accessible, interoperable, and reusable in a structured manner in accordance with the Linked Open Data paradigm. At the mo...
Preprint
Full-text available
Reviewing scientific literature is a cumbersome, time consuming but crucial activity in research. Leveraging a scholarly knowledge graph, we present a methodology and a system for comparing scholarly literature, in particular research contributions describing the addressed problem, utilized materials, employed methods and yielded results. The syste...
Preprint
Full-text available
Answering questions on scholarly knowledge comprising text and other artifacts is a vital part of any research life cycle. Querying scholarly knowledge and retrieving suitable answers is currently hardly possible due to the following primary reason: machine inactionable, ambiguous and unstructured content in publications. We present JarvisQA, a BER...
Preprint
Full-text available
Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get an overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normal...
Article
Full-text available
Instruments play an essential role in creating research data. Given the importance of instruments and associated metadata to the assessment of data quality and data reuse, globally unique, persistent and resolvable identification of instruments is crucial. The Research Data Alliance Working Group Persistent Identification of Instruments (PIDINST) d...
Preprint
Full-text available
Instruments play an essential role in creating research data. Given the importance of instruments and associated metadata to the assessment of data quality and data reuse, globally unique, persistent and resolvable identification of instruments is crucial. The Research Data Alliance Working Group Persistent Identification of Instruments (PIDINST) d...
Article
Full-text available
The FAIR principles articulate the behaviors expected from digital artifacts that are Findable, Accessible, Interoperable and Reusable by machines and by people. Although by now widely accepted, the FAIR Principles by design do not explicitly consider actual implementation choices enabling FAIR behaviors. As different communities have their own, of...
Conference Paper
Full-text available
Despite improved digital access to scholarly knowledge in recent decades, scholarly communication remains exclusively document-based. In this form, scholarly knowledge is hard to process automatically. We present the first steps towards a knowledge graph based infrastructure that acquires scholarly knowledge in machine actionable form thus enabling...
Chapter
Despite improved digital access to scholarly literature in the last decades, the fundamental principles of scholarly communication remain unchanged and continue to be largely document-based. Scholarly knowledge remains locked in representations that are inadequate for machine processing. The Open Research Knowledge Graph (ORKG) is an infrastructure...
Article
Full-text available
We would like to present FAIR Research Data: Semantic Knowledge Graph Infrastructure for the Life Sciences (in short, FAIR.ReD), a project initiative that is currently being evaluated for funding. FAIR.ReD is a software environment for developing data management solutions according to the FAIR (Findable, Accessible, Interoperable, R eusable; Wilkin...
Preprint
Full-text available
Despite improved digital access to scientific publications in the last decades, the fundamental principles of scholarly communication remain unchanged and continue to be largely document-based. The document-oriented workflows in science publication have reached the limits of adequacy as highlighted by recent discussions on the increasing proliferat...
Chapter
Scientific information communicated in scholarly literature remains largely inaccessible to machines. The global scientific knowledge base is little more than a collection of (digital) documents. The main reason is in the fact that the document is the principal form of communication and—since underlying data, software and other materials mostly rem...
Conference Paper
Full-text available
Scientific information communicated in scholarly literature remains largely inaccessible to machines. The global scientific knowledge base is little more than a collection of (digital) documents. The main reason is in the fact that the document is the principal form of communication and - since underlying data, software and other materials mostly r...