Harald Sack

Harald Sack
FIZ Karlsruhe - Leibniz Institute for Information Infrastructure | FIZ · Information Service Engineering

Prof. Dr.

About

309
Publications
38,534
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,020
Citations
Introduction
Hi, I'm Professor for Information Service Engineering and Vice President of FIZ Karlsruhe - Leibniz Institute for Information Infrastructure as well as at Institute for Applied Informatics and formal Description Methods (AIFB) at Karlsruhe Institute of Technology (KIT). Our research is focussed on the intersection of symbolic and subsymbolic knowledge representation, esp. on Semantic Technologies, Knowledge Graphs, Knowledge Mining, Natural Language Processing, and Semantic Information Services.
Additional affiliations
October 2016 - present
Karlsruhe Institute of Technology
Position
  • Professor (Full)
Description
  • Professor of Information Service Engineering
October 2016 - April 2021
FIZ Karlsruhe - Leibniz Institute for Information Infrastructure
Position
  • CEO
Description
  • Professor and Vice-President Information Service Engineering
October 2009 - September 2016
Hasso Plattner Institute
Position
  • Senior Researcher
Description
  • Master Courses @ HPI: Semantic Web Technologies OpenHPI: Knowledge Engineering with Semantic Web Technologies

Publications

Publications (309)
Article
Full-text available
Abstract—Keyword-based,search,in general,is particularly applicable if the searcher,really knows,what,she is looking for and,how,to find it. But in many,cases either the objectives of the searcher,are intrinsically fuzzy or she has no idea of the appropriate,keywords. One way,to solve this problem,is to navigate and explore the search space along a...
Article
Full-text available
Purpose – Linking Open Data (LOD) provides a vast amount of well structured semantic information, but many inconsistencies may occur, especially if the data are generated with the help of automated methods. Data cleansing approaches enable detection of inconsistencies and overhauling of affected data sets, but they are difficult to apply automatica...
Conference Paper
Full-text available
The need to bridge between the unstructured data on the Document Web and the structured data on the Web of Data has led to the development of a considerable number of annotation tools. However, these tools are currently still hard to compare since the published evaluation results are calculated on diverse datasets and evaluated based on different m...
Article
Full-text available
Incorrect or outdated data is a common problem when working with Linked Data in real world applications. Linked Data is distributed over the web and under control of various dataset publishers. It is difficult for data publishers to ensure the quality and timeliness of the data all by themselves, though they might receive individual complaints by d...
Conference Paper
Full-text available
Without search engines the information content of the World Wide Web would remain largely closed for the ordinary user. Current web search engines work well as long as the user knows what she is looking for. The situation becomes problematic, if the user has insufficient expertise or prior knowledge to formulate the search query. Often a sequence o...
Article
Full-text available
This study tackles a significant challenge in ontology development for materials science: selecting the most appropriate upper‐level ontologies for creating application‐level ontologies and knowledge graphs. Focusing on the use case of Brinell hardness testing, the research assesses the performance of various top‐level ontologies (TLOs)—basic forma...
Preprint
Full-text available
This paper presents NFDIcore 2.0, an ontology compliant with the Basic Formal Ontology (BFO) designed to represent the diverse research communities of the National Research Data Infrastructure (NFDI) in Germany. NFDIcore ensures the interoperability across various research disciplines, thereby facilitating cross-domain research. Each domain's indiv...
Preprint
Full-text available
The NFDI4DataScience (NFDI4DS) project aims to enhance the accessibility and interoperability of research data within Data Science (DS) and Artificial Intelligence (AI) by connecting digital artifacts and ensuring they adhere to FAIR (Findable, Accessible, Interoperable, and Reusable) principles. To this end, this poster introduces the NFDI4DS Onto...
Preprint
Full-text available
Ontologies are widely used in materials science to describe experiments, processes, material properties, and experimental and computational workflows. Numerous online platforms are available for accessing and sharing ontologies in Materials Science and Engineering (MSE). Additionally, several surveys of these ontologies have been conducted. However...
Chapter
Full-text available
Preserving historical city architectures and making them (publicly) available has emerged as an important field of the cultural heritage and digital humanities research domain. In this context, the TRANSRAZ project is creating an interactive 3D environment of the historical city of Nuremberg which spans over different periods of time. Next to the e...
Article
Full-text available
Based on our experience within the NFDI4Culture and NFDI-MatWerk projects we propose generalized knowledge graph based research data management solutions, which are applicable to other consortia. Our solution covers the construction of a common NFDI core ontology adapted to specific domains via domain extensions as a basis for a knowledge graph (KG...
Article
Full-text available
The local prediction of fatigue damage within polycrystals in a high-cycle fatigue setting is a long-lasting and challenging task. It requires identifying grains tending to accumulate plastic deformation under cyclic loading. We address this task by transcribing ferritic steel microtexture and damage maps from experiments into a microstructure grap...
Preprint
Full-text available
Terminology sources, such as controlled vocabularies, thesauri and classification systems, play a key role in digitizing cultural heritage. However, Information Retrieval (IR) systems that allow to query and explore these lexical resources often lack an adequate representation of the semantics behind the user's search, which can be conveyed through...
Preprint
Full-text available
The local prediction of fatigue damage within polycrystals in a high-cycle fatigue setting is a long-lasting and challenging task. It requires identifying grains tending to accumulate plastic deformation under cyclic loading. We address this task by transcribing ferritic steel microtexture and damage maps from experiments into a microstructure grap...
Preprint
Full-text available
The local prediction of fatigue damage within polycrystals in a high-cycle fatigue setting is a long-lasting and challenging task. It requires identifying grains tending to accumulate plastic deformation under cyclic loading. We address this task by transcribing ferritic steel microtexture and damage maps from experiments into a microstructure grap...
Article
Recent rises in political polarization across the globe are often ascribed to algorithmic content filtering on social media, news platforms, or search engines. The widespread usage of news recommendation systems (NRS) is theorized to drive users in homogenous information environments and, thereby, drive affective, ideological, and perceived polariz...
Conference Paper
Full-text available
The ever-increasing amount of research output through scientific articles requires means to enable transparency and a better understanding of key entities of the research lifecycle, referred to as research artifacts, such as methods, software, datasets, etc. Research Knowledge Graphs (RKG) make research artifacts findable, accessible, interoperable...
Preprint
Full-text available
Due to the open world assumption, Knowledge Graphs (KGs) are never complete. In order to address this issue, various Link Prediction (LP) methods are proposed so far. Some of these methods are inductive LP models which are capable of learning representations for entities not seen during training. However, to the best of our knowledge, none of the e...
Article
Full-text available
Knowledge Graphs (KGs) comprise of interlinked information in the form of entities and relations between them in a particular domain and provide the backbone for many applications. However, the KGs are often incomplete as the links between the entities are missing. Link Prediction is the task of predicting these missing links in a KG based on the e...
Chapter
The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation or human curation. Entity typing is the task of assigning or inferring the semantic type of an entity in a KG. This paper presents GRAND, a novel approach for entity typing leveraging different graph walk strategies...
Article
Full-text available
Among other ways of expressing opinions on media such as blogs, and forums, social media (such as Twitter) has become one of the most widely used channels by populations for expressing their opinions. With an increasing interest in the topic of migration in Europe, it is important to process and analyze these opinions. To this end, this study aims...
Chapter
Cultural heritage portals often contain intangible objects digitized as audio files. This paper presents and discusses the adaptation of existing audio ontologies intended for non-cultural heritage applications. The resulting alignment of the German Digital Library-Europeana Data Model (DDB-EDM) with Music Ontology (MO) and Audio Commons Ontology (...
Article
Full-text available
Scholarly data is growing continuously containing information about the articles from a plethora of venues including conferences, journals, etc. Many initiatives have been taken to make scholarly data available in the form of Knowledge Graphs (KGs). These efforts to standardize these data and make them accessible have also led to many challenges su...
Chapter
The entity type information in Knowledge Graphs (KGs) of different languages plays an important role in a wide range of Natural Language Processing applications. However, the entity types in KGs are often incomplete. Multilingual entity typing is a non-trivial task if enough information is not available for the entities in a KG. In this work, multi...
Preprint
Full-text available
Scholarly data is growing continuously containing information about the articles from plethora of venues including conferences, journals, etc. Many initiatives have been taken to make scholarly data available in the for of Knowledge Graphs (KGs). These efforts to standardize these data and make them accessible have also lead to many challenges such...
Conference Paper
Full-text available
Frequently, Text Classification is limited by insufficient training data. This problem is addressed by Zero-Shot Classification through the inclusion of external class definitions and then exploiting the relations between classes seen during training and unseen classes (Zero-shot). However, it requires a class embedding space capable of accurately...
Conference Paper
Full-text available
The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation. Entity Typing is the task of assigning or inferring the semantic type of an entity in a KG. This paper introduces an approach named Cat2Type which exploits the Wikipedia Categories to predict the missing entity type...
Chapter
In order to transform a Knowledge Graph (KG) into a low dimensional vector space, it is beneficial to preserve as much semantics as possible from the different components of the KG. Hence, some link prediction approaches have been proposed so far which leverage literals in addition to the commonly used links between entities. However, the procedure...
Conference Paper
Full-text available
Cultural heritage portals have the goal of providing users with seamless access to all their resources. This paper introduces initial efforts for a user-oriented restructuring of the German Digital Library (DDB). At present, cultural heritage objects (CHOs) in the DDB are modeled using an extended version of the Europeana Data Model (DDB-EDM), whic...
Conference Paper
Full-text available
Under the German government's initiative "NEUSTART Kultur", the German Digital Library or Deutsche Digitale Bibliothek (DDB) is undergoing improvements to enhance user-experience. As an initial step, emphasis is placed on creating a knowledge graph from the bibliographic record collection of the DDB. This paper discusses the challenges facing the D...
Conference Paper
Full-text available
Archival records are essential sources of information for historians and digital humanists to understand history. For modern information systems they are often analysed and integrated into Knowledge Graphs for better access, interoperability and re-use. However, due to restrictions of the representation of RDF predicates temporal data within archiv...
Conference Paper
Full-text available
An increasing number of archival institutions aim to provide public access to historical documents. Ontologies have been designed, developed and utilised to model the archival description of historical documents and to enable interoperability between different information sources. However, due to the heterogeneous nature of archives and archival sy...
Poster
Full-text available
The necessity and potential of systematic archiving and publication of digital research data is currently a hot topic in the scientific landscape, due to various benefits such as ensuring reproducibility of research results and providing the basis for data-driven science. This requires measures to ensure data quality and particularly a documentatio...
Preprint
Full-text available
With the increasing trend in the topic of migration in Europe, the public is now more engaged in expressing their opinions through various platforms such as Twitter. Understanding the online discourses is therefore essential to capture the public opinion. The goal of this study is the analysis of social media platform to quantify public attitudes t...
Preprint
Full-text available
Knowledge Graphs (KGs) have become the backbone of various machine learning-based applications over the past decade. However, the KGs are often incomplete and inconsistent. Several representation learning-based approaches have been introduced to complete the missing information in KGs. Besides, Neural Language Models (NLMs) have gained huge momentu...
Conference Paper
Full-text available
Materials are either enabler or bottleneck for the vast majority of technological innovations. The digitization of materials and processes is mandatory to create live production environments which represent physical entities and their aggregations and thus allow to represent, share, and understand materials changes. However, a common standard forma...
Chapter
The research of European history across various time layers gives insights about the development of the European cultural identity. Nuremberg as one of the great European metropolises during the Middle Ages experienced a number of transformations throughout the centuries. Within the TRANSRAZ research project, Nuremberg and the development of its ar...
Chapter
Full-text available
The entity type information in a Knowledge Graph (KG) plays an important role in a wide range of applications in Natural Language Processing such as entity linking, question answering, relation extraction, etc. However, the available entity types are often noisy and incomplete. Entity Typing is a non-trivial task if enough information is not availa...
Method
Full-text available
An overview of this machine-readable linked open data vocabulary is presented in AdA-Filmontology – Levels, Types Values which can serve as an annotation companion book, but also as an inspiration for the composition of other film analysis vocabularies. English version Version 1.0 / July 2021 Junior research group “audio-visual rhetorics of affe...
Method
Full-text available
Dieses maschinenlesbare, auf linked-open-data-Prinzipien beruhende Vokabular findet sich in AdA-Filmontologie – Ebenen, Typen, Werte als Begleitmaterial zur Filmanalyse übersichtlich aufbereitet. Deutsche Fassung Version 1.0 / Juli 2021 BMBF-Nachwuchsgruppe „Affektrhetoriken des Audiovisuellen”, Freie Universität Berlin / Hasso-Plattner Insti...
Preprint
Full-text available
The field of Materials Science is concerned with, e.g., properties and performance of materials. An important class of materials are crystalline materials that usually contain ``dislocations'' -- a line-like defect type. Dislocation decisively determine many important materials properties. Over the past decades, significant effort was put into unde...
Book
The First International Workshop on Enabling Data-Driven Decisions from Learning on the Web (L2D 2021) was held as part of the 14th ACM International Conference on Web Search and Data Mining (WSDM 2021) on March 12, 2021. The workshop collected novel, original research on the state of the art of online education empowered with data mining and machi...
Conference Paper
Full-text available
The field of Materials Science is concerned with, e.g., properties and performance of materials. An important class of materials are crystalline materials that usually contain “dislocations" { a line-like defect type. Dislocation decisively determine many important materials properties. Over the past decades, significant effort was put into underst...
Conference Paper
Full-text available
The design and delivering of platforms for online education is fostering increasingly intense research. Scaling up education online brings new emerging needs related with hardly manageable classes, overwhelming content alternatives, and academic dishonesty while interacting remotely, as examples. However , with the impressive progress of the data m...
Conference Paper
Full-text available
The amount of scientific literature continuously grows, which poses an increasing challenge for researchers to manage, find and explore research results. Therefore, the classification of scientific work is widely applied to enable the retrieval, support the search of suitable reviewers during the reviewing process, and in general to organize the ex...
Conference Paper
Full-text available
A huge number of scholarly articles published every day in different domains makes it hard for the experts to organize and stay updated with the new research in a particular domain. This study gives an overview of a new approach, HierClasSArt, for knowledge aware hierarchical classification of the scholarly articles for mathematics into a predefine...
Article
Full-text available
Today, increasing numbers of people are interacting online and a lot of textual comments are being produced due to the explosion of online communication. However, a paramount inconvenience within online environments is that comments that are shared within digital platforms can hide hazards, such as fake news, insults, harassment, and, more in gener...
Conference Paper
Full-text available
Cultural heritage institutions store and digitize large amounts of multimedia data inside archives to make archival records findable by archivists, scientists, and general public. Cataloging standards vary from archive to archive and, therefore, the sharing and use of this data are limited. To solve this issue, linked open data (LOD) is rising as a...
Conference Paper
Full-text available
An important part in European cultural identity relies on European cities and in particular on their histories and cultural heritage. Nuremberg, the home of important artists such as Albrecht Dürer and Hans Sachs developed into the epitome of German and European culture already during the Middle Ages. Throughout history, the city experienced a numb...
Preprint
Full-text available
One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of ent...
Chapter
Full-text available
Knowledge Graphs have been recognized as the foundation for diverse applications in the field of data mining, information retrieval, and natural language processing. So the completeness and the correctness of the KGs are of high importance. The type information of the entities in a KG, is one of the most vital facts. However, it has been observed t...
Conference Paper
Full-text available
Scientific knowledge has been traditionally disseminated and preserved through research articles published in journals, conference proceedings , and online archives. However, this article-centric paradigm has been often criticized for not allowing to automatically process, categorize , and reason on this knowledge. An alternative vision is to gener...
Chapter
Short text categorization is an important task in many NLP applications, such as sentiment analysis, news feed categorization, etc. Due to the sparsity and shortness of the text, many traditional classification models perform poorly if they are directly applied to short text. Moreover, supervised approaches require large amounts of manually labeled...