PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Scientific articles are typically published as PDF documents, thus rendering the extraction and analysis of results a cumbersome, error-prone, and often manual effort. New initiatives, such as ORKG, focus on transforming the content and results of scientific articles into structured, machine-readable representations using Semantic Web technologies. In this article, we focus on tabular data of scientific articles, which provide an organized and compressed representation of information. However, chart visualizations can additionally facilitate their comprehension. We present an approach that employs a human-in-the-loop paradigm during the data acquisition phase to define additional semantics for tabular data. The additional semantics guide the creation of chart visualizations for meaningful representations of tabular data. Our approach organizes tabular data into different information groups which are analyzed for the selection of suitable visualizations. The set of suitable visualizations serves as a user-driven selection of visual representations. Additionally, customization for visual representations provides the means for facilitating the understanding and sense-making of information.
Content may be subject to copyright.
Towards Customizable Chart Visualizations of
Tabular Data Using Knowledge Graphs
Vitalis Wiens1,2, Markus Stocker1, and
oren Auer1,2
1TIB Leibniz Information Centre for Science and Technology, Hannover, Germany
markus.stocker@tib.eu
2L3S Research Center, Leibniz University of Hannover, Germany
wiens@l3s.de, auer@l3s.de
Preprint, of a short paper accepted for ICALD 2020.
After it is published, it will be found at https://doi.org/10.1007/978-3-030-64452-9 6
Abstract. Scientific articles are typically published as PDF documents,
thus rendering the extraction and analysis of results a cumbersome, error-
prone, and often manual effort. New initiatives, such as ORKG, focus on
transforming the content and results of scientific articles into structured,
machine-readable representations using Semantic Web technologies. In
this article, we focus on tabular data of scientific articles, which provide
an organized and compressed representation of information. However,
chart visualizations can additionally facilitate their comprehension. We
present an approach that employs a human-in-the-loop paradigm dur-
ing the data acquisition phase to define additional semantics for tabular
data. The additional semantics guide the creation of chart visualizations
for meaningful representations of tabular data. Our approach organizes
tabular data into different information groups which are analyzed for
the selection of suitable visualizations. The set of suitable visualizations
serves as a user-driven selection of visual representations. Additionally,
customization for visual representations provides the means for facilitat-
ing the understanding and sense-making of information.
Keywords: Scholarly Communication, Knowledge Graphs, Customiz-
able Visualizations, Information Visualization
1 Introduction
Scholarly communication has not changed in its core during the last centuries.
Research articles are typically distributed as PDF documents, and the amount
of publications increases continuously every year [8]. As a consequence, search-
ing, understanding, and organizing information becomes a burden. Finding and
reviewing the literature is tying up cognitive capacity [1], and consumes time
which consequently reduces the time available for original research.
The purpose of scientific articles is to inform and share findings. As a means
for scholarly communication, the information is presented in documents using
text, figures, and tables. While the descriptive text provides detailed insights,
figures and tables serve as a visual, structured, and compressed representation
of information. However, this information is buried in PDF representations [10].
The current developments in scholarly communication exploit Semantic Web
technologies. These advancements transform the scholarly communication from
document-based to knowledge-based information systems employing structured,
interlinked, and semantically rich knowledge graphs [1]. In contrast to other
Digital Library applications that organize primarily bibliographic metadata, the
Open Research Knowledge Graph [7](ORKG)1captures the content of research
articles (e.g., research problem, materials, methods, and results).
Generally, the view on the information in scientific articles becomes static and
frozen following publication. Thus, further analysis of presented information con-
tinues to be a manual effort for readers. Knowledge-based representations pro-
vide machine-readable access to information, which serves as input for various
applications, including those addressing its presentation to humans. Therefore,
it is beneficial to extract and transform the information of scientific articles into
structured and machine-readable representations. However, due to its design for
machine-interoperatbility and processing of information, the cognitive load for
humans increases with growing size and complexity of such data structures. Visu-
alizations serve a purpose of addressing specific information needs for the data at
hand and human’s ability to understand complex data through visual represen-
tations, “a picture is worth a thousand words” [13]. Following the information
seeking mantra (overview, zooming/filtering, and details on demand) [15], we
argue that user-driven approach for the generation of visualizations and their
customization can further facilitate the sense-making of information.
In this article, we focus on the results of scientific articles in the form of
tables. Tables provide an organized and compressed depiction of information.
Various works, such as the recent work of Vu et al. [16], address the transfor-
mation of tabular data into knowledge-based representations. In contrast, the
objective of our approach is to extract such information and provide customiz-
able and meaningful chart visualizations of tabular data from knowledge graphs.
In particular, we address the following challenges:
i) What minimal information structure is required in a knowledge graph to
obtain visual representations of tabular data.
ii) How to analyze this structured information for visualization generation.
Our approach employs a human-in-the-loop technique to transform tabular
data into knowledge graph representations with additional semantics. These ad-
ditional semantics serve as the foundation for obtaining views of the knowledge
graph that feed into various data visualization. Using the additional semantics,
our approach recreates tables from knowledge graphs and enables the analysis
of their content for the creation of customizable chart visualizations.
The remainder of this article is structured as follows. Section 2 summarizes
related work, and Section 3 describes the proposed approach. Section 4 discusses
1https://orkg.org
the limitations and implications for additional use cases. Finally, Section 5 con-
cludes with an outlook on future work.
2 Related Work
The related work can be categorised into two groups: a) transformation of ta-
bles into knowledge graph representations; b) visualization of knowledge graphs.
Addressing the former, the recent work of Vu et al. [16] represents the transfor-
mation process in the form of a mapping language (D-REPR). Heterogeneous
datasets, such as tables in CSV or JSON formats, with different layouts are de-
scribed in a model that defines components for the transformation into RDF.
These components describe the dataset resource, its attributes and how data
alignment is realized. A semantic model component describes how the data is
transformed into RDF. Other approaches, such as XLWrap [9], focus on the
transformation of spreadsheets into RDF. R2RML [3] is a W3C recommendation
that addresses the mapping of relational databases to RDF. However, relational
databases can be seen as tables, and therefore, R2RML techniques are also ap-
plied to transform tabular data into Semantic Web representations such as RDF.
Due to the flexible nature of tables, the challenge of transforming tables into Se-
mantic Web representations typically results in transformation models that are
specifically tailored for individual datasets. Similarly, our approach is currently
tailored for the representation of row-based-entries for one dimensional values.
Several definitions of knowledge graphs and its features exist; however, we
lack a unified definition [5]. Ehrlinger and W¨oß [5] argue additionally that “an
ontology does not differ from a knowledge base”, meaning that visualization
methods for ontologies are also applicable for the visualization of the structure
of knowledge graphs. According to a recent survey [4], most methods and tools
visualize the content of ontologies using two-dimensional graph-based represen-
tations in the form of node-link diagrams.
Approaches, such as RelFinder [6] or the Neo4j graph visualization [11] ad-
dress the visualization of knowledge graphs based on their structure (i.e., nodes
and links). While node-link diagrams are well suited to represent the data struc-
ture of knowledge graphs, in some contexts, such as the visualization of tables,
the structural representation will not facilitate the comprehension of informa-
tion. Knowledge graphs have different structures and also contain additional
information that does not serve the purpose for information interpretation (e.g.,
URIs or class assertions). Therefore, in order to generate suitable visualizations,
the context and the semantics of the retrieved entries from a knowledge graph
need to be incorporated and processed properly for the reconstruction of a table.
The Wikidata Query Service2is an application that is closely related to
our approach. The system leverages SPARQL and presents results using differ-
ent visualization methods. It provides a selection of visual representations (e.g.,
Table, Tree, and Timeline) for the resulting data. While the Wikidata Query
2https://query.wikidata.org/
Fig. 1. Overview: (1) A table for artificial results of Precision, Recall, F1-Score, and
Runtime. (2) Processing pipeline. (3) Resulting visual representation.
Service provides a generic solution for the customizable visualization of knowl-
edge graphs, we present an approach that incorporates additional semantics and
guides the visualization generation process that is designed for the visual repre-
sentation of tabular data in the form of customizable charts.
3 Approach
Our approach is motivated and aligned with the objectives of the Open Research
Knowledge Graph (ORKG) [7], i.e., the structured representation of contribu-
tions in scientific articles and the facilitation of information perception and its
sense-making. However, our approach addresses the customizable visualization
for tabular data that originates from knowledge graphs. As a running example,
we use an imaginary table summarizing the performance of different methods,
which is common in Computer Science articles (see Figure 1).
3.1 Data Acquisition and Transformation
At first, the data acquisition phase transforms the table into a knowledge graph
representation and ensures the correct assignment of additional semantics using
a human-in-the-loop approach. Knowledge graph structures typically reflect a
triple-based representation < s p o >, where the subject sand the object oare
interlinked by the predicate p. Our approach augments tabular data with addi-
tional semantics during the data acquisition phase, preserving the context which
allows more efficiently to create further analysis and visualizations from this
structured data. Our transformation model builds upon the following heuristics:
i) The cell entries of the first column provide the subjects; in our example,
these are the methods. Thus, cell values of a row are bound to the method.
Related to this, our transformation model is also row-based.
ii) Other columns provide values for measurements of a metric. Thus, our trans-
formation model adds to the cell value two additional attributes, namely the
metric and the unit of the cell value. The header values of the columns de-
termine the metric, while a human-in-the-loop approach assigns the units
for the corresponding columns.
As illustrated in Figure 2, a simple tabular input widget eases the process for
the user to enter the data and also ensures the correct assignment of additional
semantics for the table.
Fig. 2. Widget for the tabular data transformation process eases the data input process
and appends additional semantics to cell values.
While, in general, the particular value is of interest, it is also necessary to
incorporate the context. The numerical value “89” is just a data point lacking
any meaning. Adding metric and unit to this value captures more context. This
context enables to describe the cell value as: The value “89” describes Precision,
it has the unit percentage, and it refers to a method (Method A).
3.2 Information Extraction and Organization
The reconstruction of a table requires the information about the transforma-
tion model and its structural representation. This information is obtained from
the data acquisition phase. However, due to the unknown order of returned
triples, the ordering of rows and columns can change. Nevertheless, we obtain a
reconstructed table with sufficient context for our example. Furthermore, the re-
constructed table becomes interactive through corresponding implementations,
e.g., sorting the columns ascending or descending based on their values. As illus-
trated in Figure 3, this straight forth and back transformations provide already
interactions with tabular data and another view on the information.
The reconstructed table serves as input data for chart visualizations. How-
ever, we argue that the context is viable for the creation of suitable chart visu-
alizations. In this article, we define the context of a cell value as follows:
Definition 1. Context(value(i,j))=(RowLabel(0,j), Unit(i), Metric(i))
Where i >= 1, is the column index and jthe row index.
The RowLabel refers to the entries from the first column that are used as sub-
ject anchors in the knowledge graph representation. The Unit is provided by the
Fig. 3. Illustration of the original table and the reconstructed table from a knowledge
graph. Note: The ordering of the columns is not preserved.
user, and the Metric is obtained from the header values of the corresponding
column. Data units are a crucial factor in creating meaningful chart visualiza-
tions. We argue that metrics with the same units provide reasonable candidates
for grouping information and avoid false interpretations when visualized in the
same chart, i.e., significant differences in data ranges shift the attention focus to
the visual elements that have a higher presence in the chart, see Figure 4.
a) b)
Fig. 4. Column chart visualization indicating the possible false first impression through
unrelated units and large differences in the data ranges.
The semantics of Units provide the means to create information groups by
clustering columns, i.e., the extraction of sub-tables through the matching of
compatible units. These groups reflect information that relates (or co-relates)
to a certain extend. The semantics of Metrics provide the means to guide the
selection of suitable chart visualization types. In particular, it is the definition
of compatible chart types for individual metrics.
Units: The additional semantics of Units provide means to align the cell
values to a uniform representation for a particular unit. These semantics serve
as alignment definitions between them. For example, percentage and per-mil are
easily brought into correspondence using an alignment factor of 10, or millisec-
onds are transformed to seconds using an alignment factor of 1000. The seman-
tics for unit alignment enable the approach to detect compatible units and bring
them into correspondence for clustering related (or co-related) information.
Metrics: The semantics of metrics provide additional criteria for building in-
formation groups (i.e., the subdivision of sub-tables). As mentioned before, units
provide reasonable candidates for clustering related (or co-related) information
into groups. However, identical units are used in different metrics. For exam-
ple, percentage can refer to performance measurements in information retrieval
tasks or statistical distributions. The definition of compatible metrics refines the
grouping of related information and determines which columns serve as input.
Metrics provide additional value validation mechanisms. In particular, they
define a data range. For example, the metric Precision has a range of [0, ..., 100],
or Runtime cannot be expressed as negative values. This value range restrictions
define a validation mechanism for transformation models that populate knowl-
edge graphs with tabular data. However, the value range restrictions for the
myriad of measurement factors need to be defined individually for each metric.
3.3 Customizable Visualization Generation
The analysis of the additional semantics performs the most of the heavy lifting.
However, the dimensions of the table also pose restrictions on the selection of
suitable chart visualizations. For example, spider-charts require at least 3 di-
mensions in order to span an area for a value. While this criteria is met when
the number of rows is adequate (e.g., visualizing Precision with the correspond-
ing methods as axial dimension), this representation becomes invalid if the axis
mapping is flipped and the dimensional criteria is not met (e.g., only Precision
serves as the axial dimension). This simple example indicates that the selection
for axis mapping is also crucial for the visualization suggestion. As illustrated in
Figure 1, this refers to the feedback loop for the visualization suggestion.
4 Discussion
Our approach builds upon the semantics and the structure of the tabular data
representation in a knowledge graph. Thus, it is currently limited to the chosen
transformation model. Furthermore, the approach addresses the one dimensional
representation of columns and rows. In our approach, the first column of the table
refers to unsorted entries. However, when dealing with order dependent entries,
such as time series or physical distances, the position on the axis (sorting) is
significant for the information comprehension. Currently, our approach does not
address order dependent entries in the first column.
a) b)
Fig. 5. Prototype for chart visualization using the comparison feature of ORKG:
a) The individual tables, selection options for leader-board generation and a leader-
board visualization; b) Information organization for merged tables and the resulting
column chart. The value representation transformation is indicated in red.
The approach has been described in the context of tabular data visualizations
within a single paper. However, tables are frequently used in scientific articles of
various type. Incorporating additional semantics enables new opportunities for
analysis of information across papers, too. In particular, through the additional
semantics of units and metrics the information distributed across several tables
(in different articles) can be organized for further analysis. Figure 5 show-cases
the visualization generation of tables across different articles.
5 Conclusion
In this article, we have presented an approach for customizable chart visualiza-
tions of tabular data using knowledge graphs. The approach builds on additional
semantics that are added during the data acquisition process. Using these seman-
tics, tables are reconstructed and organized in information groups, i.e., sub-tables
based on metrics and units. The semantics of Metrics select suitable visualiza-
tion from a large space of all chart types. Customizations are enabled through
chart type selection and axis mappings. Using the paper comparison feature of
ORKG [12], the approach realizes advanced use cases, such as the visualization
of information distributed among tables in multiple articles and leader-boards.
The context plays an important role in extracting tabular data from knowl-
edge graphs and the creation of visual representations. Our approach creates the
context using the a-priory known data structure and its additional semantics.
Future work will address the extension for the definition of additional semantics
related to order dependent entries for the first column. The semantics of Metrics
define the interplay among them and which chart visualizations are suitable.
Thus, future work will address the many definitions of metrics. Additionally, we
plan to investigate the alignment to existing vocabularies related to units [14]
and the RDF Data Cube Vocabulary [2] in order to increase the flexibility and
robustness of the approach. Furthermore, we argue that pattern matching and
sub-graph identification will enable the realization of semi-automated genera-
tion for context items that guide the information organization and the analysis,
enabling the chart visualization of non-tabular data from knowledge graphs.
In conclusion, we argue that the approach introducing additional semantics
and further rules will foster the creation of suitable and custom visual represen-
tations for tabular data using knowledge graphs and that it facilitates compre-
hension through different perspectives on the information in tables.
Acknowledgment
This work is co-funded by the European Research Council project Science-
GRAPH (Grant agreement #819536). Additionally, we would like to thank our
colleagues Mohamad Yaser Jaradeh and Kheir Eddine for valuable discussions
and suggestions.
References
1. S. Auer, V. Kovtun, M. Prinz, A. Kasprzik, M. Stocker, and M. E. Vidal. Towards
a knowledge graph for science. In Proceedings of the 8th International Conference
on Web Intelligence, Mining and Semantics, pages 1–6, 2018.
2. R. Cyganiak and D. Reynolds. The rdf data cube vocabulary. https://www.w3.
org/TR/vocab-data-cube/, 2014.
3. S. Das, S. Sundara, and R. Cyganiak. R2rml: Rdb to rdf mapping language.
https://www.w3.org/TR/r2rml/, 2012.
4. M. Dud´s, S. Lohmann, V. Sv´atek, and D. Pavlov. Ontology visualization methods
and tools: a survey of the state of the art. Knowledge Eng. Review, 33, 2018.
5. L. Ehrlinger and W. W¨oß. Towards a definition of knowledge graphs. SEMANTiCS
(Posters, Demos, SuCCESS), 48, 2016.
6. P. Heim, S. Hellmann, J. Lehmann, S. Lohmann, and T. Stegemann. Relfinder:
Revealing relationships in rdf knowledge bases. In International Conference on
Semantic and Digital Media Technologies, pages 182–187. Springer, 2009.
7. M. Y. Jaradeh, A. Oelen, K. E. Farfar, M. Prinz, J. D’Souza, G. Kismih´ok,
M. Stocker, and S. Auer. Open research knowledge graph: Next generation in-
frastructure for semantic scholarly knowledge. In Proceedings of the 10th Interna-
tional Conference on Knowledge Capture, K-CAP ’19, page 243–246, New York,
NY, USA, 2019. Association for Computing Machinery.
8. R. Johnson, A. Watkinson, and M. Mabe. The stm report. An overview of scientific
and scholarly publishing. 5th edition October, 2018.
9. A. Langegger and W. W¨oß. Xlwrap–querying and integrating arbitrary spread-
sheets with sparql. In International Semantic Web Conference, pages 359–374.
Springer, 2009.
10. B. Mons. Which gene did you mean? BMC Bioinform., 6:142, 2005.
11. Neo4j. Neo4j graph visualization. https://neo4j.com/developer/
graph-visualization/, accessed March 2020.
12. A. Oelen, M. Y. Jaradeh, K. E. Farfar, M. Stocker, and S. Auer. Comparing
research contributions in a scholarly knowledge graph. In Proceedings of the Third
International Workshop on Capturing Scientific Knowledge co-located with the 10th
International Conference on Knowledge Capture (K-CAP 2019), Marina del Rey,
California , November 19th, 2019, volume 2526 of CEUR Workshop Proceedings,
pages 21–26. CEUR-WS.org, 2019.
13. O. Pe˜na, U. Aguilera, and D. L´opez-de Ipi˜na. Linked open data visualization
revisited: a survey. Semantic Web Journal, 2014.
14. H. Rijgersberg, M. van Assem, and J. Top. Ontology of units of measure and
related concepts. Semantic Web, 4(1):3–13, 2013.
15. B. Shneiderman. The eyes have it: A task by data type taxonomy for information
visualizations. In Proceedings of the 1996 IEEE Symposium on Visual Languages,
Boulder, Colorado, USA, September 3-6, 1996, pages 336–343, 1996.
16. B. Vu, J. Pujara, and C. A. Knoblock. D-repr: A language for describing and
mapping diversely-structured data sources to rdf. In Proceedings of the 10th In-
ternational Conference on Knowledge Capture, pages 189–196, 2019.
... Working with welldefined features, as suggested in Radiomics (8), might enable a compromise between the optimal consideration of the complex image information and a classification that is understandable for clinical experts (35). However, the complex multi-modal data used in phenotyping are difficult to interpret for humans with classical approaches such as heatmaps and two-dimensional diagrams (36,37). When omics or image data is involved there is a lack of backtracking within these tools, which links the classification to specific relevant locations or time frames of the underlying data. ...
Article
Full-text available
The quality and acceptance of machine learning (ML) approaches in cardiovascular data interpretation depends strongly on model design and training and the interaction with the clinical experts. We hypothesize that a software infrastructure for the training and application of ML models can support the improvement of the model training and provide relevant information for understanding the classification-relevant data features. The presented solution supports an iterative training, evaluation, and exploration of machine-learning-based multimodal data interpretation methods considering cardiac MRI data. Correction, annotation, and exploration of clinical data and interpretation of results are supported through dedicated interactive visual analytics tools. We test the presented concept with two use cases from the ACDC and EMIDEC cardiac MRI image analysis challenges. In both applications, pre-trained 2D U-Nets are used for segmentation, and classifiers are trained for diagnostic tasks using radiomics features of the segmented anatomical structures. The solution was successfully used to identify outliers in automatic segmentation and image acquisition. The targeted curation and addition of expert annotations improved the performance of the machine learning models. Clinical experts were supported in understanding specific anatomical and functional characteristics of the assigned disease classes.
Conference Paper
Full-text available
Despite improved digital access to scholarly knowledge in recent decades, scholarly communication remains exclusively document-based. In this form, scholarly knowledge is hard to process automatically. We present the first steps towards a knowledge graph based infrastructure that acquires scholarly knowledge in machine actionable form thus enabling new possibilities for scholarly knowledge curation, publication and processing. The primary contribution is to present, evaluate and discuss multi-modal scholarly knowledge acquisition, combining crowdsourced and automated techniques. We present the results of the first user evaluation of the infrastructure with the participants of a recent international conference. Results suggest that users were intrigued by the novelty of the proposed infrastructure and by the possibilities for innovative scholarly knowledge processing it could enable.
Conference Paper
Full-text available
Publishing data sources to knowledge graphs is a complicated and laborious process as data sources are often heterogeneous, hierarchical and interlinked. As an example, food price datasets may contain product prices of various units at different markets and times, and different providers can have many choices of formats such as CSV, JSON or spreadsheet. Beyond data formats, these datasets may have differing layout, where one dataset may be organized as a row-based table or relational table (prices are in one column), while another may use a matrix table (prices are in one matrix). To address these problems, we present a novel data description language for mapping datasets to RDF. In particular, our language supports specifying the locations of source attributes in the sources, mapping of the attributes to ontologies, and simple rules to join the data of these attributes to output final RDF triples. Unlike existing approaches, our language is not restricted to specific data layouts such as the Nested Relational Model, or to specific data formats, such as spreadsheet. Our broad data description language presents a format-independent solution, allowing interlinking among multiple heterogeneous sources and representing many diverse data structures that existing tools are unable to handle.
Article
Full-text available
Various ontology visualization tools using different visualization methods exist and new ones are being developed every year. The goal of this paper is to follow up on previous surveys with an updated classification of ontology visualization methods and a comprehensive survey of available tools. The tools are analyzed for the used visualization methods, interaction techniques and supported ontology constructs. It shows that most of the tools apply two-dimensional node-link visualizations with a focus on class hierarchies. Color and shape are used with little variation, support for constructs introduced with version 2 of the OWL Web Ontology Language is limited, and it often remains vague what tasks and use cases are supported by the visualizations. Major challenges are the limited maturity and usability of many of the tools as well as providing an overview of large ontologies while also showing details on demand. We see a high demand for a universal ontology visualization framework implementing a core set of visual and interactive features that can be extended and customized to respective use cases.
Conference Paper
Full-text available
Recently, the term knowledge graph has been used frequently in research and business, usually in close association with Semantic Web technologies, linked data, large-scale data analytics and cloud computing. Its popularity is clearly influenced by the introduction of Google's Knowledge Graph in 2012, and since then the term has been widely used without a definition. A large variety of interpretations has hampered the evolution of a common understanding of knowledge graphs. Numerous research papers refer to Google's Knowledge Graph, although no official documentation about the used methods exists. The prerequisite for widespread academic and commercial adoption of a concept or technology is a common understanding, based ideally on a definition that is free from ambiguity. We tackle this issue by discussing and defining the term knowledge graph, considering its history and diversity in interpretations and use. Our goal is to propose a definition of knowledge graphs that serves as basis for discussions on this topic and contributes to a common vision.
Conference Paper
Full-text available
In this paper a novel approach is presented for generating RDF graphs of arbitrary complexity from various spreadsheet layouts. Currently, none of the available spreadsheet-to-RDF wrappers supports cross tables and tables where data is not aligned in rows. Similar to RDF123, XLWrap is based on template graphs where fragments of triples can be mapped to specific cells of a spreadsheet. Additionally, it features a full expression algebra based on the syntax of OpenOffice Calc and various shift operations, which can be used to repeat similar mappings in order to wrap cross tables including multiple sheets and spreadsheet files. The set of available expression functions includes most of the native functions of OpenOffice Calc and can be easily extended by users of XLWrap. Additionally, XLWrap is able to execute SPARQL queries, and since it is possible to define multiple virtual class extents in a mapping specification, it can be used to integrate information from multiple spreadsheets. XLWrap supports a special identity concept which allows to link anonymous resources (blank nodes) – which may originate from different spreadsheets – in the target graph.
Conference Paper
Full-text available
The Semantic Web has recently seen a rise of large knowl- edge bases (such as DBpedia) that are freely accessible via SPARQL endpoints. The structured representation of the contained information opens up new possibilities in the way it can be accessed and queried. In this paper, we present an approach that extracts a graph covering relationships between two objects of interest. We show an interactive visualization of this graph that supports the systematic analysis of the found relationships by providing highlighting, previewing, and ltering features.
Article
Full-text available
Computational Biology needs computer-readable information records. Increasingly, meta-analysed and pre-digested information is being used in the follow up of high throughput experiments and other investigations that yield massive data sets. Semantic enrichment of plain text is crucial for computer aided analysis. In general people will think about semantic tagging as just another form of text mining, and that term has quite a negative connotation in the minds of some biologists who have been disappointed by classical approaches of text mining. Efforts so far have tried to develop tools and technologies that retrospectively extract the correct information from text, which is usually full of ambiguities. Although remarkable results have been obtained in experimental circumstances, the wide spread use of information mining tools is lagging behind earlier expectations. This commentary proposes to make semantic tagging an integral process to electronic publishing.
Conference Paper
The document-centric workflows in science have reached (or already exceeded) the limits of adequacy. This is emphasized by recent discussions on the increasing proliferation of scientific literature and the reproducibility crisis. This presents an opportunity to rethink the dominant paradigm of document-centric scholarly information communication and transform it into knowledge-based information flows by representing and expressing information through semantically rich, interlinked knowledge graphs. At the core of knowledge-based information flows is the creation and evolution of information models that establish a common understanding of information communicated between stakeholders as well as the integration of these technologies into the infrastructure and processes of search and information exchange in the research library of the future. By integrating these models into existing and new research infrastructure services, the information structures that are currently still implicit and deeply hidden in documents can be made explicit and directly usable. This has the potential to revolutionize scientific work as information and research results can be seamlessly interlinked with each other and better matched to complex information needs. Furthermore, research results become directly comparable and easier to reuse. As our main contribution, we propose the vision of a knowledge graph for science, present a possible infrastructure for such a knowledge graph as well as our early attempts towards an implementation of the infrastructure.
Article
This paper describes the Ontology of units of Measure and related concepts OM, an OWL ontology of the domain of quantities and units of measure. OM supports making quantitative research data more explicit, so that the data can be integrated, verified and reproduced. The various options for modeling the domain are discussed. For example, physical quantities can be modeled either as classes, instances or properties. The design choices made are based on use cases from our own projects and general experience in the field. The use cases have been implemented as tools and web services. OM is compared with QUDT, another active effort for an OWL model in this domain. We note possibilities for integration of these efforts. We also discuss the role OWL plays in our approach.
Conference Paper
A useful starting point for designing advanced graphical user interfaces is the visual information seeking Mantra: overview first, zoom and filter, then details on demand. But this is only a starting point in trying to understand the rich and varied set of information visualizations that have been proposed in recent years. The paper offers a task by data type taxonomy with seven data types (one, two, three dimensional data, temporal and multi dimensional data, and tree and network data) and seven tasks (overview, zoom, filter, details-on-demand, relate, history, and extracts)