PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

In the last decade, a large number of Knowledge Graph (KG) information extraction approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG information extraction (IE) have not been studied in the literature. We propose Plumber, the first framework that brings together the research community's disjoint IE efforts. The Plumber architecture comprises 33 reusable components for various KG information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components,Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines.We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines,outperforming all baselines agnostics of the underlying KG. Furthermore,we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.
Content may be subject to copyright.
Better Call the Plumber: Orchestrating
Dynamic Information Extraction Pipelines
Mohamad Yaser Jaradeh1[0000000187772780] , Kuldeep
Singh2[0000000250549881] , Markus Stocker3[0000000154923212], Andreas
Both4[0000000291775463] , and S¨oren Auer3[0000000206982864]
1L3S Research Center, Leibniz University Hannover, Germany jaradeh@l3s.de
2Zerotha-Research & Cerence GmbH, Germany kuldeep.singh1@cerence.com
3TIB Leibniz Information Centre for Science and Technology, Germany
{markus.stocker, auer}@tib.eu
4Anhalt University of Applied Sciences, Germany andreas.both@hs-anhalt.de
Abstract.
We propose Plumber, the first framework that brings to-
gether the research community’s disjoint information extraction (IE)
efforts. The Plumber architecture comprises 33 reusable components for
various Knowledge Graphs (KG) information extraction subtasks, such
as coreference resolution, entity linking, and relation extraction. Using
these components, Plumber dynamically generates suitable information
extraction pipelines and offers overall 264 distinct pipelines. We study
the optimization problem of choosing suitable pipelines based on input
sentences. To do so, we train a transformer-based classification model that
extracts contextual embeddings from the input and finds an appropriate
pipeline. We study the efficacy of Plumber for extracting the KG triples
using standard datasets over two KGs: DBpedia, and Open Research
Knowledge Graph (ORKG). Our results demonstrate the effectiveness of
Plumber in dynamically generating KG information extraction pipelines,
outperforming all baselines agnostics of the underlying KG. Furthermore,
we provide an analysis of collective failure cases, study the similarities and
synergies among integrated components, and discuss their limitations.
Keywords:
Information Extraction
·
NLP Pipelines
·
Software Reusabil-
ity ·Semantic Search ·Semantic Web
1 Introduction and Motivation
In last one decade, publicly available KGs (DBpedia [
2
] and Wikidata [
42
] ) have
become rich sources of structured content used in various applications, including
Question Answering (QA), fact checking, and dialog systems [
39
,
4
]. The research
community developed numerous approaches to extract triple statements [
44
],
keywords/topics [
9
], tables [
45
,
23
,
22
], or entities [
35
,
36
] from unstructured text
to complement KGs. Despite extensive research, public KGs are not exhaustive
and require continuous effort to align newly emerging unstructured information
to the concepts of the KGs.
arXiv:2102.10966v1 [cs.CL] 22 Feb 2021
2 Jaradeh et al.
Research Problem:
This work was motivated by an observation of recent
approaches [
35
,
45
,
15
] that automatically align unstructured text to structured
data on the Web. Such approaches are not viable in practice for extracting and
structuring information because they only address very specific subtasks of the
overall KG information extraction problem. If we consider the exemplary sentence
Rembrandt painted The Storm on the Sea of Galilee. It was painted in 1633.
(cf. Figure 1). To extract statements aligned with the DBpedia KG from the
given sentences, a system must first recognize the entities and relation surface
forms in the first sentence. The second sentence requires an additional step of
the coreference resolution, where It must be mapped to the correct entity surface
form (namely, The Storm on the Sea of Galilee). The last step requires the map-
ping of entity and relation surface forms to the respective DBpedia entities and
predicates. There has been extensive research in aligning concepts in unstructured
text to KG, including entity linking [
15
,
18
], relation linking [
36
,
38
,
4
], and triple
classification [
14
]. However, these efforts are disjoint, and little has been done to
align unstructured text to the complete KG triples (i.e., represented as subject,
predicate, object) [
25
]. Furthermore, many entity and relation linking tools have
been reused in pipelines of QA systems [
39
,
26
]. The literature suggests that
once different approaches put forward by the research community are combined,
the resulting pipeline-oriented integrated systems can outperform monolithic
end-to-end systems [
27
]. For the KG information extraction task, however, to
the best of our knowledge, approaches aiming at dynamically integrating and
orchestrating various existing components do not exist.
Objective and Contributions:
Based on these observations, we build a frame-
work that enables the integration of previously disjoint efforts on the KG-IE task
under a single umbrella. We present the Plumber framework (cf. Figure 2) for
creating Information Extraction pipelines. Plumber integrates 33 reusable com-
ponents released by the research community for the subtasks entity linking (EL),
relation linking (RL), text triple extraction (TE) (subject, predicate, object),
and coreference resolution (CR). Overall, there are 264 different composable
KG information extraction pipelines (generated by the possible combination
of the available 33 components, i.e., for DBpedia 3 CRs, 8 TEs, 10 EL/RLs
gives 3*8*10=240, and 4*3*2=24 for the ORKG. Hence, 240+24=264 pipelines).
Plumber implements a transformer-based classification algorithm that intelli-
gently chooses the best pipeline based on the unstructured input text.
We perform an exhaustive evaluation of Plumber on the two large-scale
KGs DBpedia, and Open Research Knowledge Graph (ORKG) [
24
] to investigate
the efficacy of Plumber in creating KG triples from unstructured text. We
demonstrate that independent of the underlying KG; Plumber can find and
assemble different extraction components to produce better suited KG triple
extraction pipelines, significantly outperforming existing baselines. In summary,
we provide the following novel contributions: i) The Plumber framework is
the first of its kind for dynamically assembling and evaluating information
extraction pipelines based on sequence classification techniques and for a given
input text. Plumber is easily extensible and configurable, thus enabling the
Better Call the Plumber 3
rapid creation and adjustment of new information extraction components and
pipelines. Researchers can also use the framework for running IE components
independently for specific subtasks such as triple extraction and entity linking.
ii) A collection of 33 reusable IE components that can be combined to create
264 distinct IE pipelines. iii) The exhaustive evaluation and our detailed ablation
study of the integrated components and composed pipelines on various input
text will guide future research for collaborative KG information extraction.
We motivate our work with a running example; the sentence Rembrandt
painted The Storm on the Sea of Galilee. It was painted in 1633. Multiple
steps are required to extract these formally represented statements from the
given text. First, the pronoun it in the second sentence should be replaced by
The Storm on the Sea of Galilee using a coreference resolver. Next, a triple
extractor should extract the correct text triples from the natural language
text, i.e.,
<Rembrandt, painted, The Storm on the Sea of Galilee>
, and
<The Storm on the Sea of Galilee, painted in, 1633>
. In the next step,
the entity and relation linking component aligns the entity and relation surface
forms extracted in the previous step to the DBpedia entities:
dbr:Rembrandt
for
Rembrandt van Rijn, and
dbr:The Storm on the Sea of Galilee
for The Storm
on the Sea of Galilee, and for relations:
dbo:Artist
for painted, and
dbp:year
for
painted in. Figure 1 illustrates our running example and shows three Plumber
IE pipelines with different results. In Pipeline 1, the coreference resolver is unable
to map the pronoun it to the respective entity in the previous sentence. Moreover,
the triple extractor generates incomplete triples, which also hinders the task of
the entity and relation linker in the last step. Pipeline 2 uses a different set of
components, and its output differs from the first pipeline. Here, the coreference
Text: Rembrandt painted The Storm on the Sea of Galilee. It was painted in 1633.
Stanford
Coref
Resolver
OpenIE EARL
(It = The Storm) <Rembrandt, painted, Storm> (Rembrandt = dbr:Artemisia_(Rembrandt)),
(Storm = dbr:September_Storm)
Pipeline 1
Neural
Coref ClausIE DBpedia
Spotlight
(It = The Storm on the Sea of Galilee) <Rembrandt, painted, The Storm on the Sea of Galilee> (Rembrandt = dbr:Rembrandt)
Pipeline 2
Neural
Coref
ClausIE
Falcon
ReVerb
(It = The Storm on the Sea of Galilee)
<Rembrandt, painted, The Storm on the Sea of Galilee>
<It, was painted in, 1633>
(Rembrandt = dbr:Rembrandt),
(The Storm on the Sea of Galilee =
dbr:The_Storm_on_the_Sea_of_Galilee)
(painted = dbo:Artist)
Pipeline 3
Fig. 1.
Three example information extraction pipelines showing different results for the
same text snippet. Each pipeline consists of coreference resolution, triple extractors,
and entity/relation linking components.
4 Jaradeh et al.
resolution component is able to correctly co-relate the pronoun it to The Storm
on the Sea of Galilee, and extract the text triple correctly. However, the overall
result is only partially correct because the second triple is not extracted. Also,
the linking component is not able to spot the second entity. Pipeline 3 correctly
extracts both triples. This pipeline employs the same component as the second
pipeline for coreference resolution but also includes an additional information
extraction component (i.e., ReVerb [
16
]) and a joint entity and relation linking
component, namely Falcon [
35
]. With this combination of components, the text
triple extractors were able to compensate for the loss of information in the second
pipeline by adding one more component. Using the extracted text triples, the
last component of the pipeline, a joint entity and relation linking tool, can map
both triple components correctly to the corresponding KG entities.
The reminder of this article is organized as follows. Related work is reviewed
in Section 2. Section 3 presents Plumber, which is extensively evaluated in
Section 4. Section 5 discusses the results, and Section 6 concludes and outlines
directions for future research and work.
2 Related Work
In the last decade, many open source tools have been released by the research
community to tackle IE tasks for KGs. These IE components are not only used
for end-to-end KG triple extraction but also for various other tasks, such as:
Text Triple Extraction
: The task of open information extraction is a well
studied researched task in the NLP community [
1
]. It relies on NER (Named
Entity Recognition) and RE (Relation Extraction). SalIE [
33
] uses MinIE [
21
]
in combination with PageRank and clustering to find facts in the input text.
Furthermore, OpenIE [
1
] leverages linguistic structures to extract self-contained
clauses from the text. A comprehensive survey by Niklaus et al. [
32
] provides
detailed about such techniques.
Entity and Relation Linking
: Entity and relation linking is a widely studied
researched topic in the NLP, Web, and Information Retrieval research commu-
nities [
3
,
4
,
11
]. Often, entity and relation linking is performed independently.
DBpedia Spotlight [
10
] is one of the first approaches for entity recognition and
disambiguation over DBpedia. TagMe [
18
] links entities to DBpedia using in-link
matching to disambiguate candidates entities. Others tools such as RelMatch [
38
]
do not perform entity linking and only focus on linking the relation in the text to
the corresponding KG relation. Recon [
4
] uses graph neural networks to to map
relations between the entities with the assumption that entities are already linked
in the text. EARL [
15
] is a joint linking tool over DBpedia and models the task
as a generalized traveling salesperson problem. Sakor et al. [
35
] proposed Falcon,
a linguistic rules based tool for joint entity and relation linking over DBpedia.
Coreference Resolution
: This task is used in conjunction with other tasks in
NLP pipelines to disambiguate text and resolve syntactic complexities. The Stan-
ford Coreference Resolver [
34
] uses a multi pass sieve of deterministic coreference
models. Clark and Manning [8] use reinforcement learning to fine-tune a neural
Better Call the Plumber 5
mention-ranking model for coreference resolution. And more recently [37].
Frameworks and Dynamic Pipelines
: There have been few attempts in var-
ious domains aiming to consolidate the disjoint efforts of the research community
under a single umbrella for solving a particular task. The Gerbil platform [
41
] pro-
vides an easy-to-use web-based platform for the agile comparison of entity linking
tools using multiple datasets and uniform measuring approaches. OKBQA [
26
] is
a community effort for the development of multilingual open knowledge base and
QA systems. Frankenstein integrates 24 QA components to build QA systems col-
laboratively on-top of the Qanary integration framework [
6
]. Other ETL pipelines
system exists such as Apache NiFi. Semantic Web Pipes [
31
] and LarKC [
17
] are
other prominent examples.
End-to-End Extraction Systems
: More recently, end-to-end systems are gain-
ing more attention due to the boom of deep learning techniques. Such systems
draw on the strengths of deep models and transformers [
13
,
29
]. Kertkeidkachorn
and Ichise [
25
] present an end-to-end system to extract triples and link them
to DBpedia. Other attempts such as KG-Bert [
44
] leverage deep transformers
(i.e., BERT [
13
]) for the triple classification task, given the entity and relation
descriptions of a triple. KG-Bert does not attempt end-to-end alignment of
KG triples from a given input text. Liu et al. [
28
] design an encoder-decoder
framework with an attention mechanism to extract and align triples to a KG.
3 Dynamic Information Extraction Pipelining Framework
Plumber has a modular design (see Figure 2) where each component is integrated
as a microservice. To ensure a consistent data exchange between components,
the framework maps the output of each component to a homogeneous data
representation using the Qanary [
6
] methodology. Plumber follows three design
principles of i) Isolation, ii) Reusability, and iii) Extensibility inspired by [
39
,
41
].
Dynamic pipeline selection
:Plumber uses a RoBERTa [
29
] based clas-
sifier that given a text and a set of requirements, Plumber predicts the best
pipeline to extract KG triples. The RoBERTa model acts as intermediary that
classifies the contextual embeddings extracted from the input text into a class
which represents one of the possible pipelines. Regarding RoBERTa’s training,
we run each input sequence on all possible pipelines and compute the evaluation
metrics F1-score (i.e., estimated performance). RoBERTa is fed with the sentence
and the sentence-level performance with the best value among all pipelines as
the target class. Hence, in practice, the user points Plumber to a piece of text
and internally it uses RoBERTa to classify the text to a class (i.e., the pipeline)
to execute against the input text.
Architecture
:Plumber includes the following modules:
i) IE Compo-
nents Pool
: All information extraction components that are integrated within
the framework are parts of the pool. The components are divided based on
their respective tasks, i.e., coreference resolution, text triple extraction, as well
as entity and relation linking.
ii) Pipeline Generator
: This module creates
possible pipelines depending on the requirements of the components (i.e., the
6 Jaradeh et al.
Stanford Resolver
DBpedia Spotlight
ReVerb
TagMe
…….
IE Pipelines
Pool
Natural
Language
Text
Aligned
Triples
Pipeline
Builder
Knowledge
Graph
Pipeline
Runner
RoBERTa-based Pipeline Selector
Pipeline nPipeline 2Pipeline 1
…….
…….
…….
CR Components
IE Components
Pool
Stanford Resolver
Neuralcoref
…..
Pipeline Generator
TE Components
OpenIE
ReVerb
…..
EL/RL Components
EARL
Falcon
…..
Natural
Language Text
E2E Components
T2KG
Seq2RDF
…..
Neuralcoref
ClausIE
Falcon 2.0
T2KG
Best Pipeline
Configuration
Requirements
Fig. 2.
Overview of Plumber’s architecture highlighting the components for pipeline
generation, selection, and execution. Plumber receives an input sentence and require-
ment (underlying KG) from the user. The framework intelligently selects a suitable
pipeline based on the contextual features captured from the input sentence.
underlying KG). Users can manually select the underlying KG and, using the
metadata associated with each component, Plumber aggregates the components
for the concerned KG.
iii) IE Pipelines Pool
:Plumber stores the configu-
rations of the possible pipelines in the pool of pipelines for faster retrieval and
easier interaction with other modules.
iv) Pipeline Selector
: Based on the
requirements (i.e., underlying KG) and the input text, a RoBERTa based model
extracts contextual embeddings from the text and classifies the input into one of
the possible classes. Each class corresponds to one pipeline configuration that
is held in the IE pipelines pool.
v) Pipeline Runner
: Given the input text,
and the generated pipeline configuration, the module executes the pipeline and
produce the final KG triples.
4 Evaluation
In this section, we detail the empirical evaluation of the framework in comparison
to baselines on different datasets and knowledge graphs. As such, we study the
following research question: How does the dynamic selection of pipelines based on
the input text affect the end-to-end information extraction task?
4.1 Experimental Setup
Knowledge Graphs
To study the effectiveness of Plumber in building dynamic
KG information extraction pipelines, we use the following KGs during our
evaluation:
DBpedia
[
2
] is containing information extracted automatically from Wikipedia
info boxes. DBpedia consists of approximately 11.5B triples [35].
Open Research Knowledge Graph
[
24
] (ORKG) collects structured scholarly
Better Call the Plumber 7
knowledge published in research articles, using crowd sourcing and automated
techniques. In total, ORKG consists of approximately 984K triples.
Datasets
Throughout our evaluation, we employed a set of existing and newly
created datasets for structured triple extraction and alignment to knowledge
graphs: the WebNLG [20] dataset for DBpedia, and COV-triples for ORKG.
WebNLG
is the Web Natural Language Generation Challenge. The challenge
introduced the task of aligning unstructured text to DBpedia. In total, the dataset
contains 46K triples with 9K triples in the testing and 37K in the training set.
COV-triples
is a handcrafted dataset that focuses on COVID-19 related schol-
arly articles. The COV-triples dataset consists of 21 abstracts from peer-reviewed
articles and aligns the natural language text to the corresponding KG triples
into the ORKG. Three Semantic Web researchers verified annotation quality,
and triples approved by all three researchers are part of the dataset. The dataset
contains only 75 triples. Hence, we use the WebNLG dataset for training, and 75
triples are used as a test set.
Components and Implementation
The Plumber framework integrates
33 components, the components span different IE tasks from Triple Extrac-
tion, Entity and Relation Linking, and Coreference Resolution. Most of the
components used are open-sourced and they have been evaluated and used by
the community in their respective publications. Plumber’s code and all related
resources are publicly available online at https://git.io/JtT1s.
Baselines We include the following baselines:
T2KG
[
25
] is an end-to-end static system aligns a given natural language text
to DBpedia KG triples.
Frankenstein
[
39
] dynamically composes Question Answering pipelines over the
DBpedia KG. It employs logistic regression based classifiers for each component
for predicting the accuracy and greedily composes a dynamic pipeline of the
best components per task. We adapted Frankenstein for the KG information
extraction over DBpedia.
4.2 Experiments
The section summarizes a variety of experiments to compare the Plumber
framework against other baselines. Note, that evaluating the performance of
individual components or their combination is out of this evaluation’s scope,
since they were already used, benchmarked, and evaluated in the respective
publications. We report values of the standard metrics Precision (P), Recall (R),
and F1 score (F1). In all experiments, end-to-end components (e.g., T2KG) are
not part of Plumber.
Performance of Static Pipelines
In this experiment, we report results of the
static pipelines, i.e., no dynamic selection of a pipeline based on the input text is
considered. We ran all 264 pipelines and Table 2 (T2KG & Static noted rows)
reports the performance of the best Plumber pipeline against the baselines.
Plumber static pipeline for DBpedia comprises of NeuralCoref [
8
] for coreference
8 Jaradeh et al.
resolution, OpenIE [
1
] for text triple extraction, TagMe [
18
] for EL, and Falcon [
35
]
for RL tasks. Also, in case of Frankenstein, we choose its best performing static
pipeline. Results illustrated in the Table 2 confirm that the static pipeline
composed by the components integrated in Plumber outperforms all baselines
on DBpedia. We observe that the performance of pipeline approaches is better
than an end-to-end monolithic information extraction approaches. Although the
Plumber pipeline outperforms the baselines, the overall performance is relatively
low. All our components have been trained on distinct corpora in their respective
publications and our aim was to put them together to understand their collective
strengths and weaknesses. Note, Frankenstein addresses the QA pipeline problem
and not all components are comparable and can be applied in the context of
information extraction. Thus, we integrated NeuralCoref coreference resolution
component and OpenIE triple extraction component used in Plumber static
pipeline into Frankenstein for providing the same experimental settings.
Static Pipeline for Scholarly KG
In order to assess how Plumber performs
on domain-specific use cases, we evaluate the static pipelines’ performance on a
scholarly knowledge graph. We use the COV-triples dataset for ORKG. To the
best of our knowledge, no baseline exists on information extractions of research
contribution descriptions over ORKG. Hence, we execute all static pipelines in
Plumber tailored to ORKG to select the best one as shown in Table 2 (COV-
triples rows). Plumber pipelines over ORKG extract statements determining
the reproductive number estimates for the COVID-19 infectious disease from
scientific articles as shown below.
@p r ef i x or k g : < \ pr o te c t \ vr u le w i dt h 0 pt \ p r ot e ct \ h r ef { h tt p : // or k g . or g / or kg / re s ou r ce /} { h tt p : // o r kg . o rg / o r kg / r e so u rc e / } >.
@p r ef i x or k gp : < \ pr o te c t \ vr u le w i dt h 0p t \ p r ot e ct \ h r ef { h tt p : // or kg . o r g / or kg / p r o pe r ty / } { ht t p :/ / o rk g . o rg / o rk g / p ro p er t y / } >.
or k g: R 4 81 0 0 or k gp : P 16 0 22 " 2 .6 8 " .
In this example, orkg:R48100 refers to the city of Wuhan in China in the ORKG
and orkgp:P16022 is the property “has R0 estimate (average)”. The number
“2.68” is the reproductive number estimate.
Comparison of the Classification Approaches for Dynamic Pipeline
Selection
In this experiment, we study the effect of the transformer-based
pipeline selection approach implemented in Plumber against the pipeline se-
lection approach of Frankenstein. For a comparable experimental setting, we
re-use Frankenstein’s classification approach in Plumber, keeping the underlying
components precisely the same. We perform a 10-fold cross-validation for the
classification performance of the employed approach. Table 1 indicates that the
Plumber pipeline selection significantly outperforms baselines across the board.
Performance Comparison for KG Information Extraction Task
Our
third experiment focuses on comparing the performance of Plumber against
previous baselines for an end-to-end information extraction task. The results
in Table 2 illustrate that the dynamic pipelines built using Plumber for KG
information extraction outperform the best static pipelines of Plumber as well
Better Call the Plumber 9
Table 1.
10-fold CV of pipeline selection classifiers wrt. Precision, Recall, and F1 score.
Pipeline Selection
Approach Dataset Knowledge
Graph
Classification
P R F1
Frankenstein [39] WebNLG DBpedia 0.732 0.751 0.741
COV-triples ORKG 0.832 0.858 0.845
Plumber WebNLG DBpedia 0.877 0.900 0.888
COV-triples ORKG 0.901 0.917 0.909
as the dynamically selected pipelines by Frankenstein (rows noted with dynamic).
The end-to-end baselines, such as Kertkeidka-chorn and Ichise [
25
]. We also
observe that in cross-domain experiments for COV-triples datasets, dynamically
selected pipelines perform better than the static pipeline. In the cross-domain
experiment, the static and dynamic Plumber pipelines are relatively better
performing than the other two KGs. Unlike components for DBpedia, components
integrated into Plumber for ORKG are customized for KG triple extraction.
We conclude that when components are integrated into a framework such as
Plumber aiming for the KG information extraction task, it is crucial to select
the pipeline based on the input text dynamically. The superior performance
of Plumber shows that the dynamic pipeline selection has a positive impact
agnostic of the underlying KG and dataset. This also answers our overall research
question.
4.3 Ablation Studies
Plumber and baselines render relatively low performance on all the employed
datasets. Hence, in the ablation studies our aim is to provide a holistic picture of
underlying errors, collective success, and failures of the integrated components.
In the first study, we calculate the proportion of errors in Plumber. The
modular architecture of the proposed framework allows us to benchmark each
component independently. We consider the erroneous cases of Plumber on the
Table 2.
Overall performance comparison of static and dynamic pipelines for the KG
information extraction task.
System Dataset Knowledge
Graph
Performance
P R F1
T2KG [25] WebNLG DBpedia 0.133 0.140 0.135
Frankenstein (Static) [39] WebNLG DBpedia 0.177 0.189 0.181
Plumber (Static) WebNLG DBpedia 0.210 0.225 0.215
COV-triples ORKG 0.403 0.423 0.413
Frankenstein (Dynamic) [39] WebNLG DBpedia 0.199 0.208 0.203
COV-triples ORKG 0.403 0.424 0.413
Plumber (Dynamic) WebNLG DBpedia 0.287 0.307 0.297
COV-triples ORKG 0.411 0.437 0.424
10 Jaradeh et al.
test set of the WebNLG dataset. We calculate the performance (F1 score) of
the Plumber dynamic pipeline (cf. Table 2) at each step in the pipeline. The
results show that the coreference resolution components caused 21.54% of the
errors, 33.71% are caused by text triple extractors, 18.17% by the entity linking
components, and 26.58% are caused by the relation linking components.
We conclude that the text triple extractor components contribute to the
largest chunk of the errors over DBpedia. One possible reason for their limited
performance is that open-domain information extracting components were not
initially released for the KG information extraction task. Also, these components
do not incorporate any schema or prior knowledge to guide the extraction. We
observe that the errors mainly occur when the sentence is complex (with more
than one entity and predicate), or relations are not explicitly mentioned in the
sentence. We further analyze the text triple extractor errors. The error analysis
at the level of the triple subject, predicate, and object showed that most errors
are in predicates (40.17%) followed by objects (35.98%) and subjects (23.85%).
Further Analysis
Aiming to understand why IE pipelines perform with low
accuracy, we conduct a more in-depth analysis per IE task. In the first anal-
ysis, we evaluated each component independently on the WebNLG dataset.
Researchers [
12
,
40
] proposed several criterion for micro-benchmarking tools/com-
ponents for KG tasks (entity linking, relation linking, etc.) based on the linguistic
features of a sentence. We motivate our analysis based on the following:
I) Text Triple Extraction: We consider the number of words (wc) in the
input sentence (a sentence is termed by “simple” with average word length
of 7.41 [
39
]. Sentences with higher number of words than seven are complex
sentences). Furthermore, having a comma in a sentence (sub-clause) to separate
clauses is another factor. Atomic sentences (e.g., ”cats have tails”) are a type of
sentence that also affects triples extractors’ behavior. Moreover, nominal relation
as in ”Durin, son of Thorin” is another impacting factor on the performance.
Uppercase and lowercase mentions of the words (i.e., correct capitalization of
the first character and not the entire word) in a sentence are standard errors for
entity linking components. We consider this as a micro-benchmarking criteria.
II) Coreference Resolution: We focus on the length of the coreference chain
(i.e., the number of aliases for a single mention). Additionally, the number of
clusters is another criterion in the analysis. A cluster refers to the groups of
mentions that require disambiguation (e.g., ”mother bought a new phone, she
is so happy about it” where the first cluster is mother
she and the second is
phone
it). The presence of proper nouns in the sentence is studied as well as
acronyms. Furthermore, the demonstrative nature of the sentence is also observed
as a factor. Demonstrative sentences are the ones that contain demonstrative
pronouns (this, that, etc.).
III) Entity Linking: The number of entities in a sentence (e=1,2) is a crucial
observation for the entity linking task. Capitalization of the surface form is
another criterion for micro-benchmarking entity linking tools. An entity is termed
as an explicit entity when the entity’s surface form in a sentence matches the KG
label. An entity is implicit when there is a vocabulary mismatch. For example, in
Better Call the Plumber 11
the sentence ”The wife of Obama is Michelle Obama.”, the surface form Obama
is expected to be linked to
dbr:Barack Obama
and considered as an implicit
entity [
40
]. The last linguistic feature is the number of words (w) in an entity
label (e.g., The Storm on the Sea of Galilee has seven words).
IV) Relation Linking: Similar to the entity linking criteria, we focus on the
number of relations in a sentence (rel=1,2). The type of relation (i.e., explicit, or
implicit) is another parameter. Covered relation (sentences without a predicate
surface form) is also used as a feature for micro-benchmarking: ”Which companies
have launched a rocket from Cape Canaveral Air Force station?” where the
dbo:manufacturing
relation is not mentioned in the sentence. Covered relations
highly depend on common sense knowledge (i.e., reasoning) and the structure of
the KG [
40
]. Lastly, the number of words (w
<
=N) in a predicate surface form is
also considered.
Figure 3 illustrates micro-benchmarking of various Plumber components per
task. We observe that across IE tasks, the F1 score of the components varies
significantly based on the sentence’s linguistic features. In fact, there exist no
single component which performs equally well on all the micro-benchmarking
criteria. This observation further validates our hypothesis to design Plumber
for building dynamic information extraction pipelines based on the strengths
and weaknesses of the integrated components. We also note in Figure 3 that all
the CR components report limited performance for the demonstrative sentences
(demonstratives). When there is more than one coreference cluster in a sentence, all
other CR components observe a discernible drop in F1 score. The NeuralCoref [
8
]
component performs best for proper nouns, whereas PyCobalt [
19
] performs best
for the acronyms feature (almost being tied by NeuralCoref). In the TE task,
Graphene [7] shows the most stable performance across all categories. However,
the performance of all components (except Dependency Parser) drops significantly
when the number of words in a sentence exceeds seven (wc
>
7). Case sensitivity
also affects the performance and all components observe a noticeable drop in F1
score for lowercase entity mentions in the sentence. Similar behavior is observed
for entity linking components where case sensitivity is a significant cause of poor
performance. When the sentence has one entity and it is implicit (e=1, implicit);
all entity linking components face challenges in correctly linking the entities to
the underlying KG. Relation linking components also report lower performance
for implicit relations.
5 Discussion
Even though the dynamic pipelines of Plumber outperforms static pipelines, the
overall performance of Plumber and baselines for the KG information extraction
task remains low. Our detailed and exhaustive ablation studies suggest that
when individual components are plugged together, their individual performance
is a major error source. However, this behavior is expected, considering earlier
research works in other domains also observe a similar trend. As in 2015 Gerbil
framework [
41
] and in 2018 Frankenstein [
39
]. Within two years, the community
12 Jaradeh et al.
0.7083 0.5344 0.7526 0.5983 0.6874 0.6128 0.4389 0.6571 0.7893 0.6874 Falcon
0.4954 0.1647 0.5032 0.2049 0.3282 0.4980 0.1654 0.4554 0.2240 0.1826 TextRazor
0.4776 0.2304 0.5069 0.1215 0.3531 0.5311 0.2106 0.5069 0.2170 0.3531 TagMe
0.6127 0.4836 0.6660 0.5265 0.6015 0.5362 0.3840 0.5749 0.7092 0.5946 EARL
0.4694 0.1273 0.4620 0.1098 0.2878 0.4885 0.1044 0.4620 0.1671 0.2926 DBpedia Spotlight
0.5116 0.5537 0.5695 0.4502 0.5143 0.4772 0.3283 0.4916 0.6063 0.5084 Spacy ANN
e=1, upper case
e=1, lower case
e=1, explicit
e=1, implicit
e=1, w>2
e=2, upper case
e=2, lower case
e=2, explicit
e=2, implicit
e=2, w>2
(a) F1 score heatmap of the EL task
0.5428 0.3455 0.1722 0.2185 0.0993 0.3687 0.2477 0.1046 Ollie
0.6727 0.6096 0.3416 0.7486 0.7852 0.7038 0.6838 0.4369 OpenIE
0.6852 0.4928 0.6769 0.3105 0.2988 0.4880 0.4096 0.1335 ClausIE
0.6505 0.5853 0.1999 0.5043 0.5958 0.4060 0.1612 0.2948 MinIE
0.7709 0.6505 0.3792 0.6800 0.6223 0.7443 0.6983 0.7790 Graphene
0.6541 0.1630 0.5637 0.6496 0.1092 0.6255 0.5713 0.1806 ReVerb
0.4197 0.3497 0.3566 0.2617 0.2442 0.3009 0.2946 0.1633 POS Extractor
0.2165 0.3504 0.2452 0.0804 0.0191 0.2092 0.1183 0.0119 Dependency Extractor
wc <= 7
wc > 7
sub-clause
atomic senetence
nominal relations
upper case
lower case
acronyms
(b) F1 score heatmap of the Text TE task
(c) F1 score heatmap of the CR task
0.5173 0.4102 0.3133 0.4073 0.5632 0.3611 0.2595 0.3243 Falcon RL
0.0765 0.1158 0.0199 0.0928 0.1048 0.0714 0.1013 0.0905 Rel Match
0.3440 0.2728 0.2083 0.2688 0.3746 0.2401 0.1725 0.2156 EARL RL
0.4139 0.3261 0.2491 0.3197 0.4478 0.2870 0.2108 0.2578 Spacy ANN RL
rel=1, explicit
rel=1, implicit
rel=1, covered
rel=1, w>2
rel=2, explicit
rel=2, implicit
rel=2, covered
rel=2, w>2
(d) F1 score heatmap of the RL task
Fig. 3.
Comparison of F1 scores per component for different IE tasks based on the
various linguistic features of an input sentence (number of entities, word count in a
sentence, implicit vs. explicit relation, etc.). Darker colors indicate a higher F1 score.
Better Call the Plumber 13
has released several components dedicated to solving entity linking and relation
linking [35,15,30], which were two loopholes identified by [39] for the QA task.
We observe that state of the art components for information extraction still
have much potential to improve their performance (both in terms of runtime
and F1 score). It is essential to highlight that some of the issues observed in
our ablation study are very basic and repeatedly pointed out by researchers in
the community. For instance, Derczynski et al. [
12
] in 2015, followed by Singh
et al. [39] in 2018, showed that case sensitivity is a main challenge for EL tools.
Our observation in Figure 3 again confirms that case sensitivity of entity surface
forms remains an open issue even for newly released components. In contrast,
on specific datasets such as CoNLL-AIDA, several EL approaches reported F1
scores higher than 0.90 [
43
], showing that EL tools are highly customized to
particular datasets. In a real-world scenario like ours, the underlying limitations
of approaches are uncovered.
6 Conclusion and Future Work
In this paper, we presented the Plumber approach and framework for informa-
tion extraction. Plumber effectively selects the best possible pipeline for a given
input sentence using the sentential contextual features and a state-of-the-art
transformer-based classification model. Plumber has a service-oriented architec-
ture which is scalable, extensible, reusable, and agnostic of the underlying KG.
The core idea of Plumber is to combine the strengths of already existing disjoint
research for KG information extraction and build a foundation for a platform
to promote reusability for the construction of large-scale and semantically struc-
tured KGs. Our empirical results suggest that the performance of the individual
components directly impacts the end-to-end information extraction accuracy.
This article does not focus on internal system architecture or employed
algorithms in a particular IE component to analyze the failures. The focus of
the ablation studies is to holistically study the collective success and failure
cases for the various tasks. Our studies provide the research community with
insightful results over two knowledge graphs, 33 components, 264 pipelines. Our
work is a step in the larger research agenda of offering the research community an
effective way for synergistically combining and orchestrating various focused IE
approaches balancing their strengths and weaknesses taking different application
domains into account. We plan to extend our work in the following directions:
i) extending Plumber to other KGs such as UMLS [
5
] and Wikidata [
42
]. ii)
addressing multilinguality with Plumber, and iii) creating high performing RL
components.
References
1.
Angeli, G., Johnson Premkumar, M.J., Manning, C.D.: Leveraging linguistic struc-
ture for open domain information extraction. pp. 344–354. ACL (2015)
14 Jaradeh et al.
2.
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: Dbpedia: A
nucleus for a web of open data. In: The Semantic Web. pp. 722–735 (2007)
3.
Balog, K.: Entity linking. In: Entity-Oriented Search, pp. 147–188. Springer (2018)
4.
Bastos, A., Nadgeri, A., Singh, K., Mulang, I.O., Shekarpour, S., Hoffart, J.: Recon:
Relation extraction using knowledge graph context in a graph neural network (2020)
5.
Bodenreider, O.: The unified medical language system (umls): integrating biomedical
terminology. Nucleic acids research 32, D267–D270 (2004)
6.
Both, A., Diefenbach, D., Singh, K., Shekarpour, S., Cherix, D., Lange, C.: Qanary
- A methodology for vocabulary-driven open question answering systems. vol. 9678,
pp. 625–641 (2016)
7.
Cetto, M., Niklaus, C., Freitas, A., Handschuh, S.: Graphene: Semantically-linked
propositions in open information extraction. In: Proceedings of the 27th COLING.
pp. 2300–2311 (2018)
8.
Clark, K., Manning, C.D.: Deep reinforcement learning for mention-ranking coref-
erence models. In: Proceedings of the 2016 EMNLP. pp. 2256–2262 (2016)
9.
Cui, W., Liu, S., Wu, Z., Wei, H.: How hierarchical topics evolve in large text
corpora. IEEE TVCG 20(12), 2281–2290 (2014)
10.
Daiber, J., Jakob, M., Hokamp, C., Mendes, P.N.: Improving efficiency and accuracy
in multilingual entity extraction. In: Proceedings of the 9th I-Semantics (2013)
11. Delpeuch, A.: Opentapioca: Lightweight entity linking for wikidata (2019)
12.
Derczynski, L., Maynard, D., Rizzo, G., Van Erp, M., Gorrell, G., Troncy, R.,
Petrak, J., Bontcheva, K.: Analysis of named entity recognition and linking for
tweets. Information Processing & Management 51, 32–49 (2015)
13.
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep
bidirectional transformers for language understanding. In: NAACL. pp. 4171–4186
(2019)
14.
Dong, T., Wang, Z., Li, J., Bauckhage, C., Cremers, A.B.: Triple classification using
regions and fine-grained entity typing. In: Proceedings of the AAAI Conference on
Artificial Intelligence. vol. 33, pp. 77–85 (2019)
15.
Dubey, M., Banerjee, D., Chaudhuri, D., Lehmann, J.: EARL: Joint entity and
relation linking for question answering over knowledge graphs. In: Lecture Notes in
Computer Science, pp. 108–126. Springer International Publishing (2018)
16.
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information
extraction. In: Proceedings of the 2011 EMNLP. pp. 1535–1545 (Jul 2011)
17.
Fensel, D., van Harmelen, F., Andersson, B., Brennan, P., Cunningham, H., Della
Valle, E., Fischer, F., Huang, Z., Kiryakov, A., Lee, T.K., Witbrock, M., Zhong, N.:
Towards larkc: A platform for web-scale reasoning. In: IEEE ICSC. pp. 524–529
(2008)
18.
Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments
(by wikipedia entities). pp. 1625–1628 (2010)
19.
Freitas, A., Bermeitinger, B., Handschuh, S.: Lambda-3/pycobalt: Coreference
resolution in python. https://github.com/Lambda-3/PyCobalt
20.
Gardent, C., Shimorina, A., Narayan, S., Perez-Beltrachini, L.: Creating training
corpora for NLG micro-planners. pp. 179–188 (2017)
21.
Gashteovski, K., Gemulla, R., del Corro, L.: MinIE: Minimizing facts in open
information extraction. In: Proceedings of the 2017 EMNLP. pp. 2630–2640 (2017)
22.
Hou, Y., Jochim, C., Gleize, M., Bonin, F., Ganguly, D.: Identification of tasks,
datasets, evaluation metrics, and numeric scores for scientific leaderboards con-
struction. In: Proceedings of the 57th ACL. pp. 5203–5213 (2019)
23.
Ibrahim, Y., Riedewald, M., Weikum, G., Zeinalipour-Yazti, D.: Bridging quantities
in tables and text. In: 2019 IEEE 35th ICDE. pp. 1010–1021 (2019)
Better Call the Plumber 15
24.
Jaradeh, M.Y., Oelen, A., Farfar, K.E., Prinz, M., D’Souza, J., Kismih´ok, G., Stocker,
M., Auer, S.: Open Research Knowledge Graph: Next Generation Infrastructure
for Semantic Scholarly Knowledge. Marina Del K-CAP 19 (2019)
25.
Kertkeidkachorn, N., Ichise, R.: T2kg: An end-to-end system for creating knowledge
graph from unstructured text. In: AAAI Workshops. vol. WS-17 (2017)
26.
Kim, J.D., Unger, C., Ngomo, A.C.N., Freitas, A., Hahm, Y.g., Kim, J., Nam, S.,
Choi, G.H., Kim, J.u., Usbeck, R., et al.: OKBQA Framework for collaboration on
developing natural language question answering systems (2017)
27.
Liang, S., Stockinger, K., de Farias, T.M., Anisimova, M., Gil, M.: Querying
knowledge graphs in natural language (2020)
28.
Liu, Y., Zhang, T., Liang, Z., Ji, H., McGuinness, D.: Seq2rdf: An end-to-end
application for deriving triples from natural language text (2018)
29.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Zettlemoyer, L., Stoyanov,
V.: Roberta: A robustly optimized bert pretraining approach (2019)
30.
Mihindukulasooriya, N., Rossiello, G., Kapanipathi, P., Abdelaziz, I., Ravishankar,
S., Yu, M., Gliozzo, A., Roukos, S., Gray, A.: Leveraging semantic parsing for
relation linking over knowledge bases. ISWC (to appear) (2020)
31.
Morbidoni, C., Polleres, A., Tummarello, G., Le-Phuoc, D.: Semantic web pipes
(2007)
32.
Niklaus, C., Cetto, M., Freitas, A., Handschuh, S.: A survey on open information
extraction. In: Proceedings of the 27th COLING. pp. 3866–3878 (2018)
33.
Ponza, M., Del Corro, L., Weikum, G.: Facts that matter. In: Proceedings of the
2018 EMNLP. pp. 1043–1048. ACL (2018)
34.
Raghunathan, K., Lee, H., Rangarajan, S., Chambers, N., Surdeanu, M., Jurafsky,
D., Manning, C.: A multi-pass sieve for coreference resolution. In: EMNLP (2010)
35.
Sakor, A., Onando Mulang’, I., Singh, K., Shekarpour, S., Esther Vidal, M.,
Lehmann, J., Auer, S.: Old is gold: Linguistic driven approach for entity and
relation linking of short text. pp. 2336–2346. ACL (2019)
36.
Sakor, A., Singh, K., Patel, A., Vidal, M.E.: Falcon 2.0: An entity and relation
linking tool over wikidata. In: CIKM (2020)
37.
Sanh, V., Wolf, T., Ruder, S.: A hierarchical multi-task approach for learning
embeddings from semantic tasks. Proceedings of the AAAI 33, 6949–6956 (2019)
38.
Singh, K., Mulang, I.O., Lytra, I., Jaradeh, M.Y., Sakor, A., Vidal, M., Lange, C.,
Auer, S.: Capturing knowledge in semantically-typed relational patterns to enhance
relation linking. In: Proceedings of the Knowledge Capture Conference, K-CAP
2017, Austin, TX, USA, December 4-6, 2017. pp. 31:1–31:8 (2017)
39.
Singh, K., Radhakrishna, A.S., Both, A., Shekarpour, S., Lytra, I., Usbeck, R.,
Vyas, A., Khikmatullaev, A., Punjani, D., Lange, C., Vidal, M.E., Lehmann, J.,
Auer, S.: Why reinvent the wheel: Let’s build question answering systems together.
p. 1247–1256. WWW ’18 (2018)
40.
Singh, K., Saleem, M., Nadgeri, A., Conrads, F., Pan, J.Z., Ngomo, A.C.N.,
Lehmann, J.: Qaldgen: Towards microbenchmarking of question answering systems
over knowledge graphs. In: ISWC. pp. 277–292 (2019)
41.
Usbeck, R., R¨oder, M., et al., N.N.: Gerbil: general entity annotator benchmarking
framework. In: Proceedings of the 24th WWW. pp. 1133–1143 (2015)
42.
Vrandeˇci´c, D., Kr¨otzsch, M.: Wikidata: A Free Collaborative Knowledgebase. Com-
munications of the ACM 57(10), 78–85 (2014)
43.
Yang, X., Gu, X., Lin, S., Tang, S., Zhuang, Y., Wu, F., Chen, Z., Hu, G., Ren,
X.: Learning dynamic context augmentation for global entity linking. In: EMNLP-
IJCNLP. pp. 271–281 (2019)
16 Jaradeh et al.
44. Yao, L., Mao, C., Luo, Y.: Kg-bert: Bert for knowledge graph completion (2019)
45.
Yu, W., Li, Z., Zeng, Q., Jiang, M.: Tablepedia: Automating pdf table reading in
an experimental evidence exploration and analytic system. p. 3615–3619. WWW
’19
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
The Natural Language Processing (NLP) community has significantly contributed to the solutions for entity and relation recognition from a natural language text, and possibly linking them to proper matches in Knowledge Graphs (KGs). Considering Wikidata as the background KG, there are still limited tools to link knowledge within the text to Wikidata. In this paper, we present Falcon 2.0, the first joint entity and relation linking tool over Wikidata. It receives a short natural language text in the English language and outputs a ranked list of entities and relations annotated with the proper candidates in Wikidata. The candidates are represented by their Internationalized Resource Identifier (IRI) in Wikidata. Falcon 2.0 resorts to the English language model for the recognition task (e.g., N-Gram tiling and N-Gram splitting), and then an optimization approach for the linking task. We have empirically studied the performance of Falcon 2.0 on Wikidata and concluded that it outperforms all the existing baselines. Falcon 2.0 is open source and can be reused by the community; all the required instructions of Falcon 2.0 are well-documented at our GitHub repository (https://github.com/SDM-TIB/falcon2.0). We also demonstrate an online API, which can be run without any technical expertise. Falcon 2.0 and its background knowledge bases are available as resources at https://labs.tib.eu/falcon/falcon2/.
Chapter
Full-text available
Over the last years, a number of Knowledge Graph (KG) based Question Answering (QA) systems have been developed. Consequently, the series of Question Answering Over Linked Data (QALD1–QALD9) challenges and other datasets have been proposed to evaluate these systems. However, the QA datasets contain a fixed number of natural language questions and do not allow users to select micro benchmarking samples of the questions tailored towards specific use-cases. We propose QaldGen, a framework for microbenchmarking of QA systems over KGs which is able to select customised question samples from existing QA datasets. The framework is flexible enough to select question samples of varying sizes and according to the user-defined criteria on the most important features to be considered for QA benchmarking. This is achieved using different clustering algorithms. We compare state-of-the-art QA systems over knowledge graphs by using different QA benchmarking samples. The observed results show that specialised micro-benchmarking is important to pinpoint the limitations of the various QA systems and its components.
Conference Paper
Full-text available
Web research, data science, and artificial intelligence have been rapidly changing our life and society. Researchers and practitioners in the fields take a large amount of time to read literature and compare existing approaches. It would significantly improve their efficiency if there was a system that extracted and managed experimental evidences (say, a specific method achieves a score of a specific metric on a specific dataset) from tables of paper PDFs for search, exploration, and analytic. We build such a demonstration system, called Tablepedia, that use rule-based and learning-based methods to automate the “reading” of PDF tables. It has three modules: template recognition, unification, and SQL operations. We implement three functions to facilitate research and practice: (1) finding related methods and datasets, (2) finding top-performing baseline methods, and (3) finding conflicting reported numbers. A pointer to a screencast on Vimeo: https://vimeo.com/310162310
Conference Paper
Full-text available
Short texts challenge NLP tasks such as named entity recognition, disambiguation, linking and relation inference because they do not provide sufficient context or are partially malformed (e.g. wrt. capitalization, long tail entities , implicit relations). In this work, we present the Falcon approach which effectively maps entities and relations within a short text to its mentions of a background knowledge graph. Falcon overcomes the challenges of short text using a lightweight linguistic approach relying on a background knowledge graph. Falcon performs joint entity and relation linking of a short text by leveraging several fundamental principles of English morphology (e.g. compounding, headword identification) and utilizes an extended knowledge graph created by merging entities and relations from various knowledge sources. It uses the context of entities for finding relations and does not require training data. Our empirical study using several standard benchmarks and datasets show that Falcon significantly outperforms state-of-the-art entity and relation linking for short text query inventories.
Conference Paper
Full-text available
A Triple in knowledge-graph takes a form that consists of head, relation, tail. Triple Classification is used to determine the truth value of an unknown Triple. This is a hard task for 1-to-N relations using the vector-based embedding approach. We propose a new region-based embedding approach using fine-grained type chains. A novel geometric process is presented to extend the vectors of pre-trained entities into n-balls (n-dimensional balls) under the condition that head balls shall contain their tail balls. Our algorithm achieves zero energy loss, therefore, serves as a case study of perfectly imposing tree structures into vector space. An unknown Triple (h, r, x) will be predicted as true, when x's n-ball is located in the r-subspace of h's n-ball, following the same construction of known tails of h. The experiments are based on large datasets derived from the benchmark datasets WN11, FB13, and WN18. Our results show that the performance of the new method is related to the length of the type chain and the quality of pre-trained entity-embeddings, and that performances of long chains with well-trained entity-embeddings outperform other methods in the literature.
Conference Paper
Full-text available
We present an Open Information Extraction (IE) approach that uses a two-layered transformation stage consisting of a clausal disembedding layer and a phrasal disembedding layer, together with rhetorical relation identification. In that way, we convert sentences that present a complex linguistic structure into simplified, syntactically sound sentences, from which we can extract propositions that are represented in a two-layered hierarchy in the form of core relational tuples and accompanying contextual information which are semantically linked via rhetorical relations. In a comparative evaluation, we demonstrate that our reference implementation Graphene outper-forms state-of-the-art Open IE systems in the construction of correct n-ary predicate-argument structures. Moreover, we show that existing Open IE approaches can benefit from the transformation process of our framework.
Conference Paper
Full-text available
We provide a detailed overview of the various approaches that were proposed to date to solve the task of Open Information Extraction. We present the major challenges that such systems face, show the evolution of the suggested approaches over time and depict the specific issues they address. In addition, we provide a critique of the commonly applied evaluation procedures for assessing the performance of Open IE systems and highlight some directions for future work.