Chapter

A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network: 17th International Semantic Web Conference, Monterey, CA, USA, October 8–12, 2018, Proceedings, Part I

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The survey in [28] presents a thorough overview of the main approaches to EL, while more recent works (like [4], [17] and [9]) exploit the idea of neural networks and deep learning. To the best of our knowledge, [26], [6] and [2] are the only previous works that focus on the related (yet different) problem of MetaEL, i.e., on how to combine the outputs of multiple EL tools for providing a unified set of entity annotations. ...
... Two of our baselines consider this approach. [2] describes a framework to combine the responses of multiple EL tools which relies on the joint training of two deep neural models. However, this work is not applicable in our MetaEL problem since it makes use of external knowledge (pre-trained word embeddings and entity abstracts) as well as entity type information (a type taxonomy from each extractor), as opposed to our MetaEL task which only considers plain lists of entity annotations. ...
Preprint
Full-text available
Entity linking (EL) is the task of automatically identifying entity mentions in text and resolving them to a corresponding entity in a reference knowledge base like Wikipedia. Throughout the past decade, a plethora of EL systems and pipelines have become available, where performance of individual systems varies heavily across corpora, languages or domains. Linking performance varies even between different mentions in the same text corpus, where, for instance, some EL approaches are better able to deal with short surface forms while others may perform better when more context information is available. To this end, we argue that performance may be optimised by exploiting results from distinct EL systems on the same corpus, thereby leveraging their individual strengths on a per-mention basis. In this paper, we introduce a supervised approach which exploits the output of multiple ready-made EL systems by predicting the correct link on a per-mention basis. Experimental results obtained on existing ground truth datasets and exploiting three state-of-the-art EL systems show the effectiveness of our approach and its capacity to significantly outperform the individual EL systems as well as a set of baseline methods.
... The survey in [28] presents a thorough overview of the main approaches to EL, while more recent works (like [4], [17] and [9]) exploit the idea of neural networks and deep learning. To the best of our knowledge, [26], [6] and [2] are the only previous works that focus on the related (yet different) problem of MetaEL, i.e., on how to combine the outputs of multiple EL tools for providing a unified set of entity annotations. ...
... Two of our baselines consider this approach. [2] describes a framework to combine the responses of multiple EL tools which relies on the joint training of two deep neural models. However, this work is not applicable in our MetaEL problem since it makes use of external knowledge (pre-trained word embeddings and entity abstracts) as well as entity type information (a type taxonomy from each extractor), as opposed to our MetaEL task which only considers plain lists of entity annotations. ...
... Earlier joint methods, such as NERD-ML [74] (more below), relied on a dedicated extractor for each specific language. More recently, [75] has proposed a multilingual ensemble that combines multiple extractors for joint NER and NED, where the ensemble idea is to combine the output of several alternative components into a single and presumedly better result, for example by averaging or voting. The ensemble produces a list of entities with their types and disambiguation links (called ground truths). ...
... We have only found a few studies that investigate more than one language: NewsReader [112] which considers English, Dutch, Spanish and Italian; [118] which considers English and Dutch; and [119] which considers morphological languages such as Turkish and Russian. Hence, dealing with multilingual texts in general has so far received little attention [75], perhaps in part due to the limited availability of training datasets for languages other than English. Recently proposed multi-and cross-lingual approaches such as [112], [119] can train a model on a richly-resourced language and then apply it to a more sparsely-resourced one. ...
Article
Full-text available
An enormous amount of digital information is expressed as natural-language (NL) text that is not easily processable by computers. Knowledge Graphs (KG) offer a widely used format for representing information in computer-processable form. Natural Language Processing (NLP) is therefore needed for mining (or lifting) knowledge graphs from NL texts. A central part of the problem is to extract the named entities in the text. The paper presents an overview of recent advances in this area, covering: Named Entity Recognition (NER), Named Entity Disambiguation (NED), and Named Entity Linking (NEL).We comment that many approaches to NED and NEL are based on older approaches to NER and need to leverage the outputs of state-of-the-art NER systems. There is also a need for standard methods to evaluate and compare named-entity extraction approaches. We observe that NEL has recently moved from being stepwise and isolated into an integrated process along two dimensions: the first is that previously sequential steps are now being integrated into end-to-end processes, and the second is that entities that were previously analysed in isolation are now being lifted in each other’s context. The current culmination of these trends are the deep-learning approaches that have recently reported promising results.
Conference Paper
Full-text available
Aligning named entity taxonomies for comparing or combining different named entity extraction systems is a difficult task. Often taxonomies are mapped manually onto each other or onto a standardized ontology but at the loss of subtleties between different class extensions and domain specific uses of the taxonomy. In this paper, we present an approach and experiments for learning customized taxonomy alignments between different entity extractors for different domains. Our inductive data-driven approach recasts the alignment problem as a classification problem. We present experiments on two named entity recognition benchmark datasets, namely the CoNLL2003 newswire dataset and the MSM2013 microposts dataset. Our results show that the automatically induced mappings outperform manual alignments and are agnostic to changes in the extractor taxonomies, implying that alignments are highly contextual.
Poster
Full-text available
Named Entity Extraction is a mature task in the NLP field that has yielded numerous services gaining popularity in the Semantic Web community for extracting knowledge from web documents. These services are generally organized as pipelines, using dedicated APIs and different taxonomy for extracting, classifying and disambiguating named entities. Integrating one of these services in a particular application requires to implement an appropriate driver. Furthermore, the results of these services are not comparable due to different formats. This prevents the comparison of the performance of these services as well as their possible combination. We address this problem by proposing NERD, a framework which unifies 10 popular named entity extractors available on the web, and the NERD ontology which provides a rich set of axioms aligning the taxonomies of these tools.
Article
Full-text available
The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.
Conference Paper
Full-text available
Most current statistical natural language process- ing models use only local features so as to permit dynamic programming in inference, but this makes them unable to fully account for the long distance structure that is prevalent in language use. We show how to solve this dilemma with Gibbs sam- pling, a simple Monte Carlo method used to per- form approximate inference in factored probabilis- tic models. By using simulated annealing in place of Viterbi decoding in sequence models such as HMMs, CMMs, and CRFs, it is possible to incorpo- rate non-local structure while preserving tractable inference. We use this technique to augment an existing CRF-based information extraction system with long-distance dependency models, enforcing label consistency and extraction template consis- tency constraints. This technique results in an error reduction of up to 9% over state-of-the-art systems on two established information extraction tasks.
Conference Paper
Implementing the multilingual Semantic Web vision requires transforming unstructured data in multiple languages from the Document Web into structured data for the multilingual Web of Data. We present the multilingual version of FOX, a knowledge extraction suite which supports this migration by providing named entity recognition based on ensemble learning for five languages. Our evaluation results show that our approach goes beyond the performance of existing named entity recognition systems on all five languages. In our best run, we outperform the state of the art by a gain of 32.38% F1-Score points on a Dutch dataset. More information and a demo can be found at http://fox.aksw.org as well as an extended version of the paper descriping the evaluation in detail.
Conference Paper
Among different recommendation techniques, collaborative filtering usually suffer from limited performance due to the sparsity of user-item interactions. To address the issues, auxiliary information is usually used to boost the performance. Due to the rapid collection of information on the web, the knowledge base provides heterogeneous information including both structured and unstructured data with different semantics, which can be consumed by various applications. In this paper, we investigate how to leverage the heterogeneous information in a knowledge base to improve the quality of recommender systems. First, by exploiting the knowledge base, we design three components to extract items' semantic representations from structural content, textual content and visual content, respectively. To be specific, we adopt a heterogeneous network embedding method, termed as TransR, to extract items' structural representations by considering the heterogeneity of both nodes and relationships. We apply stacked denoising auto-encoders and stacked convolutional auto-encoders, which are two types of deep learning based embedding techniques, to extract items' textual representations and visual representations, respectively. Finally, we propose our final integrated framework, which is termed as Collaborative Knowledge Base Embedding (CKE), to jointly learn the latent representations in collaborative filtering as well as items' semantic representations from the knowledge base. To evaluate the performance of each embedding component as well as the whole system, we conduct extensive experiments with two real-world datasets from different scenarios. The results reveal that our approaches outperform several widely adopted state-of-the-art recommendation methods.
Article
Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Many popular models to learn such representations ignore the morphology of words, by assigning a distinct vector to each word. This is a limitation, especially for morphologically rich languages with large vocabularies and many rare words. In this paper, we propose a new approach based on the skip-gram model, where each word is represented as a bag of character n-grams. A vector representation is associated to each character n-gram, words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpus quickly. We evaluate the obtained word representations on five different languages, on word similarity and analogy tasks.
Conference Paper
A considerable portion of the information on the Web is still only available in unstructured form. Implementing the vision of the Semantic Web thus requires transforming this unstructured data into structured data. One key step during this process is the recognition of named entities. Previous works suggest that ensemble learning can be used to improve the performance of named entity recognition tools. However, no comparison of the performance of existing supervised machine learning approaches on this task has been presented so far. We address this research gap by presenting a thorough evaluation of named entity recognition based on ensemble learning. To this end, we combine four different state-of-the approaches by using 15 different algorithms for ensemble learning and evaluate their performace on five different datasets. Our results suggest that ensemble learning can reduce the error rate of state-of-the-art named entity recognition systems by 40%, thereby leading to over 95% f-score in our best run.
Conference Paper
We analyze some of the fundamental design challenges and misconceptions that underlie the development of an efficient and robust NER system. In particular, we address issues such as the representation of text chunks, the inference approach needed to combine local NER decisions, the sources of prior knowledge and how to use them within an NER system. In the process of comparing several solutions to these challenges we reach some surprising conclusions, as well as develop an NER system that achieves 90.8 F1 score on the CoNLL-2003 NER shared task, the best reported result for this dataset. 1
Appendix B: MUC-7 test scores introduction
  • N Chinchor