Thierry Declerck

Thierry Declerck
Deutsches Forschungszentrum für Künstliche Intelligenz | DFKI · Language Technology

MA

About

163
Publications
15,594
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,365
Citations

Publications

Publications (163)
Article
Full-text available
This article provides a comprehensive and up-to-date survey of models and vocabularies for creating linguistic linked data (LLD) focusing on the latest developments in the area and both building upon and complementing previous works covering similar territory. The article begins with an overview of some recent trends which have had a significant im...
Conference Paper
Full-text available
We describe ongoing work consisting in adding pronunciation information to wordnets, as such information can indicate specific senses of a word. Many wordnets associate with their senses only a lemma form and a part-of-speech tag. At the same time, we are aware that additional linguistic information can be useful for identifying a specific sense of...
Conference Paper
Full-text available
In this paper we describe the contributions made by the European H2020 project "Prêt-à-LLOD" ('Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors') to the further development of the Linguistic Linked Open Data (LLOD) infrastructure. Prêt-à-LLOD aims to develop a new methodology for building data value chains applic...
Conference Paper
Full-text available
In this paper we describe our current work on representing a recently created German lexical semantics resource in OntoLex-Lemon and in conformance with Word-Net specifications. Besides presenting the representation effort, we show the utilization of OntoLex-Lemon to bridge from WordNet-like resources to full lexical descriptions and extend the cov...
Conference Paper
Full-text available
We describe work consisting in porting two large German lexical resources into the OntoLex-Lemon model in order to establish complementary interlinkings between them. One resource is OdeNet (Open German Word-Net) and the other is a further development of the German version of the MMORPH morphological analyzer. We show how the Multiword Expressions...
Conference Paper
Full-text available
This paper results from observations that have been made while studying ontological and linked data-based approaches to the encoding of biographical data. Based on certain issues we discovered and which will be described here, we aim to call for a collaborative work towards guidelines for modelling biographical data in the standard Semantic Web rep...
Conference Paper
In this paper we give an overview on a number of achieved and on-going efforts dealing with porting to the Linked Data framework electronic versions of past classification schemes in the field of folktale narratives. Three of those schemes are in the field of folktales, including: (1) the work by Vladimir Propp on the Morphology of the Folktale, (2...
Conference Paper
Clinical care providers express their judgments and observations towards the patient status in clinical narratives. In contrast to sentiment expressions in general domains targeted by language technology, clinical sentiments are influenced by related medical events such as clinical precondition or outcome of a treatment. We argue that patient statu...
Conference Paper
Full-text available
The Open Linguistics Working Group (OWLG) brings together researchers from various fields of linguistics, natural language processing, and information technology to present and discuss principles, case studies, and best practices for representing, publishing and linking linguistic data collections. A major outcome of our work is the Linguistic Link...
Conference Paper
In this paper, we present on-going work pursued in the context of the Pheme project. There, the detection of rumors in social media is playing a central role in two use cases. In order to be able to store and to query for information on specific types of rumors that can be circulated in such media (but also in “classical” media), we started to buil...
Conference Paper
Full-text available
This paper presents an approach for the formal representation of compo- nents in German compounds. We as- sume that such a formal representa- tion will support the segmentation and analysis of unseen compounds that feature components already seen in other compounds. An extensive lan- guage resource that explicitly codes components of compounds is G...
Conference Paper
Full-text available
Our study presents a social media content representation, visualization and assessment method for modeling the emergence and resolution of rumorous claims during crisis events. Interpreting the factuality with which a claim is expressed is typically context-dependent and can be subject to semantic drift. We identify and quantify temporally anchored...
Article
This paper reports on an approach and experiments to automatically build a cross-lingual multi-word entity resource. Starting from a collection of millions of acronym/expansion pairs for 22 languages where expansion variants were grouped into monolingual clusters, we experiment with several aggregation strategies to link these clusters across langu...
Article
Representing time-dependent information has become increasingly important for reasoning and querying services defined on top of RDF and OWL. In particular, addressing this task properly is vital for practical applications such as modern biographical information systems, but also for the Semantic Web/Web 2.0/Social Web in general. Extending binary r...
Article
Full-text available
The recent massive growth in online media and the rise of user-authored content (e.g weblogs, Twitter, Facebook) has led to challenges of how to access and interpret the strongly multilingual data, in a timely, efficient, and affordable manner. The goal of this project is to deliver innovative, portable open-source real-time methods for cross-lingu...
Conference Paper
http://euralex2014.eurac.edu/en/callforpapers/Documents/EURALEX%202014_gesamt.pdf
Conference Paper
http://euralex2014.eurac.edu/en/callforpapers/Documents/EURALEX%202014_gesamt.pdf
Conference Paper
Full-text available
In this submission we describe results of work within the ABaC:us project dedicated to the extraction of lexical data from a corpus of sacred literature written in historical German language: All tokens occurring in the corpus have been semi-automatically mapped onto their corresponding lemmata in modern High German, which is one of the major achie...
Conference Paper
Full-text available
http://www.lrec-conf.org/proceedings/lrec2014/index.html
Conference Paper
Full-text available
We describe on-going work towards publishing language resources included in dialectal dictionaries in the Linked Open Data (LOD) cloud, and so to support wider access to the diverse cultural data associated with such dictionary entries, like the various historical and geographical variations of the use of such words. Beyond this, our approach allow...
Article
In this demo and poster paper, we describe the concept and implementation of an ontology-based storyteller for fairy tales. Its main functions are (i) annotating the tales by extracting timeline information, characters and dialogues with corresponding emotions expressed in the utterances, (ii) populating an existing ontology for fairy tales with th...
Conference Paper
Full-text available
In this position paper we discuss some of the experiences we made in describing lexical data using representation formalisms that are compatible for the publication of such data in the Linked Data framework. While we see a huge potential in the emerging Linguistic Linked Open Data, also supporting the publication of less-resourced language data on...
Conference Paper
Full-text available
We describe work on porting linguistic and semantic annotation applied to the Austrian Baroque Corpus (ABaC:us) to a format sup-porting its publication in the Linked Open Data Framework. This work includes several aspects, like a derived lexicon of old forms used in the texts and their mapping to modern German lemmas, the description of morpho-synt...
Article
This paper explains the application of ontologies in financial domains to a query expansion process. The final goal is to improve financial information retrieval effectiveness. The system is composed of an ontology and a Lucene index that stores and retrieves natural language concepts. An initial evaluation with a limited number of queries has been...
Article
The management of Drug-Drug Interactions (DDIs) is a critical issue resulting from the overwhelming amount of information available on them. Natural Language Processing (NLP) techniques can provide an interesting way to reduce the time spent by healthcare professionals on reviewing biomedical literature. However, NLP techniques rely mostly on the a...
Conference Paper
Cross lingual querying of financial and business data from multi-lingual sources requires that inherent challenges posed by the diversity of financial concepts and languages used in different jurisdictions be addressed. Ontologies can be used to semantically align financial concepts and integrate financial facts with other company information from...
Chapter
This chapter describes the general benefits provided by the collaborative Wiktionary effort, but stresses at the same the lack of standardization in these resources, and so therein the difficulty of making a wide use of the resource. It points to already existing work in the field of senses, in which the use of the lexical markup framework (LMF) ha...
Article
Full-text available
The digital universe is expanding at very high rates. New ways of retrieving and enriching text and audio content are required. In this work, a methodology for actor level emotion magnitude prediction in text and speech is proposed. A model is trained ...
Article
Full-text available
Lexica and terminology databases play a vital role in many NLP applications, but currently most such resources are published in application-specific formats, or with custom access interfaces, leading to the problem that much of this data is in “data silos” and hence difficult to access. The Semantic Web and in particular the Linked Data initiative...
Book
Full-text available
We are delighted to hereby present the proceedings of CHAT 2012. Altogether, 7 papers have been selected for presentation (4 regular papers and 3 short papers). The workshop papers cover various topics on automated approaches to terminology extraction and creation of terminology resources, compiling multilingual terminology, ensuring interoperabili...
Conference Paper
Full-text available
The project described in this paper was at first concerned with the specific issue of annotating historical texts belonging to the Memento mori genre. To produce a digital version of these texts that could be used to answer the specific questions of the researchers involved, a multi-layered approach was adopted: Semantic annotations were applied to...
Conference Paper
We present on-going work on the automated ontology-based detection and recognition of characters in folktales, restricting ourselves for the time being to the analysis of referential nominal phrases occurring in such texts. Focus of the presently reported work was to investigate the interaction between an ontology and linguistic analysis of indefin...
Article
Full-text available
The work described here concerns the use of complementary resources in sports video analysis; soccer in our case. Structured web data such as match tables with teams, player names, score goals, substitutions, etc. and multiple, unstructured, textual web data sources (minute-by-minute match reports) are processed with an ontology-based information e...
Article
Full-text available
We investigate the extension of classification schemes in the Humanities into semantic data repositories, the benefits of which could be the automation of so far manually conducted processes, such as detecting motifs in folktale texts. In parallel, we propose linguistic analysis of the textual labels used in these repositories. The resulting resour...
Conference Paper
Full-text available
This paper presents a metadata model for the description of language resources proposed in the framework of the META-SHARE infrastructure, aiming to cover both datasets and tools/technologies used for their processing. It places the model in the overall framework of metadata models, describes the basic principles and features of the model, elaborat...
Article
Ontologies often contain multilingual textual information in annotation properties, such as rdfs:label and rdfs:comment. While the motivation for using such annotation properties is to provide a human readable description of abstract conceptualization of the domain, we notice that the importance of appropriate natural language use and representatio...
Article
Full-text available
Recently, an overall trend towards increasing complexity of ontologies could be observed, not only in terms of domain modeling, where the complexity should correspond to the information to be modeled, but also as regards the addition of further information, which could be modeled as external resources to the domain model and linked to its relevant...
Conference Paper
Full-text available
We describe an ongoing work on the semi-automatic derivation of ontological structures from text. Hereby, we first apply on plain text pattern-based linguistic heuristics, for identifying relevant segments out of which candidate ontology classes and relations can be derived. The second step proposes a consolidation of those candidates on the basis...
Conference Paper
We present in this demonstration paper the actual text technology infrastructure we have been establishing for annotating with linguistic and domain-specific information - the personalized death - a corpus of baroque texts (in German) belonging to the genre "Danse Macabre". While the developed and assembled tools are already covering the automatic...
Chapter
The Language Grid is a distinctive language service infrastructure in the sense that it accommodates a wide variety of user needs, ranging from technical novices to experts; language resource consumers to language resource providers. As these language services are various in type and each of them can be idiosyncratic in many aspects, the service in...
Conference Paper
We present on-going work on the linguistic and semantic processing of the labels of the Thompson's Motif-Index of Folk-Literature, which has been proposed by Stith Thompson for the classification of narrative elements in folk-literature. We automatically extracted the labels of an on-line version of the Index, and wrote specialised grammars for pro...
Chapter
Full-text available
This chapter describes the actual state of APftML (Augmented Proppian fairy tale Markup Language), which is a schema combining linguistic and domain specific annotation for supporting Cultural Heritage and Digital Humanities research, exemplified in the fairy tale domain. APftML should in particular guide automated text analysis to detect and mark...
Chapter
Information Extraction is the process of extracting from text specific facts in a given target domain. The chapter gives an overview of the field covering components involved in the development and evaluation of information extraction system such as parts of speech tagging or named entity recognition. The chapter introduces available tools such as...
Article
Full-text available
Propp's influential structural analysis of fairy tales created a powerful schema for representing storylines in terms of character functions, which is directly exploitable for computational semantic analysis, and procedural generation of stories of this genre. We tackle two resources that draw on the Proppian model –, one formalizes it as a semanti...
Conference Paper
Full-text available
We propose applying standardized linguistic annotat ion to terms included in labels of knowledge repres entation schemes (taxonomies or ontologies), hypothesizing that this would help improving ontology-based semantic annotation of texts. We share the view that currently used methods for including lexical and te rminological information in such hie...
Conference Paper
Full-text available
We describe the implementation of an enterprise monitoring system that builds on an ontology-based information extraction (OBIE) component applied to heterogeneous data sources. The OBIE component consists of several IE modules—each extracting on a regular temporal basis a specific fraction of company data from a given data source—and a merging too...
Article
Full-text available
This poster submission presents the actual state of devel-opment of a markup scheme that combines narrative and linguistic information for the fine-grained annotation of folk-tales. The scheme builds on and extends an existing mark-up language called PftML (Proppian fairy tale Markup Lan-guage) and combines this with textual and linguistic anno-tat...
Article
Full-text available
In this paper we present on-going work on the derivation of candidate components of ontology schema (so-called T-Box) from the shallow analysis of unstructured text. We discuss here examples dealing with German text in two do-mains: Economics and Radiology.
Article
Full-text available
This paper presents our work in the field of semantic multimedia annotation and indexing with the use of complementary textual resources analysis. We describe the advantages of complementary sources of information as a support for annotation and test whether these data can be used for automatic annotation and event detection.
Article
Full-text available
In this paper, we describe the state of our work on the possible derivation of ontological structures from textual analysis. We propose an approach to semi-automatic generation of domain ontologies from scratch, on the basis of heuristic rules applied to the result of a multi-layered processing of textual documents.
Conference Paper
In this demo we present the actual state of development of ontology-based information extraction in real world applications, as they are defined in the context of the MUSING European R&D project dealing with Business Intelligence applications. We present in some details the actual state of ontology development, including a time and domain ontologie...
Conference Paper
We describe an implemented hybrid reasoning architecture that is used in an EU-funded project called MUSING (www.musing.eu) which is dedicated towards the investigation of semantic-based business intelligence solutions. The reasoning platform builds on publicly available software, such as Pellet, OWLIM, Jena, and Sesame. The project uses and extend...
Conference Paper
In this paper we describe collaborative and integrative work in the K-Space Network of Excellence. A goal of the work presented consists of combining results of the analysis of soccer videos with the semantic analysis of textual complementary sources, in order to support the semantic annotation and indexing of soccer videos. We present briefly a fo...
Conference Paper
Full-text available
Within the CLARIN e-science infrastructure project it is foreseen to develop a component-based registry for metadata for Language Resources and Language Technology. With this registry it is hoped to overcome the problems of the current available systems with respect to inflexible fixed schema, unsuitable terminology and interoperability problems. T...
Conference Paper
Full-text available
In this poster, we present the actual state of our work on the possible derivation of ontological structures from textual analysis. We propose an approach to the extension of exist- ing domain ontologies or even to the semi-automatic ontol- ogy generation of such ontologies from scratch, on the base of heuristic rules applied to the result of a mul...