Johannes Dellert

Johannes Dellert
University of Tuebingen | EKU Tübingen · Department of Linguistics

Doctor of Philosophy

About

19
Publications
1,747
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
179
Citations

Publications

Publications (19)
Chapter
Adding to the budding landscape of advanced analysis tools for Probabilistic Soft Logic (PSL), we present a graphical explorer for grounded PSL models. It exposes the structure of the model from the perspective of any single atom, listing the ground rules in which it occurs. The other atoms in these rules serve as links for navigation through the r...
Article
Full-text available
Integrating datasets from different disciplines is hard because the data are often qualitatively different in meaning, scale and reliability. When two datasets describe the same entities, many scientific questions can be phrased around whether the (dis)similarities between entities are conserved across such different data. Our method, CLARITY, quan...
Article
Full-text available
In speech, the connection between sounds and word meanings is mostly arbitrary. However, among basic concepts of the vocabulary, several words can be shown to exhibit some degree of form–meaning resemblance, a feature labelled vocal iconicity. Vocal iconicity plays a role in first language acquisition and was likely prominent also in pre-historic l...
Article
In the Proto-Quechuan lexicon, many two-segment phonetic substrings recur in semantically related roots, even though they are not independent morphemes. Such elements may have been morphemes before the Proto-Quechuan stage (i.e., in Pre-Proto-Quechuan). On the other hand, this may simply be due to chance, or to phonesthesia. In this paper, we intro...
Preprint
Integrating datasets from different disciplines is hard because the data are often qualitatively different in meaning, scale, and reliability. When two datasets describe the same entities, many scientific questions can be phrased around whether the similarities between entities are conserved. Our method, CLARITY, quantifies consistency across datas...
Preprint
This technical report describes a new prototype architecture designed to integrate top-down and bottom-up analysis of non-standard linguistic input, where a semantic model of the context of an utterance is used to guide the analysis of the non-standard surface forms, including their automated normalization in context. While the architecture is gene...
Article
Full-text available
This article describes the first release version of a new lexicostatistical database of Northern Eurasia, which includes Europe as the most well-researched linguistic area. Unlike in other areas of the world, where databases are restricted to covering a small number of concepts as far as possible based on often sparse documentation, good lexical re...
Article
Full-text available
Based on a recently published large-scale lexicostatistical database, we rank 1,016 concepts by their suitability for inclusion in Swadesh-style lists of basic stable concepts. For this, we define separate measures of basicness and stability. Basicness in the sense of morphological simplicity is measured based on information content, a generalizati...
Article
Full-text available
This paper presents a large comparative lexical database which covers about a thousand concepts across twenty Uralic languages. The dataset will be released as the first part of NorthEuraLex, a lexicostatistical database of Northern Eurasia which is being compiled within the EVOLAEMP project. The chief purpose of the lexical database is to serve as...
Conference Paper
In this paper we present a parsing architecture that allows processing of different mildly context-sensitive formalisms, in particular Tree-Adjoining Grammar (TAG), Multi-Component Tree-Adjoining Grammar with Tree Tuples (TT-MCTAG) and "simple" Range Concatenation Grammar (RCG). Furthermore, for tree-based grammars, the parser computes not only syn...
Article
We present a very large network of crosslinguistic polysemies, and compare the notion of semantic relatedness it encodes to the catalogue of semantic shifts maintained by the Russian Academy of Sciences. We separately evaluate all types of semantic shifts featured in the catalogue, including shifts occurring during semantic evolution, during borrow...
Conference Paper
Existing algorithms for minimal unsatisfiable subset (MUS) extraction are defined independently of any symbolic information, and in current implementations domain experts mostly do not have a chance to influence the extraction process based on their knowledge about the encoded problem. The MUStICCa tool introduces a novel graphical user interface f...
Article
TuLiPA - Parsing extensions of TAG with range concatenation grammars In this paper we present a parsing framework for extensions of Tree Adjoining Grammar (TAG) called TuLiPA (Tübingen Linguistic Parsing Architecture). In particular, besides TAG, the parser can process Tree-Tuple MCTAG with Shared Nodes (TT-MCTAG), a TAG-extension which has been pr...
Conference Paper
Full-text available
In this paper, we present an open-source parsing environment (Tuebingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammar...
Conference Paper
Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) redundancy and consistency issues. Furthermore some languages can reveal themselves hard to describe because of specific characteristics, e.g. the free word order in German. In this context, we present (i) a framework allowi...
Conference Paper
Full-text available
We present an approach for querying collections of heterogeneous linguistic corpora that are annotated on multiple layers using arbitrary XML-based markup languages. An OWL ontology provides a homogenising view on the conceptually different markup languages so that a common querying framework can be established using the method of ontology-based qu...
Article
Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. Editors: Koenraad De Smedt, Jan Hajič and Sandra Kübler. NEALT Proceedings Series, Vol. 1 (2007), 127-138. © 2007 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically p...

Network

Cited By