David Lindemann

David Lindemann
Universidad del País Vasco / Euskal Herriko Unibertsitatea | UPV/EHU · Departamento de Filología Inglesa, Alemana y Traducción e Interpretación

PhD

About

36
Publications
3,853
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
45
Citations
Introduction
David Lindemann currently works at the Jožef Stefan Institute, Slovenia, as researcher in the framework of the Elexis project, and as a lecturer at Dept. of English and German Philology and Translation and Interpretation, UPV/EHU University of the Basque Country. David does research in Computational Lexicography, Corpus Linguistics, and Digital Humanities.
Additional affiliations
January 2012 - April 2017
Universidad del País Vasco / Euskal Herriko Unibertsitatea
Position
  • Researcher

Publications

Publications (36)
Article
Full-text available
This report has been prepared by the “Bibliographical Data” Working Group of the DARIAH-ERIC consortium, which develops public digital research infrastructure for the arts and humanities. The Group consists of more than 30 members from 15 countries, most of whom are researchers and curators in the public sector who are engaged in bibliographical da...
Conference Paper
Full-text available
In this paper, we present ongoing work on Elexifinder (https://finder.elex.is), a lexicographic literature discovery portal developed in the framework of the ELEXIS (European Lexicographic Infrastructure) project. Since the first launch of the tool, the database behind Elexifinder has been enriched with publication metadata and full texts stemming...
Conference Paper
Full-text available
In this paper, we present a workflow for historical dictionary digitization, with a 1745 Spanish-Basque-Latin dictionary as use case. We start with scanned facsimile images, and get to represent attestations of modern standard Basque lexemes as Linked Data, in the form they appear in the dictionary. We are also able to produce an index of the dicti...
Conference Paper
Full-text available
Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover,...
Presentation
Full-text available
Lexical resources originally meant as human-readable dictionaries, or lexical-semantic databases designed for other purposes, are most often developed isolated from each other, so that a linking of data across resources, which doubtlessly means an added value to both human readers and knowlegde-based computational applications, implies some sort of...
Article
Full-text available
This short paper presents preliminary considerations regarding LexBib, a corpus, bibliography, and domain ontology of Lexicography and Dictionary Research, which is currently being developed at University of Hildesheim. The LexBib project is intended to provide a bibliographic metadata collection made available through an online reference platform....
Preprint
Full-text available
12 This short paper presents preliminary considerations regarding LexBib, a corpus, bibliography, and 13 domain ontology of Lexicography and Dictionary Research, which is currently being developed 14 at University of Hildesheim. The LexBib project is intended to provide a bibliographic metadata 15 collection made available through an online referen...
Poster
Full-text available
This poster presents preliminary considerations for a new project: A merged set of Basque (legacy) lexical resources or unified lexical database. At this preliminary stage, our main attention lies on the catalogue of data sources, on philological problems (e.g. regarding lemmatization), and on the design of the database. We propose a data model and...
Conference Paper
Full-text available
In this paper, we present a generic workflow for retro-digitizing and structuring large entry-based documents, using the 33.000 entries of Internationale Bibliographie der Lexikographie, by Herbert Ernst Wiegand, as an example (published in four volumes (Wiegand 2006-2014)). The goal is to convert the large bibliography, at present available as col...
Article
Full-text available
The aim of the project is to develop a prototype for a generator of argument structure or valency realisations in terms of syntagmatic and paradigmatic combinations of Spanish, German and French nouns. The two main applications of the tool prototype we are aiming to develop, are (1) the generation of noun phrases as argument structure realizations...
Conference Paper
Full-text available
This paper presents preliminary considerations regarding objectives and workflow of LexBib, a project which is currently being developed at the University of Hildesheim. We briefly describe the state of the art in electronic bibliographies in general, and bibliographies of lexicography and dictionary research in particular. The LexBib project is in...
Presentation
Full-text available
See presentation on videolectures.net http://videolectures.net/WNLEXworkshop2018_lindemann_wordnets/
Article
'Purism' can characterise attitudes about a wide range of linguistic phenomena, but the most common forms of linguistic purism are those concerned with the lexicon. When standardisation of language is at issue, questions of purism are unavoidable. Are processes of standardisation necessarily motivated by puristic attitudes? Or is purism a consequen...
Conference Paper
Full-text available
Ziel der hier vorgestellten Studie ist eine Beschreibung der Schnittmenge von Diskursräumen in der Lexikographie bzw. Metalexikographie und den Digital Humanities (DH). Dabei geht es um die Bestimmung von explizit bzw. implizit als Teil der DH aufzufassenden Beiträgen zu lexikographischen Themen und, andersherum, von lexikographierelevanten Themen,...
Conference Paper
Full-text available
In this paper, we present a simple method for drafting sense-disambiguated bilingual dictionary content using lexical data extracted from merged wordnets, on the one hand, and from BabelNet, a very large resource built automatically from wordnets and other sources, on the other. Our motivation for using English-Basque as a showcase is the fact that...
Chapter
Full-text available
In this article, we present a set of computational methods based on corpora or on the extraction of data from existing lexical resources for drafting bilingual dictionary content. These methods operate on three structural levels: (1) lemma lists, (2) syntactic entities, and (3) translation equivalents. The described methods are applied to the langu...
Conference Paper
Full-text available
This paper presents a simple method for drafting bilingual dictionary content using existing lexical and NLP resources for Basque. The method consists of five steps, three belonging to a semi-automatic drafting, and another two to semi-automatic and manual post-editing: (1), the building of a corpus-based frequency lemma list; (2) the drafting of s...
Thesis
Full-text available
In this PhD thesis, we present research carried out during the last five years. Bilingual Lexicography with German and Basque is the main issue shared by the whole range of research lines we have been following. To create a new German and Basque bilingual dictionary has been our goal, a German-Basque electronic dictionary, which is directed first a...
Article
Full-text available
This paper presents a simple methodology to create corpus-based frequency lemma lists, applied to the case of the Basque language. Since the first work on the matter in 1982, the amount of text written in Basque and language resources related to this language has grown exponentially. Based on state-of-the-art Basque corpora and current NLP technolo...
Chapter
Full-text available
Ibon Sarasolak 1982. urtean euskarazko maiztasun-hiztegia argitaratu zuen, 1977ko corpus batean oinarriturik. Ondorengo hamarkadetan, euskaraz idatzitako testuen zein baliabide elektronikoen kopurua handitu egin da esponentzialki. Gaur eskuragarri ditugun datuetan oinarrituta, euskara batuaren maiztasun-lemategi bat garatzea dugu helburu ikerketa h...
Article
Full-text available
In this paper, we introduce the new electronic dictionary project EuDeLex, which is currently being worked on at UPV-EHU University of the Basque Country.1 The introduction addresses the need for and functions of a new electronic dictionary for that language pair, as well as general considerations about bilingual lexicography and German as foreign...
Conference Paper
Full-text available
This paper presents a set of Bilingual Dictionary Drafting (BDD) methods including manual extraction from existing lexical databases and corpus based NLP tools, as well as their evaluation on the example of German-Basque as language pair. Our aim is twofold: to give support to a German-Basque bilingual dictionary project by providing draft Bilingua...
Article
Full-text available
Lexicography over the last decades has incorporated Corpus Linguistics methods. Lexicographers who start to work on an electronic dictionary, starting from scratch as Computational Linguists, and with little or no previous work done on their language pair, have to evaluate the contributions Corpus Linguistics methods may provide to their project, n...
Chapter
Full-text available
In this paper, we introduce the new electronic dictionary project EuDeLex, which is currently being worked on at UPV-EHU University of the Basque Country. The introduction addresses the need for and functions of a new electronic dictionary for that language pair, as well as general considerations about bilingual lexicography and German as foreign l...
Article
Full-text available
En el siglo XIX, en el campo de la lexicografía bilingüe euskera-alemán tres autores nos dejaron obras publicadas en la época, manuscritos posteriormente editados y manuscritos sin editar, entre los cuales destaca el manuscrito de CAF Mahn, fechado en 1840, que reúne amplio material lexicográfico de diversas fuentes, entre ellas posiblemente los ma...

Network

Cited By

Projects

Projects (4)
Project
DaF-Didaktik für baskische MuttersprachlerInnen
Project
Our goal is an online bibliography of Lexicography and Dictionary Research (i. e. metalexicography) that offers hand-validated publication metadata as needed for citations, that represents, if possible, metadata using unambiguous identifiers and that, in addition, is complemented with the output of a Natural Language Processing toolchain applied to the full texts. Items are tagged using nodes of a domain ontology developed in the project; term andidatess extracted from the full texts serve as suggestions for a mapping to the domain ontology. Main considerations regarding the project have been presented in Lindemann et al. 2018.
Project
Integration of lexical resources, such as (historical) dictionaries and NLP lexicons of the Basque language, and their representation as linked data.