Figure 2 - uploaded by David Lindemann
Content may be subject to copyright.
XML transformation 

XML transformation 

Source publication
Conference Paper
Full-text available
In this paper, we present a simple method for drafting sense-disambiguated bilingual dictionary content using lexical data extracted from merged wordnets, on the one hand, and from BabelNet, a very large resource built automatically from wordnets and other sources, on the other. Our motivation for using English-Basque as a showcase is the fact that...

Contexts in source publication

Context 1
... a concept-oriented collection of lexical data into a headword-oriented dictionary draft is a computationally trivial transformation task. As mentioned in Section 2, we are able to represent our dictionary draft datasets in XML, as illustrated in Figure 2 below. In connection with this transformation, we have to mention two issues, which are far from trivial, for lexicographers: (1) the modelling of homography, i.e. on which level we distinguish between homograph headword strings that point to dictionary entries related to different parts of speech (cf. in English sound N , sound V , and sound ADV ), and (2) the modelling of a distinction between homonymy and polysemy (cf. ...
Context 2
... the bits of XML code shown in Figure 2, we have to point out, of course, that it is a simplified presentation of what is possible. Here we just include the text attributes (alternatively representable as text values) for lexical items and abbreviated glosses. ...
Context 3
... central issue which is also linked to data modelling is the internal representation of polysemy (besides its disambiguation from homonymy) that results from a transformation as illustrated in Figure 2. Two questions arise: (1) Does the draft dictionary entry contain all word senses of a lemma we want to represent? ...

Similar publications

Full-text available
The current study investigated the usage of the bilingual Vocabulary Size Test (VST) within Korean EFL environment. Thirty-two university students with an intermediate to high proficiency participated in this study. The students were given a Korean bilingual version of VST and reported their official English scores. The findings of this study are a...


Wordnets are the most widely used lexical resources in natural language processing (NLP). There exist wordnets in more than 40 languages by now and all of these are connected to the original Princeton WordNet. The origins of linguistic linked data (LD) can thus in some sense be traced to the WordNet project. The implementation of the linking, however, has not relied on stable identifiers and has thus led to technical problems of reference when new versions of a wordnet are released. This chapter describes how linked data principles have been applied in the development of the Global WordNet Grid (GWG), an attempt to form a catalogue of interlingual contexts that extends beyond the Anglo-Saxon roots of the Princeton WordNet. We will describe in particular how LD technologies have been used in realizing a Collaborative Interlingual Index (CILI) that builds on standard LD vocabularies and the resource description framework (RDF) data model. We finally describe a method to link wordnets to external resources such as DBpedia/Wikipedia.