Marten Postma

Marten Postma
  • Doctor of Philosophy
  • PhD Student at Vrije Universiteit Amsterdam

About

27
Publications
3,317
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
184
Citations
Current institution
Vrije Universiteit Amsterdam
Current position
  • PhD Student
Additional affiliations
September 2013 - present
Vrije Universiteit Amsterdam
Position
  • PhD Student

Publications

Publications (27)
Article
Full-text available
Recently, Yuan et al. (2016) have shown the e ectiveness of using Long Short-Term Memory (LSTM) for performing Word Sense Disambiguation (WSD). Their proposed technique outperformed the previous state-of-the-art with several benchmarks, but neither the training data nor the source code was released. This paper presents the results of a reproduction...
Poster
Full-text available
Semantic text processing faces the challenge of defining the relation between lexical expressions and the world to which they make reference within a period of time. It is unclear whether the current test sets used to evaluate disambiguation tasks are representative for the full complexity considering this time-anchored relation, resulting in seman...
Poster
Full-text available
Current Word Sense Disambiguation systems show an extremely poor performance on low frequent senses, which is mainly caused by the difference in sense distributions between training and test data. The main focus in tackling this problem has been on acquiring more data or selecting a single predominant sense and not necessarily on the meta propertie...
Conference Paper
Full-text available
Semantic text processing faces the challenge of defining the relation between lexical expressions and the world to which they make reference within a period of time. It is unclear whether the current test sets used to evaluate disambiguation tasks are representative for the full complexity considering this time-anchored relation, resulting in seman...
Poster
Full-text available
Semantic text processing faces the challenge of defining the relation between lexical expressions and the world to which they make reference within a period of time. It is unclear whether the current test sets used to evaluate disambiguation tasks are representative for the full complexity considering this time-anchored relation, resulting in seman...
Poster
Full-text available
Entities and events in the world have no frequency, but our communication about them and the expressions we use to refer to them do have a strong frequency profile. Language expressions and their meanings follow a Zipfian distribution, featuring a small amount of very frequent observations and a very long tail of low frequent observations. Since ou...
Poster
Full-text available
Entities and events in the world have no frequency, but our communication about them and the words we use to refer to them do have a strong frequency profile. Language expressions and their meanings follow a Zipfian distribution, featuring a small amount of very frequent observations and a very long tail of low frequent observations. Since our NLP...
Presentation
Full-text available
Word Sense Disambiguation (WSD) systems tend to have a strong bias towards assigning the Most Frequent Sense (MFS), which results in a low recall on LFS (less frequent senses). We have strived to address the MFS bias in WSD systems by combining the output from a WSD system with a set of mostly static features to create a MFS classifier to decide wh...
Presentation
Full-text available
We describe Open Dutch WordNet, which has been derived from the Cornetto database, the Princeton WordNet and open source resources. We exploited existing equivalence relations between Cornetto synsets and WordNet synsets in order to move the open source content from Cornetto into WordNet synsets. Currently, Open Dutch Wordnet contains 117,914 synse...
Article
Full-text available
In this paper we present an approach to Word Sense Disambiguation based on Topic Modeling (LDA). Our approach consists of two different steps, where first a binary classifier is applied to decide whether the most frequent sense applies or not, and then another classifier deals with the non most frequent sense cases. An exhaustive evaluation is perf...
Conference Paper
Full-text available
We present in this paper our submission to task 13 of SemEval2015, which makes use of background information and external resources (DBpedia and Wikipedia) to automatically disambiguate texts. Our approach follows two routes for disambiguation: one route is proposed by a state–of–the–art WSD system, and the other one by the predominant sense inform...
Poster
Full-text available
We present in this paper our submission to task 13 of SemEval2015, which makes use of background information and external resources (DBpedia and Wikipedia) to automatically disambiguate texts. Our approach follows two routes for disambiguation: one route is proposed by a state–of–the–art WSD system, and the other one by the predominant sense inform...
Conference Paper
Full-text available
In this paper, we present a rich contex-tual perspective on the lexicon and back-ground knowledge for the purpose of deep semantic parsing. In the project Under-standing Language By machine 1 , we ad-dress various aspects of semantics in rela-tion to i.) reference to entities and event in-stances, ii.) modeling of author and reader perspectives. Le...
Conference Paper
Full-text available
Word Sense Disambiguation is still an unsolved problem in Natural Language Processing. We claim that most approaches do not model the context correctly, by relying too much on the local context (the words surrounding the word in question), or on the most frequent sense of a word. In order to provide evidence for this claim, we conducted an in-depth...
Conference Paper
Full-text available
We present Open Source Dutch Wordnet: an open source version of Cornetto (Vossen et al., 2013). Cornetto is currently not distributed as open source, because a large portion of the database originates from the commercial publisher Van Dale. We use English WordNet 3.0 (Miller, 1995; Fellbaum, 1998) as our basis. This means that we replace the Van D...
Article
Full-text available
We present an analysis of a high-level semantic task, the construction of cross-document event timelines from SemEval 2015 Task 4: TimeLine, to trace down errors to the components of our pipeline system. Event timeline extraction requires many different Natural Language Processing tasks among which entity and event detection, coreference resolution...
Presentation
Full-text available
Wordnet::Similarity is an important instrument used for many applications. It has been available for a while as a toolkit for English and it has been frequently tested on English gold standards. In this paper, we describe how we constructed a Dutch gold standard that matches the English gold standard as closely as possible. We also re-implemented t...
Article
Full-text available
Wordnet::Similarity is an important instrument used for many applications. It has been available for a while as a toolkit for English and it has been frequently tested on English gold standards. In this paper, we describe how we constructed a Dutch gold standard that matches the English gold standard as closely as possible. We also re-implemented t...
Conference Paper
Full-text available
Repeating experiments is an important in-strument in the scientific toolbox to vali-date previous work and build upon exist-ing work. We present two concrete use cases involving key techniques in the NLP domain for which we show that reproduc-ing results is still difficult. We show that the deviation that can be found in repro-duction efforts leads...

Network

Cited By