About
38
Publications
8,048
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,428
Citations
Introduction
Skills and Expertise
Publications
Publications (38)
The Resolutions of the Dutch States General (1576-1796) is an archive covering over two centuries of decision making and consists of a heterogeneous series of handwritten and printed documents. The archive, which has recently been digitised, is a rich source for historical research. However, owing to the archive’s heterogeneity and dispersion of in...
The Resolutions of the Dutch States General (1576-1796) is an archive covering over two centuries of decision making and consists of a heterogeneous series of handwritten and printed documents. This archive has rich potential for historical research, but the heterogeneity and dispersion of information makes using it for research a challenge. In thi...
In this paper, we present the SharedCanvas model for describing the layout of
culturally important, hand-written objects such as medieval manuscripts, which
is intended to be used as a common input format to presentation interfaces. The
model is evaluated using two collections from CATCHPlus not consulted during
the design phase, each with their ow...
In the context of large and ever growing archives, generating annotation suggestions automatically from textual resources related to the documents to be archived is an interesting option in theory. It could save a lot of work in the time consuming and expensive task of manual annotation and it could help cataloguers attain a higher inter-annotator...
The Documentalist Support System (DocSS) is developed to suite novel needs of documentalists working within the Dutch archive for Sound and Vision, broadcasters working outside of Sound and Vision and people interested in the Cultural Heritage value of the archive, who want to perform search in context. The documentalists (and to some extent the ot...
This paper shows the actual state of development of the manual annotation tool ELAN. It presents usage requirements from three different groups of users and how one annotation model and a number of generic design principles guided the choices made during the development process of ELAN.
In the context of the CATCH research program that is currently carried out at a number of large Dutch cultural heritage institutions our ambition is to combine and exchange heterogeneous multimedia annotations between projects and institutions. As first step we designed an Annotation Meta Model: a simple but powerful RDF/OWL model mainly addressing...
Résumé La relation voir/employé pour d'un thesaurus est souvent plus complexe que la (para-)synonymie recommandée dans le standard ISO-2788 qui décrit le contenu de ces voca-bulaires contrôlés. Le fait qu'un non descripteur puisse être rattaché à plusieurs descripteurs, et le fait que seuls ces derniers soient pertinents dans le cadre de l'indexati...
In this paper, we argue on the interest of an- choring Dutch Cultural Heritage controlled vocabularies to WordNet, and demonstrate a reusable methodology for achieving this anchoring. We test it on two controlled vocabularies, namely the GTAA thesaurus, used at the Netherlands Institute for Sound and Vision (the Dutch radio and television archives)...
In this article we report on a user study aimed at evaluating and improving a thesaurus browser. The browser is intended to be used by documentalists of a large public audio-visual archive for finding ap- propriate indexing terms for TV programs. The subjects involved in the study were documentalists of the institutions involved. The study pro- vid...
Utilization of computer tools in linguistic research has gained importance with the maturation of media frameworks for the handling of digital audio and video. The increased use of these tools in gesture, sign language and multimodal interaction studies has led to stronger requirements on the flexibility, the efficiency and in particular the time a...
The aim of this paper is to explore whether indexing terms for an audiovisual program can be derived from contextual texts automatically. For this we apply natural-language processing techniques to contextual texts of two Dutch TV-programs. We use a Dutch domain thesaurus to derive and rank the metadata. We evaluate the results by comparing them to...
Despite its scientific, political, and practical value, comprehensive information about human languages, in all their variety and complexity, is not readily obtainable and searchable. One reason is that many language data are collected as audio and video recordings which imposes a challenge to document indexing and retrieval. Annotation of multimed...
• field workers from the MPI studying language behavior of different cultures and language acquisition processes by children and adults with the help of longitudinal observations • researchers of the MPI studying multimodal interactions in various circumstances and from various cultural backgrounds • researchers of the MPI and within the ECHO proje...
This paper discusses four generations of models for linguistic annotation and evaluates their evolution in relation to the software tools and corpora they are used for. MPI work on models is compared with other recent efforts to design generic models.
In this paper we describe a software framework that supports media annotation and analysis of media related corpora over the internet. We will present the layered architecture of this framework and we will introduce our Abstract Corpus Model with which we isolate corpus specific annotation formats from the annotation and analysis tools. The main se...
For multimodal annotations an exhaustive encoding system for gestures was developed to facilitate research. The structural requirements of multimodal annotations were analyzed to develop an Abstract Corpus Model which is the basis for a powerful annotation and exploitation tool for multimedia recordings and the definition of the XML-based EUDICO An...
This paper describes an experimental integration of two infrastructures (Eudico and GATE) which were developed independently of each other; for different media (video/speech vs. text) and applications. The integration resulted into gaining an in-depth understanding of the functionality and operation of each of the two systems in isolation, and the...
This paper describes an experimental integration of two infrastructures (Eudico and GATE) which were developed independently of each other; for different media (video/speech vs. text) and applications. The integration resulted into gaining an in-depth understanding of the functionality and operation of each of the two systems in isolation, and the...
Recently the Browsable Corpus concept was introduced at the Max Planck Institute for Psycholinguistics. Generalization of this concept could help resource discovery and access for Linguistic Resources.
The desire is to improve the availability of Language Resources (LR) on the Intra-and Internet. It is suggested that this can be achieved by creating a browsable & searchable universe of meta-descriptions. This asks for the development of a standard for tagging LRs with meta-data and several conventions agreed within the community.
Die technologische Entwicklung ermöglicht es den Sozialwissenschaften, distribuierte multi-mediale Korpora mittels Internet- Technologien aufzubauen. Das EUDICO Projekt hat das Ziel, kooperierenden, jedoch an verschiedenen Orten arbeitenden Wissenschaftlern eine einheitliche, format-unabhängige Schnittstelle zu den verschiedenen sprach- und videoba...
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.
A web services based architecture for Language Resources utilizing existing technology such as XML, SOAP, WSDL and UDDI is presented. The web services architecture creates a pervasive information infrastructure that enables straightforward access to two kinds of Language Resources: traditional information sources and language processing resources....
Documentation and retrieval processes at the Netherlands Institute for Sound and Vision are organized around a common thesaurus. To help improve the quality of these processes the thesaurus was transformed into an RDF/OWL ontology and extended on basis of implicit information and external resources. A thesaurus browser web application was designed,...
In this paper we dsecribe a software environment that facilitates media annotation and analysis of media related corpora over the internet. We will describe the general architecture of this environment and we will introduce our Abstract Corpus Model with which we isolate corpora specific formats from the annotation and analysis tools. The main set...
An architecture is presented that provides an integrated framework for managing, archiving and accessing language resources. This architecture was discussed in the DELAMAN network – a world-wide network of archives holding material about endangered languages. Such a framework will be built upon a metadata infrastructure, a mechanism to resolve uniq...
We define collaborative commentary as the involvement of a research community in the interpretive annotation of electronic records. The goal of this process is the evaluation of competing theoretical claims. The process requires commentators to link their comments and related evidentiary materials to specific segments of either transcripts or elect...
Gestures are culture specific forms of arm movements which are used in communication to transfer information to the listener, to guide the planning of the speech production process and to disambiguate the incoming speech. To understand the underlying mechanisms gestures have to be analyzed in cross-linguistic processes. Large projects are necessary...
Semantic access to multimedia content in audiovisual archives is to a large extent dependent on quantity and quality of the metadata, and particularly the content descriptions that are attached to the individual items. However, given the growing amount of materials that are being created on a daily basis and the digitization of existing analogue co...