Marijn Koolen

Marijn Koolen
Koninklijke Nederlandse Akademie van Wetenschappen | KNAW · Huygens Institute for Dutch History (Huygens ING-KNAW)

PhD

About

117
Publications
17,094
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,264
Citations
Citations since 2016
42 Research Items
681 Citations
2016201720182019202020212022020406080100120
2016201720182019202020212022020406080100120
2016201720182019202020212022020406080100120
2016201720182019202020212022020406080100120
Introduction
Research interests: - information retrieval and recommendation - information behaviour and interaction - digital humanities, digital literary studies, digital history - digital research methodology (tool criticism and evaluation, data scopes, research process documentation) - scholarly web annotation
Additional affiliations
April 2018 - present
Koninklijke Nederlandse Akademie van Wetenschappen
Position
  • Researcher
October 2016 - March 2018
Koninklijke Nederlandse Akademie van Wetenschappen
Position
  • Researcher
January 2016 - August 2016
Netherlands Institute for Sound and Vision
Position
  • Software Engineer

Publications

Publications (117)
Article
The past decades have changed the way we deal with archives and archival materials. Archives digitised their inventories and part of their collections, but they were joined by many other parties who published archival collections and archive-worthy materials on the web. The information world is in continuous flux, and the developments have made cle...
Chapter
With the rise of data driven methods in the humanities, it becomes necessary to develop reusable and consistent methodological patterns for dealing with the various data manipulation steps. This increases transparency, replicability of the research. Data scopes present a qualitative framework for such methodological steps. In this work we present a...
Article
Full-text available
What is the impact of reading fiction? We analyze online Dutch book reviews to detect overall affective impact, narrative feelings, response to style and reflection. We create a set of rules that analyze the reviews and detect the impact aspects. We evaluate the detection by asking raters about the presence of these aspects in reviews and comparing...
Article
During the past decade, recommender systems have rapidly become an indispensable element of websites, apps, and other platforms that are looking to provide personalized interaction to their users. As recommendation technologies are applied to an ever-growing array of non-standard problems and scenarios, researchers and practitioners are also increa...
Conference Paper
The Resolutions of the Dutch States General (1576-1796) is an archive covering over two centuries of decision making and consists of a heterogeneous series of handwritten and printed documents. This archive has rich potential for historical research, but the heterogeneity and dispersion of information makes using it for research a challenge. In thi...
Presentation
Full-text available
The concept of "scholarly primitive" has been widely welcomed both by humanists and system designers in the humanities, due to the fact that it made it possible to have a solid conceptual basis for the operationalization of the essential functionalities required for advancing computer-mediated work in the humanities. It has also helped to prioritiz...
Book
Full-text available
Reading notes in books and other printed matter are of increasing interest in Philology and Cultural History. However, we still lack an understanding of their epistemic foundations. With reference to Thomas Mann’s private library, I suggest viewing the act of annotating with pens itself as an epistemic practice. For this, I introduce the term ‘pen...
Chapter
Full-text available
For third-party annotations in the digital edition to be interoperable, we argue they should not be anchored in web pages but in the edition’s abstract information structure. We propose an ontology for the editorial domain based on FRBROO. The ontology distinguishes between the editable domain (works that can be edited) and the edited domain (the r...
Conference Paper
In the NWO REPUBLIC project, we are creating digital access to the corpus of the Resolutions of the States General of the Dutch Republic (1576-1796). This corpus contains the decisions made in the States General each day for a 220 year period. The resolutions were recorded using a standard structure and contain many standard formulations for aspect...
Article
Full-text available
Recently, the use of word embedding models (WEM) has received ample attention in the natural language processing community. These models can capture semantic information in large corpora of text by learning distributional properties of words, that is how often particular words appear in specific contexts. Scholars have pointed out the potential of...
Conference Paper
Over the past decade, recommendation algorithms for ratings prediction and item ranking have steadily matured. However, these state-of-the-art algorithms are typically applied in relatively straightforward and static scenarios: given information about a user's past item preferences in isolation, can we predict whether they will like a new item or r...
Chapter
INEX ran as an independent evaluation forum for 10 years before it teamed up with CLEF in 2012. Even before 2012 there was considerable collaboration between INEX and CLEF, and these collaborations increased in intensity when CLEF moved beyond its traditional cross-lingual focus in 2009/2010 shifting to include all experimental IR. This led to the...
Chapter
With the rapid growth of the video game industry over the past decade, there has been a commensurate increase in research activity focused on a variety of aspects of video games. How people discover the video games they want to play and how they articulate these information needs is still largely unknown, however. A better understanding of video ga...
Conference Paper
Full-text available
The practices of digital humanists are evolving, highly diversified and experimental. There is also a lack of agreement about whether or not digital humanists should have data and programming skills. Thus, their underlying needs for higher levels of flexibility and transparency may be contradicted by their explicit requests for user-friendly graphi...
Conference Paper
Full-text available
We detail and compare the performance of two methods for classifying and predicting bestsellers: a neural network based deep learning approach and a classic statistical “MinMax” approach. We found that the “simple” classic method outperforms machine learning. We explain the possible application of our solution(s) and consider some tentative results...
Article
Full-text available
The term Macroscope has recently been introduced as an instrument to study historical big data using digital tools. In this paper we argue the need for a more elaborate set of con- cepts to describe and reason about the interactions to select, enrich, connect, analyse and evaluate historical data using digital tools. Interactions change the data an...
Article
Full-text available
In the past decade, an increasing set of digital tools has been developed with which digital sources can be selected, analyzed, and presented. Many tools go beyond key word search and perform different types of analysis, aggregation, mapping, and linking of data selections, which transforms materials and creates new perspectives, thereby changing t...
Conference Paper
Over the past decade, recommendation algorithms for ratings prediction and item ranking have steadily matured. However, these state-of-the-art algorithms are typically applied in relatively straightforward scenarios. In reality, recommendation is often a more complex problem: it is usually just a single step in the user's more complex background ne...
Conference Paper
Full-text available
For quite some time, humanities scholars have been using digital tools in addition to their established methodology to try and make sense of large and expanding data sources that cannot be handled with traditional methods alone. The digital methods have computer science aspects that may be combined with but do not readily fit into humanities method...
Article
The variety of specialized tools designed to facilitate analysis of audio-visual (AV) media are useful not only to media scholars and oral historians but to other researchers as well. Both Qualitative Data Analysis Software (QDAS) packages and dedicated systems created for specific disciplines, such as linguistics, can be used for this purpose. Sof...
Conference Paper
The goal of this workshop is to serve as a starting point for a community-driven effort to design and implement a platform for the collection, organization, maintenance, and sharing of resources for IIR experimentation. As in all scientific endeavors, progress in IIR research is contingent on the ability to build on previous ideas, approaches, and...
Research
Full-text available
Data Scopes zijn een concept voor de omgang met samengestelde data in een humanities context. Met data scopes willen we bijdragen aan methodologische reflectie op en consolidatie van de verzameling methoden die door velen in de menswetenschappen (vaak in de vorm van tools) al worden gebruikt in aanvulling op de bestaande methoden. Data scopes komen...
Chapter
Full-text available
Historians argue that tracing early exchange, trade and uses of plant medicines (materia medica) can elucidate dynamics of drug trajectories in the early modern period. However, information on how these drug trajectories have evolved is hidden in large amounts of heterogeneous historical data. These data are in different formats, languages, genres...
Article
This article describes a study of the affordances of a group of information systems (so-called “tools”) that make it possible to do computer-mediated film analyses. Film scholars have started using these tools to annotate and analyze digital film in various ways, yet not much is presently known about the implications of using these computational to...
Conference Paper
Full-text available
Recommendation algorithms for ratings prediction and item ranking have steadily matured during the past decade. However, these state-of-the-art algorithms are typically applied in relatively straightforward scenarios. In reality, recommendation is often a more complex problem: it is usually just a single step in the user's more complex background n...
Conference Paper
Full-text available
Research into recommendation algorithms has made great strides in recent years. However, these algorithms are typically applied in relatively straightforward scenarios: given information about a user's past preferences, what will they like in the future? Recommendation is often more complex: evaluating recommended items never takes place in a vacuu...
Article
Full-text available
There is broad consensus in the field of IR that search is complex in many use cases and applications, both on the Web and in domain-specific collections, and both in our professional and in our daily life. Yet our understanding of complex search tasks, in comparison to simple look up tasks, is fragmented at best. The workshop addressed many open r...
Article
Full-text available
This article reports on the CBRecSys 2016 workshop, the third edition of the workshop on New Trends in Content-based Recommender Systems, co-located with RecSys 2016 in Boston, MA. Content-based recommendation has been applied successfully in many different domains, but it has not seen the same level of attention as collaborative filtering techniqu...
Conference Paper
Full-text available
There is broad consensus in the field of IR that search is complex in many use cases and applications, both on the Web and in domain-specific collections, and both in our professional and in our daily life. Yet our understanding of complex search tasks, in comparison to simple look up tasks, is fragmented at best. The workshop addressed many open r...
Conference Paper
Full-text available
There is broad consensus in the field of IR that search is complex in many use cases and applications, both on the Web and in domain specific collections, and both professionally and in our daily life. Yet our understanding of complex search tasks, in comparison to simple look up tasks, is fragmented at best. The workshop addresses many open resear...
Conference Paper
Full-text available
Annotation has been identified as one of the "scholarly primitives", and plays a pivotal role in facilitating access to audio-visual (AV) media in a scholarly context. However, there is a lack of understanding of scholars' annotation needs and behavior. This paper is part of a group of studies aiming to understand how to improve annotation support...
Conference Paper
While content-based recommendation has been applied successfully in many different domains, it has not seen the same level of attention as collaborative filtering techniques have. However, there are many recommendation domains and applications where content and metadata play a key role, either in addition to or instead of ratings and implicit usage...
Conference Paper
Full-text available
The Social Book Search (SBS) Lab investigates book search in scenarios where users search with more than just a query, and look for more than objective metadata. Real-world information needs are generally complex, yet almost all research focuses instead on either relatively simple search based on queries, or on profile-based recommendation. The goa...
Conference Paper
Full-text available
The Social Book Search (SBS) Lab investigates book search in scenarios where users search with more than just a query, and look for more than objective metadata. Real-world information needs are generally complex, yet almost all research focuses instead on either relatively simple search based on queries or recommendation based on profiles. The goa...
Conference Paper
Full-text available
Users looking for books online are confronted with both professional meta-data and user-generated content. The goal of the Interactive Social Book Search Track was to investigate how users used these two sources of information, when looking for books in a leisure context. To this end participants recruited by four teams performed two different task...
Conference Paper
The book translation market is a topic of interest in literary studies, but the reasons why a book is selected for translation are not well understood. The "Beyond the Book" project investigates whether web resources like Wikipedia can be used to establish the level of cultural bias. This work describes the eScience tools used to estimate the cultu...
Conference Paper
Full-text available
Globalization is a current theme in literary studies. Are authors writing increasingly for a global audience? Such that novels appeal to readers in many countries, through shared ‘global’ cultural knowledge, e.g. references to concepts people around the globe are familiar with? The Beyond the Book project aims to investigate whether we can measure...
Conference Paper
Full-text available
There is broad consensus in the field of IR that search is complex in many use cases and applications, both on theWeb and in domain specific collections, and both in our professional and in our daily life. Yet our understanding of complex search tasks, in comparison to simple look up tasks, is fragmented at best. The workshop addressed many open re...
Article
Full-text available
While content-based recommendation has been applied successfully in many different domains, it has not seen the same level of attention as collaborative filtering techniques have. However, there are many recommendation domains and applications where content and metadata play a key role, either in addition to or instead of ratings and implicit usage...
Conference Paper
Full-text available
Globalization is an important research topic in many fields, also in literary studies. In our project Beyond the Book, funded by the Netherlands eScience Center, we try to find a way to measure how “international” a novel is. The idea behind our project is that the topic and content of a novel from one country may have an appeal to readers in other...
Conference Paper
Full-text available
Real-world information needs are generally complex, yet almost all research focuses on either relatively simple search based on queries or recommendation based on profiles. It is difficult to gain insight into complex information needs from observational studies with existing systems; potentially complex needs are obscured by the systems’ limitatio...
Conference Paper
There is broad consensus in the field of IR that search is complex in many use cases and applications, both on the Web and in domain specific collections, and both professionally and in our daily life. Yet our understanding of complex search tasks, in comparison to simple look up tasks, is fragmented at best. The workshop addressed the many open re...
Conference Paper
Full-text available
While content-based recommendation has been applied successfully in many different domains, it has not seen the same level of attention as collaborative filtering techniques have. However, there are many recommendation domains and applications where content and metadata play a key role, either in addition to or instead of ratings and implicit usage...
Conference Paper
Full-text available
Users looking for books online are confronted with both pro-fessional meta-data and user-generated content. The goal of the Interac-tive Social Book Search Track was to investigate how users used these two sources of information, when looking for books in a leisure context. To this end participants recruited by four teams performed two different ta...
Conference Paper
Full-text available
Online book search services allow users to tag and review books but do not include such data in the search index, which only contains titles, author names and professional subject descriptors. Such professional metadata is a limited description of the book, whereas tags and reviews can describe the content in more detail and cover many other aspect...
Conference Paper
Full-text available
INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2014 evaluation campaign , which consisted of three tracks: The Interactive Social Book Search Track investigat...
Article
Full-text available
INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2013 evaluation campaign, which consisted of four activities addressing three themes: searching professional an...
Conference Paper
Full-text available
INEX investigates focused retrieval from structured docu- ments by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2013 evaluation campaign, which consisted of four activities addressing three themes: searching pro- fessiona...
Article
Full-text available
In this paper we describe our participation in the INEX 2013 Social Book Search Track. We compare the impact of different query representations for book search topics derived from the LibraryThing discussion forums, including the title and full narrative provided by the topic creator, the name of the discussion group in which the topic was posted,...
Article
Full-text available
INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organi- zations to compare their results. This paper reports on the INEX’12 evaluation campaign, which consisted of a five tracks: Linked Data, Relevance Feedback, Snippet Retrieval,...
Conference Paper
Full-text available
The Web and social media give us access to a wealth of information, not only different in quantity but also in character---traditional descriptions from professionals are now supplemented with user generated content. This challenges modern search systems based on the classical model of topical relevance and ad hoc search: How does their effectivene...
Article
Full-text available
Article disponible en ligne : http://people.mpi-inf.mpg.de/~amishra/papers/bell-over13.pdf
Article
Full-text available
INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2011 evaluation campaign, which consisted of a five active tracks: Books and Social Search, Data Centric, Quest...
Article
Full-text available
In this paper, we document our efforts in participating to the TREC 2011 Web Tracks. We had multiple aims: This year, tougher top-ics were selected for the Web Track, for which there is less popularity information available. We look at the relative value of anchor text for these less popular topics, and at impact of spam pri-ors. Full-text retrieva...
Article
Full-text available
INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2011 evaluation campaign, which consisted of a five active tracks: Books and Social Search, Data Centric, Quest...