ArticlePDF Available

Abstract and Figures

Digital humanities initiatives play an important role in making cultural heritage collections accessible to the global community of researchers and general public for the first time. Further work is needed to provide useful and usable tools to support users in working with those digital contents in virtual environments. The CULTURA project has developed a corpus agnostic research environment integrating innovative services that guide, assist and empower a broad spectrum of users in their interaction with cultural artefacts. This article presents (1) the CULTURA system and services and the two collections that have been used for testing and deploying the digital humanities research environment, and (2) an evaluation methodology and formative evaluation study with apprentice researchers. An evaluation model was developed which has served as a common ground for systematic evaluations of the CULTURA environment with user communities around the two test bed collections. The evaluation method has proven to be suitable for accommodating different evaluation strategies and allows meaningful consolidation of evaluation results. The evaluation outcomes indicate a positive perception of CULTURA. A range of useful suggestions for future improvement has been collected and fed back into the development of the next release of the research environment.
Content may be subject to copyright.
International Journal on Digital Libraries
Evaluating a Digital Humanities Research Environment: The CULTURA Approach
--Manuscript Draft--
Manuscript Number:
Full Title: Evaluating a Digital Humanities Research Environment: The CULTURA Approach
Article Type: Original Research
Keywords: Virtual research environment; digital humanities; cultural heritage; evaluation
Corresponding Author: Christina M. Steiner
Graz University of Technology
AUSTRIA
Corresponding Author Secondary
Information:
Corresponding Author's Institution: Graz University of Technology
Corresponding Author's Secondary
Institution:
First Author: Christina M. Steiner
First Author Secondary Information:
Order of Authors: Christina M. Steiner
Maristella Agosti
Mark S. Sweetnam
Eva C. Hillemann
Nicola Orio
Chiara Ponchia
Cormac Hampson
Alexander Nussbaumer
Dietrich Albert
Owen Conlan
Order of Authors Secondary Information:
Abstract: Digital humanities initiatives play an important role in making cultural heritage
collections accessible to the global community of researchers and general public for
the first time. Further work is needed to provide useful and usable tools supporting
users in working with those digital contents in virtual environments. The CULTURA
project is developing a corpus agnostic research environment integrating innovative
services that guide, assist and empower a wide spectrum of users in their interaction
with cultural artefacts. This article introduces the CULTURA system and services and
the two collections that have been used for testing and evaluating the digital
humanities research environment. An evaluation model is outlined that builds a
common ground for systematic evaluations of the CULTURA environment in the
context of the two collections, and for implementing different evaluation strategies. The
consolidated evaluation outcomes indicate a positive perception of CULTURA.
Furthermore, a range of useful suggestions for future improvement were collected and
fed back to the development of the final implementation of the research environment.
Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation
1
Evaluating a Digital Humanities Research Environment:
The CULTURA Approach
Christina M. Steiner · Maristella Agosti · Mark S. Sweetnam · Eva-C. Hillemann ·
Nicola Orio · Chiara Ponchia · Cormac Hampson · Alexander Nussbaumer · Dietrich Albert ·
Owen Conlan
Abstract: Digital humanities initiatives play an important role in making cultural heritage collections accessible to the
global community of researchers and general public for the first time. Further work is needed to provide useful
and usable tools supporting users in working with those digital contents in virtual environments. The CULTURA
project is developing a corpus agnostic research environment integrating innovative services that guide, assist and
empower a wide spectrum of users in their interaction with cultural artefacts. This article introduces the
CULTURA system and services and the two collections that have been used for testing and evaluating the digital
humanities research environment. An evaluation model is outlined that builds a common ground for systematic
evaluations of the CULTURA environment in the context of the two collections, and for implementing different
evaluation strategies. The consolidated evaluation outcomes indicate a positive perception of CULTURA.
Furthermore, a range of useful suggestions for future improvement were collected and fed back to the
development of the final implementation of the research environment.
Key words: Virtual research environment, digital humanities, cultural heritage, evaluation.
1 INTRODUCTION
The interdisciplinary field of digital humanities is
concerned with the intersection of computer
science, knowledge management, and a wide
range of humanities disciplines.
C. M. Steiner (contact person - email: christina.steiner@tugraz.at) ·
E.-C. Hillemann • A. Nussbaumer • D. Albert
Knowledge Technologies Institute, Graz University of Technology,
Austria
M. Agosti
Dept of Information Engineering, University of Padua, Italy
Mark S. Sweetnam
Dept of History, Trinity College Dublin, Ireland
N. Orio C. Ponchia
Dept of Cultural Heritage, University of Padua, Italy
C. Hampson • O. Conlan
Knowledge and Data Engineering Group, Trinity College Dublin,
Ireland
Recent large-scale digitisation initiatives have
made many important cultural heritage collections
available online. This makes them accessible to
the global research community and interested
public for the first time. However, the full value
of these heritage treasures is not being realised.
Digital collections often lack features for deeper
quantitative and qualitative analysis, and even
very useful functions, such as the ability to
annotate or bookmark content, are often not
supported. After digitisation, these collections are
typically monolithic, difficult to navigate, and can
contain text which is of variable quality in terms
of language, spelling, punctuation, and
consistency of terminology. Although there are
digital content, tools, and services available, they
are not necessarily useful or usable. “Until
analytical tools and services are more
sophisticated, robust, transparent, and easy to use
Manuscript
Click here to download Manuscript: 20131223_CULTURA_IJDL_v2.6.docx
Click here to view linked References
2
Chapter
for the motivated humanities researcher, it will be
difficult to attract a broad base of interest…” [1].
As a result, digital collections often fail to attract
and sustain broad user engagement leading to
limited communities of interest. Thus, there still
remain important challenges in the presentation of
new digital humanities artefacts to the end user.
Movement beyond self-contained, independent
projects is needed, in order to create projects and
flexible infrastructures usable for different
collections.
Simple “one size fits all” web access is, in
many cases, not appropriate in the digital
humanities, due to the size and complexity of the
artefact collections. Furthermore, different types
of users have considerable differences in their
knowledge of the collections, requiring varying
levels of support, and every individual user has
their own particular interests and priorities.
Personalised and adaptive systems are thus
important in helping users achieve optimum
engagement with these new digital humanities
assets. Improved quality of access to cultural
collections, especially those collections, which are
not exhibited physically, is a key objective of the
CULTURA
1
project [2,3]. Moreover, CULTURA
supports a wide spectrum of users, ranging from
members of the general public with specific
interests, to users who may have a deep
engagement with the cultural artefacts, such as
professional and trainee researchers. To this end,
CULTURA is delivering a corpus agnostic
environment with a suite of services to provide
the necessary supports and features required for
such a diverse range of users.
The CULTURA system has been tested and
evaluated with two contrasting digital humanities
collections involving, respectively, textual
material and images. This paper introduces the
first integrated version of the environment and
presents the two testbed collections used (Section
2). Taking into account the novel methods and
technologies of the environment, as well as the
intended reusability of it with different
collections, an evaluation model has been
established and is described in Section 4, after a
review of related work (Section 3). In the context
of the two digital collections formative evaluation
studies on the environment have been conducted,
applying two complementary evaluation strategies
in line with the model (Section 5). The results
from both evaluation strands have been
1
http://cultura-project.eu/
consolidated to derive implications for further
development. Finally, an outlook to future
research is given (Section 6).
2 THE CULTURA SYSTEM
The CULTURA system consists of multiple
distinct services all accessed via the CULTURA
portal, the services available are shown in Figure
1 and include personalized search tools, faceted
search tools, annotators, social network
visualisation tools, and recommenders. A service
is triggered by a user’s interaction with the
CULTURA portal, with requests sent from the
presentation layer to the service via its API. For
example, when a person using the system is
looking at one of the 1641 Depositions (see
Section 2.1), entities from that document (people,
places etc.) are extracted from the data layer, and
recommended depositions based on these entities
are calculated in the control layer by the
recommender widgets. These recommendations
are then rendered in the presentation layer for the
user to view (see Figure 2).
The CULTURA portal utilises Drupal
2
as it
provides numerous services that, while essential
to CULTURA, are not core research elements,
such as user authentication and system-wide
logging. Drupal also has an extensible
architecture that allows new modules to be
developed in order to extend or replace system
functions. Hence, all services developed by
CULTURA are implemented as Drupal modules,
and when accessed by users, the responses from
these services are displayed in an appropriate
form, e.g. recommendations for related content as
seen in Figure 2.
The service oriented architecture approach
adopted by the CULTURA environment
simplifies the integration process. This is greatly
beneficial considering that services are being
developed by several different CULTURA
partners. Figure 1 highlights the various services
that need to be integrated in the control layer of
the architecture. All that is required for each of
these services is a well-defined API. In terms of
the presentation layer, a user interacting with the
entity oriented search will be communicating with
the entity oriented search module in Drupal,
which in turn accesses its bespoke API.
Furthermore, because these tools support a
2
http://www.drupal.org/
3
Figure 1. The CULTURA architecture.
parameterised launch, it also greatly simplifies the
integration process. This is because all
CULTURA services (visualisations, searches etc.)
can be rendered to users with the appropriate
information, purely by passing the relevant
identifiers to the service in the form of URL
parameters. Sections 2.1 and 2.2 describe the two
cultural collections integrated into CULTURA, as
well as the various services that the archives can
access.
Figure 2. An example of the recommended content displayed
to users within the CULTURA portal.
2.1 CULTURA and the 1641
Depositions Collection
The 1641 Depositions collection is held in the
Library of Trinity College Dublin. It comprises
more than 8,000 statements from witnesses and
victims of the violence and atrocity that took
place in the aftermath of the outbreak of the 1641
Rebellion in Ireland. The Depositions were
recorded by government appointed commissions,
and primarily record the experience of Protestant
English settlers, the events they saw, or heard of,
and the losses of property, possessions, and
money that they sustained. The Depositions are
unparalleled in early modern Europe, and provide
a unique window not only on the appalling events
of the Rebellion, but on the everyday and intimate
lives of ordinary people, and their efforts to make
sense of the devastating disintegration of social
order and neighbourly relations.
As part of a three-year project, which
commenced in 2007 and was funded by the Arts
and Humanities Research Council (UK), the Irish
Research Council for the Humanities and Social
Sciences (Ireland), and the Library of Trinity
College Dublin, the Deposition volumes, which
were in a parlous condition, were conserved. High
quality digitisations of the Depositions were
produced, and a team of three researchers
transcribed the Depositions and captured
extensive manually generated metadata,
describing the occupation and address of the
deponents, and the nature of the events recorded
4
Chapter
in each deposition
3
.
From a technological perspective, the
Depositions represent a textually-rich digital
humanities collection, which is characterised by
noisy text, inconsistent sentence structure,
grammar and spelling. The English language
manuscripts contain rich metadata and
descriptions of individuals, locations, events,
social structures and contrasting/conflicting
narratives. The digitised text of the Depositions
and its associated metadata are stored in a
MySQL
4
database that is accessed locally by the
CULTURA Drupal environment. Because of the
noisy text that is associated with the 1641
Depositions, a text normalisation process took
place a priori [4]. The output of the normalisation
process was added to the Depositions MySQL
database and used to power normalised search
over the depositions, and to improve the entity
relationship extraction performed using IBM’s
LanguageWare [5]. The process of entity
relationship extraction created a graph of people,
places, dates, and events relating to the 1641
Depositions. Importantly this entity graph was
used in a number of key CULTURA services
including social network analysis tools and
visualisations, recommenders [6] and entity
oriented search [7]. Other important CULTURA
services that operate over the 1641 Depositions
include an annotation tool [8], which enables
individuals and groups to create and share
annotations.
2.2 CULTURA and the IPSA
Collection
The Imaginum Patavinae Scientiae Archivum
(IPSA, Archive of images to support the study of
scientific research at Padua University) collection
is a digital archive of illuminated medieval and
Renaissance codices, dating from the 11th
century. It contains astrological manuscripts and
herbals with Latin, Paduan, and Italian language
commentaries. In particular, herbals are
manuscripts which contain hand-drawn depictions
of plants, such as trees, bushes or shrubs, and their
parts, such as flowers or leaves. The IPSA
collection contains mainly manuscripts written
and illustrated by the Paduan School, and
successive manuscripts produced in Europe under
3
http://1641.tcd.ie/
4
http://www.mysql.com/
its influence. The online archive was created
specifically for professional researchers in History
of Illumination to allow them to compare the
illuminated images held in the collection and to
verify the development of a new realistic way of
painting closely associated with the new scientific
studies that were flourishing at the University of
Padua in the XIV century, particularly thanks to
the teaching of Pietro d’Abano [9].
Such manuscripts have the rare characteristic
of containing high quality and very realistic
illustrations, because they were drawn from live
specimens. IPSA is a combination of digitised
images of the manuscripts and related metadata
descriptions
5
.
The user requirements analysis for the design
and development of IPSA was conducted in 2002.
A first complete prototype was made available to
researchers in March 2003, a consolidated final
version, revised using the comments of the users,
was released in July 2003 [10]. With the
involvement in the CULTURA project, it was
decided to open the archive to other categories of
users, such as non-domain professional
researchers, student communities and the general
public. This new task required the identification
of the needs, wishes and preferences of these new
categories in order to define the required changes
and improvements to IPSA.
Within CULTURA, IPSA metadata has been
shared in XML format, while high-resolution
images of the illustration are loaded from an
external server, due to copyright issues. The
collection can be browsed using a keyword search
or via a faceted browsing interface, both operating
over the XML metadata. In addition, both the
annotation tool and the social network
visualisations (see Figure 3) operate over the
IPSA collection in the context of the CULTURA
system.
2.3 The Two Collections in
CULTURA
The aim of the CULTURA project is to pioneer
the development of personalised information
retrieval and presentation, contextual adaptivity
and social analysis in a digital humanities context.
This is motivated by the desire to provide a
fundamental change in the way digital cultural
heritage is experienced, analysed, and contributed
5
http://ipsa.dei.unipd.it/en_GB/home
5
to by communities of interested individuals.
These communities typically comprise a diverse
mixture of professional researchers, apprentice
researchers (e.g. students of history and art
history), informed users (e.g. users belonging to
relevant societies, interest groups, or cultural
authorities), and interested members of the
general public.
From a technical perspective, IPSA and the
1641 Depositions represent two very different
kinds of digital humanities collections. While the
1641 Depositions are basically textual documents,
the IPSA collection is primarily image based, with
substantial metadata available, which is also
historically valuable as it captures the scientific
processes, which were prevalent during the
creation of the original collection.
Notwithstanding these differences, the two
collections share most of the CULTURA services.
For instance, the annotation tool, which is used to
annotate text within the 1641 Depositions, is used
to annotate images and parts of images within the
IPSA collection. These annotations can be used
by an individual or made public to a specific
group who may be working on the same topic.
The social network analysis visualisations (see
Figure 3) used with the 1641 Depositions
collection in CULTURA are also utilised with the
IPSA archive, which highlights the generic nature
of these Drupal modules and the effectiveness of a
service oriented architecture.
Figure 3. Visualisations based on social network
analysis within the IPSA collection.
The contrast in knowledge domain and
structure of the IPSA and 1641 content collections
demonstrate the broad applicability of the
CULTURA methodology. Moreover, it highlights
how the techniques delivered in CULTURA are
not specific to an individual domain or collection
but can be of benefit for a wide range of digital
humanities collections.
3 RELATED WORK
The tools and techniques of digital humanities
allow massive cultural collections to be digitised,
indexed, searched, and combined with other
digital libraries. Examples are the creation of
digital archives of the transcribed 1641
Depositions testimony documents [11,12] and,
respectively, of illuminated manuscripts with
botanical illustrations included in the IPSA
collection [10], as introduced in the previous
section.
Recently, more and more efforts are being
made to not only make available digital contents,
but to create virtual research environments that
provide interpretative frameworks for making
sense of cultural artefacts [13]. Such
environments support conceptualising,
visualising, and analysing information, and
collaboratively working on it. They usually do not
consist of one monolithic technology, but cover a
collection of tools assembled in one place in order
to assist research tasks and processes. Examples
are Aus-e-Lit, a portal for the study of Australian
literature [14], or the TextGrid environment for
supporting researchers in the arts and humanities
[15]. These environments incorporate tools for
text search and analysis, archiving and reuse,
collaboration and annotation. Commonly, they are
developed for a particular digital collection and
address a specific target audience, like
professional researchers, research projects or
institutions, or teaching communities.
CULTURA aims at building a novel type of
virtual research environment incorporating
innovative information retrieval technologies and
multidimensional adaptivity. The CULTURA
system integrates a suite of intelligent services for
guiding, assisting and empowering interaction
with cultural artefacts. Thereby flexibility in
terms of usage by a wide spectrum of users
groups with their specific needs and in terms of
reusability with different digital collections is
provided, which characterises the innovative
character of this research environment.
6
Chapter
Hand in hand with the development of capable
electronic information services and research
environments for digital humanities, there is a
need for comprehensive and scientifically sound
evaluation of the quality of such information
systems, in order to ensure that user needs are met
and to inform further development. Current
evaluation approaches can be categorised into
three main types highlighting the targeted
evaluation themes: user-oriented, system-oriented,
and systematic [16,17]. A user-oriented
evaluation approach pays attention to the user by
examining users’ requirements, behaviours and
preferences, and their interaction, use, acceptance,
and satisfaction with the digital humanities or
library system in question. The main purposes of
this type of evaluation are: verifying the quality of
a product, detecting problems, and supporting
decisions [18]. System-oriented evaluation
approaches focus on technological aspects and
aim at investigating how well advanced
technology can be used for digital information
representation and retrieval, measured for instance
in terms of precision, recall, and search time. This
type of evaluation examines what happens in the
informational environment external to the
individual, while user-oriented evaluation
examines the individuals’ psychological and
cognitive necessities and perceptions and how
they affect information search and use. Systematic
approaches address various levels or dimensions
and thus, may include user-oriented as well as
system-oriented evaluation goals. A range of
evaluation schemes and frameworks have been
proposed in the literature, integrating a mix of
dimensions and criteria from different disciplines
(e.g. digital libraries, information retrieval,
human-computer-interaction) and topics (e.g.
content, engineering, user, environmental).
Examples are the models suggested by Saravecic
[19], Kovács and Micsik [20] or Zhang [21]. The
so-called Interaction Triptych model [22]
distinguishes three main components of the
interaction process with a research environment
system, content, and user. On the axes between
them the evaluation criteria can be identified:
performance (system-content axis), usefulness
(content-user axis), and usability (system-user
axis). Fuhr et al. [23] established a framework for
evaluation that integrates three existing evaluation
models [19,20,22] under the umbrella of four
categories (construct, context, criteria, and
methodology) adapted from [19], which served
for structuring and describing the evaluation
process in a holistic manner.
4 THE CULTURA EVALUATION
MODEL
The novel technology of the CULTURA
environment and the openness to a wide variety of
content and users makes evaluation in CULTURA
a challenging and multi-faceted task. On the one
hand, CULTURA incorporates a range of
different services, which necessitate specific
consideration in evaluation to get comprehensive
outcomes on the quality of the system, and to
identify aspects for further refinement of its
individual components. On the other hand, the
reusability of the methods and technology with
different collections and the diversity of users
taken into account necessitates an evaluation
approach allowing researchers to select suitable
methods for a specific evaluation task, while
maintaining an appropriate level of comparability
and generalisability of evaluation results. This
required a systematic approach of defining an
evaluation model based on the existing state of the
art and tailoring to CULTURA.
The Interaction Triptych model [22] has been
used as a starting point for a conceptual analysis
of the components of the interaction process in the
CULTURA environment. The ‘system’ consists
of the intelligent services as individual system
components, and of the system as a whole. The
‘content’ is given by the test bed collections, the
1641 Depositions and IPSA; but in principle this
component is open also to other digital
collections. With respect to ‘users’, in CULTURA
four different user groups along the dimension of
expertise are distinguished and addressed:
professional researchers, apprentice researchers,
informed users, and members of the general
public [24]. The original model was extended for
CULTURA (see Figure 4) to address the quality
axes specific to the research environment and its
services and to form the common ground for
evaluation studies.
Usefulness of content refers to the interaction
between content and user: Is the content relevant
and suitable for the user? This relates to the
question whether the digital collection supports
the user’s personal information needs and/or the
information needs of the user group. A certain
level of content usefulness is necessary for a
meaningful evaluation of the other qualities.
7
Figure 4. The CULTURA evaluation model.
Usability refers to the interaction axis between
system and user: Does the system allow users to
effectively, efficiently, and satisfactorily
accomplish their tasks? This relates to whether the
communication and interaction between user and
system are smooth and whether the system is easy
to use and learn. It also includes aspects of the
learnability, navigation, and complexity of the
system.
User acceptance complements the aspect of
usability on the system-user axis of the evaluation
model: Do users consider the research
environment and its services acceptable? Users
may not necessarily have a positive attitude to the
system, even if it is technologically sound.
Commonly, the following user acceptance aspects
are distinguished [25]: perceived ease of use
(which is related to usability aspects), perceived
usefulness (this refers to the usefulness of the
system and is to be distinguished from usefulness
of content), and behavioural intention to use.
Adaptation quality refers to the interaction
between system, content, and user: Is the
adaptation provided by the CULTURA system
appropriate and useful? This quality addresses
users’ perceptions of the helpfulness and benefit
of system adaptation/recommendation received
[26]. It can also be related to layered evaluation of
adaptation [27], examining (a) whether user
variables are correctly inferred and (b) whether
adaptation decisions are appropriately taken.
Visualisation quality also addresses the
interaction between all three model components:
How do users feel about the visualisations
provided by the research environment? In the
context of CULTURA this applies to the social
network visualisations. Visualisation quality
relates to users’ perceptions about the benefit of
the visualisations provided and the user-
friendliness of the visualisation tools.
Collaboration support is another quality at the
centre of the evaluation model, relating to the
collaboration between the users of a research
environment. It refers to the extent/quality to
which users feel supported by the system in
getting in contact with each other, in exchanging
information on the collection content.
The aspect of performance (system-content
axis) is usually not directly visible to the users.
Since it is often difficult to evaluate via user
feedback is commonly measured in terms of
system-oriented evaluation . In CULTURA this
evaluation axis was assessed in terms of
normalisation quality and network quality.
Normalisation quality refers to text normalisation
as well as entity extraction from text, i.e. to the
quality and accuracy of the output of these
processes. Network quality investigates the
accuracy of the data visualisations and the
occurrence of probable inconsistencies between
the entity data and the network visualisation.
The evaluation model enables a systematic, in-
depth examination and validation of the services
integrated in CULTURA in addition to traditional
evaluation topics on the overall system and of
general interest. The model formed the common
underlying basis for the evaluation studies in
CULTURA over different collections and user
groups. The evaluation of the CULTURA system,
as described in the remainder of this paper,
represents a concrete application of the evaluation
model for the research environment described in
Section 2.
5 EVALUATION OF THE
CULTURA SYSTEM
Since the CULTURA system is intended as a
corpus agnostic research environment suitable for
different types of users, its evaluation needs to
prove its benefits over different collections and
user groups. Consolidating the results over
different evaluation studies allows researchers to
discover issues and implications that are of
general interest, as well as about the overall
quality of the environment. The evaluation model
thereby constitutes a common reference point for
specifying the data collection instruments and for
comparing and generalising results.
8
Chapter
In the following, empirical evaluations
conducted on the two system entities are
presented. CULTURA is designed to address the
needs of a spectrum of users, ranging from the
general public, who may be encountering the
collections for the first time, to professional
researchers who have worked extensively on the
subject. Evaluation of the successive versions of
the CULTURA implementation for the two
collections took place over the three years of the
project, and researchers worked closely with users
from across the user spectrum. While evaluations
with all user groups were conducted, we present
the evaluation methodology used and results
gained with apprentice investigators as a selected
user group. This group, together with professional
researchers, are able to give the most detailed and
comprehensive feedback on the system with
respect to the qualities of the evaluation model.
While in the case of both managed collections the
same general evaluation approach was taken, two
different but complementing evaluation strategies
were applied and are described next.
5.1 General Evaluation Approach
A multi-method approach defined in line with the
evaluation model was utilised with data being
taken from a variety of both quantitative and
qualitative data sources. Data collection was
carried out in three different ways: a questionnaire
covering items or scales on all evaluation
qualities, discussion, as well as interaction logs
that provided quantitative data complementing
participants self-reports. The questionnaire was
administered in separate online surveys for 1641
Depositions and IPSA, which were accessible via
links integrated in the CULTURA environment.
Questionnaire. The survey instrument was
developed in line with the evaluation model
covering rating-scale items on all relevant
evaluation qualities.
Usefulness of content was measured with two
items on the relevance of the digital collection for
individual and user group level.
For a general usability assessment, the System
Usability Scale (SUS) [28] covering 10 items was
used.
With respect to user acceptance, a scale (10
items in total) covering the main aspects of user
acceptance according to the technology
acceptance model [25] and already applied in the
context of user acceptance research on digital
libraries [29] was adapted.
Adaptation quality was assessed by eight items
on usage, usability aspects, and perceived benefits
of adaptive content recommendations provided by
the CULTURA environment to users. As this
functionality was only used for the 1641
Depositions collection, the related questions were
not used in case of the IPSA evaluation.
Visualisation quality, similarly to adaptation
quality, was captured by nine questions on usage,
aspects of the usability of the visualisation tools,
and their perceived benefits for users.
Collaboration support was measured with
three items investigating the perceived support of
collaboration with other users and the opinion of
the annotation tool.
In order to capture user-centred aspects of
normalisation quality, two items asking for
general level feedback on normalised search and
entity oriented search were used.
In addition, five open questions were
presented to collect qualitative feedback on the
perception of the CULTURA system and features
in a written form.
Discussion. The moderated discussion with
participants in terms of interviews or focus groups
was designed in a semi-structured manner. Key
questions in line with the CULTURA services and
evaluation qualities were defined in order to add
to questionnaire data and to gather more detailed
user feedback on perceived benefits and on
potential issues for further improvement.
Log Data. Users’ interactions with the
CULTURA system were logged and examined. In
case of the 1641 Depositions user trial, for
technical reasons, only data from the last month
was available for further analysis. Log data
analysis was done in accordance with the
CULTURA evaluation model and its underlying
evaluation qualities. The obtained objective data
complements participants’ self-reports.
The evaluation of the CULTURA system was
carried out in the context of university courses
following a task-based approach. In these user
trials, students (as apprentice investigators) were
first introduced to the CULTURA environment
and its functions. Subsequently, they were
9
assigned a task and the CULTURA system was
used to work on it. After task completion, students
filled in the online survey and took part in the
focus group discussion.
5.2 Evaluation of
1641Depositions@CULTURA
5.2.1 Method
Participants. The evaluation study in the context
of the 1641 Depositions trial with apprentice
investigators involved in total 14 students, with
only 11 (4 male, 7 female) of them completing the
evaluation questionnaire. Participants were
undergraduate as well as masters students of
History, Public History and Cultural Heritage, and
Digital Humanities and Culture. The average age
(n = 11) was 33.90 years (SD = 11.09), with
individual ages ranging from 22 to 59 years.
Students had advanced knowledge and experience
of computers and computer applications, in
general.
Procedure. Evaluation feedback was gathered
from apprentice investigators who had spent
twelve weeks working with the CULTURA
system for the 1641 Depositions, and who had
utilised it in the preparation of a number of
different research exercises. In using the 1641
Depositions collection, the following features are
available to the users: content recommendations,
social network visualisations, keyword search,
search over normalised contents, faceted browsing
interface, entity oriented search, and the
annotation tool.
Following the general procedure described in
Section 5.1, a mix of different methods and
approaches was used, involving the gathering of
both qualitative and quantitative feedback from
users in accordance with the evaluation model and
its evaluation qualities.
Qualitative feedback was obtained through a
mix of focus groups, debriefing sessions, and one-
on-one interviews. Quantitative feedback was
gathered through the use of rating scales in the
online surveys. These surveys were made
available to users through a personalised link
within the CULTURA environment, and users
completed them out after they had spent time
using the environment and its features.
5.2.2 Results
Usefulness of Content. The usefulness of the
1641 Deposition collection content was assessed
very high, with an average score of M = 6.09 (SD
= 0.58, Median = 6.0) on a scale with a possible
score range from 1 to 7 (Note: Due to the small
sample size medians are reported in addition to
arithmetic means).
Participants visited on average 39 pages (SD =
45.51, Median = 18.0), from which on average M
= 26.42 (SD = 39.71, Md = 5.0) were content
pages presenting depositions. A very high
variation in the total amount of page visits could
be identified. On the one hand a mix of students
logged into the system without visiting any
content pages, and on the other hand some
students made quite extensive use of the system
with more than 100 page visits.
Usability. General usability scored moderately
with an average score of M = 56.59 (SD = 12.81,
Md = 55.0) on a score scale ranging from 0 to 100
with higher values indicating better results. Scores
on individual usability items (with a possible
score range from 0 to 4, whereby higher values
indicating better usability in each case) revealed
that students did not feel the need of a technical
person to help them using the system (M = 3.36,
SD = 0.92, Md = 3.0). In line with this, the system
was also perceived as relatively quick to learn (M
= 2.73, SD = 0.90, Md = 3.0). Most critically, the
integration of the functions and tools was
perceived as not very well achieved (M = 1.55,
SD = 0.93, Median = 1) and not highly consistent
(M = 1.72, SD = 0.90, Md = 1.0).
Open feedback was, in general, very positive
and issues identified were concerned with the
technical implementation, rather than with the
conceptual underpinnings of the tools. Almost all
participants indicated that they would recommend
CULTURA to a friend, mentioning usability
aspects and research facilitations as the main
reasons.
User Acceptance. User acceptance with its
three main aspects perceived usefulness,
perceived ease of use, and behavioural intention
was assessed quite positively with scores ranging
from 4.91 to 5.43 (on a 1-7 scale, in each case).
Results are depicted in Figure 5. The best result
could be obtained for perceived usefulness (M =
10
Chapter
5.43, SD = 1.24, Md = 5.5) indicating that
participants perceived the CULTURA system as
reasonably useful for supporting their research.
This is in line with interview feedback, where
nearly all students indicated to see a potential
benefit of using CULTURA for their research.
Ease of use scored somewhat lower (M = 4.91, SD
= 0.75, Md = 4.75), but still appropriately good.
This user acceptance aspect was statistically
significant and positively correlated with the
overall usability score (r = 0.66, p<0.05), as
would be expected. For behavioural intention, an
average score of about 5 could also be identified
(M = 4.95, SD = 1.69, Md = 5.50).
Qualitative feedback confirmed this result.
Although some users stated that they were
unlikely to continue to work on the Depositions,
they still felt that the CULTURA model has a
high value and would be usefully extended to
other corpora.
Adaptation Quality. For adaptation quality an
overall score was calculated; the item responses
were also used to compute subscores on the
estimated usage of content recommendations,
their usability, and their perceived benefit
(possible score range 1-7). Results of all aspects
are shown in Figure 6. An average overall
adaptation quality of M = 3.74 (SD = 0.82, Md =
3.71) could be found, indicating a moderate to
low result. Usability of the recommenders was
assessed with medium quality, with an average
score of M = 3.93 (SD = 0.89, Md = 4.0). The
perceived benefit of adaptive recommendations
was medium to rather low with M = 3.68 (SD =
0.62, Md = 3.75). The extent of use of the
recommendations was scored by participants as
medium to rather low (M = 3.61, SD = 2.01, Md =
4.0), but a considerable variance could be
identified between individual respondents.
Examining the log data, it became obvious that
the recommendations provided by the system
were used quite rarely by the students, with half
of the participants having visited not one
recommended deposition and a mean number of
M = 2.14 (SD = 5.52) content pages visited via the
system’s content recommendations.
Nevertheless, qualitative feedback gathered in
discussions confirmed the usefulness of
recommendations as part of the research process,
especially at the beginning when having low
knowledge. However, students stressed the need
for transparency by making explicit what was
being recommended and why.
Figure 6. Results (mean scores and SD) on adaptation and
visualisation quality for 1641Depositions@CULTURA.
Visualisation Quality. Similar to adaptation
quality, for visualisation quality subscores (with
possible score range 1-7) on estimated usage,
usability, and perceived benefit of the
visualisation service were calculated from the
questionnaire responses, and an overall score was
derived. Results are shown in Figure 6.
Overall visualisation quality scored M = 4.38
(SD = 1.05, Md = 4.0), which indicates a
moderately good quality. The average estimated
usage of the visualisation service was rather low
with M = 3.20 (SD = 1.99, Md = 2.5). This result
could also be confirmed by the log data showing
that only 6 out of 14 students made use of the
visualisations at all, most of them only a few
times. The result for usability is satisfactorily
good, with half of the participants scoring with
4.5 or higher (i.e. Median) and on average with M
= 4.58 (SD = 0.71). The best result could be found
for the perceived benefit with a mean score of M
= 5.38 (SD = 0.79, Md = 5.25).
Figure 5. Results (mean scores and SD) on user acceptance
aspects for 1641Depositions@CULTURA.
11
Interviews confirmed these results; users were
on the one hand intrigued by the visualisations
and expressed a keen sense of the potential
usefulness. However, on the other hand, they
pointed to the need of more flexible visualisations
in terms of the possibility to move between
several texts and visualisations at the same time.
Collaboration Support. For collaboration
support a medium quality could be identified (M =
4.20, SD = 0.63, Md = 4.0). The explicit
assessment of the annotation tool and its
usefulness (as an indicator for collaboration
support) indicated a good quality with M = 5.20
(SD = 1.03, Md = 5.0). However, log data showed
that there were only three persons who had
annotated content pages. Thereby, the number of
annotations taken by individual participants
ranged from 1 to 6. Nevertheless, when explicitly
asking to identify the single most useful feature of
the CULTURA environment in the open feedback
section of the questionnaire, students indicated the
annotation feature, beside the ability to search
over normalised text, the faceted search, and the
visualisations.
Normalisation Quality. The ratings on
normalised search and entity oriented search were
positive resulting in an average score of M = 5.4
(SD = 1.17, Md = 5.5) for entity oriented search
and M = 5.90 (SD = 1.07, Md = 7.0) for search
over normalised contents. On average, students
made about 10 searches (M = 9.79, SD = 12.67),
half of them (M = 5.14, SD = 8.95) were carried
out over normalised contents.
Open feedback from discussions showed that
in practice, users were very pleased with the
ability to search over normalised data. Concerning
the entity oriented search interface, though users
saw high potential and value in this feature, they
pointed to two problems: accuracy of
automatically generated metadata and a confusing
and slow interface.
5.3 Evaluation of IPSA@CULTURA
5.3.1 Method
Participants. In total 110 apprentice investigators
took part in this study and completed the
evaluation questionnaire. This sample includes
undergraduate as well as master students
attending university courses of History and
Preservation of Cultural Heritage and of
Management of Archival and Bibliographic
Heritage. Gender distribution was 81 (74%)
female and 29 (26%) male students. The average
age of participants was 21.57 years (SD = 2.38;
Median = 22) with a range from 18 to 29 years.
The majority of students had average computer
literacy and experience.
Procedure. Contrary to the evaluation of
1641Depositions@CULTURA, where the user
trial not only involved a small group of apprentice
investigators but also consisted in a longer period
of contact and interaction with the CULTURA
system, the evaluation of IPSA@CULTURA was
characterised by a shorter-term interaction and
engagement with the system by a large sample of
students. In using the IPSA collection, the system
features included social network visualisations,
faceted browsing interface, and the annotation
tool.
Data collection followed the general procedure
described in Section 5.1 using a mixed-method
design including questionnaires, focus group
discussion, and log data.
5.3.2 Results
Usefulness of Content. Content usefulness was
assessed at a medium level with M = 4.26 (SD =
1.46). This may be due to the fact that the IPSA
collection covers a highly specific topic and these
young researchers likely have not yet specialized
in a particular field. This is in line with results
obtained from the focus group discussions, where
students pointed to their limited knowledge on the
content.
Students visited on average 20 pages (M =
20.44, SD = 12.41), from which on average 7 (M
= 7.40, SD = 6.76) were content pages presenting
illuminations.
Usability. For the responses collected in the
evaluation of IPSA@CULTURA an average
usability score of M = 65.39 (SD = 14.17) could
be calculated, indicating satisfactorily good
usability. In addition, scores on the individual
usability items were considered. The item
querying the consistency of the system and the
need for technical support scored best with M =
3.10 (SD = 0.86) for consistency and M = 3.07
12
Chapter
(SD = 0.90) for technical support. Lower scores
resulted for the item on potential future use with
M = 1.55 (SD = 0.93). It seems that students did
not see high relevance of using the system for
their research or studies (which may be related to
the perceived relevance of the collection content),
or at least lack understanding of recognising the
potential support of using the system. This is in
line with results obtained in the focus group
discussion, where some students pointed out that
providing a tutorial explaining the most important
CULTURA functions would be useful in order to
better understand the potential relevance and
usefulness of using the CULTURA system for
supporting their research.
User Acceptance. Ratings on perceived
usefulness, perceived ease of use, and behaviour
intention to use had been collected as aspects of
user acceptance. An overview of the mean scores
resulting for these subscales is given in Figure 7.
The best result was obtained for ease of use with
M = 4.95 (SD = 1.26), which is also closely linked
with usability, resulting in a positive correlation
of r = 0.67 (p<0.01) between the two scores.
Students who judged usability as high also
assessed the ease of use in the user acceptance
scale as being high. Perceived usefulness was
rated with an average score of M = 4.60 (SD =
1.13), thus indicating a medium to good result.
This is in line with results obtained from
qualitative feedback, where students highlighted
that CULTURA is a useful tool for their research,
especially with regard to cultural heritage
artefacts and their availability. Regarding
behaviour intention to use the system, a mean
score of M = 4.21 (SD = 1.41) resulted, indicating
that students had only a moderately high intention
to use the system. This may be due to the fact that
students, as expressed in open comments, had
limited knowledge of the content and therefore
did not see high benefit in using the system in the
future. They further suggested providing training
material or a tutorial explaining the most
important CULTURA functions in more detail.
Visualisation Quality. Figure 8 gives an
overview of the results derived for the three
subscales of visualisation quality and the overall
visualisation quality score calculated from them.
Overall visualisation quality scored with a mean
of M = 4.75 (SD = 0.80) indicating a medium to
good result. The extent of estimated usage of the
visualisations provided by the CULTURA
environment was modest with a mean score of M
= 3.57 (SD = 1.33). This corresponds to the log
data from the user trial, which show that
visualisations for illuminations were accessed
rather rarely, about 4 times on average (M = 4.07,
SD = 3.53). Visualisations for the same
illuminations were visited repeatedly, leading to
an average number of only M = 1.80 (SD = 1.08)
accesses of visualisations for unique content
pages. With respect to the usability of the
visualisation tool, a mean score of M = 5.05 (SD =
1.00) was identified, arguing for satisfactorily
good usability. The perceived benefit of the
visualisations was also quite good and reached a
mean score of M = 5.09 (SD = 1.02).
Open responses on the visualisations indicate
that this feature was appreciated by students and
was perceived as useful. However, users wished
for a more detailed explanation to get a better
understanding of the graphical explanation.
Collaboration Support. The mean score for
collaboration support with M = 4.85 (SD = 1.01)
was satisfactorily good. Participants considered
the CULTURA system as capable to support
collaboration among users to a sufficient extent.
The annotation tool was assessed as being of good
quality with M = 5.16 (SD = 1.20). Students
Figure 8. Results (mean scores and SD) on visualisation
quality for IPSA@CULTURA.
Figure 7. Results (mean scores and SD) on user
acceptance aspects for IPSA@CULTURA.
13
created on average more than 5 annotations (M =
5.30, SD = 2.55). Thereby, commonly more than
one annotation was made within a single IPSA
illumination (M = 2.40, SD = 1.08 for unique
annotated documents).
Open responses on collaboration support and
the annotation tool show that the possibilities of
these features were perceived very positively.
Though, further improvements in terms of
identifying authors and a rating system for
annotations were suggested which would allow
improving both interpretation and validity of
annotations.
5.4 Lessons Learned from the
Evaluation Strategies
The evaluations in the context of the two
collections were based on the same common
evaluation approach and model, but were
obviously quite different in nature. In both cases,
students as apprentice investigators were involved
in the evaluation studies, with the user trials being
integrated in students’ curriculum. In the case of
the 1641 Depositions collection the user trial
consisted in a longer-term involvement of a small
group of users and intensive engagement with the
CULTURA system. In context of the IPSA
collection the user trial consisted in a shorter
contact with a large number of participants. In
addition to this, users differed in their age, with
participants from the first being considerably
older than those from the second evaluation. This
difference was reflected also in users
expectations on using the digital collections and
CULTURA in the future: students in the first year
of their studies, like in the IPSA trial, commonly
do not have yet a clear idea of the tools they will
need for their career.
Another important difference between the two
samples was that English was the participants’
native language only in the evaluation of
1641Depositions@CULTURA, while participants
in the user trial on IPSA@CULTURA spoke
Italian. This might have had an influence on
users’ assessment of the CULTURA environment,
which was available in English only.
An evaluation strategy as used with 1641
Depositions provides the advantage that users
acquire sufficient knowledge of the system and,
after intensive interaction, are able to provide in-
depth feedback on their perception of and
experience with the research environment. Such
intensive engagement with users is only possible
with small numbers of users, which in turn
complicates valid quantitative data analysis. This
kind of approach involves a high level of
workload and time effort from evaluators and
participants; therefore this kind of strategy
sometimes might not be feasible to apply in
evaluation practice. An evaluation strategy as
used in the context of the IPSA collection, in
contrast, is easier to organise and deploy. A more
short-term interaction with the system, though,
might be considered more as a snapshot of user
experience and feedback and may lead to users’
assessments being somewhat confounded with
learnability, in case users have not had the chance
to gather sufficient experience in handling the
system. If an appropriate exposure to the system
is possible even in the context of a short term user
trial, participants will be able to provide valuable
and extensive feedback, too, even if probably not
in the level of detail as in case of the first
evaluation strategy. Since a large sample makes
very intensive interaction and exchange with
individual participants (e.g. via interviews)
difficult, the evaluation of IPSA@CULTURA
was carried out dividing the students in smaller
groups of about 20-30 students.
Overall, the differences in the evaluation
approaches presented make them complementary
in terms of a comprehensive and mixed-method
evaluation of CULTURA, in general. By bringing
together the outcomes from both studies and
strategies, more conclusive evidence can be
drawn on the overall quality of the system and
services. In addition, aspects for future
improvement of the research environment can be
identified, which have the potential to support
users independent of the digital collection.
5.5 Discussion
By consolidating and comparing the outcomes of
the two evaluations, aspects of the CULTURA
system that are generally positively perceived or
looked at critically by the involved user cohorts
can be examined.
Content usefulness of the digital collection
provided via the CULTURA environment was
perceived quite differently in the two user trials:
While apprentice investigators assessed the 1641
14
Chapter
Depositions collection contents as highly relevant
and interesting to them, students in the user trial
on the IPSA collection perceived the collection
content only as moderately relevant.
For both cohorts overall usability of the
CULTURA environment turned out to be
satisfactory, with students in the IPSA trial
assessing the system more positively. Considering
the most critically assessed individual usability
items a somewhat different picture emerged for
the two studies: In the IPSA trial users seemed to
have in particular difficulties in considering a
frequent future use of CULTURA, which also
reflects the moderate perceived relevance of the
collection. In the 1641 Depositions trial especially
the integration and consistency of the system
resulted to be a critical issue needing further
improvement. In this regard it has to be taken into
account that for the 1641 Depositions collection
the system makes available a broader range of
different services than those available for the
IPSA collection. This is largely due to IPSA being
an image collection rather than a textual
collection like the 1641 Depositions.
A comparison of results for user acceptance
aspects, which in principle all scored moderate to
good, confirms the previous outcomes: In the
IPSA trial the system was evaluated as easier to
use, but behaviour intention to use the system was
rather low; in the 1641 Depositions trial, on the
contrary, the perceived usefulness and
behavioural intention to use were slightly stronger
pronounced.
With respect to the evaluation of the
visualisation tools provided by the CULTURA
environment, a moderate to good quality could be
identified in both evaluations. Interestingly, in
both groups apprentice investigators did not
extensively use the visualisations, but
nevertheless evaluated their usability and,
especially, the potential benefits users can take
from the visualisations as good. In the context of
the 1641 Depositions collection it became clear
that students would be interested in further
extension of the visualisations in order to increase
their added value for gaining insight in the
research process.
The annotation tool was assessed consistently
positive. Collaboration support was perceived as
medium to good by apprentice investigators with
the IPSA collection and therefore better than for
the 1641 Deposition collection, with only a
medium result. In total, these results indicate that
there is potential to further extend the system by
additional functionality supporting collaboration,
e.g. by chat or forum features. The possibility of
searching in the 1641 Depositions collection over
normalised contents was highly appreciated,
which underlines the importance of normalisation
and argues for normalisation quality from a user-
centred perspective.
Qualitative feedback was overall quite positive
and students found the CULTURA environment
interesting. In both trials, students expressed their
interest in having CULTURA available for other
types of collections. While suggestions for
additional features made by students in the IPSA
context were, among others, the integration with
social networks, but also different language
versions of the environment and additional
help/tutorials; students’ suggestions in the 1641
Depositions trial addressed mainly the possibility
to organise and export their data, like bookmarks,
annotations, and searches.
The recorded and analysed log data for both
studies reflects well the two different evaluation
approaches taken in the two trials. Although for
1641Depositions@CULTURA log data has been
available only for a limited time span of the
overall duration of the user trial, the analysed data
set clearly shows that this user trial implemented a
more intensive interaction with the system (in
terms of page visits, content pages accessed, and
searches conducted), with a smaller group of users
over a longer period of time. This was in contrast
to a shorter and less intensive interaction with a
large number of participants for
IPSA@CULTURA. Despite this, individual
features, like the visualisations or bookmarking
were used similarly scarcely in both trials. The
annotation functionality was used more intensely
in the IPSA trial, which may be explained by the
fact that the tasks in this study explicitly requested
the creation of annotations.
6 CONCLUSION AND OUTLOOK
TO FUTURE RESEARCH
This paper has introduced the CULTURA
research environment, which provides innovative
functions to support research and exploration of
digital heritage collections. Two evaluation
studies are presented that were conducted in the
15
context of the two testbed collections using the
CULTURA system. The evaluation methodology
taken in both studies incorporate different,
complementing evaluation strategies, which were
aligned to a common underlying evaluation
model, ensuring a general comparability of
results.
The user trials presented involved apprentice
investigators; this user group, together with
professional researchers, are considered important
target audiences who are able to provide
evaluation feedback with a high level of detail.
User-centred evaluations involving the user
groups actually addressed by a research
environment are key, because only those people
who are in a position to benefit directly from such
a research environment or information service, are
able to take this field of research and development
forward [1]. Evaluation studies other than the two
presented here also involved the other user groups
considered in CULTURA. One study, for
example, involved secondary school students as
members of the general public. This is especially
interesting, since there is increasingly high
potential seen in using virtual research
environments in digital humanities as teaching
instruments [8], which is relevant not only for
academic, but also for elementary and high school
education.
The evaluation outcomes on the different
qualities of the underlying evaluation model
provide targeted information on benefits and on
potential for further refinement or extension of
specific services and features of the CULTURA
environment. The results from the presented
studies, together with those obtained from other
user groups were consolidated to derive
implications for further development. A range of
changes were implemented in the meantime.
These have been evaluated positively in more
recent user trials and were singled out as being
especially valuable in terms of building users’
comfort with and confidence in the CULTURA
environment. The evaluation results of the studies
presented in this paper also provide benchmark
data for the final evaluation round. The final
version of the CULTURA system is currently
being evaluated and updates or extensions of the
individual services are addressed and investigated
in detail. A comparison of results between the
intermediate and the final implementation will
demonstrate progress made and the overall
benefits and quality of CULTURA as a novel
research environment.
The CULTURA system was designed from the
outset to meet the requirement of a corpus-
agnostic research environment, independent of a
specific collection, working alongside it. This
means that, although CULTURA was evaluated
within the context of the two digital collections, it
is not designed specifically with either of these in
mind, but rather as a flexible and supporting
framework capable of integration with a wide
range of digital collections. The environment,
developed with a particular set of services, may
not have all those capabilities implemented
ideally or used at all within an application on a
specific given digital collection. This may be due
to the nature and domain of the digital collection,
as well as to technical, pedagogical, or pragmatic
reasons. This has to be taken into account when
considering individual evaluation studies in a very
concrete application setting, given a certain
collection managed by the CULTURA system,
and a selected user group. In their totality,
different evaluation studies on CULTURA shall
prove its general usefulness and significance for
empowering and guiding users in their interaction
with digital humanities collections.
Acknowledgments. The work reported has been
partially supported by the CULTURA project
6
, as
part of the Seventh Framework Programme of the
European Commission, Area “Digital Libraries
and Digital Preservation” (ICT-2009.4.1), grant
agreement no. 269973, and could not be realised
without the close collaboration between all
CULTURA partners
7
.
REFERENCES
1. Borgman, C.L. (2009) The digital future is now: A
call to action for the humanities. Digital
Humanities Quarterly 3(4) Accessed 3 October
2013 http://www.digitalhumanities.org/dhq/vol/
3/4/000077/000077.html
2. Hampson, C., Agosti, M., Orio, N., Bailey, E.,
Lawless, S., Conlan, O., Wade, V. (2012) The
CULTURA project: supporting next generation
interaction with digital cultural heritage
collections. In: Proceedings of the 4th
6
http://www.cultura-strep.eu/
7
http://www.cultura-strep.eu/partners
16
Chapter
International Euromed Conference. Springer,
Heidelberg, pp. 668-675
3. Hampson, C., Lawless, S., Bailey, E., Yogev, S.,
Zwerdling, N., Carmel, D. (2012) CULTURA: A
metadata-rich environment to support the
enhanced interrogation of cultural collections. In:
Proceedings of the 6
th
Metadata and Semantics
Research Conference. Springer, Heidelberg, pp.
227-238
4. Lawless, S., Hampson, C., Mitankin, P.,
Gerdjikov, S. (2013) Normalisation in historical
text collections. In: Proceedings of Digital
Humanities 2013, Lincoln, Nebraska, pp. 507-509
5. Yogev, S., Roitman, H., Carmel, D. Zwerdling, N.
(2012) Towards expressive exploratory search
over entity-relationship data. In: Proceedings of
the 21
st
International Conference Companion on
World Wide Web, New York, pp. 83-92
6. Hampson, C., Bailey, E., Munnelly, G., Lawless,
S., Conlan, O. (2013) Dynamic personalisation for
digital cultural heritage collections. In:
Proceedings of the 6
th
International Workshop on
Personalized Access to Cultural Heritage, Rome,
Italy
7. Carmel, D., Zwerdling, N., Yogev, S. (2012)
Entity oriented search and exploration for cultural
heritage collections: the EU CULTURA project.
In: Proceedings of the 21
st
International
Conference Companion on World Wide Web, New
York, pp. 227-230
8. Agosti, M., Conlan, O., Ferro, N., Hampson, C.,
Munnelly, G. (2013) Interacting with digital
cultural heritage collections via annotations: the
CULTURA approach. In: Proceedings of the 2013
ACM symposium on Document engineering.
ACM, New York pp. 13-22
9. Mariani Canova, G. (2011-2012) Per Cultura: le
immagini dei manoscritti della scienza a Padova
dal Medioevo al Rinascimento. In: Atti e Memorie
dell’Accademia Galileiana di Scienze, Lettere ed
Arti. Vol. CXXIV, pp. 81-90
10. Agosti, M., Benfante, L., Orio, N. (2003) IPSA: A
digital archive of herbals to support scientific
research. In: Sembok T.M.T. (ed.) Proceedings of
the International Conference on Asian Digital
Libraries ICADL 2003. LNCS vol. 2911. Berlin,
Springer, pp. 253-264
11. Clarke, A. (1986) The 1641 depositions. In: Fox P.
(ed.) Treasures of the Library. Trinity College
Dublin, pp. 111-122
12. O’Regan, D., Sweetnam, M., Fennell, B., Lawless,
S. (2011) A collaborative linguistic research
interface for the 1641 depositions. Poster
presented at Digital Humanities 2011, Stanford,
California
13. Bellamy, C. (2012) The sound of many hands
clapping: Teaching the digital humanities through
virtual research environments (VREs). Digital
Humanities Quarterly 6(2) Accessed 3 October
2013 http://www.digitalhumanities.org/
dhq/vol/6/2/000119/000119.html
14. Hunter, J., Gerber, A. (2010) The Aus-e-Lit
project: Advanced eResearch services for scholars
of Australian literature. VALA 2010, Melbourne,
Australia, Accessed 3 October 2013
http://www.itee.uq.edu.au/
eresearch/papers/2010/VALA2010_Hunter.pdf
15. Neuroth, H., Lohmeier, F., Smith, K.M. (2011)
TextGrid Virtual research environment for the
humanities. The International Journal of Digital
Curation 6(2): 222-231
16. Saracevic, T. (2000) Digital library evaluation:
toward an evolution of concepts. Library Trends,
49(2): 350-369
17. Zhang, Y. (2007) Developing a holistic model for
digital library evaluation. Dissertation, The State
University of New Jersey
18. De Jong, M., Schellens, P. J. (1997) Reader-
focused text evaluation: An overview of goals and
methods. Journal of Business and Technical
Communication 11(4): 401-432
19. Saracevic, T. (2004) Evaluation of digital libraries:
An overview. In: Agosti M., Fuhr N. (eds.)
DELOS Workshop on the Evaluation of Digital
Libraries. Accessed 7 October 2013
http://www.scils.rutgers.edu/~tefko/DL_evaluatio
n_Delos.pdf
20. Kovács, L., Micsik, A. (2004) The evaluation
computer: A Model for structuring evaluation
activities. In: Agosti M., Fuhr N. (eds.) DELOS
Workshop on the Evaluation of Digital Libraries
21. Zhang, Y. (2010) Developing a holistic model for
digital library evaluation. Journal of the American
Society for Information Science and Technology
61(1): 88-110
22. Tsakonas, G., Papatheodorou, C. (2006) Analysing
and evaluating usefulness and usability in
electronic information services. Journal of
Information Science 32(5): 400-419
23. Fuhr, N., Tsakonas, G., Aalberg, T., Agosti, M.,
Hansen, P., Kapidakis, S. Klas, C.-P. , Kovács, L.,
Landoni, M., Micsik, A., Papatheodorou, C.,
Peters, C., Sølvberg, I. (2007) Evaluation of digital
libraries. International Journal on Digital
Libraries 8(1): 21-38
24. Sweetnam, M., Agosti, M., Orio, N., Ponchia, C.,
Steiner, C., Hillemann, E.-C., Ó Siochrú, M.,
Lawless, S. (2012) User needs for enhanced
engagement with cultural heritage collections. In:
Zaphiris P., Buchanan G., Rasmussen E., and
Loizides, F. (eds.) Theory and practice of digital
libraries. TPDL 2012. LNCS vol. 7489. Springer,
Berlin pp. 64-75
25. Davis, F.D., Bagozzi, R.P., Warshaw, P.R. (1989)
User acceptance of computer technology: A
comparison of two theoretical models.
Management Science 35(8): 982-1003
26. Steiner, C.M., Albert, D. (2012) Tailor-made or
17
unfledged? Evaluating the quality of adaptive
eLearning. In: Psaromiligkos, A. Spyridakos, and
S. Retalis (eds.) Evaluation in e-learning. Nova
Science, New York, pp 111-143
27. Brusilovsky P., Karagiannidis C., Sampson D.
(2004) Layered evaluation of adaptive learning
systems. International Journal of Continuing
Engineering Education and Life-Long Learning
14: 402-421
28. Brooke, J. (1996) SUS: a “quick and dirty”
usability scale. In: P.W. Jordan, B. Thomas, B.A.
Weerdmeester & a.L. McCleeland (eds.). Usability
evaluation in industry. Taylor and Francis,
London, pp.189-194.
29. Thong, J.Y.L., Hong, W., Tam, K.-Y. (2002)
Understanding user acceptance of digital libraries:
what are the roles of interface characteristics,
organizational context, and individual differences?
International Journal of Human-Computer Studies
57: 215-242
... In digital humanities studies, usability evaluation of tools and services is seen as a key part of the research (Bulatovic et al., 2016), and is published and discussed in detail (e.g. Steiner et al., 2014;Bartalesi et al., 2016;Hu, 2018). In archaeology specifically, usability studies are less routinely performed (or at least not often published), and seem to be limited to the fields of virtual reality and digital museums (Karoulis et al., 2006;Pescarin et al., 2014). ...
Thesis
Full-text available
The archaeology domain produces large amounts of texts, too much to effectively read or manually search through for research. To alleviate this problem, we created a search system (called AGNES), which combines full text search with entity and geographical search. We first created a manually labelled data set to train a Named Entity Recognition model, which is used to extract entities from text. We also did a user requirement study, and usability evaluation on the system, to make sure it is suitable for archaeological research. In a case study on Early Medieval cremations, we show that using AGNES leads to a knowledge increase when compared to the knowledge of experts, gathered using previously available search engines. This shows that this kind of intelligent search system can help with literature research, find more relevant data, and lead to a better understanding of the past.
... A CAS "allows users to add valued information, share ideas and create knowledge" in a digital environment. Although several examples of CAS can be found in the literature [6][7][8][9], we can identify three main shortcomings of these works that inspired us to perform the work presented in this paper: first, existing approaches do not adopt specific strategies to communicate uncertainty, which is important in a DH research context. We partially attribute this to the fact that these systems do not incorporate specific models for uncertainty, making it hard to capture, store, and communicate it. ...
Article
Full-text available
The capture, modelling and visualisation of uncertainty has become a hot topic in many areas of science, such as the digital humanities (DH). Fuelled by critical voices among the DH community, DH scholars are becoming more aware of the intrinsic advantages that incorporating the notion of uncertainty into their workflows may bring. Additionally, the increasing availability of ubiquitous, web-based technologies has given rise to many collaborative tools that aim to support DH scholars in performing remote work alongside distant peers from other parts of the world. In this context, this paper describes two user studies seeking to evaluate a taxonomy of textual uncertainty aimed at enabling remote collaborations on digital humanities (DH) research objects in a digital medium. Our study focuses on the task of free annotation of uncertainty in texts in two different scenarios, seeking to establish the requirements of the underlying data and uncertainty models that would be needed to implement a hypothetical collaborative annotation system (CAS) that uses information visualisation and visual analytics techniques to leverage the cognitive effort implied by these tasks. To identify user needs and other requirements, we held two user-driven design experiences with DH experts and lay users, focusing on the annotation of uncertainty in historical recipes and literary texts. The lessons learned from these experiments are gathered in a series of insights and observations on how these different user groups collaborated to adapt an uncertainty taxonomy to solve the proposed exercises. Furthermore, we extract a series of recommendations and future lines of work that we share with the community in an attempt to establish a common agenda of DH research that focuses on collaboration around the idea of uncertainty.
... In digital humanities studies, usability evaluation of tools and services is seen as a key part of the research (Bulatovic et al., 2016), and is published and discussed in detail (e.g. Steiner et al., 2014;Bartalesi et al., 2016;Hu, 2018). In archaeology specifically, usability studies are less routinely performed (or at least not often published), and seem to be limited to the fields of virtual reality and digital museums (Karoulis et al., 2006;Pescarin et al., 2014). ...
Preprint
Full-text available
This paper presents AGNES, the first information retrieval system for archaeological grey literature, allowing full-text search of these long archaeological documents. This search system has a web interface that allows archaeology professionals and scholars to search through a collection of over 60,000 Dutch excavation reports, totalling 361 million words. We conducted a user study for the evaluation of AGNES's search interface, with a small but diverse user group. The evaluation was done by screen capturing and a think aloud protocol, combined with a user interface feedback questionnaire. The evaluation covered both controlled use (completion of a pre-defined task) as well as free use (completion of a freely chosen task). The free use allows us to study the information needs of archaeologists, as well as their interactions with the search system. We conclude that: (1) the information needs of archaeologists are typically recall-oriented, often requiring a list of items as answer; (2) the users prefer the use of free-text queries over metadata filters, confirming the value of a free-text search system; (3) the compilation of a diverse user group contributed to the collection of diverse issues as feedback for improving the system. We are currently refining AGNES's user interface and improving its precision for archaeological entities, so that AGNES will help archaeologists to answer their research questions more effectively and efficiently, leading to a more coherent narrative of the past.
Article
Purpose Digital humanities research platform for biographies of Malaysia personalities (DHRP-BMP) was collaboratively developed by the Research Center for Chinese Cultural Subjectivity in Taiwan, the Federation of Heng Ann Association Malaysia, and the Malaysian Chinese Research Center of Universiti Malaya in this study. Using The Biographies of Malaysia Henghua Personalities as the main archival sources, DHRP-BMP adopted the Omeka S, which is a next-generation Web publishing platform for institutions interested in connecting digital cultural heritage collections with other resources online, as the basic development system of the platform, to develop the functions of close reading and distant reading both combined together as the foundation of its digital humanities tools. Design/methodology/approach The results of the first-stage development are introduced in this study, and a case study of qualitative analysis is provided to describe the research process by a humanist scholar who used DHRP-BMP to discover the character relationships and contexts hidden in The Biographies of Malaysia Henghua Personalities . Findings Close reading provided by DHRP-BMP was able to support humanities scholars on comprehending full text contents through a user-friendly reading interface while distant reading developed in DHRP-BMP could assist humanities scholars on interpreting texts from a rather macro perspective through text analysis, with the functions such as keyword search, geographic information and social networks analysis for humanities scholars to master on the character relationships and geographic distribution from personality biographies, thus accelerating their text interpretation efficiency and uncovering the hidden context. Originality/value At present, a digital humanities research platform with real-time characters’ relationships analysis tool that can automatically generate visualized character relationship graphs based on Chinese named entity recognition (CNER) and character relationship identification technologies to effectively assist humanities scholars in interpreting characters’ relationships for digital humanities research is still lacking so far. This study thus presents the DHRP-BMP that offers the key features that can automatically identify characters’ names and characters’ relationships from personality biographies and provide a user-friendly visualization interface of characters’ relationships for supporting digital humanities research, so that humanities scholars could more efficiently and accurately explore characters’ relationships from the analyzed texts to explore complicated characters’ relationships and find out useful research findings.
Article
Purpose The purpose of this study is to examine how different feminist Facebook groups in Israel operate in order to better understand the main issues in their discussions about feminism in Israel. The study will also identify the variances between the different subgroups. A secondary research question examined was whether Voyant Tools can be used as an effective content text analysis tool in general and in Hebrew in particular. Design/methodology/approach The study's research method analyzes the content of Facebook posts using the Voyant Tools online toolkit to quantitatively analyze and visualize the results of text mining and data visualization. The sample consists of the texts of posts of three groups representing different currents in Israeli feminism, gathered over a period of three months. Findings The results show that there are high-frequency words occurring in all groups, each group has its unique words, which distinguish it from the other groups. Feminist and Halachic Feminist groups had few words in common, while the Religious Feminist groups had more words in common with both the Feminist and the Halachic Feminist groups and more so with the latter group. While all groups discussed the issue of violence against women, especially sexual violence, the degree of engagement varied greatly between the groups. In addition, there were clear differences in the prominent issues concerning the various groups. This paper demonstrates the possibility of using Voyant Tools for text mining and analysis. Originality/value This paper demonstrates the possibility of using Voyant Tools for text mining and analysis. Voyant Tools shed light on common concepts, their location and prevalence in the text.
Article
Digital library evaluation has become increasingly important in information science, yet there has been minimal evaluative work focusing on digital cultural heritage. We report on a comprehensive review of methodologies and frameworks used in the evaluation of cultural heritage digital libraries and archives. Empirical studies are examined using Tefko Saracevic's digital library evaluation framework to identify models, frameworks, and methodologies in the literature and to categorize these past evaluative approaches. Through the classification and critique of evaluative types and trends, we aim to develop a set of recommendations for the future evaluation of cultural heritage digital libraries and archives.
Article
Purpose Digital humanities aim to use a digital-based revolutionary new way to carry out enhanced forms of humanities research more effectively and efficiently. This study develops a character social network relationship map tool (CSNRMT) that can semi-automatically assist digital humanists through human-computer interaction to more efficiently and accurately explore the character social network relationships from Chinese ancient texts for useful research findings. Design/methodology/approach With a counterbalanced design, semi-structured in-depth interview, and lag sequential analysis, a total of 21 research subjects participated in an experiment to examine the system effectiveness and technology acceptance of adopting the ancient book digital humanities research platform with and without the CSNRMT to interpret the characters and character social network relationships. Findings The experimental results reveal that the experimental group with the CSNRMT support appears higher system effectiveness on the interpretation of characters and character social network relationships than the control group without the CSNRMT, but does not achieve a statistically significant difference. Encouragingly, the experimental group with the CSNRMT support presents remarkably higher technology acceptance than the control group without the CSNRMT. Furthermore, use behaviors analyzed by lag sequential analysis reveal that the CSNRMT could assist digital humanists in the interpretation of character social network relationships. The results of the interview present positive opinions on the integration of system interface, smoothness of operation, and external search function. Research limitations/implications Currently, the system effectiveness of exploring the character social network relationships from texts for useful research findings by using the CSNRMT developed in this study will be significantly affected by the accuracy of recognizing character names and character social network relationships from Chinese ancient texts. The developed CSNRMT will be more practical when the offered information about character names and character social network relationships is more accurate and broad. Practical implications This study develops an ancient book digital humanities research platform with an emerging CSNRMT that provides an easy-to-use real-time interaction interface to semi-automatically support digital humanists to perform digital humanities research with the need of exploring character social network relationships. Originality/value At present, a real-time social network analysis tool to provide a friendly interaction interface and effectively assist digital humanists in the digital humanities research with character social networks analysis is still lacked. This study thus presents the CSNRMT that can semi-automatically identify character names from Chinese ancient texts and provide an easy-to-use real-time interaction interface for supporting digital humanities research so that digital humanists could more efficiently and accurately establish character social network relationships from the analyzed texts to explore complicated character social networks relationship and find out useful research findings.
Article
Purpose - The purpose of this paper is to study the relevance of heritage collections and the convergence of methodologies and standards traditionally linked to Library and Information Science (LIS) in the development of digital humanities (DH) research in Spain. Design/methodology/approach - This paper is based on a systematic review of scientific publications that are representative of DH in Spain and were published between 2013 and 2018. The analysis considered doctoral theses, journal articles and conference papers. Findings - The results highlight the synergies between documentary heritage, LIS and DH. However, it appears that there is a scarcity of scientific literature to support the confluence of LIS and DH and a limited formal connection between heritage institutions and the areas of academia that reuse and enrich these source collections. Research limitations/implications - The review of representative scholarly DH publications was mainly based on the metadata that describe the content of articles, thesis and conference papers. This work relies on the thematic indexing (descriptors and keywords) of the analysed documents but their level of quality and consistency is very diverse. Originality/value - The topic of the study has not been explored before and this work could contribute to the international debate on the interrelation and complementarity between LIS and DH. In addition, this paper shows the contribution that standards and documentary methodologies make to projects in which technology is applied to humanities disciplines. The authors propose that there is an urgent need to strengthen the “scientific relationships” between heritage institutions, as well as enhancing links between the academic field of DH and LIS in order to improve teaching and research strategies in conjunction.
Thesis
Computational Thinking is an idea of skills of universal competence that everyone should have in the 21st Century. There is gender gap in STEM and particularly Computer Science education. Over the last decades there has been an interest in developing computational thinking and also focus on how to integrate Computational thinking in the curriculum as a main concern. The research paper seeks to find the digital literacy of students, the use of computer science unplugged activities and plugged activities such as Scratch to develop the computational thinking. The research used a total of 201 sampled from 244 students of T.I Ahmadiyya Girls’ Senior High School in the Sekeyere East District of Ashanti Region Ghana .The paper used a diagnostic test Microsoft digital literacy test to determine the digital literacy. CS unplugged activities were adapted and used for weeks followed by the use of Scratch plugged activities. The scores were the dependent variable and students were independent variable for the study. The study employed Post-test Pre-test Experimental design .Data collected indicated that digital literacy levels were moderate and developing. The data of CS unplugged scores were collected and showed high levels of Computational Thinking. Dr. Scratch was used to collect data of students Scratch scores and it indicated that their CT had improved during the activities. Data also indicated a correlation between ICT scores and Math and Science. Hypothesis was tested and there was a significant difference in ICT scores after the Intervention. In view of these results, it was recommended that stakeholders of education should appropriately supply and provide plugged and unplugged activities as part of the curriculum and implement it effectively.
Conference Paper
Full-text available
This paper presents the evaluation approach taken for an innovative research environment for digital cultural heritage collections in the CULTURA project. The integration of novel services of information retrieval to support exploration and (re)search of digital artefacts in this research environment, as well as the intended corpus agnosticism and diversity of target users posed additional challenges to evaluation. Starting from a methodology for evaluating digital libraries an evaluation model was established that captures the qualities specific to the objectives of the CULTURA environment, and that builds a common ground for empirical evaluations. A case study illustrates how the model was translated into a concrete evaluation procedure. The obtained outcomes indicate a positive user perception of the CULTURA environment and provide valuable information for further development.
Article
Full-text available
The number of digital collections in the cultural heritage domain is increasing year on year. Improved quality of access to cultural collections, especially those collections which are not exhibited physically is a key objective of the digitisa-tion process. Despite some successes in this area, many digitised collections struggle to attract users or to maintain their interest over a prolonged period. One of the key reasons for this is that users of these archives vary in expertise (from professional researchers to school children) and have different tasks and goals that they are trying to accomplish. This paper describes CULTURA, an FP7 funded project that is addressing this specific issue through its four-phase personalisation approach and accompanying suite of services. By employing such personalisation techniques, CULTURA is helping the exploration of, link-ing to, and collaboration around cultural heritage collections.
Conference Paper
Full-text available
The increased digitisation of cultural collections, and their availabil-ity on the World Wide Web, has made access to these valuable documents much easier than ever before. However, despite the increased availability of ac-cess to cultural archives, curators still struggle to instigate and enhance en-gagement with these resources. The CULTURA project is actively addressing this issue through the development of a metadata-driven personalisation envi-ronment for navigating cultural collections and instigating collaborations. The corpus agnostic CULTURA environment also supports a full spectrum of users: ranging from professional researchers seeking patterns in the data and trying to answer complex queries; to interested members of the public who need help navigating a vast collection of resources. This paper discusses the state of the art in this area and the various innovative approaches used in the CULTURA project, with a special focus on how the underlying metadata helps facilitate its semantically rich environment.
Conference Paper
Full-text available
This paper introduces the main characteristics of the digital cultural collections that constitute the use cases presently in use in the CULTURA environment. A section on related work follows giving an account on efforts on the management of digital annotations that are pertinent and that have been considered. Afterwards the innovative annotation features of the CULTURA portal for digital humanities are described; those features are aimed at improving the interaction of non-specialist users and general public with digital cultural heritage content. The annotation functions consist of two modules: the FAST annotation service as back-end and the CAT Web front-end integrated in the CULTURA portal. The annotation features have been, and are being, tested with different types of users and useful feedback is being collated, with the overall aim of generalising the approach to diverse document collections and not only the area of cultural heritage.
Article
Full-text available
The increased digitisation of cultural collections and their availability on the World Wide Web has made access to these valuable documents much easier than ever before. However, despite the increased availability of access to cultural archives, curators still struggle to instigate and enhance engagement with these resources. The CULTURA project is actively addressing this issue through the development of a metadata-driven personalisation environment for exploring cultural collections and instigating collaborations. This paper discusses the state of the art in this area and the various innovative approaches used in the CULTURA project, with a special focus on how the underlying metadata helps to facilitate its semantically rich environment. An evaluation of the CULTURA project with students is detailed, highlighting its relevance for those without domain expertise. This is vital as the system needs to adapt and support a full spectrum of users, from professional researchers to interested members of the public.
Chapter
Adaptive learning systems (ALE) are designed towards the main objective of tailoring learning content, system feedback and appearance to individual users - according to their preferences, goals, knowledge, and other characteristics. The implementation of a successful adaptation process is of course demanding. Adaptation is not good per se and poor realisations of adaptation may lead to disappointed users who may reject or disable adaptation mechanisms. This is why the evaluation of ALE needs to be a fundamental and integral part of their development. Evaluation should address the questions whether adaptation works on principle, whether it really improves the system, whether it leads to more effective learning, whether users prefer the adaptive features, etc. The main challenge in evaluating ALE lies in their core characteristic - adaptivity, which results in individual experiences and interactions with the system for each individual user. The attempts of dealing with this challenge are diverse. This chapter provides an overview on existing and suggested methods for evaluating adaptive e-learning. Strengths and weaknesses of the current evaluation approaches are elaborated and relevant topics in the user-centred evaluation of ALE are discussed. The evaluation methodology developed in GRAPPLE, an EC-funded project aiming at developing a generic responsive adaptive personalised learning environment, is outlined as a case study on evaluating adaptive e-learning. In GRAPPLE the concept of 'adaptation quality' is adopted and conceptualized in terms of covering different aspects of adaptive elearning experiences that are addressed for a holistic evaluation of adaptive e-learning. In sum, the chapter aims at increasing awareness for the importance of careful and properly designed evaluation of ALE. We believe that the thorough consideration of the quality of adaptation and the use of evaluation approaches on the basis of mathematicalpsychological models and expertise are the ingredients for a sound investigation of the benefits of adaptive e-learning and thus, for contributing to the overall notice, growth, and spread of ALE.
Article
Improved full-text search, named-entity recognition and relationship extraction are all key research topics across many areas of technology, with emerging applications in the intelligence, healthcare and financial fields amongst many others [1] . In Digital Humanities, there is a growing interest in the application of such Natural Language Processing (NLP) approaches to historical texts [2] with a view to improving how a user can explore and analyse these collections [3] [4] [5] [6] . However, the text contained in handwritten historical manuscript collections can often be 'noisy' in nature — with variation in spelling, punctuation, word form, sentence structure and terminology. This is particularly the case with collections written in archaic language forms, such as Early-Modern English. Multiple studies have concluded that the applicability of modern NLP tooling to such historical texts has been very limited due to this inherent noisiness in the texts. This historical language barrier hinders the accessibility and thus the potential exploration and analysis of many significant historical text collections. This paper will discuss the normalisation of historical texts as a solution to this problem and examine how normalisation can improve the analysis, interpretation and exploration of these collections. Normalisation is the process of transforming text into a single canonical form, in this case, the modern equivalent of the language. Once this has been completed, the texts can be processed using current NLP techniques and technologies. However, the normalisation of historical texts presents a difficult challenge in itself. Much research has been undertaken in an attempt to cope with the correction and normalisation of text produced by Optical Character Recognition (OCR), speech recognition, instant messaging etc. which show similar characteristics to those of historical texts. One technique which has been applied is the use of a historical lexicon, supplemented by computational tools and linguistic models of variation. However, because of the absence of language standards, multiple orthographic variations of a given word or expression can be found in a collection of material, even in the same document. As a result, the quality of the results achieved, even after normalisation, has not been satisfactory. Researchers have also noted a general lack of tools and resources specialised to this domain. This paper will present the normalisation research conducted as part of the CULTURA project, which has developed techniques for the normalisation of a 17th century manuscript collection written in Early Modern English, The 1641 Depositions [7] . CULTURA analyses the artefacts and through the application of novel linguistic models of variation, enables normalisation techniques to remove issues of inconsistency in spelling, grammar and punctuation. The technologies developed and applied have had to solve issues arising from the need to contend with noisy inputs, the impact noise can have on downstream applications, and the demands that noisy information places on document analysis. The normalisation of texts in Early Modern English can be interpreted as a special (restricted) case of translation. Using this intuition, a methodology was developed based upon statistical machine translation models. The key ingredient of this approach is a new translation module that further develops known OCR correction techniques.
Conference Paper
Evaluation is an important task for digital libraries, because it reveals relevant information about their quality. This paper presents a conceptual and technical approach to support the systematic evaluation of digital libraries in three ways and a system is presented that assists during the entire evaluation process. First, it allows for formally modelling the evaluation goals and designing the evaluation process. Second, it allows for data collection in a continuous and non-continuous, invasive and non-invasive way. Third, it automatically creates reports based on the defined evaluation models. On the basis of an example evaluation it is outlined how the evaluation process can be designed and supported with this system.
Conference Paper
This paper presents research carried out in order to elicit user needs for the design and development of a digital library and research platform intended to enhance user engagement with cultural heritage collections. It outlines a range of user constituencies for this digital library. The paper outlines a taxonomy of intended users for this system and describes in detail the characteristics and requirements of these users for the facilitation and enhancement of their engagement with and use of textual and visual cultural artefacts.