Content uploaded by Martin Ruskov
Author content
All content in this area was uploaded by Martin Ruskov on Mar 05, 2024
Content may be subject to copyright.
The VAST Collaborative Multimodal Annotation
Platform: Annotating Values
Georgios Petasis1, Martin Ruskov2, Anna Gradou1, and Marko Kokol3
1Institute of Informatics and Telecommunications,
National Centre for Scientific Research (N.C.S.R.) “Demokritos”
GR-153 10, P.O.BOX 60228, Aghia Paraskevi, Athens, Greece,
petasis@iit.demokritos.gr, agradou@iit.demokritos.gr
2Department of Computer Science, Università degli Studi di Milano,
Via Celoria 18, 20133 Milano, Italy,
martin.ruskov@unimi.it
3Semantika Research, Semantika d.o.o., Zagrebška 40a, 2000 Maribor, Slovenia,
marko.kokol@semantika.eu
Abstract. In this paper, we present the VAST Collaborative, Multi-
modal, Web Annotation Tool. It is a collaborative, web-based annota-
tion tool built upon the Ellogon infrastructure, adapted to the content
creation and annotation needs of digital cultural heritage. With the help
of an annotation methodology and guidelines, the tool has been used to
analyse and annotate intangible artifacts (mainly narratives) with moral
values. This paper presents the tool and its capabilities, and an evalua-
tion study for assessing its usability.
Keywords: annotation tools, inter-annotation reliability, collaborative
annotation, web-based annotation, moral values.
1 Introduction
It is widely spoken of the values inherited through literary heritage. However,
being specific about how these values are expressed in intangible artifacts (e.g.
narratives such as historical texts) is not straightforward. One widely adopted
technique to externalise implicit content in text is qualitative content analysis,
where experts annotate a text assigning labels that are not necessarily visible in
the text itself.
In this paper, we present the VAST Collaborative, Multimodal, Web An-
notation Tool. It is a collaborative, web-based annotation tool built upon the
Ellogon infrastructure, adapted to the content creation and annotation needs of
digital cultural heritage. Being based on a generic annotation platform in de-
velopment for many years, offers a set of advantages, like the ability to support
a wide range of annotation tasks and annotation schemata, robustness, as well
as cross-domain features, like artifact/resource management, security, storage,
etc. On the other hand, domains like the digital cultural heritage, may have
specialised requirements. In order to support content creation and annotation
2 Georgios Petasis et al.
in the cultural heritage domain, the following methodology has been followed:
a) collect requirements from cultural heritage professionals, with an emphasis
on narratives and annotation with values; b) perform an analysis of state-of-art
annotation tools and the percentage of requirements they support; c) adapt the
selected tool to fulfil all requirements; d) perform content creation and annota-
tion tasks, following defined methodologies and guidelines; and e) evaluate the
performance of the tasks.
Innovative aspects of the VAST Tool include a) support for all modalities;
b) ability to annotate long documents (either texts, audios, or videos); c) real-
time annotation schema extension; and d) extensive support for annotation qual-
ity monitoring.
The structure of the is paper is as follows: Section 2 presents work related
to annotation types and types for various modalities and domains, focusing on
tools for cultural heritage, along with an overview of usability evaluation of
annotation tools. Section 3 describes the requirements gathered from cultural
heritage professionals, while section 4 provides an overview of the VAST Tool
and its main features. Section 5 presents the evaluation phase, while section 6
concludes this paper, and presents some future directions.
2 Related Work
An annotation tool is a specialised application that aims to help annotators en-
rich multimedia artifacts, through the creation of additional metadata. These
metadata (or “annotations”) can be classified into two main categories, charac-
terised by the granularity of their application: 1) “properties” that are associated
to an entire artifact, characterising it as a whole. Typical annotations of this cat-
egory include labels on document or image level, e.g. classification categories as-
sociated with documents. 2) “annotations”, which typically associate labels with
segments (parts) of an artifact. Typical annotation types of this category include
a) textual annotations, labels associated with specific parts/segments of a text
(e.g. words, sentences, paragraphs, etc.), b) spatial annotations, labels associated
with areas of images and videos (e.g. points, landmarks, 2D/3D bounding boxes,
polygons, etc. [1]), and c) temporal annotations, which can associate labels with
temporal segments that are determined by beginning and end timestamps, in au-
dio and video. Depending on the goal of data annotation, spatial and temporal
annotation can be combined on the same artifact.
Historically, annotation tools started as desktop applications. The user needed
to install special software, and the annotation process took place locally with the
documents that were stored in a local machine. Nowadays, an increasingly num-
ber of on-line tools has been released, typically running within a Web browser
and offering capabilities like collaborative annotation by multiple annotators.
Regarding availability, options at one’s disposal range from open source tools
that can be adapted by developers, to commercial applications.
Over the years, a plethora of annotation tools has been presented, mainly
driven by applications of annotated data, such as machine learning. The vast
The VAST Annotation Platform 3
majority of annotation tools has been driven by the needs of research areas such
as natural language processing (NLP) and image analysis, since annotation tools
are among the primary means for transferring human knowledge to artificial in-
telligence models through the assignment of labels to data [1]. Several surveys
try to organise and compare annotation tools along dimensions that relate to
features, tasks and modalities. A recent overview of text annotation tools is
presented at [15], while a fairly recent extensive review and comparison of sev-
eral annotation tools for manual text annotation can be found in [14]. Several
image annotation tools are surveyed in [1,16], while surveys about audio and
video annotation tools can be found in [10,6]. Finally, a recent survey regarding
requirements and use of annotation tools can be found in [19].
In the cultural heritage domain, the function of annotation tools remains the
same, aiming at digitising human knowledge from professionals, scholars, and
quite often the crowd, that is involved in documentation, curation, restoration
and enhancement of the cultural assets. While generic annotation tools can also
support this domain, there are domains of application that require more spe-
cialised tools, such as the “ART3mis”4annotation tool for 3D objects [3] and the
“Music Scholars Score Annotator”5, which allows users to label digital musical
scores [22]. A frequent requirement in this domain is the annotation of artifacts
with structured knowledge, typically in the form of an ontology or a vocabulary
encoded in Semantic Web technologies, such as “Culto” [9], and “CulHIAT” [20],
along with the need to annotated 3D objects and scenes [2].
Usability research offers tools and techniques to measure software quality. An
important overarching perspective is given by the technology acceptance model
which identifies a separation between perceived usefulness and perceived ease of
use [11]. The System Usability Scale (SUS) is the most widely used quantitative
measure of perceived ease of use [5,13]. It consists of 10 questions on a 5-point
Likert-scale and produces an overall score in the range of 0-100 with higher
numbers meaning better usability. It has been shown to be broadly equivalent to
other popular measures, yet very efficient - achieving good statistical convergence
over samples as small as 12 participants. A huge body of data indicates an
average score of 68 with the end values of 0 and 100 actually obtainable by
individual users. Generally a score of 80 is representative for an above average
user experience. Yet, average scores vary by type of application, user experience,
and application complexity among others. Research also confirms that slight
variations of the questions – to better address the task at hand – typically do
not compromise the results [13].
3 Values in the VAST Project
VAST is a European H2020 Research project that aims to bring (moral) values
to the forefront of advanced digitisation, and to investigate the transformation
of core European Values, including freedom, democracy, equality, the rule of law,
4https://warmestproject.eu/tools-survey/art3mis/
5https://trompamusic.github.io/music-scholars-annotator/
4 Georgios Petasis et al.
tolerance, dialogue, dignity, etc. across space and time. Through the analysis and
the awareness raising on moral values, VAST wants to contribute to the public
discourse about them and to understand how they are perceived.
Having as a starting point that morality is an individual construct, influenced
and shaped by any aspect of a person’s social life, VAST wants to study val-
ues in the context of social interactions that relate to arts (focusing on theatre
through ancient Greek Drama of the 5th BCE century), folklore (focusing on
folktales/fairytales of the 19th century), science (focusing on Scientific Revolu-
tion and natural-philosophy documents of the 17th century) and education.
VAST aims to research existing collections of intangible assets (expressed
in natural language, from different places and from significant moments in Eu-
ropean history) and trace and inter-link the values emerging from them. For
these purposes, VAST has developed a collaborative semantic annotation plat-
form, the “VAST Semantic Annotation Platform”6. It allows the collaborative,
multimodal, analysis and annotation of artifacts (primary with values, but not
limited to values). These services facilitate professionals like scholars analysing
narratives, or museum curators who want to extend the available metadata on
their collections.
3.1 VAST Semantic Annotation Platform Requirements
In order to collect a set of functional requirements related to the annotation tool
used within the VAST Platform, an online survey has been conducted, involving
mainly scholars, theatre and museum professionals (∼80 professionals across
Europe provided answers). Important features obtained through this survey are:
Artifact management: A comprehensive way of managing artifacts is re-
quired, supporting file types from multiple modalities (texts, images, audios,
and videos), management of resources (such as importing/exporting, copying/
cloning, moving, sharing), and security (supporting both private and shared ar-
tifacts and annotations, network storage with automated backup procedures,
provision of data privacy within the context of the annotation tool).
Multiple Annotation Methods: Multiple methods and capabilities in applying
labels to artifacts are required. Users requested the ability for document-level
annotations,multi-label annotations, and annotation of relationships. The abil-
ity to annotate with multiple terminologies and ontologies are also considered
important by the users. Pre-annotations and machine-learning assisted annota-
tion is not considered as important. Regarding labelling capabilities, annotation
of (overlapping) arbitrary text segments is required, along with bounding boxes
in images, and temporal annotations in audios and videos. Finally, the ability
to support multiple annotation schemes and many annotation tasks has been
characterised as important.
Annotation of Long Documents: Users emphasise the need to support the
annotation of lengthy documents, and to visualise the document in its entirety,
6https://platform.vast-project.eu/
The VAST Annotation Platform 5
e.g. being able to visualise and annotate a whole theatrical play or a scientific
manuscript. In addition, support for various languages has been requested.
Quality Control: The ability to inspect annotations, compare annotations/
documents/collections and acquiring various inter-rater/inter-annotation met-
rics has been characterised as important.
Partial Annotation: The ability to partially save documents has been re-
quested.
Based on the aforementioned requirements, along with additional constraints
(e.g. availability under an open source license, support for real-time collaborative
annotation, etc.), a literature review has been conducted. The “Ellogon Anno-
tation Platform” [15] has been selected as the most promising infrastructure for
basing the VAST Semantic Annotation Platform on. The feature comparison
dimensions can been seen on Table 1 (adapted from [15]).
BRAT
[17]
Clarin-EL
[12]
Ellogon
[15]
GATE
Teamware
[4]
Label
Studio
[21]
WebAnno
[7]
Open Source Yes Yes Yes Yes Yes1Yes
Collaborative
Annotation
(Real-time)
Yes (Yes) Yes (Yes) Yes (Yes) Yes (No) Yes (No) Yes (No)
Role Management Basic Basic Basic Advanced Advanced2Advanced
Progress Monitoring No No No Yes Yes Yes
Annotation Statistics No No Yes Yes Yes Yes
Automatic Annotation Yes No Yes Yes Yes Yes
Inter-annotator
Agreement Plugin No Yes No Enterprise
(Paid) No
Annotation
Comparison Partial No Yes No Enterprise
(Paid) Partial
Long Document
Annotation No Yes Yes No No No
Real-time Schema
Extension No Yes Yes No No No
Table 1: Feature comparison of existing annotation solutions.
1Enterprise Features may not be included.
2Advanced in Enterprise (Paid).
6 Georgios Petasis et al.
4 The VAST Semantic Annotation Platform
The Annotation Tools that is included in the VAST Semantic Annotation Plat-
form is based on the Ellogon Web Annotation Tool [15], and extends it along
the following dimensions:
Artifact management:
–Support for managing artifacts beyond texts, implementing support for im-
age, audio, and video files. The management of artifacts has the same capa-
bilities across modalities.
–Extended REST API, to support image, audio and video artifacts.
Multiple Annotation Methods:
–Support for spatial annotations (bounding boxes), for image annotations.
–Support for temporal annotations, when annotating audio/video artifacts.
Annotation of Long Documents:
–Support for large audio and video artifacts.
With these extensions, the VAST Semantic Annotation Platform satisfies
all the user requirements. The VAST Semantic Annotation Platform is publicly
available as a) a publicly available cloud service7, and b) open-source software
under the Apache license8. More details about the the Ellogon Web Annotation
Tool can be found at [15].
In Figure 1, we present a typical example of text annotation in the context
of the VAST project. On the right panel, the user interface (UI) is adjusted to
the used annotation schema. The user can add additional, custom labels and
enrich the annotation schema by using the label creation button at the bottom
of the screen. On the left panel, the text of “Cinderella” tale is displayed, along
with some annotations. The user can read the text in its entirety and click on
the colored segments to highlight them, in order to see annotation data and edit
its details. An important feature of this application is the navigation through
overlapping annotations (if there are any) by using the combo-box that exist at
the bottom of the UI.
5 Evaluation
An evaluation with 27 users was performed in order to assess the usability of
the annotation tool. Evaluators were invited to a in-person group session where
each of them worked individually and anonymously. They were instructed ver-
bally about the process and asked to read and sign the written information
and consent form. Then they were left to work independently with a researcher
available for clarifications. For the purposes of the evaluation, the English trans-
lations of three one-page excerpts per pilot were selected. For Pilot 1 these were
7https://platform.vast-project.eu/
8The VAST Semantic Annotation Platform: https://github.com/vast-
project/ellogon-annotation-tool
The VAST Annotation Platform 7
Fig. 1: Annotating a tale with values. The annotation scheme in use includes
a set of annotation labels that represent concepts, moral values and ideas that
are being tracked down the texts. Labels are organised under three categories:
Key/Main Concepts/Values, Expanded Concepts, and Bi-polarities.
Sophocles’ Antigone, Euripides’ Hecuba and Aristophanes’ Peace. For Pilot 2 On
the Revolutions of the Celestial Spheres by Copernicus, Micrographia by Hooke
and Mathematical Principles of Natural Philosophy by Newton were selected.
For Pilot 3, Grimm fairytales were chosen: Faithful Johannes, Juniper Tree and
Little Snow White.
Evaluators were asked - whenever possible - to find and annotate five values:
two main concepts, two expanded concepts, and one bi-polarity. Upon completion
of the tasks, evaluators were asked to complete an evaluation questionnaire. It
contained three sections, responses to all questions were required: demographics,
System Usability Scale (SUS), and the following open exploratory questions:
1. List one or more aspects that positively impressed you in using the VAST
Annotation Tool.
2. List one or more aspects that negatively affected your annotation experience.
3. Do you have suggestions on how to improve the usability of the tool?
4. Do you have suggestions about possible functionalities that can be added to
the tool?
5. Is there any question that you had and remained unanswered or difficult to
answer when using the tool?
The group of evaluators was balanced in terms of gender. It is mainly com-
posed of young educated people, consequence of recruitment in university setting.
8 Georgios Petasis et al.
To probe the technical proficiency of evaluators, a list of 5 generic – yet broadly
relevant to the task – technical skills were included in the demographics questions
as a form of self-assessment. It confirmed a general very good level of technical
proficiency. The huge majority of them had no experience of annotation, with
only 19% reporting repeated experience with this type of task.
The overall SUS score was 69.6 (standard deviation 20.0), which can be con-
sidered satisfactory given that we evaluate a piece of professional software [13].
Also in the responses to the open questions the overall sentiment towards the
annotation tool was positive. Evaluators indicated that the tool was “clear and
well organized”, that “the interface is user-friendly”, and that “there were clear
instructions for each step of annotation”. They explicitly listed the highlighting
paradigm, colour-coding and ease of value selection as positive features. One
participant summarised this with “It’s very important that this tool is similar
to Word”. When asked to elaborate on their annotation experience, participants
offered opinions that we report in three groups: divergent perceptions of features,
suggestions for improvement, and complexity.
Divergent perception of features: Some evaluators highlighted criticisms that
others stressed as positive aspects. A typical example of this is the interaction
to define the matches between text selections and values. Some evaluators, in-
cluding ones without prior technical experience, found the process to be very
intuitive. Others reported getting confused about the order of selection. One
particular evaluator without annotation experience perceived it more intuitive
to first select values and then the text. They wrote “when you select the attribute
if you accidentally still have the text from before it changes that attribute” and
made two particular suggestions: asked for the possibility to match starting from
the value and suggested that after a match the program should automatically
deselect it. However, others asked for the possibility to allow for multiple values
for the same selection.
Suggestions for improvement: Some suggestions for improvement were more
conceptual, while others were practical. Two evaluators expressed the need to
add personal notes, one of them illustrating the suggestion by referring to the
comment feature of Microsoft Word. A theme that emerged as a recurring chal-
lenge is the visualisation of overlapping annotations. One evaluator proposed as
a possible solution to add filtering to visualise only a sublist of values of cur-
rent interest. Another hypothesised that there could be a way to use colouring
to show overlapping selections. Other ideas were adapting selections to disre-
gard differences in punctuation, autosave option, indication of used or unused
annotations.
Complexity: Some evaluators commented about the learning curve to work
with the software. One explained it like this “At first, it was hard to find the
texts. It’s not an intuitive software and it takes quite some time to master it”.
Another evaluator provided emotional feedback by saying “It was difficult in the
beginning but then I enjoyed it.” Other evaluators commented on the difficulty of
the texts. One wrote “It was easy to use the tool. The text was harder considering
it was a translation of the original Greek text.”
The VAST Annotation Platform 9
6 Conclusions and Future Work
The VAST Annotation Tool was designed as an integrative part of the VAST
Platform and as an instrument addressing the specific task of annotation of
historical documents. Results from this first evaluation suggest that it already
matches the expectations from a typical software aimed to be used in a specific
professional context.
With this in mind, the answers to the exploratory questions suggest that the
annotation process could benefit from an in-depth analysis of the cognitive load,
employed by annotators, possibly understanding what constitutes intrinsic and
extraneous cognitive load in the annotation activity and look for ways to reduce
the extraneous part [18].
The evaluation presented here covers only the annotation of text. Further
studies are needed to assess other types of annotation. Also, aspects of quality
control, could also be considered subject to further evaluations. In particular, it
could be of interest to explore and evaluate the possibilities to compare across
documents and collections or across annotators. Some work regarding the first of
these directions has already been done by subjecting the resulting data to further
qualitative analysis [8]. However, this research is still in progress and needs to
be expanded and results need to be critically interpreted by professionals in the
humanities.
7 Acknowledgments
The research leading to these results has received funding from the European
Union’s Horizon 2020 research and innovation programme, in the context of
VAST project, under grant agreement No 101004949. This paper reflects only
the view of the authors and the European Commission is not responsible for any
use that may be made of the information it contains.
References
1. Aljabri, M., AlAmir, M., AlGhamdi, M., Abdel-Mottaleb, M., Collado-Mesa,
F.: Towards a better understanding of annotation tools for medical imag-
ing: a survey. Multimedia Tools and Applications 81(18), 25877–25911 (2022).
https://doi.org/10.1007/s11042-022-12100-1
2. Apollonio, F.I., Gaiani, M., Bertacchi, S.: Managing cultural heritage with in-
tegrated services platform. The International Archives of the Photogrammetry,
Remote Sensing and Spatial Information Sciences (2019)
3. Arampatzakis, V., Sevetlidis, V., Arnaoutoglou, F., Kalogeras, A., Koulamas, C.,
Lalos, A., Kiourt, C., Ioannakis, G.A., Koutsoudis, A., Pavlidis, G.: Art3mis: Ray-
based textual annotation on 3d cultural objects. In: CAA 2021 International Con-
ference “Digital Crossroads” (2021)
4. Bontcheva, K., Cunningham, H., Roberts, I., Roberts, A., Tablan, V., Aswani, N.,
Gorrell, G.: Gate teamware: a web-based, collaborative text annotation framework.
Language Resources and Evaluation 47(4), 1007–1029 (2013)
10 Georgios Petasis et al.
5. Brooke, J.: SUS: a retrospective. Journal of Usability Studies 8(2), 29–40 (2013)
6. Cassidy, S., Schmidt, T.: Tools for Multimodal Annotation, pp. 209–227. Springer
Netherlands, Dordrecht (2017). https://doi.org/10.1007/978-94-024-0881-2_7
7. de Castilho, R.E., Biemann, C., Gurevych, I., Yimam, S.M.: Webanno: a flexible,
web-based annotation tool for clarin. Proceedings of the CLARIN Annual Confer-
ence (CAC) 2014 (2014)
8. Ferrara, A., Montanelli, S., Ruskov, M.: Detecting the semantic shift of values in
cultural heritage document collections (short paper). In: Proceedings of the 1st
Workshop on Artificial Intelligence for Cultural Heritage. pp. 35–43. No. 3286 in
CEUR Workshop Proceedings, Aachen (2022), https://ceur-ws.org/Vol-3286/04_
paper.pdf
9. Garozzo, R., Murabito, F., Santagati, C., Pino, C., Spampinato, C.: Culto: An
ontology-based annotation tool for data curation in cultural heritage. ISPRS -
International Archives of the Photogrammetry, Remote Sensing and Spatial Infor-
mation Sciences 42, 267–274 (2017)
10. Gaur, E., Saxena, V., Singh, S.K.: Video annotation tools: A review.
In: 2018 International Conference on Advances in Computing, Com-
munication Control and Networking (ICACCCN). pp. 911–914 (2018).
https://doi.org/10.1109/ICACCCN.2018.8748669
11. Hornbæk, K., Hertzum, M.: Technology acceptance and user experi-
ence: A review of the experiential component in hci 24(5) (2017).
https://doi.org/10.1145/3127358
12. Katakis, I.M., Petasis, G., Karkaletsis, V.: CLARIN-EL web-based annotation tool.
In: Proceedings of the 10th International Conference on Language Resources and
Evaluation (LREC’16). pp. 4505–4512. ELRA, Portorož, Slovenia (2016), https:
//aclanthology.org/L16-1713
13. Lewis, J.R.: The system usability scale: Past, present, and future. Inter-
national Journal of Human–Computer Interaction 34(7), 577–590 (2018).
https://doi.org/10.1080/10447318.2018.1455307
14. Neves, M., Ševa, J.: An extensive review of tools for manual anno-
tation of documents. Briefings in Bioinformatics 22(1), 146–163 (2019).
https://doi.org/10.1093/bib/bbz130
15. Ntogramatzis, A.F., Gradou, A., Petasis, G., Kokol, M.: The ellogon web anno-
tation tool: Annotating moral values and arguments. In: Proceedings of the 13th
Language Resources and Evaluation Conference. pp. 3442–3450. ELRA, Marseille,
France (2022), https://aclanthology.org/2022.lrec-1.368
16. Pande, B., Padamwar, K., Bhattacharya, S., Roshan, S., Bhamare, M.: A review
of image annotation tools for object detection. In: 2022 International Conference
on Applied Artificial Intelligence and Computing (ICAAIC). pp. 976–982 (2022).
https://doi.org/10.1109/ICAAIC53929.2022.9792665
17. Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: Brat: a
web-based tool for nlp-assisted text annotation. In: Proceedings of the Demonstra-
tions at the 13th Conference of the European Chapter of the ACL. pp. 102–107.
ACL (2012)
18. Sweller, J., van Merriënboer, J.J.G., Paas, F.: Cognitive Architecture and Instruc-
tional Design: 20 Years Later. Educational Psychology Review 31(2), 261–292
(2019). https://doi.org/10.1007/s10648-019-09465-5
19. Tan, L.: A survey of nlp annotation platforms.
https://github.com/alvations/annotate-questionnaire (2020)
The VAST Annotation Platform 11
20. Theodosiou, Z., Georgiou, O., Tsapatsoulis, N., Kounoudes, A., Milis, M.: Annota-
tion of cultural heritage documents based on XML dictionaries and data clustering.
In: Digital Heritage - 3rd International Conference. LNCS, vol. 6436, pp. 306–317.
Springer (2010). https://doi.org/10.1007/978-3-642-16873-4_23
21. Tkachenko, M., Malyuk, M., Holmanyuk, A., Liubimov, N.: Label Studio: Data
labeling software (2020-2022), https://github.com/heartexlabs/label-studio
22. Tomašević, D., Wells, S., Ren, I.Y., Volk, A., Pesek, M.: Exploring
annotations for musical pattern discovery gathered with digital annota-
tion tools. Journal of Mathematics and Music 15(2), 194–207 (2021).
https://doi.org/10.1080/17459737.2021.1943026