Content uploaded by Merten Kröncke
Author content
All content in this area was uploaded by Merten Kröncke on Aug 01, 2022
Content may be subject to copyright.
Digital Humanities 2022
Book Barcoding
Finally, we named the proposed algorithm ‘book
barcoding,’ whose name was inspired by ‘DNA
barcoding’ (Moritz, 2004). A DNA barcode is a DNA
sequence that is specific to a species. When investigating
the species of a particular DNA sequence, the sequence is
compared with the DNA barcode library, such as BOLD
(Barcode of Life Data Systems), to identify DNA sequences
from unknown species. Based on a similar framework, we
plan to establish a general collation platform for printed
books, where keypoints specific to a book helps to identify
the phylogenetic relationship of unknown books.
Acknowledgment
The author thanks Mr. Jun Homma for his significant
contribution to vdiff.js. He also thanks Prof. Kumiko
Fujizane and Prof. Kazuaki Yamamoto of the National
Institute of Japanese Literature for helpful comments on the
research. A part of the research is based on the work of Mr.
Thomas Leyh, who contributed to this project while he was
an NII internship student. JSPS KAKENHI Grant Number
JP19H01141 partially supports this work.
Bibliography
Alcantarilla, P. F., Nuevo, J., and Bartoli, A. (2011) Fast
explicit diffusion for accelerated features in nonlinear scale
spaces. Trans. Pattern Anal. Machine Intell, 34:7, 1281–
1298.
Fischler, M. A., and Bolles, R. C. (1981) Random
sample consensus: a paradigm for model fitting with
applications to image analysis and automated cartography.
Commun. ACM 24:6, 381–395.
Gale, D. and Shapley, L. S. (1962) College Admissions
and the Stability of Marriage. The American Mathematical
Monthly, 69:1, 9-15.
Kitamoto, A., Horii, H., Horii, M., Suzuki, C.,
Yamamoto, K., Fujizane, K. (2018) Differential Reading by
Image-based Change Detection and Prospect for Human-
Machine Collaboration for Differential Transcription,
Digital Humanities Conference.
Leyh, T., Kitamoto, A. (2020) Computer Vision-
based Comparison of Woodblock-printed Books and its
Application to Japanese Pre-modern Text, Bukan. Tenth
Conference of Japanese Association for Digital Humanities
(JADH2020), 53-59.
Moritz, C., Cicero, C. (2004) DNA Barcoding: Promise
and Pitfalls. PLoS Biol 2:10, e354.
Emotions and Literary Periods
leonard.konle@uni-wuerzburg.de
Julius-Maximilians-Universität Würzburg, Germany
278
Digital Humanities 2022
merten.kroencke@uni-goettingen.de
Georg-August-Universität Göttingen, Germany
fotis.jannidis@uni-wuerzburg.de
Julius-Maximilians-Universität Würzburg, Germany
simone.winko@phil.uni-goettingen.de
Georg-August-Universität Göttingen, Germany
“Longing, resignation, derision, disillusionment,
weary smiles, these are the five basic tones
of the modern scale of emotions.” (Servaes 1896)
Introduction
Periodization has been neglected in computational
literary studies despite some early discussions (Underwood
2013). In literary studies the usual basis for the construction
of periods are differences in the choice of topics or style or
non-literary aspects, while differences in the representation
of emotions are underresearched. This is the case even
though recent approaches in literary studies ascribe epoch-
specific relevance to the literary representation of emotions.
How to use quantitative methods to study emotions in
literary texts and use them to describe the differences
between periods is the focus of our paper; our use case is the
difference between realism and early modernism in German
literary history and we are focusing on poetry. In a first step
a group of domain experts manually annotated around 1.000
poems, highlighting phrases according to the emotions
they represented. In the second step a machine learning
model was trained and in a third step this model was used
to annotate a collection of more than 6.000 poems, from
anthologies representing either realism or early modernism.
Lastly we analyzed the main differences of these periods
based on the trends we found. 1
Resources
Our corpus 2 consists of 6249 poems from 20
anthologies. 12 anthologies, published between 1885 and
1911, are explicitly intended by the editors to contain
‘modern’ poetry. 3 The other anthologies were published
between 1859 and 1882 and represent the earlier poetry of
realism.
We gathered emotion annotations for 1278 poems. The
goal was not to annotate readers’ emotions, but rather the
emotions represented in the text itself. The annotators used
a list of 40 discrete emotions (see Table 1), the selection
of which was based both on existing emotion models (e.g.
Ekman 1992, 1999; Plutchik 1980a, 1980b, 2001) and on
the emotions that were regularly represented in the poems
of our corpus. We categorized the emotions into 6 groups,
inspired by the emotion hierarchy in (Shaver et al. 1987).
First, each poem was annotated independently by two
annotators, then they merged annotations manually into a
consensus annotation. Their agreement, measured with γ
(Mahet et al. 2015), was 0.6445 for individual emotions and
0.7491 for the emotion groups.
The emotions groups are not equally balanced (see Fig.
1). This distribution could be specific to our corpus and very
probably will change with other genres.
279
Digital Humanities 2022
Emotion Classification
We model emotion classification as a series of binary
classifications to avoid the complexity of a multi-labeling
task. Basis of our classification experiment is the german
BERT (Devlin et al. 2018) model gbert-large (Chan et al.
2020). Because gbert is trained on contemporary webtext,
we continue its pre-training 4 with poetry to adapt to our
target domain. Subsequently we perform fine-tuning on the
binary emotion classification tasks. To overcome the class
imbalance we apply undersampling by randomly sampling
examples from the majority class in every epoch. While
the classification of single emotions leads to a large spread
in predictive quality 5, the grouped emotions (Table 1)
lead to more stable performance at an acceptable level of
uncertainty (Table 2).
Analysis
Our results (Fig. 2) show that modernist poetry as a
whole represents emotions slightly less frequently than
realist poetry, but the effect sizes are small. 9% of realist
poems and 12% of modernist poems do not represent any
emotion. The probability that a verse contains an emotion
is 47% in realism and 42% in modernism. The decrease in
emotionality from realism to modernism is mainly due to
the emotion group joy, i.e. positive emotions.
If only canonical modernist authors 6 are considered,
the tendency to represent fewer emotions is much stronger.
The probability that a poem from a canonical author does
not represent any emotion is 14%, and the probability that a
verse from the canonical subcorpus contains an emotion is
39%. Not only joy, but also anger, sadness, and especially
love become less frequent compared to the poetry of
realism. Again, the decrease is most pronounced for positive
emotions.
Discussion
Some literary scholars claim that German modernist
poetry, in contrast to the more traditional poetry of realism,
tends toward a sober, matter-of-fact, and non-emotional
mode of expression (cf. e.g. Andreotti 2014). Others argue
that modernist poetry does indeed represent emotions
frequently, albeit in a modified way (cf. e.g. Winko 2003).
Our results support the view that modernist poetry as a
whole continues to represent emotions frequently, that is,
almost as frequently as the poetry of realism. There is a
much more significant decrease in emotionality, however,
when considering only canonical authors. This suggests
that the contradicting views in literary studies regarding
the emotionality or non-emotionality of modernist poetry
could be explained, at least in part, by different objects
of study. The scholars who support the non-emotionality
thesis might have focused more than the others on canonical
authors. These observations highlight the importance of
selection processes and corpus formation in literary history.
Future research could examine further selection criteria and
categories, such as gender or class.
The trend to represent emotions less frequently applies
especially to positive emotions. As a result, negative
emotions make up a larger proportion of the remaining
emotions and modernist poetry appears more negative
overall. This is an interesting topic for further research.
Moreover, it seems instructive to investigate later literary
periods such as expressionism. In addition, it should
be interesting to examine mixed emotions. Finally, it is
desirable to not only analyze the frequency of emotions,
but also the way of representation, e.g. explicit or implicit
modes, which is especially important when dealing with
literature.
Bibliography
Andreotti, Mario. Die Struktur der modernen Literatur.
Neue Formen und Techniken des Schreibens: Erzählprosa
und Lyrik. P. Haupt, 5th edition 2014.
Chan, Brandon; Schweter, Stefan and Möller, Timo.
(2020): German’s next language model In: Proceedings
of the 28th International Conference on Computational
Linguistics, Barcelona, Spain (Online), pp. 6788–6796.
URL: https://aclanthology.org/2020.coling-main.598. doi:
10.18653/v1/2020.coling-main.598.
280
Digital Humanities 2022
Devlin, Jacob; Chang, Ming-Wei; Chang; Kenton, Lee
and Toutanova, Kristina (2018). Bert: Pre-training of deep
bidirectional transformers for language understanding.
arXiv preprint arXiv:1810.04805.
Ekman, Paul. “An Argument for Basic Emotions.”
Cognition and Emotion, vol. 6, no. 3-4, 1992, pp. 169–200.
Ekman, Paul. “Basic Emotions.” Handbook of Cognition
and Emotion, edited by John Tim Dagleish and Mich J.
Power. Wiley, 1999, pp. 45-60.
Gururangan, Suchin; Marasović, Ana; Swayamdipta,
Swabha; Lo, Kyle; Beltagy, Iz; Downey, Doug; and Smith,
Noah A. (2020): Don't stop pretraining: adapt language
models to domains and tasks. In: Proceedings of the 58th
Annual Meeting of the Association for Computational
Linguistics.
Konle, Leonard and Jannidis, Fotis (2020): Domain
and Task Adaptive Pretraining for Language Models. CHR
2020: Workshop on Computational Humanities Research,
November 18–20, 2020, Amsterdam, The Netherlands.
Proceedings http://ceur-ws. org ISSN, 1613, 0073.
Mathet, Yann; Widlöcher, Antoine; Métivier, Jean-
Philippe (2015): The Unified and Holistic Method Gamma
(γ) for Inter-Annotator Agreement Measure and Alignment
Computational Linguistics, MIT Press, September 2015,
Vol. 41, No. 3: 437-479.
Plutchik, Robert. Emotion: A Psychoevolutionary
Synthesis. Harper & Row 1980a.
Plutchik, Robert. “A general psychoevolutionary theory
of emotion.” Emotion: Theory, Research and Experience.
Theories of Emotion, edited by Robert Plutchik and Henry
Kellerman. Academic Press, 1980b, vol. 1, pp. 3–33.
Plutchik, Robert. “The Nature of Emotions.” American
Scientist, vol. 89, no. 4, 2001, pp. 344–350.
Servaes, Franz. Goethe am Ausgang des Jahrhunderts.
In: Neue deutsche Rundschau (1896), pp. 1073-1090
(translation by FJ/SW).
Shaver, Phillip, et al. “Emotion Knowledge: Further
Exploration of a Prototype Approach.” Journal of
Personality and Social Psychology, vol. 52, no. 6, 1987, pp.
1061–1086.
Underwood, Ted: Why Literary Periods Mattered?
Stanford University Press 2013.
Winko, Simone. Kodierte Gefühle. Zu einer Poetik der
Emotionen in lyrischen und poetologischen Texten um
1900. Erich Schmidt, 2003.
Notes
1. CRediT Roles: Leonard Konle: Inve stigation, Data
Curation, Writing – original draft; Merten Kröncke:
Data Curation, Writing – original draft; Fotis Jannidis:
Conceptualization, Supervision, Writing – review &
editing; Simone Winko: Conceptualization, Writing –
review & editing.
2. Code and data: https://github.com/LeKonArD/
Emotions-and-Literary-Periods
Corpus Release: https://doi.org/10.5281/
zenodo.6053952
3. Given the publication dates, we are limited in our
analysis to the poetry of modernism.
4. Hyperparameter: 500 steps, batchsize 30, learningrate
2e-5 (see Konle and Jannidis 2020, Gururangan et al.
2020)
5. Very frequent emotions like longing (f1: 0.73) or
suffering (f1: 0.72) yield sufficient classifiers, but less
frequent ones like calmness or desire lead to results
similar to a random baseline.
6. In our study, in accordance with German literary
histories, Stefan George (22 poems), Rainer Maria
Rilke (37 poems), Hugo von Hofmannsthal (31
poems), and Arno Holz (50 poems) represent
canonical modernism.
Accuracy is not all you need
rdkm@cas.au.dk
Aarhus University, Denmark
idamarie@cas.au.dk
Aarhus University, Denmark
kenneth.enevoldsen@cas.au.dk
Aarhus University, Denmark
lasse.hansen@clin.au.dk
Aarhus University, Denmark
kln@cas.au.dk
Aarhus University, Denmark
Discussions around diversity and bias in language
representations are a hot topic in contemporary natural
language processing. Countless papers have pointed
out that these representations can be shown to contain
specific biases, such as in the case of both so-called static
281