Conference PaperPDF Available

Feelings from the Past—Adapting Affective Lexicons for Historical Emotion Analysis

Authors:
  • JULIE Lab, Friedrich Schiller University Jena

Abstract and Figures

We describe a novel method for measuring affective language in historical texts by expanding an affective lexicon and jointly adapting it to prior language stages. We automatically construct a lexicon for word-emotion association of 18th and 19th century German which is then validated against expert ratings. Subsequently, this resource is used to identify distinct emotional patterns and trace long-term emotional trends in different genres of writing spanning several centuries.
Content may be subject to copyright.
Feelings from the Past—
Adapting Affective Lexicons for Historical Emotion Analysis
Sven Buechel1Johannes Hellrich2Udo Hahn1
1Jena University Language & Information Engineering (JULIE) Lab
http://www.julielab.de
2Graduate School ‘The Romantic Model’
http://www.modellromantik.uni-jena.de
Friedrich-Schiller-Universit¨
at Jena, Jena, Germany
Abstract
We describe a novel method for measuring affective language in historical texts by expanding an
affective lexicon and jointly adapting it to prior language stages. We automatically construct a
lexicon for word-emotion association of 18th and 19th century German which is then validated
against expert ratings. Subsequently, this resource is used to identify distinct emotional patterns
and trace long-term emotional trends in different genres of writing spanning several centuries.
1 Introduction
For more than a decade, computational linguists have endeavored to decode affective information
1
from
textual documents, such as personal value judgments or emotional tone (Turney and Littman, 2003; Alm
et al., 2005). Despite the achievements made so far, the majority of work in this area is limited in at least
two ways. First, employing simple positive-negative polarity schemes fails to account for the diversity of
affective reactions (Sander and Scherer, 2009) and, second, in contrast to the humanities where numerous
contributions focus on emotion expression and elicitation (Corngold, 1998), very little work has been
conducted by computational linguists to unravel affective information in historical sources.
Arguably the main problem here relates to the availability of language resources for detecting affect.
Algorithms for measuring semantic polarity (positive vs. negative) or emotion typically rely on either
annotated corpora, lexical resources (storing the affective meaning of individual words) or a combination
of both (Liu, 2015). To ensure proper affect prediction, these resources must accurately represent the target
domain but speakers of historical language stages (19th century and earlier) can no longer be recruited for
data annotation. Prior work aiming to detect affect in historical text ignored this problem and relied on
contemporary language resources instead (Acerbi et al., 2013; Bentley et al., 2014).
Using word embeddings, we tackle this problem by jointly adapting a contemporary affective lexicon
to historical language and expanding it in size. Collecting ratings from historical language experts, we
successfully validate our method against human judgment. In contrast to previous work based on the
categorical notion of polarity (Cook and Stevenson, 2010), we employ the more expressive dimensional
Valence-Arousal-Dominance (VAD; Bradley and Lang (1994)) model of affect, instead. As a proof of
concept, we apply this method to a collection of historical German texts, the main corpus of the ’Deutsches
Textarchiv’ (DTA) [German Text Archive], in order to demonstrate the adequacy of our approach. Our
data indicate that, at least for historical texts, academic writing and belles lettres, as well as respective
subgenres, strongly differ in their use of affective language. Furthermore, we find statistically significant
affect change patterns between 1740 and 1900 for these genres.
2 Related Work
Prior computational studies analyzing affect in non-contemporary text are very rare. To the best of our
knowledge, the work by Acerbi et al. (2013) and Bentley et al. (2014) constitute the first of this kind. They
construct a literary misery index by comparing frequency of joy-indicating vs. sadness-indicating words
This work is licenced under a Creative Commons Attribution 4.0 International License. License details:
http://
creativecommons.org/licenses/by/4.0/
1We here use affect as an umbrella term for both semantic polarity and emotion.
in the Google Books Ngram corpus (see below) and find correlations with major socio-political events
(such as WWII), as well as the annual U.S. economic misery index in the 20th century.
As stated above, most prior work focused on the bi-polar notion of semantic polarity, a rather simplified
representation scheme given the richness of human affective states (a deficit increasingly recognized in
sentiment analysis (Strapparava, 2016)). In contrast to this representationally restricted format, the VAD
model of emotion (Bradley and Lang, 1994), which we employ here, is a well-established approach in
psychology (Sander and Scherer, 2009) which also increasingly attracts interest in the NLP community
(see among others K
¨
oper and Schulte im Walde (2016), Yu et al. (2016), and Wang et al. (2016)).
It assumes that affective states can be characterized relative to three affective dimensions: Valence
(corresponding to the concept of polarity), Arousal (the degree of calmness or excitement) and Dominance
(the degree to which one feels in control of a social situation). Formally, the VAD dimensions span a
three-dimensional real-valued space which is illustrated in Figure 1, the prediction of such values being a
multi-way regression problem (Buechel and Hahn, 2016).
1.0 0.5 0.0 0.5 1.0
1.0 0.5 0.0 0.5 1.0
1.0
0.5
0.0
0.5
1.0
Valence
Arousal
Dominance
Anger
Surprise
Disgust
Fear
Sadness
Joy
Figure 1: The three-dimensional space spanned
by the VAD dimensions. For a more intuitive
explanation, we display the position of six basic
emotions (Ekman, 1992; Russell and Mehra-
bian, 1977).
Thanks to the popularity of the VAD scheme in
psychology, plenty of resources have already been
developed for different languages. For English, the
Affective Norms of English Words (ANE W; Bradley
and Lang (1999)) incorporate 1,034 words paired with
experimentally determined affective ratings using a
9-point scale for Valence, Arousal and Dominance,
respectively (see Table 1 for an illustration of the struc-
ture of such a lexicon). Warriner et al. (2013) provided
an extended version of this resource (14k entries) em-
ploying crowdsourcing. As far as German-language
emotion lexicons are concerned, AN GS T (Schmidtke
et al., 2014) is arguably the most important one for
NLP purposes—it was only recently constructed (com-
prising 1,003 lexical entries) and replicates AN EWs
methodology very closely (see K
¨
oper and Schulte im
Walde (2016) for a more complete overview of German VAD resources).
As the manual creation of affective lexicons (polarity or VAD) is expensive, their automatic extension is
an active field of research since many years (Turney and Littman, 2003; Rosenthal et al., 2015). Typically,
unlabeled words are attributed affective values given a set of seed words with known affect association, as
well as similarity scores between seed and unlabeled words. Concerning emotions in VAD representation,
Bestgen (2008) presented an algorithm based upon a k-Nearest-Neighbor methodology which expands the
original lexicon by a factor of 17 (Bestgen and Vincze, 2012). Cook and Stevenson (2010) were the first to
induce a polarity lexicon for non-contemporary language from historical corpora by employing a pointwise
mutual information (PMI) metric to determine word similarity and the much received algorithm by Turney
and Littman (2003) for polarity induction. PMI is, like latent semantic analysis (LSA; Deerwester et al.
(1990)), an early form of distributional semantics which, in the meantime, has been replaced by singular
value decomposition with positive pointwise mutual information (SVD
PPMI
; Levy et al. (2015)) and
skip-gram negative sampling (SGNS; Mikolov et al. (2013)). Quite recently, evidence is available that the
latter behaves more robust than the former (Hamilton et al., 2016).
Most prior studies covering long time spans (e.g., Acerbi et al. (2013)) rely on the Google Books Ngram
corpus (GBN; Michel et al. (2011), Lin et al. (2012)). However, this corpus might be problematic for
Digital Humanities research because of digitization artifacts and its opaque and unbalanced sampling
(Pechenick et al., 2015; Koplenig, 2016). For German, we use the DTA
2
(Geyken, 2013; Jurish, 2013),
which consists of books transcribed with double-keying and selected for their representativeness. The
DTA aims for genre balance and provides a range of metadata for each document, e.g., authors, year,
classification (like belles lettres and academic texts) and sub-classification (e.g., poem, biology, medicine).
2TCF version from May 11, 2016, available via www.deutschestextarchiv.de/download
3 Methods
Our methodology consists of two main parts. First, we adapt a contemporary VAD emotion lexicon to
historical language and expand it jointly in size, and, second, we use this expanded lexicon to analyze
emotions in historical language stages.
3.1 Inducing Historical VAD Lexicons
One of the most commonly used algorithms for affect lexicon induction was proposed a decade ago by
Turney and Littman (2003) and put into practice for historical language by Cook and Stevenson (2010).
Unfortunately, this procedure expects seed words of discrete polarity classes, a format we consider less
informative for affective language analysis. For VAD vectors, we here employ the induction algorithm
introduced by Bestgen (2008) instead. Bestgen’s algorithm computes the affective score of the word
w
,
¯e(w), given the set of the knearest neighboring words to wfrom a seed lexicon, NEAREST(k, w), as
¯e(w) := 1
kX
vNEAREST(k,w)
e(v)(1)
where
e(v)
is the emotion value of the word
v
, a three-dimensional VAD vector (see Table 1 and Figure
1 for illustration). We modify Bestgen’s method by replacing LSA with SGNS for determining word
similarity. In order to account for word-emotion association as present in historical language stages, we
use word embeddings derived directly from the target language stage instead of contemporary ones. Seed
values for the induction are taken from the contemporary ANGST lexicon (Schmidtke et al., 2014). This
method results in a hybrid lexicon whose seed VAD values are empirically determined by contemporary
speakers, whereas the similarity of words (and therefore the set of words taken into account when
computing emotion values for words not in the seed lexicon) is determined from historical corpora.
Although the emotion values computed in this way might be somewhat biased towards the contemporary
language stage, such a hybrid lexicon should be more suitable for a historical analysis than lexicons with
contemporary information only.
3.2 Measuring Textual Emotion
Building on an adapted lexicon, it is possible to (more) accurately determine the emotion values of
historical texts. For this task, we use the Jena Emotion Analysis System
3
(JEMAS; Buechel and Hahn
(2016)) since it has been (as one of the first tools for VAD prediction) thoroughly evaluated and is, to
the best of our knowledge, currently the only tool for this purpose freely available. The lexicon-based
approach it employs yields reasonable performance (Staiano and Guerini, 2014; Buechel and Hahn, 2016)
and is easily adaptable to other domains by replacing the lexicon—a feature most valuable for historical
applications as well. Basically,
4
it calculates the emotion value of a document
d
(a bag of words),
¯e(d)
, as
the weighted average of the emotion values of the words in d,¯e(w), as computed by Equation 1:
¯e(d) := Pwdλ(w, d)ׯe(w)
Pwdλ(w, d)(2)
where
¯e(w)
is defined as the vector representing a neutral emotion, if
w
is not covered by the lexicon,
and
λ
denotes some term weighting function. Here, we use absolute term frequency as the resulting
performance is among the best for automatically expanded lexicons (Buechel and Hahn, 2016).
4 Experiments
4.1 Gold Standard
One considerable difficulty concerning lexicons for historical language stages relates to their proper
validation, since we lack native speakers for data annotation. Hence, to assess the quality of our results
3https://github.com/JULIELab/JEmAS
4For brevity, we only give a loose formal specification. See Buechel and Hahn (2016) for a more elaborated definition.
we constructed a small gold standard of 20 words annotated by seven doctoral students from various
humanities fields. Their areas of expertise strongly overlap with the time periods covered by the slice
of the DTA we are investigating. The instructions and rating scales follow the design of Warriner et al.
(2013), yet with one crucial exception—subjects were requested to put themselves in the position of a
person living between 1741 and 1900. We used such a wide temporal range since we expected different
raters to ground their rating decisions on different time spans, varying with their historic expertise and
acquaintance with a specific period. When averaging the different ratings, these biases should level off
resulting in valid ratings relative to the entire time span. Our 20 stimulus words were randomly selected
from words present within both the AN GS T seed lexicon and the subset of the 1741–1900 DTA corpus,
thus avoiding any “noisy” words such as annual figures. Table 1 provides some sample entries.
For comparison with existing resources, we measure inter-annotator agreement (IAA) by calculating
the standard deviation between all given ratings for each word and dimension and then averaging these
values for every VAD dimension (Average Standard Deviation; ASD). Our raters achieved an ASD of
1.61, 1.85, and 1.83 for Valence, Arousal, and Dominance, respectively. These IAA ratings are better than
the ASDs reported by Warriner et al. (2013)—1.68, 2.30, and 2.16—suggesting that our experts are able
to consistently rate non-contemporary word emotions.
4.2 Lexicon Expansion and Historical Adaptation
Lemma Valence Arousal Dominance
“Mutter” (mother) 2.00 -1.14 -1.29
“Erholung” (recovery) 0.86 -2.29 0.57
“giftig” (poisonous) -2.29 1.86 -0.71
“Krise” (crisis) -2.00 2.00 -0.86
Table 1: Sample entries from the historical gold stan-
dard relative to their empirically determined Valence-
Arousal-Dominance (VAD) values.
As mentioned before, we operate on the 1741–
1900 part of the DTA. One text from this period
written in Latin was excluded, leaving us with
1,022 texts. To ensure matches between this cor-
pus and our VAD seed lexicon, we preprocessed
ANG ST (Schmidtke et al., 2014) with the CAB
5
lemmatization system used by the DTA (Jurish,
2013), without further filtering or modification of
these entries. We then trained 200 dimensional
SGNS embeddings6on this corpus.
We ran the modified version of Bestgen’s expansion algorithm (see above) on these word embeddings
using AN GS T as seed lexicon. The
k
-parameter was determined by running the process for each integer
k[1,50]
measuring Pearson’s
r
between original and induced values at each step. The correlation was
highest for
k= 16
(
r= 0.681
; average correlation over all three dimensions) which was thus employed
to induce the final lexicon.
Our expanded and historically adapted lexicon comprises 143,677 word-emotion pairs. The correlation
between these induced values and our historical gold standard amounts to
r= 0.75
,
0.64
and
0.56
for
Valence, Arousal, and Dominance, respectively (the differences between the dimensions are consistent
with prior work (Bestgen and Vincze, 2012)). Hence, our performance on historical data is even higher
than the performance Bestgen and Vincze (2012) reported when using Bestgen’s original algorithm to
predict contemporary word emotions. We take this as a hint that our modifications (e.g., using SGNS
instead of LSA) more than compensate for the additional difficulty of inducing historical word emotions.
4.3 Application to the DTA Historical Corpus
In order to demonstrate the potential of our approach for the Digital Humanities, we now examine the
distribution of emotions in the DTA corpus relative to different categories of metadata. First, we take into
account the genres of a document considering the whole study period. Second, we look at changes in
emotions over time (also taking genre differences into account). Furthermore, of the three main genres
distinguished within the DTA— belles lettres, academic texts and functional texts—we focus on the first
5Available via www.deutschestextarchiv.de/demo/cab/
6
We used the PYTHON-based GENSIM implementation (accessible from
https://radimrehurek.com/gensim/
),
with the following parameters: context window of up to 10 neighboring words, minimum word frequency of 10, negative
sampling with 5 noise words and downsampling for words with a frequency of
103
or higher. We trained for 5 epochs,
decreasing the learning rate from initial 0.025 down to 0.0001 in each of them.
642 0 2
42 0 2
Valence
Dominance
Belles lettres
Academic
Figure 2: Distribution of two of
the main document classes rela-
tive to Valence and Dominance.
Lyric
Narratives
Drama
Figure 3: Distribution of sub-
classes of belles lettres relative
to Arousal and Dominance.
642 0 2
42 0 2 4
Valence
Arousal
Law
Philosophy
Mathematics
Technology
Physics
Figure 4: Distribution of five
subclasses of academic texts rel-
ative to Arousal and Valence.
two because, upon inspection of the texts present in each category, they tend to be much more distinctively
defined than the latter which, in our view, remains quite opaque.
For the following experiments, we processed the documents of the study period via JEMAS (see Section
3.2) employing the newly constructed historically adapted emotion lexicon. The VAD output of this system
was standardized so that mean
M= 0
and standard deviation
SD = 1
for each dimension. Subsequently,
we visualize our data with 2-D scatterplots each displaying two of the three VAD dimensions (Figures 2–4).
Of the three possible plots (one for each pair of VAD dimensions) we here include the most illustrative
ones for each comparison.
4.3.1 Distinction of Text Genres and Domains
Comparing belles lettres to academic texts, Figure 2 depicts their distribution relative to Valence and
Dominance so that each data point relates to one document. The two genres are clearly separated
7
and their
clusters show only little overlap. These observations suggest that the VAD values our system generates
reflect the membership of a text for a certain genre. It may also indicate that our method is valid insofar
as it catches some relevant intrinsic characteristics of the processed documents. To further illustrate
the usefulness of our work for, e.g, literary studies or history of mind, we statistically tested whether
these classes differ relative to the three emotional dimensions (using non-parametric tests; the data are,
in general, not normally distributed) and give median (Md) values. In this setting, belles lettres display
significantly higher Valence (Md
= 0.39
) and lower Dominance (Md
=0.40
) than academic texts (Md
=
0.22
and
0.42
, respectively;
p<.05
using a Mann-Whitney U test). This may reflect the technical
nature of academic writing, e.g., explaining certain methodologies, and therefore expressing more control
(which is closely related to Dominance).8Differences in Arousal were not significant.
Concerning the subgenres of belles lettres, we compared the predefined classes lyric and drama, and
also narratives, a subclass we defined for this experiment subsuming different fine-grained distinctions
between German terms for novels, novellas and tales. Again, a visual examination (see the Dominance-
Arousal plot in Figure 3) reveals good separability. Dramas show lower Valence (Md = 0.00) than lyric and
narratives (Md = 0.43 and 0.45, respectively), whereas lyric excels with high Arousal (Md = 0.43) contrary
to narratives (Md =
0.22
) and drama (Md
=0.13
). Furthermore, narratives have a markedly higher
Dominance (Md
=0.19
) in contrast to lyric (Md
=1.44
) and drama (Md
=1.23
). The differences
between the groups are significant relative to each dimension (
p < .05
; using the Kruskal-Wallis test,
since we compare more than two groups).
Another striking distribution is depicted in Figure 4 which displays the relative positioning of five
subclasses of academic texts, namely law, philosophy, mathematics, technology, and physics. Apparently,
we come up with a clear (almost linear) separation between philosophy and law, on the one hand, and
7
The notion of separability can be quantified as the performance of a classifier predicting the genre of a document given its
VAD values. We ran these experiments in a pilot study finding good separability (almost 90% accuracy in this case) but exclude
the details for brevity.
8
The interpretations we offer in this section are meant as an illustration of how our quantitative data could be utilized within
the (Digital) Humanities. We currently do not claim that these results can be taken for granted given our experimental data.
mathematics, physics and technology, on the other hand (thus empirically substantiating intuitions of
different academic cultures dividing the sciences from the humanities (Kagan, 2009) in emotional terms).
Also, the plot reveals more fine-grained features in line with common-sense intuitions about these study
fields, e.g., parts of the philosophical texts are indistinguishable from law texts, while others show
pronounced overlap with physics (possibly reflecting the impact of different subdisciplines, such as
philosophy of law and philosophy of science). Also, physics and technology are fairly well set apart from
each other, while mathematics seems to be equally similar to both. The qualitative fields display higher
Valence and Arousal (Md =
0.26
and
0.54
, respectively) than the quantitative ones (Md =
0.70
and
1.47
). However, the sciences show higher Dominance than law and philosophy (Md =
1.12
as opposed
to
0.53
; all differences significant:
p<.05
using a Mann-Whitney U test). Extending our interpretation
concerning Figure 2, this may reflect the more technical nature of writings in the quantitative fields as
opposed to the language-centered disciplines.
4.3.2 Shifts in Emotion over Time
Lemma Valence Arousal Dominance
Academic -0.002* 0.000 0.003***
Belles lettres 0.001 -0.006*** 0.001
All -0.002* -0.003*** 0.004***
Table 2:
β
-coefficients of linear models predict-
ing Valence, Arousal and Dominance (VAD),
respectively, given a year. Levels of significance:
*p<.05; ** p<.01; ***p<.001.
We now turn to the question whether shifts in emo-
tion can be traced in the texts of the DTA corpus over
time. Again, we considered, first, all texts of the cor-
pus, second, texts of the major academic class and,
third, texts of the major class belles lettres. Due to
data sparsity we did not take into account subclasses.
We found clear evidence for long-duration shifts in
emotion values considering the different groups. Per-
forming linear regression, our data (quantified as the
β
-coefficient of linear regression models, i.e., the steepness of the regression line) suggest a specifically
strong increase in Dominance concerning academic texts (possibly reflecting the establishment of a more
technical style in scientific writing) and in the corpus as a whole, as well as a decrease of Arousal in
belles lettres (possibly reflecting the shift from highly emotional sentimentalism via romanticism to rather
descriptive realism (Watanabe-O’Kelly, 1997)). We summarize our findings concerning long-duration
shifts in Table 2. These figures might seem rather small; recall, however, that the VAD values are
normalized (given in
SD
) and that the documents we consider span 160 years so that, e.g., Arousal in
belles lettres decreased by almost one SD (0.96).
5 Conclusion
In this paper, we introduced a novel methodology for measuring emotion in non-contemporary texts by
linking neural word embeddings derived from historical corpora, an adapted expansion algorithm for
affective lexicons, and a lexicon-based method for emotion analysis. To demonstrate the potential of
our approach for the Digital Humanities, we then conducted a study on emotional patterns within the
DTA, a high-quality collection of historical German texts, using the multidimensional VAD model of
emotion. This is the first application study of this kind, since prior studies on affect in historical texts
were conducted with lexicons that were both non-specific for historical texts and less informative in terms
of their affect representation scheme (Acerbi et al., 2013; Bentley et al., 2014).
We found evidence that different genres and subgenres of belles lettres and academic texts in the DTA
show contrasting patterns in their emotional characteristics. Moreover, we identified pronounced long-term
trends in textual emotions between 1741 and 1900. Both these observations can, though cautiously, be
linked to explanatory patterns as discussed in the humanities (thus granting face validity to our findings).
We are interested in transferring these results to other languages, as well as conducting more fine
grained temporal modeling, i.e., using multiple, temporally more specific lexicons for tracking emotional
change. Future methodological work will focus on broadening the coverage of our gold standard as well as
on the quality of induction algorithms for affective lexicons of historical language. Our induced historical
word emotion lexicon in a format compatible with JEMAS and the gold standard are publicly available
9
.
9https://github.com/JULIELab/HistEmo
Acknowledgements
This research was partly conducted within the Graduate School ‘The Romantic Model’ which is funded
by grant GRK 2041/1 from the Deutsche Forschungsgemeinschaft (DFG).
References
Alberto Acerbi, Vasileios Lampos, Philip Garnett, and R. Alexander Bentley. 2013. The expression of emotions
in 20th century books. PLoS ONE, 8(3):e59030.
Cecilia Ovesdotter Alm, Dan Roth, and Richard Sproat. 2005. Emotions from text: Machine learning for text-
based emotion prediction. In HLT-EMNLP 2005 — Proceedings of the Human Language Technology Confer-
ence & 2005 Conference on Empirical Methods in Natural Language Processing. Vancouver, British Columbia,
Canada, 6-8 October 2005, pages 579—586.
R. Alexander Bentley, Alberto Acerbi, Paul Ormerod, and Vasileios Lampos. 2014. Books average previous
decade of economic misery. PLoS ONE, 9(1):e83147.
Yves Bestgen and Nadja Vincze. 2012. Checking and bootstrapping lexical norms by means of word similarity
indexes. Behavior Research Methods, 44(4):998–1006.
Yves Bestgen. 2008. Building affective lexicons from specific corpora for automatic sentiment analysis. In LREC
2008 — Proceedings of the 6th International Conference on Language Resources and Evaluation. Marrakech,
Morocco, 26 May - June 1, 2008, pages 496–500.
Margaret M. Bradley and Peter J. Lang. 1994. Measuring emotion: The self-assessment manikin and the semantic
differential. Journal of Behavior Therapy and Experimental Psychiatry, 25(1):49–59.
Margaret M. Bradley and Peter J. Lang. 1999. Affective norms for English words (AN EW): Stimuli, instruction
manual and affective ratings. Technical Report C-1, University of Florida, Gainesville, FL.
Sven Buechel and Udo Hahn. 2016. Emotion analysis as a regression problem: Dimensional models and their
implications on emotion representation and metrical evaluation. In ECAI 2016 — Proceedings of the 22nd
European Conference on Artificial Intelligence. Vol. 2: Long Papers. The Hague, The Netherlands, August 29 -
September 2, 2016, number 285 in Frontiers in Artificial Intelligence and Applications, pages 1114–1122.
Paul Cook and Suzanne Stevenson. 2010. Automatically identifying changes in the semantic orientation of words.
In LREC 2010 — Proceedings of the 7th International Conference on Language Resources and Evaluation. La
Valletta, Malta, May 17-23, 2010, pages 28–34.
Stanley Corngold. 1998. Complex Pleasure: Forms of Feeling in German Literature. Stanford University Press.
Scott C. Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard A. Harshman. 1990.
Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391–407.
Paul Ekman. 1992. An argument for basic emotions. Cognition & Emotion, 6(3-4):169–200.
Alexander Geyken. 2013. Wege zu einem historischen Referenzkorpus des Deutschen: das Projekt Deutsches
Textarchiv. In Ingelore Hafemann, editor, Perspektiven einer corpusbasierten historischen Linguistik und
Philologie. Internationale Tagung des Akademienvorhabens “Alt¨
agyptisches W¨
orterbuch” an der Berlin-
Brandenburgischen Akademie der Wissenschaften. Berlin, Germany, December 12-13, 2011, pages 221–234.
William L. Hamilton, Jure Leskovec, and Daniel Jurafsky. 2016. Diachronic word embeddings reveal statistical
laws of semantic change. In ACL 2016 — Proceedings of the 54th Annual Meeting of the Association for
Computational Linguistics. Berlin, Germany, August 7-12, 2016, volume 1: Long Papers, pages 1489–1501.
Bryan Jurish. 2013. Canonicalizing the Deutsches Textarchiv. In Ingelore Hafemann, editor, Perspektiven
einer corpusbasierten historischen Linguistik und Philologie. Internationale Tagung des Akademienvorhabens
“Alt¨
agyptisches W¨
orterbuch” an der Berlin-Brandenburgischen Akademie der Wissenschaften. Berlin, Germany,
December 12-13, 2011, pages 235–244.
Jerome Kagan. 2009. The Three Cultures: Natural Sciences, Social Sciences, and the Humanities in the 21st
Century. Cambridge University Press.
Maximilian K¨
oper and Sabine Schulte im Walde. 2016. Automatically generated affective norms of abstractness,
arousal, imageability and valence for 350,000 German lemmas. In LREC 2016 — Proceedings of the 10th
International Conference on Language Resources and Evaluation. Portoroˇ
z, Slovenia, 23-28 May 2016, pages
2595–2598.
Alexander Koplenig. 2016. The impact of lacking metadata for the measurement of cultural and linguistic change
using the GOOGLE NGRA M data sets: Reconstructing the composition of the German corpus in times of WWII.
Digital Scholarship in the Humanities, 32.
Omer Levy, Yoav Goldberg, and Ido Dagan. 2015. Improving distributional similarity with lessons learned from
word embeddings. Transactions of the Association for Computational Linguistics, 3:211–225.
Yuri Lin, Jean-Baptiste Michel, Erez Lieberman Aiden, Jon Orwant, William Brockman, and Slav Petrov. 2012.
Syntactic annotations for the GOOGLE BO OK S NGR AM corpus. In ACL 2012 — Proceedings of the 50th Annual
Meeting of the Association for Computational Linguistics. Jeju Island, Korea, July 10, 2012, volume System
Demonstrations, pages 169–174.
Bing Liu. 2015. Sentiment Analysis: Mining Opinions, Sentiments, and Emotions. Cambridge University Press.
Jean-Baptiste Michel, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray, The Google
Books Team, Joseph P. Pickett, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, Steven Pinker, Mar-
tin A. Nowak, and Erez Lieberman Aiden. 2011. Quantitative analysis of culture using millions of digitized
books. Science, 331(6014):176–182.
Tomas Mikolov, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Efficient estimation of word represen-
tations in vector space. In ICLR 2013 — Workshop Proceedings of the International Conference on Learning
Representations. Scottsdale, Arizona, USA, May 2-4, 2013.
Eitan Adam Pechenick, Christopher M. Danforth, and Peter Sheridan Dodds. 2015. Characterizing the
GOOGLE BOO KS corpus: Strong limits to inferences of socio-cultural and linguistic evolution. PLoS One,
10(10):e0137041.
Sara Rosenthal, Preslav I. Nakov, Svetlana Kiritchenko, Saif M. Mohammad, Alan Ritter, and Veselin Stoyanov.
2015. SE MEVAL- 2015 Task 10: Sentiment analysis in Twitter. In SemEval-2015 — Proceedings of the 9th
Workshop on Semantic Evaluation @ NAACL-HLT 2015. Denver, Colorado, USA, June 4-5, 2015, pages 451–
463.
James A. Russell and Albert Mehrabian. 1977. Evidence for a three-factor theory of emotions. Journal of
Research in Personality, 11(3):273–294.
David Sander and Klaus R. Scherer, editors. 2009. The Oxford Companion to Emotion and the Affective Sciences.
Oxford University Press.
David S. Schmidtke, Tobias Schr¨
oder, Arthur M. Jacobs, and Markus Conrad. 2014. ANGST: Affective norms
for German sentiment terms, derived from the affective norms for English words. Behavior Research Methods,
46(4):1108–1118.
Jacopo Staiano and Marco Guerini. 2014. DEPECH E MOO D: A lexicon for emotion analysis from crowd anno-
tated news. In ACL 2014 — Proceedings of the 52nd Annual Meeting of the Association for Computational
Linguistics. Baltimore, Maryland, USA, June 22-27, 2014, volume 2: Short Papers, pages 427–433.
Carlo Strapparava. 2016. Emotions and NLP: Future directions. In WASSA 2016 — Proceedings of the 7th
Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis @ NAACL-HLT
2016. San Diego, California, USA, June 16, 2016, page 180.
Peter D. Turney and Michael L. Littman. 2003. Measuring praise and criticism: Inference of semantic orientation
from association. ACM Transactions on Information Systems, 21(4):315–346.
Jin Wang, Liang-Chih Yu, K. Robert Lai, and Xuejie Zhang. 2016. Dimensional sentiment analysis using a
regional CNN-LSTM model. In ACL 2016 — Proceedings of the 54th Annual Meeting of the Association for
Computational Linguistics. Berlin, Germany, August 7-12, 2016, volume 2: Short Papers, pages 225–230.
Amy Beth Warriner, Victor Kuperman, and Marc Brysbært. 2013. Norms of valence, arousal, and dominance for
13,915 English lemmas. Behavior Research Methods, 45(4):1191–1207.
Helen Watanabe-O’Kelly, editor. 1997. The Cambridge History of German Literature. Cambridge Univ. Press.
Liang-Chih Yu, Lung-Hao Lee, Shuai Hao, Jin Wang, Yunchao He, Jun Hu, K. Robert Lai, and Xuejie Zhang. 2016.
Building Chinese affective resources in valence-arousal dimensions. In NAACL-HLT 2016 — Proceedings of
the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies. San Diego, California, USA, June 12-17, 2016, pages 540–545.
... Frequency metrics to detect the co-occurrences with sets of "good-" or "bad-" sentiment words has been widely used to infer the amelioration and pejoration of a word over time, especially for neologism Cook and Stevenson (2010); Jatowt and Duh (2014). Similarly, Buechel et al. (2016) use seed words in German and English to track the emotions captured in corpora. Additionally, the authors represent it in the Valence-Arousal-Dominance framework instead of following a negative and positive connotation. ...
... However, we could not observe that there is also an harmonization on how the error analysis for the feeling function is implemented. Some works adopt binary orientation function (positive, negative), while other multidimensional orientations (valence, arousal, dominance) Buechel et al. (2016). There is still further study to analyze which function (f) is more suitable for characterizing change. ...
Preprint
Full-text available
Live languages continuously evolve to integrate the cultural change of human societies. This evolution manifests through neologisms (new words) or semantic changes of words (new meaning to existing words). Understanding the meaning of words is vital for interpreting texts coming from different cultures (regionalism or slang), domains (e.g., technical terms), or periods. In computer science, these words are relevant to computational linguistics algorithms such as translation, information retrieval , question answering, etc. Semantic changes can potentially impact the quality of the outcomes of these algorithms. Therefore, it is important to understand and characterize these changes formally. The study of this impact is a recent problem that has attracted the attention of the computational linguistics community. Several approaches propose methods to detect semantic changes with good precision, but more effort is needed to characterize how the meaning of words changes and to reason about how to reduce the impact of semantic change. This survey provides an understandable overview of existing approaches to the characterization of semantic changes and also formally defines three classes of characterizations: if the meaning of a word becomes more general or narrow (change in dimension) if the word is used in a more pejorative or positive/ameliorated sense (change in orientation), and if there is a trend to use the word in a, for instance, metaphoric or metonymic context (change in relation). We summarized the main aspects of the selected publications in a table and discussed the needs and trends in the research activities on semantic change characterization.
... Both are capable of extracting features from multiple physiological signals, but only the ECG features extraction are considered here. AuBT is chosen as the first ECG feature extraction technique because it has been used widely in ERS [20], [23]- [27]. Meanwhile, TEAP is chosen as a comparison to the study because the developer claimed that the toolbox extracts lesser features, but, with a more meaningful contents to be classified [25]. ...
... AuBT uses GUI as to simplify the process and act as a faster way to analyse new data. The features that can be extracted from ECG signals are as tabulated in Table 2 [20], [23]- [26]. ...
... Para el análisis de sentimientos de Madame Bovary y La Regenta hemos utilizado el enfoque semántico 13 basado en lexicones afectivos, también denominados diccionarios de sentimientos o simplemente lexicones. Una de las principales limitaciones de este trabajo es, sin duda, la carencia de lexicones diseñados de manera específica para lenguas distintas del inglés (en nuestro caso, el español y el francés) o que permitan llevar a cabo comparaciones interlingüísticas (Balahur y Turchi, 2012;Schmidt y Burghardt, 2018), así como de lexicones que contemplen la dimensión diacrónica (Buechel et al., 2016). Habida cuenta de la necesidad de una mayor investigación sobre estos aspectos, los resultados no han de considerase concluyentes, sino meramente ejemplificativos de la aplicación de un nuevo modelo de estudio. ...
Article
Full-text available
La supuesta influencia de Madame Bovary en La Regenta, rodeada desde el inicio de polémicas y enfrentamientos, ha sido objeto de numerosos estudios críticos. El enfoque tradicionalmente adoptado ha sido de tipo cualitativo y se ha fundado en datos parciales, no siempre objetivos. Es más, en ocasiones, se han tomado como base de las distintas hipótesis tan solo impresiones meramente anecdóticas y, en consecuencia, los resultados obtenidos han sido discordantes. El objetivo principal de este trabajo es aportar datos cuantitativos que contribuyan a dar respuesta a esta cuestión aún abierta. Con este fin, llevaremos a cabo un análisis computacional de los patrones estilísticos y la dimensión emotiva que subyacen en ambas novelas utilizando para ello el lenguaje de programación R. Además de este objetivo primario se abordará también secundariamente la comparación de la versión original de Madame Bovary con su traducción al español, a fin de someter a experimentación un nuevo modelo de acercamiento a la equivalencia traductora. A pesar de que, dada su novedad, este enfoque presenta aún limitaciones, puede constituir un primer paso para explorar nuevas vías de investigación de fenómenos como la asimilación, la imitación, la intertextualidad o el plagio en textos literarios, así como de la equivalencia en traducción.
... Many efforts in NLP focus on (automatically) creating lexicons. Terms are assigned VAD scores based on their semantic similarity to other words, for which manual annotations are provided (Köper, Kim, and Klinger 2017;Buechel, Hellrich, and Hahn 2016). To date, lexicons are available for both English (Bradley and Lang 1999;Warriner, Kuperman, and Brysbaert 2013;Mohammad 2018) and other languages (e.g., Buechel, Rücker, and Hahn [2020] created lexicons for 91 languages, including Korean, Slovak, Icelandic, Hindi), and so are corpora annotated at the sentence or paragraph level with (at least a subset of) VAD information-among others are Preotiuc-Pietro et al. (2016), Buechel and Hahn (2017b), and Buechel and Hahn (2017a) for English, Yu et al. (2016) for Mandarin and Mohammad et al. (2018) for Spanish and Arabic. ...
Article
Full-text available
The most prominent tasks in emotion analysis are to assign emotions to texts and to understand how emotions manifest in language. An important observation for natural language processing is that emotions can be communicated implicitly by referring to events alone, appealing to an empathetic, intersubjective understanding of events, even without explicitly mentioning an emotion name. In psychology, the class of emotion theories known as appraisal theories aims at explaining the link between events and emotions. Appraisals can be formalized as variables that measure a cognitive evaluation by people living through an event that they consider relevant. They include the assessment if an event is novel, if the person considers themselves to be responsible, if it is in line with their own goals, and so forth. Such appraisals explain which emotions are developed based on an event, for example, that a novel situation can induce surprise or one with uncertain consequences could evoke fear. We analyze the suitability of appraisal theories for emotion analysis in text with the goal of understanding if appraisal concepts can reliably be reconstructed by annotators, if they can be predicted by text classifiers, and if appraisal concepts help to identify emotion categories. To achieve that, we compile a corpus by asking people to textually describe events that triggered particular emotions and to disclose their appraisals. Then, we ask readers to reconstruct emotions and appraisals from the text. This set-up allows us to measure if emotions and appraisals can be recovered purely from text and provides a human baseline to judge a model’s performance measures. Our comparison of text classification methods to human annotators shows that both can reliably detect emotions and appraisals with similar performance. Therefore, appraisals constitute an alternative computational emotion analysis paradigm and further improve the categorization of emotions in text with joint models.
... Buechel et al [12] studied German historical texts in the time-period between 1741 and 1900 and found a decrease of arousal in belles lettres, which the authors attribute to possibly shifting from highly emotional sentimentalism through romanticism to descriptive realism. The methodology employed in this study includes using a three-dimensional VAD (valence-arousal-dominance) space of emotions, where words were assigned their VAD values based on the induction algorithm [13], and then the emotion value of a document was calculated as a weighted average of the emotion values of all of its words. ...
Preprint
Full-text available
Written texts reflect the emotional state of humans that create them. It is however not always obvious how to interpret the observed trends and patterns. Here we use statistical analysis to extract information about emotions, and focus on the valence-energy space to ask the questions: (1) Can we detect a temporal change in emotional characteristics of texts? (2) Are there measurable differences of these emotional characteristics among different groups of people? To determine trends in emotion through writing, a searchable online Corpus of Historical American English was used (400 million words from 1810–2000), as well as a collection of 180 contemporary posts grouped by gender, age, and occupation. Sentiment analysis tools were applied to measure levels of positivity/negativity and energy of writing. It was found that through written text, energetic words decreased in frequency and less energetic words increased, indicating a decrease of strong feelings and a rise of apathy in the texts over time. In the present day’s blog texts, three pairwise comparisons were performed: males vs females, older vs younger age-groups, and individuals with background in arts vs those with backgrounds in science. While no statistically significant difference in energy levels were detected, there was a clear separation in the valence of these groups, with females, younger people, and those with a background in the arts displaying the most negativity. Mathematical modeling was used to interpret the findings of the historical analysis. It was shown that a flattening of emotions, or a rise in apathy, may not necessarily be caused by a corresponding population trend, but could be a simple consequence of a bell-shape curve in emotion distribution and imitation dynamics governing the production of written texts.
... Different efforts in NLP focus on (automatically) creating lexicons. Terms are assigned VAD scores based on their semantic similarity to other words, for which manual annotations are provided (Köper, Kim, and Klinger 2017;Buechel, Hellrich, and Hahn 2016). To date, lexicons are available for both English (Bradley and Lang 1999;Warriner, Kuperman, and Brysbaert 2013;Mohammad 2018b) and other languages (Köper and Im Walde 2016;Yu et al. 2016;Buechel, Rücker, and Hahn 2020), and so are corpora annotated at the paragraph level with VAD information (for English, see Preoţiuc-Pietro et al. (2016), Buechel and Hahn (2017b) and Buechel and Hahn (2017a), for Mandarin Yu et al. (2016)). ...
Preprint
Full-text available
The most prominent tasks in emotion analysis are to assign emotions to texts and to understand how emotions manifest in language. An important observation for natural language processing is that emotions can be communicated implicitly by referring to events alone, appealing to an empathetic, intersubjective understanding of events, even without explicitly mentioning an emotion name. In psychology, the class of emotion theories known as appraisal theories aims at explaining the link between events and emotions. Appraisals can be formalized as variables that measure a cognitive evaluation by people living through an event that they consider relevant. They include the assessment if an event is novel, if the person considers themselves to be responsible, if it is in line with the own goals, and many others. Such appraisals explain which emotions are developed based on an event, e.g., that a novel situation can induce surprise or one with uncertain consequences could evoke fear. We analyze the suitability of appraisal theories for emotion analysis in text with the goal of understanding if appraisal concepts can reliably be reconstructed by annotators, if they can be predicted by text classifiers, and if appraisal concepts help to identify emotion categories. To achieve that, we compile a corpus by asking people to textually describe events that triggered particular emotions and to disclose their appraisals. Then, we ask readers to reconstruct emotions and appraisals from the text. This setup allows us to measure if emotions and appraisals can be recovered purely from text and provides a human baseline to judge model's performance measures. Our comparison of text classification methods to human annotators shows that both can reliably detect emotions and appraisals with similar performance. We further show that appraisal concepts improve the categorization of emotions in text.
... (2018). Buechel et al. (2016) have developed a methodological framework to adapt existing affect lexicons to specific use cases. Other than dictionaries, emotion analysis relies on labeled corpora. ...
... Using contemporary norms to estimate the valence of words used decades ago is potentially problematic, as all words may have changed their meaning or sentiment over history. In practice, however, it has been shown that historical sentiment as inferred from averaging contemporary valence norms of semantic neighbors is similar to the sentiment judged by historical language experts (Buechel, Hellrich, & Hahn, 2016). ...
Article
Despite increasing life expectancy and high levels of welfare, health care, and public safety in most post-industrial countries, the public discourse often revolves around perceived threats. Terrorism, global pandemics, and environmental catastrophes are just a few of the risks that dominate media coverage. Is this public discourse on risk disconnected from reality? To examine this issue, we analyzed the dynamics of the risk discourse in two natural language text corpora. Specifically, we tracked latent semantic patterns over a period of 150 years to address four questions: First, we examined how the frequency of the word risk has changed over historical time. Is the construct of risk playing an ever-increasing role in the public discourse, as the sociological notion of a ‘risk society’ suggests? Second, we investigated how the sentiments for the words co-occurring with risk have changed. Are the connotations of risk becoming increasingly ominous? Third, how has the meaning of risk changed relative to close associates such as danger and hazard? Is risk more subject to semantic change? Finally, we decompose the construct of risk into the specific topics with which it has been associated and track those topics over historical time. This brief history of the semantics of risk reveals new and surprising insights—a fourfold increase in frequency, increasingly negative sentiment, a semantic drift toward forecasting and prevention, and a shift away from war toward chronic disease—reflecting the conceptual evolution of risk in the archeological records of public discourse.
... Later, Strapparava and Valitutti (2004) made WordNet Affect available to target word classes and differences regarding their emotional connotation, Mohammad and Turney (2012) released the NRC dictionary with more than 14,000 words for a set of discrete emotion classes, and a valence-arousal-dominance dictionary was provided by . Buechel et al. (2016) and Buechel and Hahn (2017) (1985), Table 6, top part corresponds to those emotions we consider in our work here. ...
Preprint
Full-text available
Automatic emotion categorization has been predominantly formulated as text classification in which textual units are assigned to an emotion from a predefined inventory, for instance following the fundamental emotion classes proposed by Paul Ekman (fear, joy, anger, disgust, sadness, surprise) or Robert Plutchik (adding trust, anticipation). This approach ignores existing psychological theories to some degree, which provide explanations regarding the perception of events (for instance, that somebody experiences fear when they discover a snake because of the appraisal as being an unpleasant and non-controllable situation), even without having access to explicit reports what an experiencer of an emotion is feeling (for instance expressing this with the words "I am afraid."). Automatic classification approaches therefore need to learn properties of events as latent variables (for instance that the uncertainty and effort associated with discovering the snake leads to fear). With this paper, we propose to make such interpretations of events explicit, following theories of cognitive appraisal of events and show their potential for emotion classification when being encoded in classification models. Our results show that high quality appraisal dimension assignments in event descriptions lead to an improvement in the classification of discrete emotion categories.
Chapter
Full-text available
Zusammenfassung Ein wichtiger Bestandteil unserer alltäglichen Kommunikation, neben der Mitteilung und Beschreibung von Ereignissen und Fakten, ist der Ausdruck von Emotionen, welcher auch Bestandteil von Hassrede ist: Es wird zum Beispiel Wut zum Ausdruck gebracht, was wiederum bei den Betroffenen Angst, Traurigkeit oder vielleicht auch Überraschung auslösen kann. In der maschinellen Verarbeitung von Sprache haben sich in der letzten Zeit einige konkrete Aufgaben, welche Teil der Emotionsanalyse in Text sind, herauskristallisiert. Diese sind zum einen Klassifikationsaufgaben (welche Emotion drückt ein Text aus?) und zum anderen relationale Strukturlernaufgaben (welche Wörter bezeichnen die Person, die eine Emotion fühlt und welche Wörter lassen auf die Ursache der Emotion schließen?). Wir verschaffen uns in diesem Kapitel einen kurzen Überblick über das Feld und diskutieren im Anschluss etwas genauer, wie sich die Beschreibungen von Emotionen in verschiedenen Domänen unterscheiden und wie Ereignisbeschreibungen mit Hilfe psychologischer Theorien mit Emotionen zusammengebracht werden können. Insbesondere analysieren wir auf Basis des Emotions-Komponenten-Prozessmodells, auf welche Komponenten von Emotionen (subjektives Gefühl, kognitive Evaluation, Körperreaktion, Ausdruck, Motivation) Autor:innen zugreifen, und stellen fest, dass diese Verteilung zwischen sozialen Medien und Literatur unterschiedlich ist. In beiden Domänen spielt aber die kognitive Komponente zur Interpretation von Emotionen eine wichtige Rolle. Dies zeigt auf, dass insbesondere der Ereignisinterpretation Aufmerksamkeit geschenkt werden muss, um implizit kommunizierte Emotionen aufzudecken. Dies motiviert uns, Emotionen mit Hilfe der Appraisaltheorien zu analysieren, welche den Zusammenhang zwischen kognitiven Prozessen und Emotionen erklären. Zu beiden Konzepten – dem Komponentenmodell und den Appraisaltheorien – präsentieren wir Textkorpora und Klassifikationsmodelle.
Conference Paper
Full-text available
This paper presents a collection of 350 000 German lemmatised words, rated on four psycholinguistic affective attributes. All ratings were obtained via a supervised learning algorithm that can automatically calculate a numerical rating of a word. We applied this algorithm to abstractness, arousal, imageability and valence. Comparison with human ratings reveals high correlation across all rating types.
Conference Paper
Full-text available
Emotion analysis (EA) and sentiment analysis are closely related tasks differing in the psychological phenomenon they aim to catch. We address fine-grained models for EA which treat the computation of the emotional status of narrative documents as a regression rather than a classification problem, as performed by coarse-grained approaches. We introduce Ekman's Basic Emotions (BE) and Russell and Mehrabian's Valence-Arousal-Dominance (VAD) model—two major schemes of emotion representation following opposing lines of psychological research, i.e., categorical and dimensional models—and discuss problems when BEs are used in a regression approach. We present the first natural language system thoroughly evaluated for fine-grained emotion analysis using the VAD scheme. Although we only employ simple BOW features, we reach correlation values up until r = .65 with human annotations. Furthermore, we show that the prevailing evaluation methodology relying solely on Pearson's correlation coefficient r is deficient which leads us to the introduction of a complementary error-based metric. Due to the lack of comparable (VAD-based) systems, we, finally, introduce a novel method of mapping between VAD and BE emotion representations to create a reasonable basis for comparison. This enables us to evaluate VAD output against human BE judgments and, thus, allows for a more direct comparison with existing BE-based emotion analysis systems. Even with this, admittedly, error-prone transformation step our VAD-based system achieves state-of-the-art performance in three out of six emotion categories, out-performing all existing BE-based systems but one.
Conference Paper
Full-text available
An increasing amount of research has recently focused on representing affective states as continuous numerical values on multiple dimensions , such as the valence-arousal (VA) space. Compared to the categorical approach that represents affective states as several classes (e.g., positive and negative), the dimensional approach can provide more fine-grained sentiment analysis. However, affective resources with valence-arousal ratings are still very rare, especially for the Chinese language. Therefore, this study builds 1) an affective lexicon called Chinese valence-arousal words (CVAW) containing 1,653 words, and 2) an affective corpus called Chinese valence-arousal text (CVAT) containing 2,009 sentences extracted from web texts. To improve the annotation quality, a corpus cleanup procedure is used to remove outlier ratings and improper texts. Experiments using CVAW words to predict the VA ratings of the CVAT corpus show results comparable to those obtained using English affective resources.
Book
In 1959 C. P. Snow delivered his now-famous Rede Lecture, 'The Two Cultures,' a reflection on the academy based on the premise that intellectual life was divided into two cultures: the arts and humanities on one side and science on the other. Since then, a third culture, generally termed 'social science' and comprised of fields such as sociology, political science, economics, and psychology, has emerged. Jerome Kagan's book describes the assumptions, vocabulary, and contributions of each of these cultures and argues that the meanings of many of the concepts used by each culture are unique to it and do not apply to the others because the source of evidence for the term is special. The text summarizes the contributions of the social sciences and humanities to our understanding of human nature and questions the popular belief that biological processes are the main determinant of variation in human behavior.
Conference Paper
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.
Conference Paper
Understanding how words change their meanings over time is key to models of language and cultural evolution, but historical data on meaning is scarce, making theories hard to develop and test. Word embeddings show promise as a diachronic tool, but have not been carefully evaluated. We develop a robust methodology for quantifying semantic change by evaluating word embeddings (PPMI, SVD, word2vec) against known historical changes. We then use this methodology to reveal statistical laws of semantic evolution. Using six historical corpora spanning four languages and two centuries, we propose two quantitative laws of semantic change: (i) the law of conformity---the rate of semantic change scales with an inverse power-law of word frequency; (ii) the law of innovation---independent of frequency, words that are more polysemous have higher rates of semantic change.
Article
Recent trends suggest that neural-network-inspired word embedding models outperform traditional count-based distributional models on word similarity and analogy detection tasks. We reveal that much of the performance gains of word embeddings are due to certain system design choices and hyperparameter optimizations, rather than the embedding algorithms themselves. Furthermore, we show that these modifications can be transferred to traditional distributional models, yielding similar gains. In contrast to prior reports, we observe mostly local or insignificant performance differences between the methods, with no global advantage to any single approach over the others.