ChapterPDF Available

Abstract and Figures

We propose and test multiple neuro-symbolic methods for sentiment analysis. They combine deep neural networks – transformers and recurrent neural networks – with external knowledge bases. We show that for simple models, adding information from knowledge bases significantly improves the quality of sentiment prediction in most cases. For medium-sized sets, we obtain significant improvements over state-of-the-art transformer-based models using our proposed methods: Tailored KEPLER and Token Extension. We show that the cases with the improvement belong to the hard-to-learn set.KeywordsNeuro-symbolic sentiment analysisplWordNetKnowledge baseTransformersKEPLERHerBERTBiLSTMPolEmo 2.0
Content may be subject to copyright.
Neuro-symbolic Models for Sentiment Analysis
Jan Kocoń, Joanna Baran, Marcin Gruza, Arkadiusz Janz, Michał Kajstura,
Przemysław Kazienko, Wojciech Korczyński, Piotr Miłkowski, Maciej Piasecki,
and Joanna Szołomicka
Department of Artificial Intelligence
Wrocław University of Science and Technology, Wrocław, Poland
jan.kocon@pwr.edu.pl
Abstract. We propose and test multiple neuro-symbolic methods for
sentiment analysis. They combine deep neural networks transform-
ers and recurrent neural networks with external knowledge bases. We
show that for simple models, adding information from knowledge bases
significantly improves the quality of sentiment prediction in most cases.
For medium-sized sets, we obtain significant improvements over state-of-
the-art transformer-based models using our proposed methods: Tailored
KEPLER and Token Extension. We show that the cases with the im-
provement belong to the hard-to-learn set.
Keywords: neuro-symbolic sentiment analysis ·plWordNet ·knowledge
base ·transformers ·KEPLER ·HerBERT ·BiLSTM ·PolEmo 2.0
1 Introduction
Sentiment analysis is an NLP task performed in industrial or marketing solu-
tions. It aims to determine how customers (authors of textual opinions) react to
given products or services. In the classical symbolic approach, a text is evaluated
using external knowledge bases, e.g., sentiment dictionaries [3, 4]. Then, words
from the text are linked to positive, negative, or neutral polarization derived
from such dictionaries. The final sentiment is an aggregation over all words.
State-of-the-art sentiment analysis methods are mainly based on transformers.
Such language models contain millions of parameters but also require large com-
putational resources. Hence, their simplified methods, e.g., BiLSTM [15, 19], are
often used in practice. We refer to both of these approaches as our baselines.
In this paper, we present and validate neuro-symbolic solutions to sentiment
analysis that combine both approaches: deep neural networks and symbolic in-
ference. These methods use vector representations of text from deep language
This work was funded by the National Science Centre, Poland, project no.
2021/41/B/ST6/04471 (PK) and 2019/33/B/HS2/02814 (MP); the Polish Min-
istry of Education and Science, CLARIN-PL; the European Regional Development
Fund as a part of the 2014-2020 Smart Growth Operational Programme, CLARIN
Common Language Resources and Technology Infrastructure, project number
POIR.04.02.00-00C002/19; the statutory funds of the Department of Artificial In-
telligence, Wrocław University of Science and Technology.
2 Kocoń J. et al.
models and external knowledge bases in the form of, e.g., lexicons (sentiment),
knowledge graphs (WordNet), and lexico-syntactic patterns (sentiment modifi-
cation rules). Our main contributions are: (1) design or adaptation of multiple
neuro-symbolic methods, (2) comparing our approaches against methods without
knowledge bases; (3) prove that for simpler models, the knowledge base signifi-
cantly improves the prediction quality; (4) showing specific cases of medium-sized
sets for which knowledge base information significantly improves the prediction
quality for the current transformer-based SOTA models; (5) evidence that neuro-
symbolic approaches improve reasoning mainly for hard-to-learn cases.
2 Related Work
Sentiment analysis (SA) is a standard classification task aiming to decide whether
the presented text has a positive, negative or neutral polarity. Some works treat
SA as a multi-class prediction problem when data are focused on ranking system
(e.g. 5-star). In the past, standard machine learning methods were applied to
SA such as decision tree, SVM, Naive Bayes or random forest. However, in re-
cent years we are observing the growing popularity of deep-learning (DL) models
which proved to be very succesfull.
Standard deep-learning approach. Different types of DL architures were
exploited in sentiment classification. We can mention here CNN, LSTM, RNN,
GRU, Bi-LSTM and their variations with attention mechanism [11]. Most of
then were trained in a supervised setting. However, despite the promising re-
sults achieved by these models, vulnerabilities have been observed such as poor
knowledge propagation of cross-domain sentiment analysis in online systems [2],
mainly due to lack of enough manual annotated datasets for all domains.
Neuro-symbolic approach. Many lexicon resources for various languages
have been developed. Princeton WordNet (PWN) is a major one for English
but similar knowledge bases were created for other languages too. Some con-
tain emotive annotations for specific word meanings assigned by people (e.g.
SentiWordNet). In addition, NLP tools were created to analyse data in a man-
ner similar to human understanding (POS part-of-speech tagger, WSD word
sense disambiguation). Given the complexity of the SA task, which combines nat-
ural language processing, psychology, and cognitive science, using such external
knowledge processed according to human logic could improve results of standard
DL methods. Moreover, it can imply more explainable predictions. Some works
have been done in that field. [8] incorporated graph-based ontology Concept-
Net into sentiment analysis enriching the text semantics. Apart from knowledge
graph, [25] added a WSD process into social media posts processing. A context-
aware sentiment attention mechanism acquiring the sentiment polarity of each
word with its POS tag from SentiWordNet was studied in [13]. The pre-training
process very rarely respects sentiment-related knowledge. If so, the problem of
proper representation of sentiment words and aspect-sentiment pairs needs to be
solved. To address it Sentiment Knowledge Enhanced Pre-training (SKEP) [24]
has been proposed. It uses sentiment masking and constructs three sentiment
Neuro-symbolic Models for Sentiment Analysis 3
knowledge prediction objectives to embed this information at the word- and
aspect-level into a pre-trained representation.
3 Datasets
3.1 plWordNet & plWordNet Emo
plWordNet is a very large lexico-semantic network for Polish constructed on
the basis of the corpus-based wordnet development method, according to which
lexical units1(henceforth LUs) are the basic building blocks of the wordnet [7].
LUs of very similar meaning are grouped into synsets (sets of synonyms) each
LU belongs to only one synset. The most recent version describes 295k LUs
for 194k lemma of four PoS (part of speech) grouped into 228k synsets2[1].
Emotive annotation was performed on the level of LUs and LU use exam-
ples [27]. Context-independent emotive characterisation of an LU was obtained
by comparing its authentic use in text corpora. The main distinction is between
neutrality vs polarity of LUs. Polarised LUs are assigned the intensity of the
sentiment polarisation, basic emotions and fundamental human values. The lat-
ter two help to determine the sentiment polarity and its intensity expressed in
the 5 grade scale: strong or weak vs negative and positive, plus the ambiguous
tag. Annotator decisions are supported by text examples that must be included
in the annotations. Due to the compatibility with other wordnet-based anno-
tations, eight basic emotions3recognised by Plutchik [20] were used. One LU
can be assigned more than one emotion and, as a result, complex emotions are
represented by using the same eight-element set. The 12 fundamental human
values4postulated by Puzynina [21] link the emotive state of the speaker to
the evaluative attitude. The annotations were done by two annotators each (a
linguist and a psychologist) according to the 2+1 scheme.
3.2 PolEmo
PolEmo 2.0 dataset [12, 15] is a sentiment analysis task benchmark dataset.
It consists of more than 8,000 consumer reviews, containing more than 57,000
sentences. Texts come from four domains: hotels, medicine, products, and school.
Each review was annotated with sentiment in a 2+1 scheme at the text level and
the sentence level. In this work, only text level examples were used. There are
the following sentiment classes: positive, negative, neutral, and ambivalent. The
obtained Positive Specific Aggrement (PSA) [9] was 90% at the text level and
87% at the sentence level. PolEmo 2.05is available under an MIT open license.
1Triples: lemma, Part of Speech (PoS) and sense identifier.
2http://plwordnet.pwr.edu.pl
3Joy,fear,surprise,sadness,disgust,anger,trust and anticipation.
4Utility,truth,knowledge,beauty,happiness,futility,harm,ignorance,error,ugliness
5https://clarin-pl.eu/dspace/handle/11321/710
4 Kocoń J. et al.
3.3 Preprocessing
All texts from PolEmo were tokenized, lemmatized, and tagged using CLARIN
PoS tagger6. Word sense disambiguation [10] (WSD7) was performed to identify
the appropriate LU belonging to that token. Next plWordNet Emo was used to
annotate words with sentiment, basic emotions and fundamental human values
(valuations). Additionally, we also propagated sentiment and emotion annota-
tions from wordnet to words that originally did not have this annotation in the
plWordNet Emo. It required training a regressor based on fastText model [6] us-
ing emotive dimensions from plWordNet Emo aggregated per lemma (emotions
propagated). Data annotation statistics are presented in Tab. 1.
The example pipeline for combining text with a knowledge base is shown
in Fig. 1. It tokenizes text and matches words with their correct meanings in
Wordnet. Furthermore, information on sentiment and emotions from Wordnet
annotations (WordnetEmo) is added to the text at the word sense level using the
EMOCCL tool. The emotional Wordnet annotation is aggregated at the word
lemmas level and added to the text (lemma lexicon).
Fig. 1. Baseline approach (ML) vs. neuro-symbolic approach (neuro-symbolic ML).
The blue colour on the diagram indicates neuro-symbolic part of the method.
Table 1. Token annotation coverage in preprocessed PolEmo2.0
Feature Train Dev Test
Sentiment (all) 28.3% 28.7% 28.4%
Sentiment (pos/neg) 9.2% 9.4% 9.4%
Basic emotions 8.3% 8.5% 8.5%
Valuations 8.6% 8.8% 8.7%
Emotion propagated 99.9% 99.9% 99.9%
6http://ws.clarin-pl.eu/tager
7http://ws.clarin-pl.eu/wsd
Neuro-symbolic Models for Sentiment Analysis 5
4 Neuro-symbolic Models
4.1 HB: HurtBERT Model
Fig. 2. HB: HurtBERT model.
HurtBERT [16] (Fig. 2) was proposed for the abusive language detection
task. Apart from the standard transformer-based text representation, it incor-
porates knowledge from a lexicon [5]. Additional features are processed by a
separate branch and then are concatenated with a text representation before
the classification layer. Lexical information can be utilized in two ways: (1) HB-
enc: HB-encoding using a simple frequency count for the lexicon categories; (2)
HB-emb: HB-embedding obtained with a LSTM network. The second method
is more expressive, as it takes token order into account. As the number of cate-
gories in plWordNet differs from the ones used in the original paper, we modified
the dimensionality of sentiment embedding layer accordingly.
4.2 TK: Tailored KEPLER Model
Tailored KEPLER model (Fig. 3) is an adaptation of KEPLER [26] which in-
corporates information from a knowledge graph (KG) into a pretrained languae
model (PLM) like BERT during fine-tuning. It is different to the original KE-
PLER model where extra KE knowledge is used during pretraining stage (unsu-
pervised masked language modeling). Our Kepler approach is tailored to single
task, it utilizes extra knowledge during fine-tuning. To harness knowledge from
KG, its entities representation is obtained by encoding their text descriptions
with PLM. Thus, PLM can be additionally learned with Knowledge Embedding
(KE) objective along with a task objective.
We used plWordNet as KG from which we extract relations between LUs and
between synsets.The relation facts are described by a triplet (h, r, t)where h,t
represent the head and the tail entities; ris a relation type from set R. After
discarding some types of relations (e.g., hyperonymy is symmetric to hyponymy),
48 types of relations remained.
We get the embeddings for heads and tails by encoding the corresponding
LUs descriptions with PLM. The relation types are encoded by a randomly
initialized, learnable embedding table. As KE loss, the loss from [22] is used.
It adopts negative sampling [18] and tries to minimize TransE distance for the
entities being in the relation and to maximize it for negative samples triplets.
6 Kocoń J. et al.
Fig. 3. Tailored KEPLER model. The same encoder is used to obtain embeddings for
KE loss and for the downstream task.
To fine-tune the pretrained model, we applied multitask loss L=LKE +LNLP
where LNLP is loss for a downstream NLP task. We used only those triplets
which LUs are present in the the downstream task training set and we clipped
the number of steps in each epoch to the size of the downstream task training
set.
4.3 TE: Token Extension Model
The benefits of additional knowledge bases are best seen in simple language
models [17]. For this reason, fastText model for Polish language [14] and BiLSTM
model [15] working on the basis of embeddings per token derived from it has been
taken into consideration (Fig. 4). This approach allows to use the knowledge
base at the level of each token. Thus, we propose 3 variants: (1) baseline - which
uses token embedding only, (2) TE-original where additional knowledge (as
a vector) from the wordnet is concatenated to the token embedding, and (3)
TE-propagated using propagated data (Sec. 3.3) on all words in text.
4.4 ST(P): Special Tokens (with Positioning) Model
In transformer with Special Tokens (ST) model (Fig. 5) we added special BERT
tokens corresponding to emotions and sentiments. They are put after a word
which lemma is annotated with emotion or sentiment in plWordNet. It is a way
to harness emotive knowledge from plWordNet to Transformer. Exemplary input
can be in a form of: She was still weeping [SAD], despite the happy [JOY] end of
the movie. Since emotion tokens are marked as special tokens, they will not be
broken down into word pieces by tokenizer and their embedding vectors will be
initialized randomly. Since adding new tokens to the text breaks its sequentiality,
we test additional version of the model (STP: Special Tokens with Positioning) in
which we adjust the emotion token position indexes so that they are equal to the
lemma token position indexes they correspond to (e.g. Happyidx=1 [JOY]idx=1
andidx=2 amazedidx=3 [SURPRISED]idx=3 girlidx=4.). With this adjustement,
Neuro-symbolic Models for Sentiment Analysis 7
Fig. 4. TE: Token Extension model.
Fig. 5. ST: Special Tokens model.
the emotional tokens will have the same positional embeddings as their corre-
sponding lemmas.
4.5 STN: Special Tokens from Numeric Data Model
STN method is an extension of ST method (same model as in Fig. 5) designed
for the cases when a lemma is annotated by many annotators. Lemma intensity
of emotion ecan be expressed as fraction αe(0,1) of annotation with emotion
e. Since not all LUs are annotated, a regression model is used to propagate
these values to other lemmas. A special token for emotion eis put after a word
if its αe> T . In another variant, we add all found special tokens (without
replacement) in a text at its end. Threshold Tcan be either the same for all
emotions or individual value Teassigned to each emotion eas a quantile of
all αevalues for lemmas in the train set. For STN method, the special token
embeddings for each emotion are initialized with an average of the embeddings
of all subword tokens obtained after tokenization of the emotion name. Adjusting
positional embedding proposed for ST method is not applied for STN.
8 Kocoń J. et al.
4.6 SE: Sentiment Embeddings Model
Fig. 6. SE: Sentiment Embeddings Model.
Both HurtBERT-embedding and HurtBERT-encoding aggregate additional
information at text level, which can limit the interaction between the text and
features obtained from plWordNet. To incorporate token-level lexical annota-
tions into a transformer, we add trainable sentiment embeddings as hidden rep-
resentations before the last transformer layer (Fig. 6). If the word consists of
multiple BPE parts, we add the embedding to all subword tokens. Augmented
representations are then passed to a classifier to compute the probability of each
sentiment class. The classifier consists of a dense layer followed by a softmax
activation function. During the pretraining phase of HerBERT, there is no ad-
ditional lexical information. Adding the sentiment token in the second to last
layer of the transformer could corrupt the token representations. We consider
two variants: (1) SE: the last transformer block’s weights are left unchanged
and (2) SE-reset: the last transformer block’s weights are randomly initialized.
Random reinitialization of the last BERT layer is a common practice [28] and
can make it easier for the model to utilize additional features.
5 Experimental Setup and Results
For each experimental setup, we compare a baseline neural model with its neuro-
symbolic extension. In each method (excluding TE), we used HerBERT as a
SOTA baseline for sentiment analysis performed on PolEmo 2.0 dataset. We test
the method on selected undersampled training datasets of different sizes. Both
baseline and neuro-symbolic models are trained using the same hyperparameters.
Some of the methods are adapted from other papers, so the baselines are not
identical in different setups in terms of hyperparameters. For each configuration,
the experiments are repeated 10 times.
Neuro-symbolic Models for Sentiment Analysis 9
5.1 TK: Tailored KEPLER Model
Fine-tuning is performed for 4 epochs with learning rate 5e-5, batch size 4 and
weight decay 0.01. Maximum sequence length is 256 and 32 for PolEmo texts
and for entities text representations, respectively. Results are presented in Fig. 8.
The statistical gains are obtained for the smaller training sets what shows that
the extra knowledge from KG helps when an amount of data is limited.
For the case where the difference between the baseline and TK was significant,
both models were compared using the cartography method [23]. It uses model
confidence, variability, and correctness over epochs to find which texts are hard,
easy or ambiguous to learn. The correctness specifies the fraction of times the
true label is predicted. The confidence is the mean probability of the ground truth
between epochs. The variability measures how indecisive the model is. Fig. 7
shows datamap for HerBERT. Colours of the points on the diagram indicate if the
instance is easier to learn for Tailored KEPLER than the baseline (HerBERT).
The diagram shows that adding extra knowledge improves correctness for far
more cases than it worsens. Moreover, the examples, which are affected belong
to hard-to-learn and ambiguous classes only.
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
variability
0.1
0.2
0.3
0.4
0.5
0.6
0.7
confidence
ambiguous
easy-to-learn
hard-to-learn
relation
improve
no change
worsen
Fig. 7. Datamap [23] for the baseline model in Tailored Kepler (TK) setup. Green
colour indicates cases for which correctnes cKof TK has a higher value than correctness
cHfor the baseline. Gray examples have |cKcH| 0.3(small or no change). Red
instances mean that cK< cH.
10 Kocoń J. et al.


NLP

KE
+
NLP
Fig. 8. Results for Tailored KEPLER model.
5.2 TE: Token Extension Model
The models were trained for 25 epochs. The model performing best on the vali-
dation set was used for testing (maximum F1-macro). The results of the exper-
iments are presented in Fig. 9. The performance of models based on fastText
embeddings increases with the size of the training set. On 5 of the 6 dataset sizes
tested, the approach using additional data in the original (TE-orig) or propa-
gated form (TE-prop) was better than the baseline. For train sizes over 1,000,
using propagated data proved to be the best possible approach.




Fig. 9. Results for Token Extension model.
It is important to compare the running time of the TE model with that of
the example transformer-based (SE) model in Fig. 10. The performance of the
TE model (macro F1: 83%) is significantly worse by about 4 p.p. relative to
the SE model (macro F1: 87%). However, the inference time of the TE model
for the test set (3.6 s) is almost four times shorter than that of the SE model.
https://www.overleaf.com/project/620b8ae3cda06ae4691ba512
Neuro-symbolic Models for Sentiment Analysis 11
Fig. 10. Inference time of Sentiment Embeddings (transformer-based) and Token Ex-
tension (BiLSTM+fastText) neuro-symbolic models.
5.3 ST(P): Special Tokens (with Positioning) Model
The maximum tokenizer text length has been set to 512, so that adding new
emotional tokens does not require to truncate the text. The batch size was set
to 20. We used the Adam optimizer with the learning rate set to 2e-5 during
training. The models were trained for 5 epochs and the model with the smallest
validation loss was checkpointed and tested. The results are presented in Fig. 11.
     












Fig. 11. Special Token Model (ST) and ST with positional embeddings (STP).
ST and STP models achieve worse results than the baseline for smaller train
datasets (250 and 500 samples). For bigger train datasets, there are no significant
differences between the models.
5.4 STN: Sprecial Tokens from Numeric Data
We consider the following variant with in-text and at-end special tokens: (1)
no propagated data, T= 0.5; (2) propagated data, T= 0.6; (3) propagated
data, individual threshold Teequal to 0.75 qunatile. In each case, fine-tuning is
12 Kocoń J. et al.
performed for 4 epochs with learning rate 5e-5, batch size 16, weight decay 0.01,
and maximum sequence length 512. Results are presented in Fig. 12.



T
= 0.5

T
= 0.6


T
= 0.5

T
= 0.6


T
e
=
q
0.75

Fig. 12. Results for Special Tokens for Numeric Data model.
The results do not show significant improvements for each STN method. In
the case of in-text special tokens, the results are usually worse. For at-end-of-text
special tokens performance is very similar to the baseline.
5.5 HB: HurtBERT Model & SE: Sentiment Embeddings Model






Fig. 13. Results for HurtBERT-encoding, HurtBERT-embeddings, Sentiment Embed-
ding and Sentiment Embedding Reset models.
Models are fine-tuned using AdamW optimizer with learning rate 1e-5, linear
warmup schedule, batch size 32, and maximum sequence length 256 for 30 epochs
and the best model is chosen according to a validation F-score. Results are
presented in Fig. 13. In lower data regimes (250 and 500 samples), there may
Neuro-symbolic Models for Sentiment Analysis 13
not be enough data to learn embeddings of sentiment features, hence the similar
performance. For larger datasets, the additional information from a knowledge
base is outweighed by a textual information. Our experiments do not show a
significant improvement over a baseline, both for HurtBERT and the proposed
SE method. Texts in PolEmo dataset are complex and aggregating additional
lexical features on the level of a whole text is not sufficient.
6 Conclusions
We designed and adapted multiple neuro-symbolic methods. The additional
knowledge in most transformer-based neuro-symbolic models does not lead to im-
provement in most cases. For the smallest variants of datasets (training dataset:
250 texts), it can even make the training process more unstable and degrade
the model quality (ST*, HB*, SE*). Adding special tokens inside the text is not
beneficial for pretrained BERT models because it damages the natural structure
of the text. It is not the case for tokens added at the end of the text, but still no
performance gain is observed. It can be caused by the fact that the considered
PolEmo dataset has high PSA, so the knowledge encompassed in the pretrained
HerBERT model is sufficient to obtain very good results.
However, for small and medium-sized datasets, our Tailored KEPLER neuro-
symbolic transformer-based model produced statistically significant gains. It also
allowed to obtain better and more stable results. Analysis of these cases shows
performance gains for examples belonging to ambivalent sentiment class. We
examined in which cases additional knowledge improved the quality of inference,
Fig. 7. The vast majority of these cases were identified by the baseline model as
hard-to-learn.
A key finding of the study is that the knowledge base significantly improves
the quality of simple models such as Token Extension, Fig. 10. Compared to
transformer-based models, we obtain an almost fourfold reduction in inference
time, at the cost of a significant but relatively small decrease in quality (4 pp.).
For the TK model, the quality gain due to additional knowledge was significant
for most cases. This shows that with very little computational cost, the inference
quality can be significantly improved for such models.
References
1. plWordNet 4.5 (2021), http://hdl.handle.net/11321/834, CLARIN-PL
2. Al-Moslmi, T., Omar, N., Abdullah, S., Albared, M.: Approaches to cross-domain
sentiment analysis: A systematic lit. review. IEEE access 5, 16173–16192 (2017)
3. Augustyniak, Ł., Kajdanowicz, T., Kazienko, P., Kulisiewicz, M., Tuligłowicz, W.:
An approach to sentiment analysis of movie reviews: Lexicon based vs. classifica-
tion. In: HAIS’14. pp. 168–178. Springer (2014)
4. Augustyniak, Ł., Kajdanowicz, T., Szymański, P., Tuligłowicz, W., Kazienko, P.,
Alhajj, R., Szymanski, B.: Simpler is better? lexicon-based ensemble sentiment
classification beats supervised methods. In: ASONAM 2014. pp. 924–929 (2014)
14 Kocoń J. et al.
5. Bassignana, E., Basile, V., Patti, V.: Hurtlex: A multilingual lexicon of words to
hurt. In: CLiC-it 2018. vol. 2253, pp. 1–6. CEUR-WS (2018)
6. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with
subword information (2017)
7. Dziob, A., Piasecki, M., Rudnicka, E.: plWordNet 4.1 - a linguistically motivated,
corpus-based bilingual resource. In: The 10th Global Wordnet Conference. pp.
353–362. Global Wordnet Association (Jul 2019)
8. Ghosal, D., Hazarika, D., Roy, A., Majumder, N., Mihalcea, R., Poria, S.: Kingdom:
Knowledge-guided domain adapt. for sentiment analysis. arXiv:2005.00791 (2020)
9. Hripcsak, G., Rothschild, A.: Agreement, the f-measure, and reliability in informa-
tion retrieval. J. of the Amer.Med.Inform.Ass. (JAMIA) 12(3), 296–8 (2005)
10. Janz, A., Piasecki, M.: A weakly supervised word sense disambiguation for polish
using rich lexical resources. Poznan Studies in Cont. Ling. 55(2), 339–365 (2019)
11. Joseph, J., Vineetha, S., Sobhana, N.: A survey on deep learning based sentiment
analysis. Materials Today: Proceedings (2022)
12. Kanclerz, K., Miłkowski, P., Kocoń, J.: Cross-lingual deep neural transfer learning
in sentiment analysis. Procedia Computer Science 176, 128–137 (2020)
13. Ke, P., Ji, H., Liu, S., Zhu, X., Huang, M.: Sentilare: Sentiment-aware language
representation learning with linguistic knowledge. arXiv:1911.02493 (2020)
14. Kocoń, J., Gawor, M.: Evaluating KGR10 Polish word embeddings in the recogni-
tion of temporal expressions using BiLSTM-CRF. Schedae Informaticae 27 (2018)
15. Kocoń, J., Miłkowski, P., Zaśko-Zielińska, M.: Multi-level sentiment analysis of
polemo 2.0: Extended corpus of multi-domain consumer reviews. In: CoNLL’19.
pp. 980–991. ACL (Nov 2019)
16. Koufakou, A., Pamungkas, E.W., Basile, V., Patti, V.: HurtBERT: Incorporating
lexical features with BERT for the detection of abusive language. In: The 4th
Workshop on Online Abuse and Harms. pp. 34–43. ACL (Nov 2020)
17. Ma, Y., Peng, H., Cambria, E.: Targeted aspect-based sentiment analysis via em-
bedding commonsense know. into an attentive lstm. In: AAAI’18. vol. 32 (2018)
18. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Dist. representations of
words and phrases and their compositionality. In: NIPS’13. pp. 3111–3119 (2013)
19. Miłkowski, P., Gruza, M., Kazienko, P., Szołomicka, J., Woźniak, S., Kocoń, J.:
Multiemo: Language-agnostic sentiment analysis. In: ICCS’22. Springer (2022)
20. Plutchik, R.: EMOTION: A Psychoevolutionary Synthesis. Harper & Row (1980)
21. Puzynina, J.: Język wartości [The language of values]. Sci. Pub. PWN (1992)
22. Sun, Z., Deng, Z.H., Nie, J.Y., Tang, J.: Rotate: Knowledge graph embedding by
relational rotation in complex space. In: Int.Conf. on Learning Repr. (ICLR) (2019)
23. Swayamdipta, S., Schwartz, R., Lourie, N., Wang, Y., Hajishirzi, H., Smith, N.A.,
Choi, Y.: Dataset cartography: Mapping and diagnosing datasets with training
dynamics. In: EMNLP’20. pp. 9275–9293. ACL (Nov 2020)
24. Tian, H., Gao, C., Xiao, X., Liu, H., He, B., Wu, H., Wang, H., Wu, F.: Skep:
Sentiment knowledge enhanced pre-training for sentiment analysis (2020)
25. Vizcarra, J., Kozaki, K., Torres Ruiz, M., Quintero, R.: Knowledge-based sentiment
analysis and visualization on social networks. NGC 39(1), 199–229 (2021)
26. Wang, X., Gao, T., Zhu, Z., Liu, Z., Li, J.Z., Tang, J.: Kepler: A unified model
for knowledge embedding and pre-trained language representation. Transactions of
the Association for Computational Linguistics 9, 176–194 (2021)
27. Zaśko-Zielińska, M., Piasecki, M.: Towards emotive annotation in plWordNet 4.0.
In: The 9th Global Wordnet Conf. pp. 153–162. Global WordNet Association (2018)
28. Zhang, T., Wu, F., Katiyar, A., Weinberger, K.Q., Artzi, Y.: Revisiting few-sample
bert fine-tuning. arXiv:2006.05987 (2020)
... To prevent this, explainable artificial intelligence methods should be used to understand the model behaviour [35,8]. Moreover, identifying a missing part can greatly improve the effectiveness of a model [26,20,18,3,23,34]. On the other hand, apart from scientists, there is a growing need for common users to understand AI solutions thoroughly. ...
Chapter
Full-text available
Data Maps is an interesting method of graphical representation of datasets, which allows observing the model’s behaviour for individual instances in the learning process (training dynamics). The method groups elements of a dataset into easy-to-learn, ambiguous, and hard-to-learn. In this article, we present an extension of this method, Differential Data Maps, which allows you to visually compare different models trained on the same dataset or analyse the effect of selected features on model behaviour. We show an example application of this visualization method to explain the differences between the three personalized deep neural model architectures from the literature and the HumAnn model we developed. The advantage of the proposed HumAnn is that there is no need for further learning for a new user in the system, in contrast to known personalized methods relying on user embedding. All models were tested on the sentiment analysis task. Three datasets that differ in the type of human context were used: user-annotator, user-author, and user-author-annotator. Our results show that with the new explainable AI method, it is possible to pose new hypotheses explaining differences in the quality of model performance, both at the level of features in the datasets and differences in model architectures.
... Examining the publicly available datasets for emotion recognition in the Polish language reveals the lack thereof. CLARIN-Emo dataset was created as a subset of a Polish sentiment annotated consumer reviews corpus PolEmo [19,18,37,15,20,3,23] with additional emotion annotations to fill this gap. ...
Chapter
Full-text available
In this paper, we investigate whether it is possible to automatically annotate texts with ChatGPT or generate both artificial texts and annotations for them. We prepared three collections of texts annotated with emotions at the level of sentences and/or whole documents. CLARIN-Emo contains the opinions of real people, manually annotated by six linguists. Stockbrief-GPT consists of real human articles annotated by ChatGPT. ChatGPT-Emo is an artificial corpus created and annotated entirely by ChatGPT. We present an analysis of these corpora and the results of Transformer-based methods fine-tuned on these data. The results show that manual annotation can provide better-quality data, especially in building personalized models.KeywordsChatGPTEmotion recognitionAutomatic annotation
Article
Full-text available
The goal of the growing discipline of neuro-symbolic artificial intelligence (AI) is to develop AI systems with more human-like reasoning capabilities by combining symbolic reasoning with connectionist learning. We survey the literature on neuro-symbolic AI during the last two decades, including books, monographs, review papers, contribution pieces, opinion articles, foundational workshops/talks, and related PhD theses. Four main features of neuro-symbolic AI are discussed, including representation, learning, reasoning, and decision-making. Finally, we discuss the many applications of neuro-symbolic AI, including question answering, robotics, computer vision, healthcare, and more. Scalability, explainability, and ethical considerations are also covered, as well as other difficulties and limits of neuro-symbolic AI. This study summarizes the current state of the art in neuro-symbolic artificial intelligence.
Chapter
Full-text available
This chapter discusses the growing importance of artificial intelligence in e-commerce, starting with the challenge of defining AI itself. It identifies and discusses several key areas of e-commerce where AI is playing and will continue to play an increasing role, namely fulfilment, inventory control, chatbots and avatars, personalisation and recommendation, automated order systems, and AI-driven ads. The significant role of AI in SEO and content creation was also discussed, where tools such as ChatGPT can automate and optimise e-commerce content. However, challenges such as the problem of hallucination of LLMs, quality, and originality of content are noted. The final section of the chapter shows the results of an experiment using ChatGPT for SEO, demonstrating the results and potential for improving search engine rankings for e-commerce sites using such tools.
Article
Full-text available
Social media is used to categorise products or services, but analysing vast comments is time-consuming. Researchers use sentiment analysis via natural language processing, evaluating methods and results conventionally through literature reviews and assessments. However, our approach diverges by offering a thorough analytical perspective with critical analysis, research findings, identified gaps, limitations, challenges and future prospects specific to deep learning-based sentiment analysis in recent times. Furthermore, we provide in-depth investigation into sentiment analysis, categorizing prevalent data, pre-processing methods, text representations, learning models, and applications. We conduct a thorough evaluation of recent advances in deep learning architectures, assessing their pros and cons. Additionally, we offer a meticulous analysis of deep learning methodologies, integrating insights on applied tools, strengths, weaknesses, performance results, research gaps, and a detailed feature-based examination. Furthermore, we present in a thorough discussion of the challenges, drawbacks, and factors contributing to the successful enhancement of accuracy within the realm of sentiment analysis. A critical comparative analysis of our article clearly shows that capsule-based RNN approaches give the best results with an accuracy of 98.02% which is the CNN or RNN-based models. We implemented various advanced deep-learning models across four benchmarks to identify the top performers. Additionally, we introduced the innovative CRDC (Capsule with Deep CNN and Bi structured RNN) model, which demonstrated superior performance compared to other methods. Our proposed approach achieved remarkable accuracy across different databases: IMDB (88.15%), Toxic (98.28%), CrowdFlower (92.34%), and ER (95.48%). Hence, this method holds promise for automated sentiment analysis and potential deployment.
Chapter
Deep learning approaches have become popular in many different areas, including sentiment analysis (SA), because of their competitive performance. However, the downside of this approach is that they do not provide understandable explanations on how the sentiment values are calculated. In contrast, previous approaches that used sentiment lexicons can do that, but their performance is normally not high. To leverage the strengths of both approaches, we present a neuro-symbolic approach that combines deep learning (DL) and symbolic methods for SA tasks. The DL approach uses a pre-trained language model (PLM) to construct sentiment lexicon. The symbolic approach exploits the constructed sentiment lexicon and manually constructed shifter patterns to determine the sentiment of a sentence. Our experimental results show that the proposed approach leads to promising results with the additional advantage that sentiment predictions can be accompanied by understandable explanations.
Conference Paper
Full-text available
In the era of artificial intelligence, data is gold but costly to annotate. The paper demonstrates a groundbreaking solution to this dilemma using ChatGPT for text augmentation in sentiment analysis. We leverage ChatGPT's generative capabilities to create synthetic training data that significantly improves the performance of smaller models, making them competitive with, or even outperforming, their larger counterparts. This innovation enables models to be both efficient and effective, thereby reducing computational cost, inference time, and memory usage without compromising on quality. Our work marks a key advancement in the cost-effective development and deployment of robust sentiment analysis models.
Conference Paper
Full-text available
Sentiment analysis involves using WordNets enriched with emotional metadata, which are valuable resources. However, manual annotation is time-consuming and expensive, resulting in only a few WordNet Lexical Units being annotated. This paper introduces two new techniques for automatically propagating sentiment annotations from a partially annotated WordNet to its entirety and to a WordNet in a different language: Multilingual Structured Synset Embeddings (MSSE) and Cross-Lingual Deep Neural Sentiment Propagation (CLDNS). We evaluated the proposed MSSE+CLDNS method extensively using Princeton WordNet and Polish WordNet, which have many inter-lingual relations. Our results show that the MSSE+CLDNS method outperforms existing propagation methods, indicating its effectiveness in enriching WordNets with emotional metadata across multiple languages. This work provides a solid foundation for large-scale, multilingual sentiment analysis and is valuable for academic research and practical applications.
Article
Affective tasks, including sentiment analysis, emotion classification, and sarcasm detection have drawn a lot of attention in recent years due to a broad range of useful applications in various domains. The main goal of affect detection tasks is to recognize states such as mood, sentiment, and emotions from textual data (e.g., news articles or product reviews). Despite the importance of utilizing preprocessing steps in different stages (i.e., word representation learning and building a classification model) of affect detection tasks, this topic has not been studied well. To that end, we explore whether applying various preprocessing methods (stemming, lemmatization, stopword removal, punctuation removal and so on) and their combinations in different stages of the affect detection pipeline can improve the model performance. The are many preprocessing approaches that can be utilized in affect detection tasks. However, their influence on the final performance depends on the type of preprocessing and the stages that they are applied. Moreover, the preprocessing impacts vary across different affective tasks. Our analysis provides thorough insights into how preprocessing steps can be applied in building an effect detection pipeline and their respective influence on performance.
Article
Full-text available
Aspect-based sentiment analysis aims to extract aspect and opinion terms, and identify the sentiment polarities for such terms. The majority of research has proposed effective methods in individual subtasks, and some multi-task learning models have been designed to deal with combining two subtasks, such as extracting aspect terms and opinions in pairs. Recently, there have been some studies on triple extraction tasks that attempt to simultaneously extract target terms (aspects, opinions) and sentiment polarities from a sentence. However, these studies ignore the directional dependency relations between terms and context, and the intrinsic dependence between these terms has not been well exploited. In this paper, we propose a novel dependency-enhanced graph convolutional network (DE-GCN) for multi-variate extraction tasks. We re-integrate the directional dependency relations in the graph convolution to reconstruct the time-series information representation. In addition, we construct a dependency aggregator to enhance dependency relations between contexts. We conduct experiments on extensive experiments and comparisons on these subtasks. Experimental results on four datasets show the effectiveness of our proposed model.
Chapter
Full-text available
We developed and validated a language-agnostic method for sentiment analysis. Cross-language experiments carried out on the new MultiEmo dataset with texts in 11 languages proved that LaBSE embeddings with an additional attention layer implemented in the BiLSTM architecture outperformed other methods in most cases.KeywordsCross-language NLPSentiment analysisLanguage-agnostic representationLASERLaBSEBiLSTMOpinion miningMultiEmo
Chapter
Full-text available
This article presents MultiEmo, a new benchmark data set for the multilingual sentiment analysis task including 11 languages. The collection contains consumer reviews from four domains: medicine, hotels, products and university. The original reviews in Polish contained 8,216 documents consisting of 57,466 sentences. The reviews were manually annotated with sentiment at the level of the whole document and at the level of a sentence (3 annotators per element). We achieved a high Positive Specific Agreement value of 0.91 for texts and 0.88 for sentences. The collection was then translated automatically into English, Chinese, Italian, Japanese, Russian, German, Spanish, French, Dutch and Portuguese. MultiEmo is publicly available under the MIT Licence. We present the results of the evaluation using the latest cross-lingual deep learning models such as XLM-RoBERTa, MultiFiT and LASER+BiLSTM. We have taken into account 3 aspects in the context of comparing the quality of the models: multilingualism, multilevel and multidomain knowledge transfer ability.
Article
Full-text available
Pre-trained language representation models (PLMs) cannot well capture factual knowledge from text. In contrast, knowledge embedding (KE) methods can effectively represent the relational facts in knowledge graphs (KGs) with informative entity embeddings, but conventional KE models cannot take full advantage of the abundant textual information. In this paper, we propose a unified model for Knowledge Embedding and Pre-trained LanguagERepresentation (KEPLER), which can not only better integrate factual knowledge into PLMs but also produce effective text-enhanced KE with the strong PLMs. In KEPLER, we encode textual entity descriptions with a PLM as their embeddings, and then jointly optimize the KE and language modeling objectives. Experimental results show that KEPLER achieves state-of-the-art performances on various NLP tasks, and also works remarkably well as an inductive KE model on KG link prediction. Furthermore, for pre-training and evaluating KEPLER, we construct Wikidata5M1 , a large-scale KG dataset with aligned entity descriptions, and benchmark state-of-the-art KE methods on it. It shall serve as a new KE benchmark and facilitate the research on large KG, inductive KE, and KG with text. The source code can be obtained from https://github.com/THU-KEG/KEPLER.
Conference Paper
Full-text available
The detection of abusive or offensive remarks in social texts has received significant attention in research. In several related shared tasks, BERT has been shown to be the state-of-the-art. In this paper, we propose to utilize lexical features derived from a hate lexicon towards improving the performance of BERT in such tasks. We explore different ways to utilize the lexical features in the form of lexicon-based encodings at the sentence level or embeddings at the word level. We provide an extensive dataset evaluation that addresses in-domain as well as cross-domain detection of abusive content to render a complete picture. Our results indicate that our proposed models combining BERT with lexical features help improve over a baseline BERT model in many of our in-domain and cross-domain experiments. https://www.aclweb.org/anthology/2020.alw-1.5/
Chapter
Full-text available
We describe the creation of HurtLex, a multilingual lexicon of hate words. The starting point is the Italian hate lexicon developed by the linguist Tullio De Mauro, organized in 17 categories. It has been expanded through the link to available synset-based computational lexical resources such as MultiWordNet and BabelNet, and evolved in a multi-lingual perspective by semi-automatic translation and expert annotation. A twofold evaluation of HurtLex as a resource for hate speech detection in social media is provided: a qualitative evaluation against an Italian annotated Twitter corpus of hate against immigrants, and an extrinsic evaluation in the context of the AMI@Ibereval2018 shared task, where the resource was exploited for extracting domain-specific lexicon-based features for the supervised classification of misogyny in English and Spanish tweets.
Article
Full-text available
In this article, we present a novel technique for the use of language-agnostic sentence representations to adapt the model trained on texts in Polish (as a low-resource language) to recognize polarity in texts in other (high-resource) languages. The first model focuses on the creation of a language-agnostic representation of each sentence. The second one aims to predict the sentiment of the text based on these sentence representations. Besides models evaluation on PolEmo 1.0 Sentiment Corpus, we also conduct a proof of concept for using a deep neural network model trained only on language-agnostic embeddings of texts in Polish to predict the sentiment of the texts in MultiEmo-Test 1.0 Sentiment Corpus, containing PolEmo 1.0 test datasets translated into eight different languages: Dutch, English, French, German, Italian, Portuguese, Russian and Spanish. Both corpora are publicly available under a Creative Commons copyright license.
Article
Analyzing people’s opinions and sentiments towards certain aspects is an important task of natural language understanding. In this paper, we propose a novel solution to targeted aspect-based sentiment analysis, which tackles the challenges of both aspect-based sentiment analysis and targeted sentiment analysis by exploiting commonsense knowledge. We augment the long short-term memory (LSTM) network with a hierarchical attention mechanism consisting of a target-level attention and a sentence-level attention. Commonsense knowledge of sentiment-related concepts is incorporated into the end-to-end training of a deep neural network for sentiment classification. In order to tightly integrate the commonsense knowledge into the recurrent encoder, we propose an extension of LSTM, termed Sentic LSTM. We conduct experiments on two publicly released datasets, which show that the combination of the proposed attention architecture and Sentic LSTM can outperform state-of-the-art methods in targeted aspect sentiment tasks.
Article
This survey focus on sentiment analysis using various Deep learning methodologies namely Convolutional neural network, Recurrent neural network, Long Short Term Memory, Gated Recurrent Unit and its variants. Sentiment analysis is used to analyse opinions or sentiments of people about entities such as products, services, individuals. Currently it has become a very active research area since a vast amount of data is generated daily in various forms such as text, audios and videos in the social media on the world wide web. Sentiment analysis categorizes opinions into positive, negative, or neutral. Deep learning network perform better than SVMs and conventional neural networks for sentiment analysis since it can handle huge amount of data. Out of various deep learning models Recurrent neural networks perform better than Convolutional Neural networks for sentiment analysis. LSTM and GRU both are better than Simple RNN because they can catch Long-Term Dependencies.