ArticlePDF Available

‘We (don’t) know how you feel’ – a comparative study of automated vs. manual analysis of social media conversations

Authors:

Abstract and Figures

The ever-growing volume of brand-related conversations on social media platforms has captivated the attention of academics and practitioners, as the analysis of those conversations promises to offer unparalleled insight into consumers’ emotions. This article takes a step back from the hype, and investigates the vulnerabilities related to the analysis of social media data concerning consumers’ sentiment. A review of the literature indicates that the form, focus, source and context of the communication may negatively impact on the analyst’s ability to identify sentiment polarity and emotional state. Likewise, the selection of analytical tool, the creation of codes, and the classification of the data, adversely affect the researcher’s ability to accurately assess the sentiment expressed in a social media conversation. Our study of Twitter conversations about coffee shows low levels of agreement between manual and automated analysis, which is of grave concern given the popularity of the latter in consumer research.
Content may be subject to copyright.
This article was downloaded by: [Ana Isabel Canhoto]
On: 19 June 2015, At: 01:44
Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Click for updates
Journal of Marketing Management
Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/rjmm20
‘We (don’t) know how you feel’ –
a comparative study of automated
vs. manual analysis of social media
conversations
Ana Isabel Canhotoa & Yuvraj Padmanabhanb
a Faculty of Business, Oxford Brookes University, UK
b Mindgraph, UK
Published online: 18 Jun 2015.
To cite this article: Ana Isabel Canhoto & Yuvraj Padmanabhan (2015) ‘We (don’t) know how you
feel’ – a comparative study of automated vs. manual analysis of social media conversations, Journal
of Marketing Management, 31:9-10, 1141-1157, DOI: 10.1080/0267257X.2015.1047466
To link to this article: http://dx.doi.org/10.1080/0267257X.2015.1047466
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
‘We (don’t) know how you feel’ – a comparative
study of automated vs. manual analysis of social
media conversations
Ana Isabel Canhoto, Faculty of Business, Oxford Brookes University,
UK
Yuvraj Padmanabhan, Mindgraph, UK
Abstract The ever-growing volume of brand-related conversations on social
media platforms has captivated the attention of academics and practitioners,
as the analysis of those conversations promises to offer unparalleled insight
into consumers’ emotions. This article takes a step back from the hype, and
investigates the vulnerabilities related to the analysis of social media data
concerning consumers’ sentiment. A review of the literature indicates that the
form, focus, source and context of the communication may negatively impact on
the analyst’s ability to identify sentiment polarity and emotional state. Likewise,
the selection of analytical tool, the creation of codes, and the classification of
the data, adversely affect the researcher’s ability to accurately assess the
sentiment expressed in a social media conversation. Our study of Twitter
conversations about coffee shows low levels of agreement between manual
and automated analysis, which is of grave concern given the popularity of the
latter in consumer research.
Keywords consumer behaviour; emotions; sentiment analysis; social media;
data analysis; CAQDAS
Introduction
Social media data have been heralded as revolutionary to study consumer
behaviour, by practitioners (e.g. Casteleyn, Mottart, & Rutten, 2009)and
social scientists (e.g. Baker, 2009) alike. Social media platforms allow for the
collection of data in real time and in a non-intrusive manner (Murthy, 2008),
and are more cost-effective than traditional approaches (Christiansen, 2011).
They are particularly promising in the study of feelings and emotions (Cooke
&Buckley,2008). Accordingly, researchers have investigated various aspects of
social media data collection, such as the ethics of using such data (e.g. Nunan &
Domenico, 2013), or the impact of varying levels of use of social media
platforms by class, race and gender on research results and even the type of
work done (Murthy, 2008). However, there is very limited discussion in the
literature of the issues arising once the data has been collected. Such absence of
©2015 Westburn Publishers Ltd.
Journal of Marketing Management, 2015
Vol. 31, Nos. 9–10, 1141–1157, http://dx.doi.org/10.1080/0267257X.2015.1047466
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
research represents a significant gap in the literature, given that how researchers
process and analyse online data is profoundly social with tremendous
sociological implications(Halford, Pope, & Weal, 2013,p.180).Thatis,the
lack of published research specifically examining how social media data is
processed and analysed presents a significant gap in the understanding of the
value and limitations of using such data in consumer research and, specifically, in
the study of consumer sentiment.
The volume of data available and the pressure to process it quickly means that,
increasingly, data analysis is done without human intervention (Nunan & Domenico,
2013). Accordingly, there is now an abundance of commercial software tools that
mine textual data and produce reports of expressed opinions and sentiment (Sterne,
2010). These tools produce scores reflecting the emotions expressed in the segments
of text analysed (Cambria & Hussain, 2012), though it is difficult to assess how good
or limited the tools are, and what aspects of automated sentiment analysis are
particularly strong or weak, given that the companies behind such commercial
applications do not reveal their algorithms (Beer & Burrows, 2013). This limitation
violates one of the key principles of using software to analyse qualitative data, namely
that researchers need to verify the accuracy of those tools (Brown, Taylor, Baldy,
Edwards, & Oppenheimer, 1990).
The promotional literature of the providers of automated sentiment analysis tools
typically report an accuracy rate between 60% and 85% (Carson, 2014). While these
percentages suggest a high level of accuracy (see Gwet, 2012), it is not clear how the
coefficients were calculated. Moreover, it is not possible to independently verify the
companiesclaims, as there is no information on the input for those calculations; for
instance, the sample used. This uncertainty is problematic for research, given that the
purpose for which a software product was developed may make it unsuitable for
application in another type of research (Rose, Spinks, & Canhoto, 2014). Given that
social media data are increasingly used as a source of insight into consumers
emotions, it is imperative to investigate the issues emerging in the analysis of such
data, both concerning the type of research (i.e. emotions) and concerning the use of
software in the analysis of social media conversations. Accordingly, this study
investigates the following research question:
What are the vulnerabilities related to the analysis of social media data concerning
consumerssentiment towards a product?
The next section outlines the rationale, goals and processes of sentiment analysis,
before considering how the nature of social media conversations and the
characteristics of automated analysistools may limit the researchers ability to
identify sentiment polarity and emotional state in social media content. The
subsequent section details the research design that saw 200 Twitter posts about
coffee being analysed manually and with various software products, to study
sentiment towards this drink and its consumption practices. The results revealed
low levels of inter-coder agreement, not just in terms of manual vs. automated
approaches, but also between the automated tools considered. These low levels of
agreement were particularly noticed for negative or neutral sentiments, for segments
of text with multiple foci, and where the expression of sentiment is made via
abbreviations and subtle elements, or results from the absence of the product in
question. The implications of these findings for research on consumer behaviour are
considered, and directions for further research presented.
1142 Journal of Marketing Management, Volume 31
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
Using social media data to study sentiment
The study of emotions is a topical subject within the marketing literature, given
their impact on consumer behaviour for instance, being in a good mood makes
people more willing to take risks (Johnson & Tversky, 1983) and shortens the
decision-making process (Forgas, 1991). The role of emotions in consumer
behaviour (see Loewenstein & Lerner, 2003, for a detailed discussion) led to the
development of sentiment analysis, also known as opinion mining. Sentiment
analysis consists of a number of techniques and, increasingly, technical artefacts
to identify and analyse feelings. The goals of sentiment analysis are to identify
whether consumers are expressing emotions, as well as the nature of those emotions
and how strong those feelings are.
Sentiment data can be collected via experiments, as traditionally done in the field
of psychology, for instance. Experiments allow for the simultaneous assessment of
various dependent variables, but have several limitations in terms of mood induction
and manipulation, as well as the isolation of independent and dependent effects
(Cohen, Pham, & Andrade, 2008). An alternative approach is to conduct
interviews or surveys, which ask participants to reflect on previous emotionally
charged experiences, but this approach, too, has its drawbacks. One limitation is
that participants may be unwilling to invoke or revisit emotionally charged memories
(Cohen et al., 2008). The other restriction is that the quality of the insight depends
on the participantsability to verbalise their emotions (Cooke & Buckley, 2008).
Social media promise to overcome these limitations. As individuals became
lifecasters(Patterson, 2012) who share information about themselves, their
behaviours and relationships (Kietzmann, Hermkens, McCarthy, & Silvestre, 2011),
so social media platforms became popular vehicles to study consumers on a large
scale and in a natural setting (Kivran-Swaine, Brody, Diakopoulos, & Naaman,
2012). Moreover, as a significant share of social media conversations express
sentiment about products and brands (Jansen, Zhang, Sobel, & Chowdury, 2009),
these platforms have become very appealing as a source of information about
consumer sentiment for product and brand managers.
Once data, for instance online reviews,havebeencollectedfromtherelevant
social media platforms, researchers can analyse those inputs looking for terms,
phrases or expressions that reflect sentiment. There are a number of specialist
software products available to mine documents using a range of keywords (Thet,
Na, & Khoo, 2010), such as greatfor a positive emotion, or revoltingfor a
negative one. The text segments collected are subsequently classified according to
their sentiment polarity, that is, whether the overall feeling expressed in the unit
of text selected is positive or negative (Thelwall, Buckley, & Paltoglou, 2011).
In addition to studying sentiment polarity, researchers can analyse emotional
states. It is valuable to understand the specific type of emotion experienced by the
consumer because emotions are highly differentiated in their impact (Laros &
Steenkamp, 2005). For instance, unhappiness and anger are both negative
emotions. Yet, they have different consequences in terms of consumer behaviour, in
that the former lacks a focus, whereas the latter tends to be targeted and will lead to
context-specific responses (Bushman, Baumeister, & Phillips, 2001).
The study of sentiment polarity and emotional state may be increasingly popular,
but is not without its challenges. First, the expression of sentiment varies with
Canhoto and Padmanabhan Automated vs. manual analysis of social media conversations 1143
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
cultures and over time, both in terms of the languages syntactic features and in terms
of style (Abbasi, Chen, & Salem, 2008). Moreover, a single segment of text may
express more than one sentiment, and refer to more than one object, creating
uncertainty regarding the prevalent sentiment. For instance, the author of a
product review may judge the product positively, but express dissatisfaction with
specific features (Liu, 2010). In this example, whether the review should be classified
as positive or negative depends on whether the focus of the analysis is the overall
impression or the specific features, respectively. Another challenging factor is that
sentiment may be expressed through subtle elements such as the use of exception or
conditional clauses (Kim & Hovy, 2006), or even the choice of words and their
placement (Davis & OFlaherty, 2012). Lastly, sentiment about an object may not be
expressed directly but through comparisons, instead, which requires the analyst to
have domain knowledge to identify whether the comparative terms used reflect a
positive or a negative opinion about the product (Liu, 2010).
In addition to these inherent challenges of sentiment analysis, studying the
expression of emotions on social media may present its own additional difficulties,
as summarised in Table 1 and discussed next.
In terms of the syntactic and stylistic aspects of expressing sentiment, it should be
noted that social media users tend to apply certain colloquialisms and abbreviations
with multiple and ever-evolving meanings. For instance, LOL started by being an
acronym for lots of love, but now is also used as a replacement for laughing out
loud. In addition, users may also employ an increasing array of text symbols or
emoticons to support the communication of feelings and emotional statuses.
Another challenge is that social media messages such as status updates or
comments on a blog tend to be fairly short. This characteristic of social media
messages arises partly because of the features of particular platforms, such as the
limit of 140 characters for Twitter messages. However, this characteristic also
reflects the instant and informal nature of communication on social media.
Messages are frequently complemented, or even replaced, by non-textual
elements such as links to external sources of information, pictures, videos
and audio files (Kietzmann et al., 2011). The use of short messages and non-
textual elements may hinder the identification of multiple foci within one
segment of text.
It has also been noted that social media content that is critical of brands often
employs irony and sarcasm (e.g. Dahl, 2015). It is very difficult for analysts to detect
and classify sarcastic content in general, due to its nuances and multidimensional
nature (Vanden Bergh, Lee, Quilliam, & Hove, 2011), and the same applies for
sentiment analysis. The importance of contextual knowledge for the study of
Table 1 Challenges in sentiment analysis.
Related to. . . General Social media specific
Form Syntax and style Use of colloquialisms, abbreviations,
symbols and emoticons
Focus Multiple sentiments
and objects
Short text segments and use of non-textual
elements
Source Subtlety Use of irony and sarcasm
Context Contextual knowledge Complexity of social media
1144 Journal of Marketing Management, Volume 31
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
online content has been emphasised by several authors, including Kozinets (2002)
who reflected on the complexity of studying meaning in social media conversations.
Yet, existing data analysis software has very limited ability to analyse data in context
(Cambria & Hussain, 2012).
The large volume of social media data available, and the complexity of
monitoring conversations over multiple and very diverse platforms, have led
both managers (Davis & OFlaherty, 2012) and researchers (Nunan &
Domenico, 2013) to turn to third-party providers of automated tracking and
analysis of social media data. This is not a problem in itself, given that specialist
software can help with the manipulation of big data sets and the generation of
codes in qualitative research (Lage & Godoy, 2008). Using software to analyse
textual data can also improve the credibility of a qualitative study, even if it does
not change the rigour of the analytical work done, or the outcome of the analysis
(Ryan, 2009). However, the characteristics of automated tools for instance, the
algorithms used reflect the purpose for which the software was developed and,
thus, analysts need to carefully assess the suitability of that software for their
projects (Rose et al., 2014). Researchers also need to be actively involved in the
creation of categories, and in deciding what data to retrieve and collate (Basit,
2003). Finally, they need to carefully verify the accuracy of the classification, as
content analysis software has limitations in terms of discerning nuances in
meaning, leading to the partial retrieval of information (Brown et al., 1990). In
the case of sentiment analysis software, it is very difficult for researchers to verify
any of these aspects (Beer & Burrows, 2013), putting researchers at risk of acting
on inaccurate data analysis outcomes.
In summary, the study of sentiment in social media conversations, while popular
and promising, may be negatively impacted by issues related to the communication of
sentiment, as well as issues related to the automated analysis of that sentiment. These
vulnerabilities may affect the researchers ability to identify and classify the texts
sentiment polarity and emotional state, as depicted in Figure 1.
The following section describes how the issues depicted in Figure 1 were
investigated in an empirical setting.
Figure 1 Sources of vulnerability in the study of sentiment in social media
conversations.
Communication of sentiment
Form
Focus
Source
Context
Automated analysis of sentiment
Selection of tool
Creation of codes
Classification of data
Sentiment polarity
Negatively impact on accuracy of:
Issues associated with:
Emotional state
Canhoto and Padmanabhan Automated vs. manual analysis of social media conversations 1145
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
The empirical study
To investigate the issues depicted in Figure 1, empirically, we set out to study the
expression of sentiment on social media following a qualitative content analysis
(QCA) approach. QCA is a systematic approach to the analysis of both verbal and
visual textual material in either paper or digital format, including online material
(Rose et al., 2014, p. 135). In QCA, the language or imagery used is the focus of the
research, rather than a resource, and so its the words themselves, and how they are
used, that are analysed (Schreier, 2012).
As the topic of food and beverages is the one most widely discussed on social
media (Forsyth, 2011), this was chosen as the focus for data collection. The topic of
food is also extremely important in the social sciences literature, with the past
20 years having seen an explosion of work(p. 369), particularly within the sub-
topics of children, health and social aspects (Uprichard, 2013). Within the broad
topic of food, it was decided to focus on social media conversations about coffee as
this broadly popular beverage is charged with a wide range of cultural meanings
(Grinshpun, 2014). Moreover, coffee has been the subject of other netnographic
studies (e.g. Kozinets, 2002), which offered a starting point for the development of
coding schemes for the present study.
The specific online community selected for observation was Twitter. This is
because the Twitter platform is often used as a source of qualitative data by both
practitioners and academics (Williams, Terras, & Warwick, 2013). Tweets were
collected over a period of one month, using the search term coffeeand its
variants latte,mocha,cappuccino,espressoand Americano, as well as the
terms flavour,aromaand caffeine. Care was taken to include multiple users
and to exclude tweets by manufacturers and retailers. Of the corpus of remaining
tweets, 200 were selected randomly to test the framework depicted in Figure 1.
In the research methods literature, accuracy is assessed by the extent to which
different researchers agree on the classification of a particular data object, that is the
rate of inter-coder agreement (Gwet, 2012). In the case of an automated tool, this
will be the extent to which the classification produced by the software matches that
of human coders. Therefore, to investigate the accuracy of automated sentiment
analysis, we used more than one software, and used the rate of inter-coder
agreement as a proxy for accuracy. Specifically, we checked the rate of agreement
between coders, between software products, and between coders and software
products.
Specifically, data were analysed manually and with two popular automated
sentiment analysis tools. Software number L was a commercial product offered by
the leading international provider of social medial analytics, and uses natural
language processing and adaptive learning techniques.
1
Software T was a product
developed and commercialised by a leading international university and is based on
computational linguistics.
2
Coding was done with a scheme that reflected polarity of
emotion (positive vs. negative). In addition, as advised by Koppel and Schler (2006),
comments that related to the product (i.e. coffee and its synonyms, as previously
described) but that did not express an emotion, were not excluded from the sample;
instead, they were coded as neutral.
1
Source: company website.
2
Source: company website.
1146 Journal of Marketing Management, Volume 31
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
Subsequently, data were analysed by type of emotion, because different emotions
produce different behavioural consequences (Laros & Steenkamp, 2005). This was
done manually and with software L described above, as well as two academic
products using semantic analysis techniques (specifically, rule-based and the M-C-
based sentic computing; for more on sentic computing, see Cambria & Hussain,
2012). Coding was done using Plutchiks(2001) wheel of emotions schema, which
identifies eight primary bipolar emotions.
Findings and discussion
This section presents the outcomes of the sentiment analysis conducted manually and
with the software products, before examining the reasons for the problems of
accuracy encountered in the exercise.
Agreement of classification of sentiment polarity and emotional state analysis
There was a high rate (89%) of inter-coder agreement between the two manual
coders. However, there were significant differences between the outcomes of the
manual vs. the automated approaches to sentiment analysis, as depicted in
Figure 2.
As Figure 2 shows, the three approaches (manual vs. software L vs. software T)
only delivered the same score in circa one-third of the cases (32%). In 7% of the cases
both software products produced the same outcome, but differed from the manual
analysis. In other words, the overall agreement rate between the two software
products was just under 40%. While it had been noted that automated tools are
blunt instruments to study sentiment (e.g. Cambria & Hussain, 2012), this was still a
surprising result, well below the rates typically reported in the commercial literature,
as discussed in Davis and OFlaherty (2012), and even taking into account the
challenges presented by social media data, as per Figure 1.
Figure 2 Extent of inter-coder agreement in the analysis of sentiment polarity.
Manual analysis
and Software L
produce the
same outcome,
24%
All approaches
produce the
same outcome,
32%
All approaches
produce
different
outcomes, 11%
The software
products
produce the
same outcome,
which differs
from manual
anal
y
sis, 7%
Manual analysis
and Software T
produce the
same outcome,
28%
Canhoto and Padmanabhan Automated vs. manual analysis of social media conversations 1147
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
In 11% of the cases, each approach delivered a different score. In around half the
cases the outcome of the manual analysis mirrored that of one of the software
products but not the other. Hence, it cannot be said that one of the software
products is clearly superior to the other in terms of accuracy (using inter-coder
agreement rates, as a proxy), even though the software had such different origins.
On the contrary, both products have similar rates of manual vs. automated coding
agreement.
Overall, the manual analysis of tweets was most likely to result in a positivescore.
One of the software products was most likely to produce negative scores, while the
other had a higher proportion of neutral classifications.
The discrepancy between manual and automated analysis was also evident in the
analysis of emotional states, though this tended to vary for specific states, as
illustrated in Figures 3 and 4. Again, there wasnt clear evidence of superiority of
one software over the others, contrary to what may be seen with other analytical
software as noted by Rose et al. (2014).
Agreement tended to occur around the positive emotion joy. In turn, differences
were particularly marked for the emotion surprise’–this was evident both between
manual analysis vs. automated, and between the various automated software, as
illustrated by the quote provided in Table 2.
Investigation of the causes for disagreement in coding
Further analysis sought to probe the factors related to the communication of
sentiment on social media and the factors related to the automated analysis of
sentiment, and how they impact on the classification of tweets.
Figure 3 Extent of inter-coder agreement in the analysis of emotional state – manual
(pink, upper line) vs. software L (blue, lower line). (This figure is available in full
colour in the online version of the article.)
Line Chart
7
6.5
6
5.5
5
4.5
4
3.5
Sum(Manual), Sum(Lymbix)
3
2.5
2
1.5
1
135791113151719212325272931333537
(Row Number)
39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75
1148 Journal of Marketing Management, Volume 31
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
As exemplified by the entries in Ta b l e 3 ,therewereinstancesofagreementbetween
manual coding and two software products for all types of messages: neutral, positive or
negative. However, focusing on those messages where all types of coders agreed, it is
interesting to see that they are most likely to reflect positive emotions, as exemplified by
these tweets: Found a euro cent on my walk and have a great cup of coffee in hand.
Monday is already off to a good start(entry 18) and Feeling much more alive this
morning now that Ive had my coffee. Thank you #Nespresso(entry 28). Similarly,
emotions that were clearly positive, like joy,showedhigherratesofagreementthan
those that were neutral or negative (Ta b l e 2 ). By contrast, an example of a problematic
sentence is the entry Think I need an IV of caffeine today. So tired, courtesy of my
beautiful angelic children. . .(entry 99). The expressions courtesyand angelicwere
used sarcastically in this expression, and the software disagreed on how to code it (software
Tdeemeditnegative,whereassoftwareLdeemed it positive). These observations extend
previous findings (e.g. Vanden Bergh et al., 2011)notingthatitisdifficulttodetectand
Figure 4 Extent of inter-coder agreement in the analysis of emotional state – manual
(solid line) vs. rule-based sentic (dashed line) vs. M-C-based sentic (dotted line).
1
2
3
4
5
6
7
0
8
13579111315171921232527293133353739414345474951535557596163656769717375
Sum(Manual), Sum(Sentic File Based, Sum(Sentic M/C based)
(Row Number)
Table 2 Example of disparity in emotional state classification.
Entry Manual L Rule M-C
12. ‘This coffee shop needs to
change there music up every
once and a while. Or maybe I
should go home
4. Anger 3. Surprise 3. Surprise 2. Joy
Canhoto and Padmanabhan Automated vs. manual analysis of social media conversations 1149
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
accurately code irony and sarcasm. Specifically,thesefindingsindicatethat,asfarasTwitter
is concerned, there are challenges associated with the expression of neutral and negative
sentiments in general, as discussed next, not just those of a sarcastic nature (though these,
too, presented challenges).
The very small number of tweets where the software products agreed with each other
but the classification was different from the manual one were for text segments that were
very short, such as these examples: In uni. I think without this cup of coffee I would hulk
out(entry 69) or Cups and cups of coffee is whats going to keep me up at work tonight
(entry 36). Both of these extracts have less than 70 characters, and were deemed neutral by
the automated software but positive by the manual coders.
Table 3 Examples of messages where there was agreement between all coders.
Entry Comment Manual Software L Software T
Mommys making me a
pot of coffee for the
night ahead. #ohboy
#thisisgonnaberough
‘rough’ comment
implies a negative
sentiment, but it
seems to apply to the
night ahead, not the
coffee.
1. Neutral 1. Neutral 1. Neutral
I am doing several
things over the weeks
of Lent. Food, TV,
Internet, Caffeine,
Music, Sleep, Shopping
for Non-Essentials
User is stating a fact
and listing various
items. No clear or
implicit expression of
emotion.
1. Neutral 1. Neutral 1. Neutral
Coffee and sunny
skies. . . Life is good.
The reference to sunny
skies and the use of
the term ‘good’
suggest a positive
emotion.
2. Positive 2. Positive 2. Positive
Doing some late night
paper/laptop work . . .
Hope to be done in the
next few hours. lol . . .
Yep, a hot cup of coffee
sounds mighty good
now!;o)
Use of expression
‘mighty good’ and
smiley face suggest
positive emotion.
2. Positive 2. Positive 2. Positive
i think coffee make
headache more worst
-.- get a tea or juice.
Referring to negative
side effect of coffee.
Considers alternative
products.
3. Negative 3. Negative 3. Negative
Good morning! Hope
everyone had a good
night’s sleep! I will
never drink anything
with caffeine ever
again! I was up sick
half the night!
Negative side effect of
drinking coffee.
Expresses intention to
avoid coffee in the
future.
3. Negative 3. Negative 3. Negative
1150 Journal of Marketing Management, Volume 31
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
In terms of the text segments that caused the software to produce different
outcomes from each other, it was observed that problems tended to occur where
more than one sentiment was expressed, and more than one object was
mentioned, in a single segment of text, such as Itssocoldandguesswhatson
my room. . . Air conditioning. Seriously well off to work I go. Might have to stop
and get some coffee(entry 51) or The early shift sucks. Oh well at least my latte
is yummy:)(entry 19). In entry 19 we have two objects, namely the early shift
and the coffee drink; and two sentiments, a negative one towards the early shift
and a positive one towards the drink. It has been noted elsewhere (e.g. Liu, 2010)
that automated tools have difficulty in coding this type of messages. And this was
certainly the case in this sentence with software T, which failed to adjust to the
nuance of multiple objects and emotions, deeming this sentence as displaying a
negative sentiment.
Differences also arose when the expression of sentiment was implied rather
than explicit. In other words, when the segment of text itself did not contain
any of the keywords associated with emotion, but it was nonetheless rich in
meaning. For instance, this segment depicts coffee positively, as a reward even
though that word (or a synonym) was never mentioned: 100 copies of [product
title] sold overnight means a definite Starbucks run this morning. Possibly coffee
out twice this week! Maybe even sushi!!(entry 46). In this specific example,
software T attributed a negative score to the segment, and software L a neutral
one. Correctly classifying this text segment requires an understanding of the
context of the conversation, namely that selling that particular number of copies
overnight should be considered a positive event. It also requires an
understanding of the cultural meanings attached to coffee. For instance, in the
United Kingdom, drinking coffee is traditionally seen as an energy booster;
however, having coffee out is deemed a treat (Forsyth, 2011). This particular
entry mentions having sushi and coffee out twice in week, which the manual
analysts construed as being a very special treat. This level of contextual and
cultural understanding would be very difficult to programme in an algorithm. In
the example cited, software T deemed the sentence negative and software L
deemed it neutral.
Another factor that may cause problems in the classification of tweets is
subtlety, and concerns the case where the negativity emerges from the absence of
coffee. In such cases, the polarity of the text segment is negative, but it actually
expresses a positive attitude towards coffee, as in this case: how the heck am I
supposed to be able to sleep well without coffee in my system? fucking snow
(entry 31).
There were also problems caused by syntax and style, namely around the use of
abbreviations and slang. One example is the tweet: Having coffee with my grandma
before work right now. QT(entry 25). The abbreviation QTis used here instead of
the phrase Quality Timeand expresses a positive emotion. Software T deemed it
positive, and software L neutral.
Finally, there were a small number of tweets that were picked up and classified by
the software because they contained the keyword coffee, but which were not
expressing any emotion towards the drink itself. For instance, the segment This
coffee shop needs to change there music up every once and a while. Or maybe I
should go home(Entry 12) expresses anger, which is a negative sentiment. However,
the consumer is referring to a place, not the coffee drink itself.
Canhoto and Padmanabhan Automated vs. manual analysis of social media conversations 1151
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
In summary, the software struggled to cope with some very short sentences,
which they tended to deem neutral, but not when these were clearly positive or
negative. The main issues arose where a negative sentiment was expressed but
this resulted from the absence of coffee and the entry was, thus, classified as
positive by the manual coders; whereanegativeorneutralsentimentwas
expressed but this referred to a different object (e.g. shift work), not the
coffee, and there was a positive sentiment expressed towards the latter; and
where the positive sentiment is not explicitly expressed but rather implied
through cultural associations such as having coffee out, or through
abbreviations such as QT. These issues combined to make the overall
sentiment of the corpus of tweets more positive than initially considered.
Conversely, some sentences were changed into negative or neutral entries due
to the use of irony and sarcasm, or because they contained the word coffee and
expressed a sentiment, but did not refer to the drink itself.
Conclusions
Emotions are key to both explain and anticipate consumer behaviour, and sentiment
analysis offers marketers in academia and in industry a way of measuring and
summarising those emotions. Emotions displayed on social media conversations, in
particular, are very appealing for research, as these platforms offer many
opportunities to listen to the conversations in real time, with minimum disruption
for the individuals expressing those emotions and in a cost-effective way. Despite its
promise and popularity, the sentiment analysis of social media conversations is
neither a simple nor a straightforward process.
The research question in this study asked, What are the vulnerabilities related
to the processing and analysis of social media data concerning consumers
sentiment towards a product?. The framework in Figure 1 pointed to two
categories of vulnerabilities, both of which were present in our analysis. Unlike
with the collection and analysis of quantitative data for instance, studying the
correlation between two variables where there are well-established standards
for both processing and analysing data, the study of emotions is embedded in
nuances and subjectivity. There are many words that can be associated with any
given emotion; some words that can be associated with more than one emotion
and, as our study showed, it is also possible to communicate emotion without
using emotionally charged words. These challenges are accentuated by the fact
that the segments of text available on social media are very short, rich in
abbreviations and slang, and often with typos or grammatical errors.
In this study, not only were multiple types of software used, but also software
products from both a commercial and an academic origin were employed. There
were no marked differences in performance between the various products, indicating
that this is not a failure of one product or the other but, rather, a challenge presented
by the subject matter (emotions and sentiments) and by the channel, with its technical
limitations and very specific culture and netiquette.
The impact of the characteristics of the researcher and, in particular, his or her use
of social media platforms also influences data selection and analyses (Murthy, 2008).
For instance, in this case, the researchers used British English to formulate their
search queries (e.g. flavour), unconsciously leaving out the American spelling of the
1152 Journal of Marketing Management, Volume 31
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
word (namely, flavor). Likewise, due to their age, they may have failed to capture or
decode particular spellings or abbreviations and, indeed, sarcasm.
These vulnerabilities have a number of effects on the use of automated tools
to analyse sentiment in online conversations. The first effect was that the
problems with classification of tweets led to an inaccurate representation of
the overall sentiment towards coffee, both in terms of sentiment polarity and
in terms of emotional state. The second effect was that segments that should
have been excluded from the analysis because they did not relate to the topic
under analysis coffee were retained in the corpus of data, possible
skewing the results. Given that so many commercial and, increasingly,
academic research projects rely on the automated analysis of sentiment data,
these findings raise concerns for the quality of those insights and subsequent
decisions.
These results are very concerning given the popularity of automated
sentiment analysis in consumer behaviour research. It is concerning for
academics, particularly the novice user, who may be too reliant on these tools
to analyse large volumes of consumer data. One of the reasons why using
qualitative data analysis software may improve the credibility of a qualitative
study is that the software enables researchers to make visible their data coding
and data analysis processes (Rademaker, Grace, & Curda, 2012). This is not the
case with most automated sentiment analysis tools, given that the coding and
analysis process is performed by algorithms strongly guarded by the commercial
organisations that sell these applications (Beer & Burrows, 2013). It is even
more concerning for practitioners in search of speedy and inexpensive customer
insight and who are unlikely to assess the robustness of the automated tools, as
we did in this study.
The findings from this study have important implications for consumer behaviour
research in academia and in the industry. One likely impact of the low inter-coder
agreement rates observed in this study is that they may discourage researchers from
using sentiment analysis software in their work, which may negatively impact their
ability to use large data sets, or to see their work accepted and recognised.
Alternatively, it may discourage some researchers from using social media data
altogether in their research, which would be a great loss for the development of
the discipline of consumer behaviour.
To improve the classification of tweets, sentiment analysis needs to take into
consideration the social context within which the conversation takes place, for
instance analysts need to look at tweets before or after the one being coded, or
consider wider patterns (e.g. more negative tweets on Mondays). Moreover,
analysts need to consider the cultural connotations of the object that they are
studying, including international variations for instance, in Japan the
consumption of coffee is associated with the idea of foreignness (Grinshpun,
2014), whereas this is no longer the case in the United Kingdom (Forsyth,
2011). Additionally, it is important to keep developing dictionaries that reflect
the specific syntax and style used in social media conversations, or even software
solutions that, in the first stage of analysis, replace commonly used abbreviations
with their formal equivalent for instance, replacing BRB with be right
back. However, it must be recognised that as language and communication
styles are constantly evolving, these dictionaries and tools will never
completely reflect the full variations and nuances in social media
Canhoto and Padmanabhan Automated vs. manual analysis of social media conversations 1153
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
communication. Moreover, they will struggle to capture sarcasm and highly
contextualised uses of language for instance, teenagers using the term sick
to refer to a very good experience.
It needs to be emphasised that software is constantly being updated and
improved, and that some of the problems highlighted here might have been
addressed in versions of the software released after the time when this study was
conducted. For instance, there is more data from which the software can learn
from, dictionaries can be improved and new techniques can be implemented.
This study does not aim to discourage researchers from using automated
sentiment analysis tools in general, or the ones that we mentioned here in
particular. Instead, our message is that researchers need to spend considerable
time familiarising themselves with the technical and pragmatic aspects of
communication in the social environment, and with the characteristics and
limitations of the software that they may use to analyse social media data.
Social media offers a window into consumersminds and holds much promise for
the development of consumer behaviour research. In particular, the analysis of social
media conversations offers many advantages over alternative methods to study
consumersemotions. However, as this research showed, the researchers ability to
accurately identify the sentiment expressed in a tweet or other similar short textual
data extract is limited by how emotions are verbalised, and the contextual nature of
the communication of emotions. Moreover, while automated tools may be effective
at processing large volumes of data, the lack of sophistication and contextual
awareness of those tools, plus the biases introduced by the researchers themselves,
reduce such toolsability to accurately identify sentiment polarity or emotional state.
As with any other automated data analysis tool, researchers need to carefully assess
the suitability of sentiment analysis software for their projects, and understand their
limitations.
Disclosure statement
No potential conflict of interest was reported by the authors.
References
Abbasi, A., Chen, H., & Salem, A. (2008). Sentiment analysis in multiple languages: Feature
selection for opinion classification in web forums. ACM Transactions on Information
Systems,26(3), 134. doi:10.1145/1361684.1361685
Baker, S. (2009, May 21). Learning, and profiting, from online friendships. Bloomberg
Businessweek Magazine.
Basit, T. N. (2003). Manual or electronic? The role of coding in qualitative data analysis.
Educational Research,45(2), 143154. doi:10.1080/0013188032000133548
Beer, D., & Burrows, R. (2013). Popular culture, digital archives and the new social life of data.
Theory, Culture & Society,30(4), 4771. doi:10.1177/0263276413476542
Brown, D., Taylor, C., Baldy, R., Edwards, G., & Oppenheimer, E. (1990). Computers and
QDA - can they help it? A report on a qualitative data analysis programme. The Sociological
Review,38(1), 134150. doi:10.1111/j.1467-954X.1990.tb00850.x
Bushman, B., Baumeister, R. F., & Phillips, C. M. (2001). Do people aggress to improve
their mood? Catharsis beliefs, affect regulation opportunity, and aggressive responding.
1154 Journal of Marketing Management, Volume 31
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
Journal of Personality and Social Psychology,81(1), 1732. doi:10.1037/0022-
3514.81.1.17
Cambria, E., & Hussain, A. (2012). Sentic computing: Techniques, tools, and applications.
Dordrecht: Springer. ISBN 978-94-007-5070-8
Carson, E. (2014, June 18). Sentiment analysis: Understanding customers who dont mean
what they say. TechRepublic.
Casteleyn, J., Mottart, A., & Rutten, K. (2009). How to use Facebook in your market research.
International Journal of Market Research,51(4), 439447. doi:10.2501/
S1470785309200669
Christiansen, L. (2011). Personal privacy and internet marketing: An impossible conflict or a
marriage made in heaven? Business Horizons,54(6), 509514. doi:10.1016/j.
bushor.2011.06.002
Cohen, J. B., Pham, M. T., & Andrade, E. B. (2008). The nature and role of affect in consumer
behavior. In C. P. Haugtvedt, P. Herr, & F. Kardes (Eds.), Handbook of consumer
psychology (pp. 297348). Mahwah, NJ: Lawrence Erlbaum.
Cooke, M., & Buckley, N. (2008). Web 2.0, Social networks and the future of market research.
International Journal of Market Research,50(2), 267292.
Dahl, S. (2015). Social media marketing - theories & applications. London: Sage.
Davis, J. J., & OFlaherty, S. (2012). Assessing the accuracy of automated twitter sentiment
coding. Academy of Marketing Studies Journal,16(Suppl.), 3550.
Forgas, J. P. (1991). Affective influences on partner choice: Role of mood in social decisions.
Journal of Personality and Social Psychology,61(5), 708720. doi:10.1037/0022-
3514.61.5.708
Forsyth, J. (2011). Coffee - UK (pp. 169). London: Mintel.
Grinshpun, H. (2014). Deconstructing a global commodity: Coffee, culture, and
consumption in Japan. Journal of Consumer Culture,14(3), 343364. doi:10.1177/
1469540513488405
Gwet, K. L. (2012). Handbook of inter-rater reliability. Gaithersburg, MD: StatAxis Publishing
Company.
Halford, S., Pope, C., & Weal, M. (2013). Digital futures? Sociological challenges and
opportunities in the emergent semantic web. Sociology,47(1), 173189. doi:10.1177/
0038038512453798
Jansen, B. J., Zhang, M., Sobel, K., & Chowdury, A. (2009). Twitter power: Tweets as
electronic word of mouth. Journal of the American Society for Information Science and
Technology,60(11), 21692188. doi:10.1002/asi.21149
Johnson, E. J., & Tversky, A. (1983). Affect, generalization, and the perception of risk. Journal
of Personality and Social Psychology,45(1), 2031. doi:10.1037/0022-3514.45.1.20
Kietzmann, J. H., Hermkens, K., McCarthy, I. P., & Silvestre, B. S. (2011). Social media? Get
serious! Understanding the functional building blocks of social media. Business Horizons,54
(3), 241251. doi:10.1016/j.bushor.2011.01.005
Kim, S.-M., & Hovy, E. (2006, July). Automatic identification of pro and con reasons in online
reviews. Paper presented at the COLING/ACL, Sydney.
Kivran-Swaine, F., Brody, S., Diakopoulos, N., & Naaman, M. (2012, May). Of joy and
gender: Emotional expression in online social networks. In Companion proceedings of
ACM CSCW12 conference on Computer Supported Cooperative Work (pp. 139142).
New York, NY: ACM.
Koppel, M., & Schler, J. (2006). The importance of neutral examples for learning sentiment.
Computational Intelligence,22(2), 100109. doi:10.1111/coin.2006.22.issue-2
Kozinets, R. V. (2002). The field behind the screen: using netnography for marketing research
in online communities. Journal of Marketing Research,39(1), 6172. doi:10.1509/
jmkr.39.1.61.18935
Canhoto and Padmanabhan Automated vs. manual analysis of social media conversations 1155
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
Lage, M. C., & Godoy, A. S. (2008). Computer-aided qualitative data analysis: Emerging
questions. RAM. Revista De Administração Mackenzie,9(4), 7598. doi:10.1590/S1678-
69712008000400006
Laros, F. J. M., & Steenkamp, J.-B. E. M. (2005). Emotions in consumer behavior: A
hierarchical approach. Journal of Business Research,58(10), 14371445. doi:10.1016/j.
jbusres.2003.09.013
Liu, B. (2010). Sentiment analysis and subjectivity. In N. Indurkhya & F. J. Damerau (Eds.),
Handbook of natural language processing (pp. 627666). Boca Raton, FL: Taylor &
Francis.
Loewenstein, G., & Lerner, J. S. (2003). The role of affect in decision making. In R. J.
Davidson, K. R. Scherer, & H. H. Goldsmith (Eds.), Handbook of affective sciences (pp.
619642). Oxford: Oxford University Press.
Murthy, D. (2008). Digital ethnography: An examination of the use of new technologies for
social research. Sociology,42(5), 837855. doi:10.1177/0038038508094565
Nunan, D., & Domenico, M. D. (2013). Market research and the ethics of big data.
International Journal of Market Research,55(4), 505520.
Patterson, A. (2012). Social-networkers of the world, unite and take over: A meta-introspective
perspective on the Facebook brand. Journal of Business Research,65(4), 527534.
doi:10.1016/j.jbusres.2011.02.032
Plutchik, R. (2001). The nature of emotions. American Scientist,89(4), 344350.
Rademaker, L., Grace, E., & Curda, S. (2012). Using computer-assisted qualitative data
analysis software (CAQDAS) to re-examine traditionally analyzed data: Expanding
our understanding of the data and of ourselves as scholars. Qualitative Report,17
(43), 111.
Rose, S., Spinks, N., & Canhoto, A. I. (2014). Management research - applying the principles.
London: Routledge.
Ryan, M. E. (2009). Making visible the coding process: Using qualitative data software in a
post-structural study. Issues in Educational Research,19(2), 142161.
Schreier, M. (2012). Qualitative content analysis in practice. London: Sage.
Sterne, J. (2010). Social media analytics: Effective tools for building, interpreting, and using
metrics. London: Wiley.
Thelwall, M., Buckley, K., & Paltoglou, G. (2011). Sentiment in Twitter events. Journal of the
American Society for Information Science and Technology,62(2), 406418. doi:10.1002/
asi.21462
Thet, T. T., Na, J.-C., & Khoo, C. S. G. (2010). Aspect-based sentiment analysis of movie
reviews on discussion boards. Journal of Information Science,36(6), 823848. doi:10.1177/
0165551510388123
Uprichard, E. (2013). Describing description (and keeping causality): The case of academic
articles on food and eating. Sociology,47(2), 368382. doi:10.1177/
0038038512441279
Vanden Bergh, B. G., Lee, M., Quilliam, E. T., & Hove, T. (2011). The multidimensional
nature and brand impact of user-generated ad parodies in social media. International
Journal of Advertising,30(1), 103131. doi:10.2501/IJA-30-1-103-131
Williams, S. A., Terras, M., & Warwick, C. (2013). What do people study when they study
Twitter? Classifying Twitter related academic papers. Journal of Documentation,69(3),
384410. doi:10.1108/JD-03-2012-0027
1156 Journal of Marketing Management, Volume 31
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
About the authors
Ana Isabel Canhoto is Senior Lecturer in Marketing at Oxford Brookes University and
Programme Lead of the MSc Marketing. She researches, writes and advises organisations on
how to identify and manage difficult customers, and terminate bad commercial relationships.
She is also interested in the use of social media to build customer profiles. Prior to joining
academia, she worked as a management consultant in the telecommunications industry and as a
portfolio manager at a leading media and entertainment company, among others.
Corresponding author: Ana Isabel Canhoto, Oxford Brookes University, Wheatley Campus,
Wheatley, Oxford OX33 1HX, England.
T+44 (0)1865 485858
Eadomingos-canhoto@brookes.ac.uk
Yuvraj Padmanabhan is the managing director of Mindgraph, with a special expertise in social
media and sentiment analysis.
Canhoto and Padmanabhan Automated vs. manual analysis of social media conversations 1157
Downloaded by [Ana Isabel Canhoto] at 01:44 19 June 2015
... Mechanized supposition investigation systems are regularly employed to arrange text-based records into predefined categories reflecting the extremity of conclusion referred to within the content. As of late, [14] have tried a relative study of computerized and manual social media discussions. Their findings demonstrate low understanding degrees in manual and mechanized investigations, which has "grave concern given the notoriety of the last in shopper explore" [14]. ...
... As of late, [14] have tried a relative study of computerized and manual social media discussions. Their findings demonstrate low understanding degrees in manual and mechanized investigations, which has "grave concern given the notoriety of the last in shopper explore" [14]. There are a number of reasons for the mechanized characterization of communicated slant within social media discussions. ...
... In addition, notion examination encounters problems in the utilization of natural language processing (NLP) on non-structured content, CGC, and an average of online life discussions. For instance, CGC content typically reflects the casual nature and moment of correspondence through web-based networking media [14]. The substance is normally a free-streaming content and easily goes in its word and sentence structure use [94],generally incorporates shortened forms, emoticons, emojis, incorrect spellings, and makes use of an SMS-like linguistic structure on a regular basis, which is not satisfactorily bolstered by current slant investigation strategies. ...
Preprint
Full-text available
The volume of discussions concerning brands within social media provides digital marketers with great opportunities for tracking and analyzing the feelings and views of consumers toward brands, products, influencers, services, and ad campaigns in CGC. The present study aims to assess and compare the performance of firms and celebrities (i.e., influencers that with the experience of being in an ad campaign of those companies) with the automated sentiment analysis that was employed for CGC at social media while exploring the feeling of the consumers toward them to observe which influencer (of two for each company) had a closer effect with the corresponding corporation on consumer minds. For this purpose, several consumer tweets from the pages of brands and influencers were utilized to make a comparison of machine learning and lexicon-based approaches to the sentiment analysis through the Naive algorithm (lexicon-based) and Naive Bayes algorithm (machine learning method) and obtain the desired results to assess the campaigns. The findings suggested that the approaches were dissimilar in terms of accuracy; the machine learning method yielded higher accuracy. Finally, the results showed which influencer was more appropriate according to their existence in previous campaigns and helped choose the right influencer in the future for our company and have a better, more appropriate, and more efficient ad campaign subsequently. It is required to conduct further studies on the accuracy improvement of the sentiment classification. This approach should be employed for other social media CGC types. The results revealed decision-making for which sentiment analysis methods are the best approaches for the analysis of social media. It was also found that companies should be aware of their consumers' sentiments and choose the right person every time they think of a campaign.
... Sentiment analysis research has become popular over the past two decades [40,61,62]; as more efficient sentiment classification models are devised [63] and studies have compared automated analysis of conversations on social media with manual approaches [64]. ...
... In other studies, Microsoft Azure has been found to yield better results when compared to other analyser tools such as Stanford NLP [64], IBM Watson Natural Language Understanding, OpinionFinder 2.0 and Sentistrength [70]. However, as Azure only identifies polarity, it is a less accurate method of measuring an individual's opinion towards a topic compared to other approaches such as VADER [71] and so part of this study compared the sentiment analysis approaches of Microsoft Azure and VADER. ...
Article
Full-text available
Vaccine hesitancy is an ongoing concern, presenting a major threat to global health. SARS-CoV-2 COVID-19 vaccinations are no exception as misinformation began to circulate on social media early in their development. Twitter’s Application Programming Interface (API) for Python was used to collect 137,781 tweets between 1 July 2021 and 21 July 2021 using 43 search terms relating to COVID-19 vaccines. Tweets were analysed for sentiment using Microsoft Azure (a machine learning approach) and the VADER sentiment analysis model (a lexicon-based approach), where the Natural Language Processing Toolkit (NLTK) assessed whether tweets represented positive, negative or neutral opinions. The majority of tweets were found to be negative in sentiment (53,899), followed by positive (53,071) and neutral (30,811). The negative tweets displayed a higher intensity of sentiment than positive tweets. A questionnaire was distributed and analysis found that individuals with full vaccination histories were less concerned about receiving and were more likely to accept the vaccine. Overall, we determined that this sentiment-based approach is useful to establish levels of vaccine hesitancy in the general public and, alongside the questionnaire, suggests strategies to combat specific concerns and misinformation.
... Nowadays, social media (SM) platforms have become popular among users as a source of valuable information regarding sales ( Lau et al., 2018 ;Jeong et al., 2019 ;Rathore and Ilavarasan, 2020 ), marketing ( Guerreiro and Moro, 2017 ;Moro et al., 2018 ;, hotel industry ( Calheiros et al., 2017 ;Geetha et al., 2017 ;Xu et al., 2017 ), hospitality ( Kim and Im, 2018 ), airline industry Tian et al., 2019 ;Punel and Ermagun, 2018 ;Martin-Domingo et al., 2019 ), entertainment , business ( Canhoto and Padmanabhan, 2015 ;Abrahams et al., 2015 ;Fan et al., 2013 ;Lee, 2018 ;He et al., 2019 ;Mingione et al., 2020 ;Greco and Polli, 2020 ;Liu, 2020 ), tourism Ainin et al., 2020 ), fraud detection , water business ( Pawsey et al., 2018 ) Before going to the actual purchase of products or services, customers or consumers utilize the information present on the social media platforms (e.g. Twitter and Facebook) to make purchase decisions ( Li et al., 2018 ). ...
... In stock performance, negative sentiments are more impactful as compared to pos-itive sentiments of customers or consumers ( Liu, 2020 ). Emotional text mining ( Canhoto and Padmanabhan, 2015 ) also applies to big data UGC to identify valuable customers who share similar brands on social media ( Mingione et al., 2020 ;Greco and Polli, 2020 ). In the business prospectus, customers' knowledge plays an essential role in the enhancement of the products and services of the organizations . ...
Article
Full-text available
The importance of text mining is increasing in services management as the access to big data is increasing across digital platforms enabling such services. This study adopts a systematic literature review on the application of text mining in services management. First, we analyzed the literature on which has used text mining methods like Sentiment Analysis, Topic Modeling, and Natural language Processing (NLP) in reputed business management journals. Further, we applied visualization tools for text mining and the topic association to understand the dominant themes and relationships. The analysis highlighted that social media analysis, market analysis, competitive intelligence are the most dominant themes while other themes like risk management and fake content detection are also explored. Further, based on the analysis, future research agenda in the field of text mining in services management has been indicated.
... There are often substantial gaps between automated and manual analyses (Canhoto and Padmanabhan 2015). Future research can improve the effectiveness of listening by pursuing three directions. ...
... In short, prior work on marketing has made moral social media storms a major advertising phenomenon, but the shades that occur in these moral events have not yet been studied in detail. Some research has shown that businesses are well-advised to seek more assertive responses, particularly where brand supporters can protect their brand [57]. ...
Article
Full-text available
In the past recent years, WhatsApp and WeChat have surprisingly fast growth. Facebook as well became the first social network to reach 1 billion active users every month. The presence of social media is an expectation for brands instead of an exception to the rule. Social events and shared information within your target market will help you understand developments in the industry. The opportunity to expose patterns in business in real-time is a potential business intelligence goldmine. The worldwide rate of social penetration reached 49% in 2020, with the highest penetration rates in East Asia and North America. Instagram enables users, through their standards of credibility, authenticity, and transparency, to develop themselves. Influencers from social media have a personal recognizable identity, also known as the "true brand" An influencer has tools and values that can motivate many other followers to increase their presence in the media. Even if these leads do not directly buy via social, awareness-raising can lead them to become full-time buyers. The overwhelming majority of users on Instagram are under the age of 30 according to recent Social Media demographics. Marketers face a dilemma: increased people want businesses to take a social stand, but 79% of CMOs fear that their capacity to attract consumers will be adversely affected. Businesses can mitigate negative emotions by providing positive information to popular social media users. Marketing managers will encourage consumers through tournaments and influencer programmers to engage in contact practices so customers can evangelize and encourage their loyalty to the organization through the creation and delivery of user-generated content.
... From a listening perspective, current work on identifying topics and trends in usergenerated content is still crude. There are often substantial gaps between automated and manual analyses (Canhoto and Padmanabhan 2015). Future research can improve the effectiveness of listening by pursuing three directions. ...
Article
Full-text available
Over the past two decades, everyday users have become a prominent force in the advertising landscape. They actively participate in conversations with and about brands by creating, amplifying, and interacting with brand-related messages. These user activities generate large volumes of structured and unstructured data that advertisers can mine to understand consumer interests and preferences. In this article, we survey insights from the user-generated content literature through the computational advertising lens to offer a road map for future research. Specifically, we discuss three roles that users play—as creators, metavoicers, and propagators. For each role, we present key research areas that can benefit from a computational approach, identify the opportunities and challenges, and propose questions for future research. We also discuss the practical implications of applying computational methods to study users and user-generated content for advertisers.
Article
The authors develop a multimodal social listening analysis (MSLA) approach as a framework for managers to understand how meaning is constructed in social media posts using both text and other media. The research adds to AI and text analysis approaches by considering the whole meaning of a post rather than an analysis of subsets of information in text and other media. The use of MSLA is validated across the social media platforms of Facebook, Twitter and Instagram. The findings show that MSLA helps (i) reveal structures in what appear to be unstructured multimodal posts; (ii) identify all the sentiment items in a post; (iii) identify implicit meanings, such as irony, humour and sarcasm; and (iv) further identify emotions and judgements in multimodal communication. Importantly, this paper explains how decisions and opinions are made online and how marketing strategies can be tailored towards meanings derived from multimodal communication in social media.
Article
Background: The rapid adoption and sustained use of social media globally has provided researchers with access to unprecedented quantities of low-latency data at minimal costs. This may be of particular interest to nutrition research as food is frequently posted about and discussed on social media platforms. This scoping review investigates the ways in which social media is being used to understand population food consumption, attitudes, and behaviours. Methods: The peer-reviewed literature was searched from 2003 to 2021 using four electronic databases. Results: The review identified 71 eligible studies from 25 countries. Two thirds (n=47) were published within the last five years. The United States had the highest research output (31%, n=22) and Twitter was the most used platform (41%, n=29). A diverse range of dataset sizes were used, with some studies relying on manual techniques to collect and analyse data while others required the use of advanced software technology. Most studies were conducted by disciplines outside health with only two studies (3%) conducted by nutritionists. Conclusion: It appears the development of methodological and ethical frameworks as well as partnerships between experts in nutrition and information technology may be required to advance the field in nutrition research. Moving beyond traditional methods of dietary data collection may prove social media as a useful adjunct to inform recommended dietary practices and food policies. This article is protected by copyright. All rights reserved.
Chapter
The relationship between young generations (Millennials and Gen Z), luxury, and food is a current and complex subject. Millennials and Gen Z are the first digital native generations to be very comfortable with technology devices and interested at an early stage in luxury food experiences. By exploring youth food culture and current luxury food experiences and practices, the authors identify three trends (digitalization, extended realities, and cause-related marketing) as key areas food brands and food actors (e.g., restaurants) should capitalize on to educate, facilitate, and promote the adoption of pleasurable, healthy, and sustainable food consumptions. The authors provide an overview of these three new key trends together with examples Millennials and Gen Z consumers are attracted to considering luxurious food consumption and experiences. This chapter contributes to the need to look at contexts of application (food) where sustainability and the digital transformation highlights the present and future for the promotion of luxury goods and experiences.
Chapter
Modern technology-rich environments provide a variety of tools with various types of capabilities that can support student success at the tertiary level. While university-supported learning platforms such as Moodle typically support this academic purpose, social networking sites such as Facebook can also be used within university studies to support student success.
Article
Full-text available
A growing body of consumer research studies emotions evoked by marketing stimuli, products and brands. Yet, there has been a wide divergence in the content and structure of emotions used in these studies. In this paper, we will show that the seemingly diverging research streams can be integrated in a hierarchical consumer emotions model. The superordinate level consists of the frequently encountered general dimensions positive and negative affect. The subordinate level consists of specific emotions, based on Richins' (Richins, Marsha L. Measuring Emotions in the Consumption Experience. J. Consum. Res. 24 (2) (1997) 127–146) Consumption Emotion Set (CES), and as an intermediate level, we propose four negative and four positive basic emotions. We successfully conducted a preliminary test of this second-order model, and compare the superordinate and basic level emotion means for different types of food. The results suggest that basic emotions provide more information about the feelings of the consumer over and above positive and negative affect. D 2004 Elsevier Inc. All rights reserved.
Book
Full-text available
Social media has quickly become part of the fabric of our daily lives, and as we have flocked to it, so have most companies and organisations from every sector and industry. It is now the place to attract and sustain our attention. But how is it a new marketing activity and how is it similar to previous practice and customer behaviour? Does it require new modes of thinking about human networks and communications or do the existing conceptual models still apply? This book offers a critical evaluation of the theoretical frameworks that can be used to explain and utilise social media, and applies them to fun real-life examples and case studies from a range of industries, companies and countries. These include Unilever, Snickers, American Express, Volkswagen and Amnesty International, and span campaigns run across different platforms in countries such as China, Canada, Sweden and Singapore. Readers are invited to think about the different types of social media users and explore topics such as brand loyalty, co-creation, marketing strategy, measurement, mobile platforms, privacy and ethics. As well as tracing the emergence and trends of Web 2.0 and what they mean for marketing, the author also considers the future for social media marketing. Discussion questions and further reading are provided throughout, and the book is accompanied by a companion website.
Article
Full-text available
As diverse members of a college of education evaluation committee one of our charges is to support faculty as we document and improve our teaching. Our committee asked faculty to respond to three qualitative questions, documenting ways in which interdepartmental and cross-department conversations are used to promote reflective thinking about our practice. Three of us investigated the use of CAQDAS to provide an additional level of analysis and how we learned more about ourselves as scholars through this collaboration. Our findings include recommendations regarding the use of CAQDAS to support collaborative efforts by diverse scholars. © 2012: Linnea L. Rademaker, Elizabeth J. Grace, Stephen K. Curda, and Nova Southeastern University.
Article
Full-text available
Market Research is often accused of failing to provide the insights sought by our clients, and in an increasingly complex society we are challenged to embrace a different model of thinking with different principles at its centre. We believe that a Web 2.0 research platform and a social network approach offers marketing research new tools to meet the challenges of the future. The paper identifies a number of trends that may well provide fertile ground for marketing researchers to develop new approaches. The open source movement will not only affect the way that we think but the very methodologies that we use. The emergence of Web 2.0 offers us an array of collaborative tools with which to develop new research approaches to explore the rapidly changing social and media environment. At the same, the rapid growth of online social networks has fuelled the already rich research literature on the importance of studying humankind in 'tribes' or 'groups'. We argue that the combination of social computing tools and an understanding of social networks will allow us to build new types of research communities, in which respondents interact not only with the researchers but with the clients and most fertilely with each other. Moreover as we examine these types of networks we will become increasingly better able to handle multiple sources of data, and be as comfortable with these new forms of user generated content as we are with the traditional data collection tools of the last fifty years. We believe that these social software tools and trends provide the blueprint for researchers to build new types of 'participatory panels' or 'research communities' and we describe our experiences in developing such a community.
Article
Full-text available
Since their entry to Japan in the latter half of the 19th century, coffee and coffee shops have been closely linked to the economic, political, and socio-cultural change undergone by the Japanese society. The cafés themselves have gone through numerous transformations in order to address the various social needs of their patrons. Today, coffee shops occupy a significant niche in the Japanese urban lifestyle. However, the cultural ‘baggage' of coffee as a foreign commodity still plays a central role in generating its consumer appeal. Coffee is a global commodity whose value on the world market is surpassed only by oil. Moreover, due to its peculiar historical background, it became a beverage charged with a wide range of cultural meanings; tracing these meanings in different contexts can shed light on the way cultural commodities ‘behave' in the globalized world. In order to examine the niche that coffee occupies in the Japanese consumption scene, I will analyze the manner in which representations of coffee are constructed and translated into a consumer experience. Through the case of coffee in Japan I will try to demonstrate the process of ‘movement of culture', whereby the relevance of a foreign commodity in the local context is determined by the complex interplay between two culturally engineered binary entities of ‘global' and ‘local', ‘foreign' and ‘native'.
Chapter
Textual information in the world can be broadly categorized into two main types: facts and opinions. Facts are objective expressions about entities, events, and their properties. Opinions are usually subjective expressions that describe people’s sentiments, appraisals, or feelings toward entities, events, and their properties. The concept of opinion is very broad. In this chapter, we only focus on opinion expressions that convey people’s positive or negative sentiments. Much of the existing research on textual information processing has been focused on themining and retrieval of factual information, e.g., information retrieval (IR), Web search, text classification, text clustering, and many other text mining and natural language processing tasks. Littleworkhadbeendone on the processing of opinions until only recently. Yet, opinions are so important that whenever we need to make a decision we want to hear others’ opinions. This is not only true for individuals but also true for organizations.
Article
Do people aggress to make themselves feel better? We adapted a procedure used by G. K. Manucia, D. J. Baumann, and R. B. Cialdini (1984), in which some participants are given a bogus mood-freezing pill that makes affect regulation efforts ineffective. In Study 1, people who had been induced to believe in the value of catharsis and venting anger responded more aggressively than did control participants to insulting criticism, but this aggression was eliminated by the mood-freezing pill. Study 2 showed similar results among people with high anger-out (i.e., expressing and venting anger) tendencies. Studies 3 and 4 provided questionnaire data consistent with these interpretations, and Study 5 replicated the findings of Studies I and 2 using measures more directly concerned with affect regulation. Taken together, these results suggest that many people may engage in aggression to regulate (improve) their own affective states.
Article
Social media have provided consumers with numerous outlets for disseminating their brand-related comments. Given the impact of these comments on brand image and brand success, marketers have begun to rely on third-party companies to automatically track, collect and analyze these comments with regard to their content and sentiment. There is, however, an absence of controlled research which objectively and systematically assesses the accuracy of these companies automated sentiment coding. This research evaluated the automated sentiment coding accuracy and misclassification errors of six leading third-party companies for a broad range of comment types and forms. Overall, automated sentiment coding appears to have limited reliability and appears to be accurately accomplished only for very simple statements in which a keyword is used to convey its typical meaning. Statements without keywords or statements in which keyword meaning is reversed through negation or context are accurately coded at very low levels. Neutral statements appear to be problematic for some, but not all, companies. Implications of the research for the use of automated sentiment analysis for brand decisionmaking are presented.
Book
The title Cognitive Agent-based Computing reflects a unified framework combining two key modeling paradigms for developing cognition/understanding of a special type of systems namely the Complex Adaptive Systems (CAS).