ArticlePDF Available

Twitter users change word usage according to conversation-partner social identity



This paper investigates how people express social identity at a large scale on a social network. We looked at communities of users on the Twitter website, and tested two established social-psychology theories that are usually performed at local scale. We found evidence of Communication Accommodation Theory, where community members vary their language characteristics depending on which community they are communicating with. We also found the level of linguistic variation correlated with how isolated a community was: evidence that there is Convergence between linked members. This demonstrates the power of methods which analyse subtle human behaviour on social networks.
Twitter users change word usage according to conversation-partner social identity
Nadine Tamburrinia, Marco Cinnirellab, Vincent A. A. Jansena, John Brydena,
aSchool of Biological Sciences, Royal Holloway University of London, Egham TW20 0EX, UK
bDepartment of Psychology, Royal Holloway University of London, Egham TW20 0EX, UK
This paper investigates how people express social identity at a large scale on a social network. We looked at
communities of users on the Twitter web site, and tested two established social-psychology theories that are usually
performed at local scale. We found evidence of Communication Accommodation Theory, where community members
vary their language characteristics depending on which community they are communicating with. We also found the
level of linguistic variation correlated with how isolated a community was: evidence that there is Convergence between
linked members. This demonstrates the power of methods which analyse subtle human behaviour on social networks.
Keywords: Twitter; Community structure; Social identity; Language accommodation; Linguistic convergence; Social
network analysis
We study social identity on communities found on
the Twitter web site
We find users adjust word usage according to the
community of the interlocutor
More isolated communities change word usage to a
greater extent
Large scale studies of Twitter can test hypotheses on
social behaviour
1. Introduction
Social identity is that proportion of an individual’s self-
concept that derives from membership of a social group
(Tajfel and Turner, 1979). Group affiliation has functions
of enhancing cooperation (Boyd and Richerson, 2009) and
allowing individuals to define others through the group
they belong to, in the same way that the individual de-
fines him or herself through the identity of their own group
(Ashforth and Mael, 1989). Group members share be-
haviour and social norms. This shared behaviour in social
groups is thought to be generated through processes on
social networks such as convergence of behaviour due to
social relationships (Hormuth, 1990; Ethier and Deaux,
The way we use language is strongly associated with
our social identity (Scott, 2007). The convergence of be-
haviour, proposed by social identity theory, is often stud-
ied through the language used within social groups. This
Corresponding author
demonstrates how language is more than just a means of
communication and sociolinguistic studies have shown that
varieties of a language can be strongly associated with so-
cial or cultural groups (Gumperz, 1958; Labov, 1966; Car-
roll, 2008; Bryden et al., 2013).
By using language as a proxy for social behaviour, stud-
ies have been able to understand how expression of social
identity is often strongly context dependent: people will
behave differently depending on which social identity has
the strongest salience in the current situation (Hogg and
Reid, 2006). Studies show how this often manifests in the
accommodation of language according to the social iden-
tity of the interlocutor (Giles, 1973; Gallois et al., 2005).
Individuals negotiate the social distance between them-
selves and the person with whom they are conversing, and
are therefore in control of its creation and maintenance
(Shepard et al., 2001). For example, Iwasaki and Horie
(2000) reported how Thai speakers would adjust their lin-
guistic registers when interacting with strangers. These
studies look at specific groups or social situations, but we
do not know whether this behaviour can be found at a large
scale across many groups where these groups are allowed
to freely interact with one another.
Online social networking platforms are providing us
with a large scale platform to study human behaviour.
With over 200 million monthly active users (Costolo, 2013),
the Twitter social network is particularly useful due to its
publicly accessible nature (Virk, 2011) and network size.
The analysis of large networks brings with it considerable
statistical power that allows for the detection of patterns
that in traditional, smaller scale network studies would be
undetectable. Twitter functions as a micro-blogging web-
site, working on the premise of users sharing their opinions
and thoughts in brief messages (maximum 140 characters),
Preprint submitted to Social Networks August 15, 2014
which are referred to as “tweets”. An investigation into the
reasons why people post on the Twitter web site by (Java
et al., 2007) found that about one eighth of posts were
conversational messages rendering Twitter as a prime re-
source for public access to naturally occurring communi-
cation (Danescu-Niculescu-Mizil et al., 2011) making this
public resource an excellent place to study the expression
of social identity.
The study of how identity affects our use of language
online is a growing field. There is evidence for communica-
tion accommodation between offline conversation partners
(Danescu-Niculescu-Mizil et al., 2011) showing that syn-
tax, pitch, gestures, word choice, length or form can differ
according to interlocutor. Evidence for linguistic conver-
gence online is mixed with studies finding evidence both for
(e.g. Riordan et al., 2013) and against (e.g. Christopher-
son, 2011) the existence of convergence in online communi-
cation. The anonymity sometimes engendered in computer
mediated environments can act to enhance the significance
of social identity in contexts where a relevant shared group
membership is salient to users (Postmes et al., 2000). Con-
sequently, social identity can be heightened which explains
why some group phenomena, such as polarisation of atti-
tudes, and stereotyping, can seem enhanced in some on-
line environments (e.g. Postmes et al., 2001). This is ev-
ident due to collective identity amongst communities of
websites of environmental activists (Ackland and O’Neil,
2011). However, such studies of social identity in computer-
mediated-communication are still in their relative infancy
and this research aims to contribute to the further devel-
opment of this field by looking to expressly link commu-
nication accommodation and convergence to social groups
that have formed on Twitter.
In order to identify online groups, we look to the study
of complex networks. In this field, the term communities is
used to denote parts of the network that are more strongly
linked within themselves than to the rest of the network, a
phenomenon that has been observed in many human social
networks (Porter et al., 2009). In this sense, communities
are an emergent property of network structure. Much work
has gone into developing methods to detect such groups
from topological analysis (Fortunato, 2010), and the ex-
tent to which this is possible has been termed modularity
(Newman, 2006). The communities found in this way are
usually associated with groups of friends or acquaintances,
or similarity in traits (Porter et al., 2009; Bryden et al.,
2011; Traud et al., 2012) and have also been shown to share
language features (Bryden et al., 2013). We hypothesise
that communities found in online networks will share so-
cial identity and consequently we expect to find that they
demonstrate communication accommodation and conver-
In this study we focus on a specific aspect of behaviour
that is strongly associated with social identity, asking whether
individuals will shift their linguistic behaviour according
to which social group they are messaging. The data of
online communities that we used came from a previous
study of the Twitter web site (Bryden et al., 2013). We
tested for communication accommodation by looking to
see if users varied specific language characteristics accord-
ing to whether they had sent conversational messages to
members of the same community or to members from other
communities. We tested for convergence by looking to see
whether this level of language variation for a community
correlated with how strongly linked a community is within
2. Methods
The data upon which we did our tests was a network
of 189,000 Twitter users. To identify users to download
we used a snowball-sample where, for each user sampled,
all their tweets which mentioned other users (using the
‘@’ symbol) were recorded and any new users referenced
added to a list of users from which the next user to be
sampled was picked. Starting from a random user, conver-
sational tweets, time-stamped between January 2007 to
November 2009 were sampled from the Twitter web site
during December 2009, yielding over 200 million messages.
The network was formed of bidirectional links, where both
nodes had sent at least one message to one another, and
weighted by the number of tweets sent between the two
users linked. We ignored messages that were copies of
other messages (so called retweets, which are identified by
a case-insensitive search for the text ‘RT’). In total the
network had 75 million messages (tweets) directed from
users of the network to one another.
The network was partitioned into communities using a
modularity maximisation algorithm (Blondel et al., 2008)
and a partition of the network was found where 91% of the
tweets were sent by users to other users within the same
community. For each community, characteristic words were
generated that were used more commonly in that group
than the global average (see Supplementary Information
for characteristic words). These allowed us to identify
English speaking groups and also qualitatively summarise
shared characteristics of each group. For more informa-
tion on how characteristic words were generated, and an
argument that the network sampled was representative of
the complete Twitter network, see Bryden et al. (2013).
To investigate changes in language characteristics, we
divided messages into two collections: internal messages
that were sent to other members of the same group, and
external messages that were sent to members of different
communities. For each group, we made sure that both
collections were of the same size by discarding messages
at random from the larger collection. The difference in
word usage between the samples from the two classes was
To calculate differences between word usage between
the two samples we used text similarity measures. We used
two different text measures (Gomaa and Fahmy, 2013) to
confirm that the result was not an artefact generated by
one of the measures. For a word wwe define numbers of
usages of win the internal and external samples as λi(w)
and λe(w) respectively. The first measure was the Eu-
clidean distance between relative word usage frequencies
for each collection, given by,
The second measure was the quantitative version of the
Jaccard distance measure (Gallagher, 1999) which is one
minus the multiset intersection of the two samples divided
by the multiset union. This is given by,
1Pwmin [λi(w), λe(w)]
Pvmax [λi(v), λe(v)] .(2)
To look at other linguistic features that can be indica-
tive of changes in linguistic style (see, e.g., Bryden et al.,
2013; Wagner et al., 2013), we also calculated differences
between word-ending frequencies (using both Euclidean
and Jaccard distances) and apostrophe frequencies. Dif-
ferences between apostrophe frequencies were calculated
by calculating the frequency of apostrophes per word used
by each of the two collections and then calculating the
absolute difference between these two values.
3. Results
The partitioning of the sample network of Twitter users
yielded 414 groups, with 42 groups having more than 250
users. A variety of languages were found with different
groups using different languages. To eliminate the effects
of a user simply changing between different languages de-
pending on which group they were speaking to, we did the
study on the 24 groups (of a size greater than 250 users)
that used the English language which were selected in a
previous study (Bryden et al., 2013, and see methods).
With these English-speaking groups, we formed col-
lections of internal and external messages for each group,
and then measured the Euclidean distance in word usage
frequencies between the two collections. Since differences
of word-usage frequencies can arise because users within
a group communicate about one or a limited number of
subjects, we also measured distances of word-ending us-
age frequencies and apostrophe usage frequencies to look
at markers of linguistic style. We found a variety of dis-
tances between internal and external messages in all three
measurements (Figure 1).
There is a variety of distances between the internal and
external word usages in Figure 1. It is possible that these
differences in word usage could have happened by random
chance. To test this on a group-by-group basis we used
a bootstrap by resampling (with replacement) new ran-
dom pairs of collections of messages from the union of the
original internal and external collections used to generate
Figure 1. By calculating linguistic distances between the
Figure 1: A comparison of the 24 English speaking groups in the
Twitter network showing the extent of linguistic variation between
internal and external tweets. The bars show Euclidean distances on
a group-by-group basis between internal and external tweets for the
three measurements: word-usage frequencies (solid bars at the top
of each plot), word-ending frequencies (slashed bars in the middle)
and apostrophe usage (crossed bars at the bottom). For each mea-
surement, all groups were scaled so that the values ranged between
0.0 and 1.0. Each group has a short description and a group num-
ber. The short description was generated by qualitatively inspecting
unusual words generated for each group (see Supplementary Infor-
newly sampled pairs of collections, we can confirm that the
difference found between the original group didn’t happen
by chance. Repeating this procedure 1,000 times for each
group, we calculated the p-value: the proportion of resam-
pled collections for which a linguistic distance exceeded
that of the original internal and external collections. In
fact, using both the Euclidean and Jaccard measures, none
of the distances between the word, or word-ending usages,
of the resamples exceeded that of the original collections
(p0.001). This showed that the users we studied do
indeed change their word and word-ending usage accord-
ing to whether they are messaging other members of the
group or not. For distance between internal and external
apostrophe usages, 17 of the 24 groups were significant
The difference between the language use of external
and internal messages raises a question as to how much
this change in language characteristics is due to the sender
of a message conforming to the language use of the re-
ceiver. An alternative scenario may be that external mes-
sages may have their own language patterns. We investi-
gated this by comparing, using both the Jaccard and the
Euclidean measures, the external messages to and from
a focal community against the internal messages of every
community. We found that the most similar community in
each case was the original focal community. This indicates
that the change in language characteristics is indeed due
to the sender of a message conforming to the language use
of the receiver.
The groups of Twitter users analysed in this work were
generated by partitioning the sampled network of Twitter
users such that the proportion of messages sent within the
Figure 2: Linguistic variation between internal and external tweets
increases with the proportion of tweets sent within a group. a) dis-
tance between word-usage (circles with regression line, two-tailed
p= 5.6×106), b) distance between word-endings (triangles and re-
gression line, two-tailed p= 0.052), c) distance between apostrophe
usage (crosses and regression line, two-tailed p= 0.0074).
groups was maximised: so called modularity maximisation
(Blondel et al., 2008; Newman and Girvan, 2004). This
generated closely interlinked groups that are relatively iso-
lated from the rest of the network. We assessed whether
there is any relationship between the level of isolation of
a group, measured as the proportion of messages sent by
that group to other members of the same group, and the
amount of linguistic variation between internal and ex-
ternal messages. We found that the distances between
word and apostrophe usage correlated significantly with
the proportion of messages sent within the groups (Fig-
ure 2). This indicates that the more a group was isolated
from the rest of the network, the more it showed linguistic
We did not find a significant correlation for word-ending
variation against the proportion of internal tweets (Figure
2, panel b). A visual inspection of the figure reveals that
one of the groups is an outlier from the rest across all three
measurements of linguistic variation. This group (number
93) is made up of a network of people that organise on-
line parties called ‘pawpawties’ to raise money for animal
charities (Manning, 2009). It is intriguing that this group,
which largely exists on Twitter, has much stronger lan-
guage accommodation features compared to similar groups
which appear to have much stronger offline interaction.
When we remove this outlier from the regression, we find
that there is a significant correlation for word-ending vari-
ation against the proportion of internal tweets (two-tailed
p= 0.00080).
4. Discussion
Our work demonstrates how computational methods
can be used to study social processes on large-scale social
networks. Our study was done on an unrestricted large-
scale sample of Twitter where individuals interact freely
with one another. We used topological analysis to identify
social groups in the network and then demonstrated how
linguistic behaviour will change according to the group
membership of the interlocutors. This shows how sub-
tle trends in linguistic behaviour aggregate to form social
identity through communication accommodation and lin-
guistic convergence.
The work illustrates an important methodological tool
for studying social processes on large scale social networks.
Measurements of social behaviour, especially language fea-
tures, rarely appear to conform to a normal distribution
and are thus difficult to analyse with traditional statistical
methods. In this work we use a bootstrap method which,
through resampling our data, is independent of whichever
distribution the original measurements might come from.
The bootstrap is a simple, but powerful, tool for statisti-
cal analysis of subtle social processes at such a large scale
(Efron and Tibshirani, 1993).
Our study has found evidence of behaviour on the Twit-
ter social network that is consistent with theory on social
identity. The results show that people are aware, either
implicitly or explicitly, of the social identity of their inter-
locutor and change their language usage accordingly. This
demonstrates that interaction networks with limited com-
munication channels are still sophisticated enough to allow
their members to express social identities. We have also
found that the extent to which members change their lan-
guage characteristics depends on how isolated their group
is from the rest of the network. This shows that social con-
vergence between several individuals is strongly related to
the proportion of their total interaction that they spend
within the group.
This study is compatible with other studies of linguis-
tic variation within and between groups (e.g. Bell, 1984;
Gregory and Carroll, 1978), and the idea that communi-
ties may develop unique linguistic styles which can become
intertwined with, and markers of, their identity. Our find-
ing of linguistic differences between internal and external
tweets echoes sociolinguistic work on situational fluctua-
tions in linguistic registers (e.g. Iwasaki and Horie, 2000)
and supports a social identity perspective that views such
linguistic variation as part of the process of social cate-
An important difference with previous studies is the
scale at which this study took place. For instance, previ-
ous studies that have looked at convergence did not find
significance with sample sizes of 30 conversations (Christo-
pherson, 2011). Our approach surpasses the boundaries
of survey or interview, and laboratory or field based in-
vestigations, with millions of conversations being analysed
yielding significant statistical power. While the environ-
ment of Twitter is somewhat specific and does not relate
to many other on- and offline environments, the fact that
our results here were replicated for each community tested
indicates that our result is likely to be generalisable.
The differences in word usage between the internal and
external messages of each group may be due to each group
sharing interests in certain subjects. To go beyond sub-
ject areas, we also looked at word endings and apostrophe
usage. This is consistent with theory which shows how
groups become associated with particular communication
styles, members may reference those styles in their com-
municative acts as a means of claiming or expressing the
identity in question (Rampton, 1995).
Our study was restricted to English language groups
because a large proportion of the groups in our sample of
Twitter used English. While there were groups that spoke
other languages in our data, we did not have the quantity
of data to adequately resolve sub-groups for non-English
speaking Twitter users. We would expect, with more data,
to be able to resolve sub-groups for non-English speaking
users, and thus be able to test the theory across many
different languages.
It is possible that the sampling algorithm that we orig-
inally used to sample the Twitter messages may have some
introduced some biases which would mean that our sample
is not representative of Twitter as a whole. A sampling
process used can have some bias toward Twitter users that
have had messages sent to them. To mitigate this, we made
sure that unsampled users were only placed once on the list
of users to be sampled, even if they have been messaged
by several previously sampled users. The second issue is
that there may be a bias toward certain communities - es-
pecially toward the community of the user first sampled.
We cover this in more detail in a previous paper (Bryden
et al., 2013) arguing that the sampler will move to random
communities relatively quickly. We found that our sam-
pling method detected a broad variety of communities and
this indicates the sample is likely to be representative of
the population.
Interesting future topics which are possible extensions
of our work include theory on out-groups, where theory
such as Communication Accommodation Theory and the
Social Identity Model of Deindividuation predict diver-
gence when interlocutors message certain external groups.
We didn’t find any evidence of this in our study as we
found that external messages for a particular group were
still closer to the internal messages of the group than any
other. Further investigations of how language character-
istics converge and/or diverge over time may shed some
light on this topic and be of interest in their own right.
Finally, we may also be able to improve an algorithm that
predicts the groups of individuals based on their language
patterns (Bryden et al., 2013), by comparing an individ-
ual’s language use against that of only the internal tweets
of the groups.
Even though the conversations we studied on Twitter
were made up of very short text messages which are pub-
lically posted, these results indicate that many complex
features of normal offline communication take place on-
line. While such behaviour may not be evident at a small
scale, the large quantities of data used in this study meant
that we were able to identify these subtle patterns. This
indicates that future studies on social identity, social be-
haviour and cooperation are likely to prove fruitful.
5. Acknowledgements
Thanks to Shaun Wright, Tim Harrison and the anony-
mous reviewers. This work was supported by the Eco-
nomic and Social Research Council (grant ES/L000113/1).
Ackland, R., O’Neil, M., 2011. Online collective identity: The case
of the environmental movement. Social Networks 33 (3), 177–190.
Ashforth, B. E., Mael, F., 1989. Social identity theory and the orga-
nization. Academy of Management Review 14 (1), 20–39.
Bell, A., 1984. Language style as audience design. Language in
society 13 (2), 145–204.
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E., 2008.
Fast unfolding of communities in large networks. Journal of Sta-
tistical Mechanics: Theory and Experiment 2008 (10), P10008.
Boyd, R., Richerson, P. J., 2009. Culture and the evolution of hu-
man cooperation. Philosophical Transactions of the Royal Society
B: Biological Sciences 364 (1533), 3281–3288.
Bryden, J., Funk, S., Geard, N., Bullock, S., Jansen, V. A., 2011. Sta-
bility in flux: Community structure in dynamic networks. Journal
of The Royal Society Interface 8 (60), 1031–1040.
Bryden, J., Funk, S., Jansen, V. A., 2013. Word usage mirrors com-
munity structure in the online social network Twitter. EPJ Data
Science 2 (1), 1–9.
Carroll, K. S., 2008. Puerto rican language use on
Centro Journal 20 (1), 96–111.
Christopherson, L., 2011. Can u help me plz?? Cyberlanguage
accommodation in virtual reference conversations. Proceedings
of the American Society for Information Science and Technology
48 (1), 1–9.
Costolo, R., 2013. Twitter, Inc.: Initial public offering.
Danescu-Niculescu-Mizil, C., Gamon, M., Dumais, S., 2011. Mark
my words!: Linguistic style accommodation in social media. In:
Proceedings of the 20th international conference on World wide
web. p. 745–754.
Efron, B., Tibshirani, R. J., Jan. 1993. An Introduction to the Boot-
strap. Chapman and Hall/CRC, New York.
Ethier, K. A., Deaux, K., 1994. Negotiating social identity when con-
texts change: Maintaining identification and responding to threat.
Journal of Personality and Social Psychology 67 (2), 243.
Fortunato, S., 2010. Community detection in graphs. Physics
Reports 486 (3), 75–174.
Gallagher, E. D., 1999. COMPAH documentation.
Gallois, C., Ogay, T., Giles, H., 2005. Communication accommo-
dation theory: A look back and a look ahead. In: Gudykunst,
W. B. (Ed.), Theorizing About Intercultural Communication.
Sage, Thousand Oaks, CA, p. 121–148.
Giles, H., 1973. Accent mobility: A model and some data. Anthro-
pological Linguistics 15 (2), 87–105.
Gomaa, W. H., Fahmy, A. A., 2013. A survey of text similarity ap-
proaches. International Journal of Computer Applications 86 (13),
Gregory, M., Carroll, S., 1978. Language and situation: Language
varieties and their social contexts. Routledge and Kegan Paul,
London, Henley and Boston.
Gumperz, J. J., 1958. Dialect differences and social stratification in
a north indian village. American Anthropologist 60 (4), 668–682.
Hogg, M. A., Reid, S. A., 2006. Social identity, self-categorization,
and the communication of group norms. Communication Theory
16 (1), 7–30.
Hormuth, S. E., 1990. The ecology of the self: Relocation and
self-concept change. Cambridge University Press.
Iwasaki, S., Horie, P. I., 2000. Creating speech register in Thai
conversation. Language in Society 29 (04), 519–554.
Java, A., Song, X., Finin, T., Tseng, B., 2007. Why we twitter:
Understanding microblogging usage and communities. In: Pro-
ceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop
on Web mining and social network analysis. p. 56–65.
Labov, W., 1966. The linguistic variable as structural unit. Wash-
ington Linguistics Review 3, 4–22.
Manning, S., 2009. Animal lovers throw ‘pawpawties’ for charity.
Animal-lovers- throw-pawpawties- for-charity- 3285436.
Newman, M. E., 2006. Modularity and community structure in net-
works. Proceedings of the National Academy of Sciences 103 (23),
Newman, M. E., Girvan, M., 2004. Finding and evaluating commu-
nity structure in networks. Physical Review E 69 (2), 026113.
Porter, M. A., Onnela, J.-P., Mucha, P. J., 2009. Communities in
networks. Notices of the AMS 56 (9), 1082–1097.
Postmes, T., Spears, R., Lea, M., 2000. The formation of group
norms in computer-mediated communication. Human Communi-
cation Research 26 (3), 341–371.
Postmes, T., Spears, R., Sakhel, K., De Groot, D., 2001. Social
influence in computer-mediated communication: The effects of
anonymity on group behavior. Personality and Social Psychology
Bulletin 27 (10), 1243–1254.
Rampton, B., 1995. Crossing: Language and ethnicity among ado-
lescents. Longman, London.
Riordan, M. A., Markman, K. M., Stewart, C. O., 2013. Commu-
nication accommodation in instant messaging an examination of
temporal convergence. Journal of Language and Social Psychol-
ogy 32 (1), 84–95.
Scott, C. R., 2007. Communication and social identity theory: Ex-
isting and potential connections in organizational identification
research. Communication Studies 58 (2), 123–138.
Shepard, C. A., Giles, H., Le Poire, B. A., 2001. Communication
accommodation theory. In: Robinson, W. P., Giles, H. (Eds.),
The new handbook of language and social psychology. John Wiley,
New York, p. 33–56.
Tajfel, H., Turner, J. C., 1979. An integrative theory of intergroup
conflict. In: Austin, W. G., Worchel, S. (Eds.), The social
psychology of intergroup relations. Brooks/Cole, Monterey CA,
pp. 33–47.
Traud, A. L., Mucha, P. J., Porter, M. A., 2012. Social structure
of facebook networks. Physica A: Statistical Mechanics and its
Applications 391 (16), 4165–4180.
Virk, A., 2011. Twitter: The strength of weak ties. University of
Auckland Business Review 13 (1), 19–21.
UABusReview/2011_13_i01-5- twitter.pdf
Wagner, C., Asur, S., Hailpern, J., 2013. Religious politicians and
creative photographers: Automatic user categorization in twitter.
In: ASE/IEEE International Conference on Social Computing
... Result of the interaction between members when identifying and comparing themselves with the group to which they belong [137,149] Social influence The extent to which a member directly or indirectly affects the thoughts, feelings, and actions of others [123,138,150] Social interaction A process by which members are connected, create ties, and allow access to and exchange of resources [36,43,151] Prestige A measure of the relationship between members [152] Engagement Actively contributing interventions that facilitate greater communication and resource sharing [108,120] Popularity A measure to evaluate the behavior of a member in relation to others in a network [115,116] Reciprocity A measure of a member's interaction with other network members that promotes the implicit sharing of resources [7,126] Reputation A measure of the recognition of the success of a member [115] 1 Main references that support the description of the variables adopted in this work. ...
... There is a relative frequency of interaction between members on a particular topic; depending on the group or sub-group, this shows changes in their linguistic behavior. Tamburrini et al. [137] state that members are implicitly or explicitly aware of the social identity of a partner and change their use of language in a certain way. This is mostly due to how isolated such a group is from the rest of the network. ...
Full-text available
Although social capital has been researched from many approaches and attempts have been made to measure it online, the literature lacks an operational description that would allow its measurement criteria to be established from a social network perspective. Therefore, the purpose of this paper is to identify in the literature what metrics researchers use to measure social capital on social networking sites from a social network perspective. Thus, this contribution offers a theoretical description of the key elements for measuring social capital in social networking sites, which may be useful in subsequent studies.
... Domain 1: Twitter Social media platforms like Twitter have been mined to study aspects of language change over time, such as the introduction or diffusion of new words (Eisenstein et al., 2014;Tamburrini et al., 2015;Wang and Goutte, 2017). We collect unlabeled data for domain adaptation by extracting a random selection of 12M tweets, spread semi-uniformly from 2015 till 2020. ...
Full-text available
When an NLP model is trained on text data from one time period and tested or deployed on data from another, the resulting temporal misalignment can degrade end-task performance. In this work, we establish a suite of eight diverse tasks across different domains (social media, science papers, news, and reviews) and periods of time (spanning five years or more) to quantify the effects of temporal misalignment. Our study is focused on the ubiquitous setting where a pretrained model is optionally adapted through continued domain-specific pretraining, followed by task-specific finetuning. We establish a suite of tasks across multiple domains to study temporal misalignment in modern NLP systems. We find stronger effects of temporal misalignment on task performance than have been previously reported. We also find that, while temporal adaptation through continued pretraining can help, these gains are small compared to task-specific finetuning on data from the target time period. Our findings motivate continued research to improve temporal robustness of NLP models.
... Recent scholarship also shows that Facebook users readily infer the political affiliation of their interlocutors based on the content they post (Settle, 2018) and modify their online behavior accordingly. Salient political identities have been shown to determine people's word choices (Tamburrini et al., 2015); their propensity to be uncivil in online conversations (Gervais, 2015;Rains et al., 2017); their evaluations of users (Settle, 2018;Suhay et al., 2018), as well as their responses to moral suasion from perceived in-group members (Munger, 2021). ...
Full-text available
Affective polarization—growing animosity and hostility between political rivals—has become increasingly characteristic of Western politics. While this phenomenon is well-documented through surveys, few studies investigate whether and how it manifests in the digital context, and what mechanisms underpin it. Drawing on social identity and intergroup theories, this study employs computational methods to explore to what extent political discussions on Reddit’s r/politics are affectively polarized, and what communicative factors shape these affective biases. Results show that interactions between ideologically opposed users were significantly more negative than like-minded ones. These interactions were also more likely to be cut short than sustained if one user referred negatively to the other’s political in-group. Conversely, crosscutting interactions in which one of the users expressed positive sentiment toward the out-group were more likely to attract a positive than a negative response, thus mitigating intergroup affective bias. Implications for the study of online political communication dynamics are discussed.
... The second is the current president, Iván Duque Márquez, whose administration continues to implement the agreement. This analysis will allow us to identify the institutional discourse posted on the digital social network Twitter, which has proven to be a communication platform that enables the identification of political agendas and official discourse contained in short messages (maximum 280 characters) posted in real time (Tamburrini et al., 2015). Accordingly, the research objective was to verify whether the ideological rhetoric in the government discourse during the postagreement period created frameworks for building a culture of peace that would enable the recategorization of the opponent and social reconciliation. ...
Millions of children worldwide will not reach their potential in terms of education and development. However, it is widely known that investment in high-quality early childhood development (ECD) pays rich dividends throughout the lifespan of an individual, impacting their own lives, families, and communities in a positive way. Further evidence points to the importance of ECD in delivering the United Nations’ Sustainable Development Goals (SDGs). The multi-sectoral, integrated provision of ECD services is ideally placed to facilitate holistic positive change and enhance social cohesion in some of the most inequitable and vulnerable contexts. The LINKS project brings together an international network of researchers, who work in strategic partnership with United Nations Children’s Fund (UNICEF) and Early Years the Organisation for Young Children in Northern Ireland to support the development, implementation, and evaluation of ECD programs in low- and middle-income countries impacted by divisions and conflict. The project is designed to contribute to the international evidence base on ECD for social cohesion and sustaining peace to make a real difference in the lives of children, caregivers, and communities.
... These might be attributed to the comment mechanism acting as an open forum to reduce the individual's exposure ratio in social networks [80]. Online social networks are recognized sites of both the construction of social identities [81,82] and their linguistic performance [83]. Social identity theory [81] asserts that willingness to negatively engage with out-group members is a way of affirming membership of the in-group. ...
Full-text available
Background: The dissemination of rumor rebuttal on social media is vital for rumor control and disease containment during the public health crisis. Previous researches on the effectiveness of rumor rebuttal, to a certain extent, ignored or simplified the structure of dissemination network and users' cognition, decision-making and interaction behaviors. Objective: This research aimed to roughly evaluate the effectiveness of rumor rebuttal, deeply dig into the attitude-based echo chamber effect in the users' response towards rumor rebuttal under multiple topics on Weibo, a Chinese social media, in the early stage of COVID-19 epidemic, and its impact on information characteristics of user interaction content. Methods: We called Sina Weibo API to crawl rumor rebuttal related to COVID-19 from 10:00 a.m. on January 23, 2020 to 0:00 a.m. on April 8, 2020. Using content analysis, sentiment analysis, social network analysis and statistical analysis, we first analyzed whether and to what extent there was echo chamber effect in individual's attitude shaping when retweeting or commenting on others. Then, we tested the heterogeneity of attitude distribution within communities and the homophily of interactions between communities. Based on the results of user- and community- levels, we made comprehensive judgments. Finally, we examined the users' interaction content from three dimensions of sentimental expression, information seeking/sharing, and civilization to test the impact of echo chamber effect. Results: Our results indicated that the retweeting mechanism played an essential role in promoting polarization and the commenting mechanism in consensus building, denied that there might be significant echo chamber effect in community interaction, and verified that compared to like-minded interactions, cross-cutting interactions significantly contained more negative sentiment, information seeking/sharing and incivility. Besides, we found that online users' information seeking was accompanied by incivility, and information sharing was accompanied by more negative sentiment, which was often accompanied by incivility. Conclusions: Our findings revealed the existence and degree of echo chamber effect from multiple dimensions (such as topic, interaction mechanism, interaction level) and its impact on interaction content. These findings can provide several suggestions for preventing or alleviating group polarization to achieve better rumor rebuttal. Clinicaltrial:
Full-text available
The development of digital technology has changed the way people communicate. The existence of online media is no longer a medium of information but also has become a space for human interpersonal relationships. One of the phenomena is the shifted pattern of a group of people looking for a life partner virtually through dating applications. This emerging trend of online dating applications in the vast digital world has been contradicted with the values of the Eastern community including Indonesia. The communication pattern employed by the users, despite it being considered taboo to some extent, includes the sexual self-disclosure by users to targeted partners in the online dating apps. The study aims at understanding the process of communication of sexual self-exposure by the users in the online dating application. Additionally, this study examines how sexual behaviors as the implication of online interactions. There are some dynamics employed in sexual self-disclosure in the process of online communication, factors that influence the way to communicate in the online dating apps, including occupation, recent events, and sexual experiences
This study draws on the social identity approach (SIA), to examine how political elites (i.e., members of the 116 th United States Congress) communicated norms about mask-wearing on social media during the COVID-19 pandemic. Using Twitter data collected in 2020, we found that Republican members of Congress were significantly less likely to promote mask-wearing than Democratic members. We also observed some variations in norm-conforming behaviors among the members of each party. For Republicans, increased loyalty to the Trump leadership was significantly associated with a lower level of mask promotion. For Democrats, we found some evidence that loyalty to the party predicted higher levels of mask promotion. On the other hand, interactions with out-group members decreased adherence to party norms for both Republican and Democratic members of Congress. These findings allow us to better understand the social–psychological effects of party membership among political elites as well as the importance of leader–follower relationships and intergroup interactions.
Full-text available
In recent years, social media has become a ubiquitous and integral part of social discourse. Homophily is a fundamental topic in network science and can provide insights into the flow of information and behaviours within society. Homophily mainly refers to the tendency of similar-minded people to interact with one another in social groups than with dissimilar-minded people. The study of homophily has been very useful in analyzing the formations of online communities. In this paper, we review and survey the effects of homophily in social networks and summarize the state-of-art methods that have been proposed in the past recent years to identify and measure those effects in multiple types of social networks. We conclude with a critical discussion of open challenges and directions for future research.
This chapter presents a textual analysis of government discourse under the two Colombian presidents that implemented the peace agreement between the Juan Manuel Santos Calderón administration and the Revolutionary Armed Forces of Colombia – People’s Army (FARC-EP). This chapter aims to explore whether the ideological rhetoric in government discourse during the post-agreement period created frameworks for building a culture of peace that facilitates social reconciliation. Two studies were carried out based on official posts made by the presidents on Twitter. The first study compares the discourses of the two administrations. The second presents a chronological analysis of the discourse in posts made by the last presidential administration. Results revealed that cognitive and emotional frameworks were aimed at social reconciliation and forgiveness in the Santos administration (peace with reconciliation) and at security and protection in the Duque administration (peace with legality). In conclusion, despite the rhetoric expressing the desire for peace, the cognitive and emotional frameworks of the government discourse during the second presidential administration do not encourage reconciliation and the recategorization of the adversary in political arenas. In that context of delegitimization, the construction of a peace culture transcends the agreement with the FARC-EP and warns against the escalation of conflict.
A carefully tailored tone in response to a complaint on social media can create positive emotions for an upset customer. However, very few studies have identified what response tones, based on an established theory, would be most effective for complaint management. This study conceptualizes a service agent's response tones based on Ballmer and Brennenstuhl's (1981) classification of speech acts and examines how an agent's use of speech acts elicit positive emotions for the complainant. Ballmer and Brennenstuhl classify speech acts within the dimensions of conventionality and dialogicality, and they suggest the two dimensions interact. Thus, we examine the impact of each dimension of speech acts and the interactions between the two dimensions on the elicitation of positive emotions for complainants. We collected over 100,000 tweets and classified firm agents' speech acts and complainants' emotions by designing deep learning architectures (i.e., bi-directional recurrent neural networks). Our fixed-effect regression results show that a low level of each speech act leads to the elicitation of customers' positive emotions but that the combination of the two erodes the individual advantages. This study expands Ballmer and Brennenstuhl's (1981) speech act classification from a speaker's perspectives to a listener's perspectives by contextualizing it in an analysis of service agents' tones and their roles in eliciting positive emotions among complainants.
Full-text available
The social networking site has quickly become a venue in which Puerto Ricans are able to communicate among themselves within a larger global community, resulting in an interesting range of language use. After examining the amount of Puerto Rican users, the researcher uses five aspects of profiles to analyze language use in fifty profiles of Puerto Ricans ages 18 to 22. The final portion of the paper is a case study of three profiles highlighting current use of Puerto Rican Spanish and netspeak. The paper concludes that many Puerto Rican users of live in a bilingual linguistic reality floating between Spanish and English.
Full-text available
Measuring the similarity between words, sentences, paragraphs and documents is an important component in various tasks such as information retrieval, document clustering, word-sense disambiguation, automatic essay scoring, short answer grading, machine translation and text summarization. This survey discusses the existing works on text similarity through partitioning them into three approaches; String-based, Corpus-based and Knowledge-based similarities. Furthermore, samples of combination between these similarities are presented.
List of tables Preface 1. Restructuring the ecology of the self: a framework for self-concept change 2. Method considerations for an ecological approach 3. Relocation and changes in commitment: a cross-sectional study over the first year 4. Implications of recent research in cognitive social psychology for self-concept change 5. Social psychological theories on maintenance and change 6. Sociological approached to the self-concept and change 7. The development of self-concept-related measures 8. Functions of the physical environment for the self-concept 9. Anticipation of transition from university 10. The experience sampling method 11. A quasi-experimental study of relocation and satisfaction with self 12. Relocation as transition and change in a physical and social context 13. A longitudinal questionnaire study over one year 14. A longitudinal study of students' transition to university 15. Conclusion References Author index Subject index.
Conference Paper
Finding the "right people" is a central aspect of social media systems. Twitter has millions of users who have varied interests, professions and personalities. For those in fields such as advertising and marketing, it is important to identify certain characteristics of users to target. However, Twitter users do not generally provide sufficient information about themselves on their profile which makes this task difficult. In response, this work sets out to automatically infer professions (e.g., musicians, health sector workers, technicians) and personality related attributes (e.g., creative, innovative, funny) for Twitter users based on features extracted from their content, their interaction networks, attributes of their friends and their activity patterns. We develop a comprehensive set of latent features that are then employed to perform efficient classification of users along these two dimensions (profession and personality). Our experiments on a large sample of Twitter users demonstrate both a high overall accuracy in detecting profession and personality related attributes as well as highlighting the benefits and pitfalls of various types of features for particular categories of users.