ArticlePDF Available

Abstract

This paper investigates how people express social identity at a large scale on a social network. We looked at communities of users on the Twitter website, and tested two established social-psychology theories that are usually performed at local scale. We found evidence of Communication Accommodation Theory, where community members vary their language characteristics depending on which community they are communicating with. We also found the level of linguistic variation correlated with how isolated a community was: evidence that there is Convergence between linked members. This demonstrates the power of methods which analyse subtle human behaviour on social networks.
Twitter users change word usage according to conversation-partner social identity
Nadine Tamburrinia, Marco Cinnirellab, Vincent A. A. Jansena, John Brydena,
aSchool of Biological Sciences, Royal Holloway University of London, Egham TW20 0EX, UK
bDepartment of Psychology, Royal Holloway University of London, Egham TW20 0EX, UK
Abstract
This paper investigates how people express social identity at a large scale on a social network. We looked at
communities of users on the Twitter web site, and tested two established social-psychology theories that are usually
performed at local scale. We found evidence of Communication Accommodation Theory, where community members
vary their language characteristics depending on which community they are communicating with. We also found the
level of linguistic variation correlated with how isolated a community was: evidence that there is Convergence between
linked members. This demonstrates the power of methods which analyse subtle human behaviour on social networks.
Keywords: Twitter; Community structure; Social identity; Language accommodation; Linguistic convergence; Social
network analysis
Highlights
We study social identity on communities found on
the Twitter web site
We find users adjust word usage according to the
community of the interlocutor
More isolated communities change word usage to a
greater extent
Large scale studies of Twitter can test hypotheses on
social behaviour
1. Introduction
Social identity is that proportion of an individual’s self-
concept that derives from membership of a social group
(Tajfel and Turner, 1979). Group affiliation has functions
of enhancing cooperation (Boyd and Richerson, 2009) and
allowing individuals to define others through the group
they belong to, in the same way that the individual de-
fines him or herself through the identity of their own group
(Ashforth and Mael, 1989). Group members share be-
haviour and social norms. This shared behaviour in social
groups is thought to be generated through processes on
social networks such as convergence of behaviour due to
social relationships (Hormuth, 1990; Ethier and Deaux,
1994).
The way we use language is strongly associated with
our social identity (Scott, 2007). The convergence of be-
haviour, proposed by social identity theory, is often stud-
ied through the language used within social groups. This
Corresponding author
demonstrates how language is more than just a means of
communication and sociolinguistic studies have shown that
varieties of a language can be strongly associated with so-
cial or cultural groups (Gumperz, 1958; Labov, 1966; Car-
roll, 2008; Bryden et al., 2013).
By using language as a proxy for social behaviour, stud-
ies have been able to understand how expression of social
identity is often strongly context dependent: people will
behave differently depending on which social identity has
the strongest salience in the current situation (Hogg and
Reid, 2006). Studies show how this often manifests in the
accommodation of language according to the social iden-
tity of the interlocutor (Giles, 1973; Gallois et al., 2005).
Individuals negotiate the social distance between them-
selves and the person with whom they are conversing, and
are therefore in control of its creation and maintenance
(Shepard et al., 2001). For example, Iwasaki and Horie
(2000) reported how Thai speakers would adjust their lin-
guistic registers when interacting with strangers. These
studies look at specific groups or social situations, but we
do not know whether this behaviour can be found at a large
scale across many groups where these groups are allowed
to freely interact with one another.
Online social networking platforms are providing us
with a large scale platform to study human behaviour.
With over 200 million monthly active users (Costolo, 2013),
the Twitter social network is particularly useful due to its
publicly accessible nature (Virk, 2011) and network size.
The analysis of large networks brings with it considerable
statistical power that allows for the detection of patterns
that in traditional, smaller scale network studies would be
undetectable. Twitter functions as a micro-blogging web-
site, working on the premise of users sharing their opinions
and thoughts in brief messages (maximum 140 characters),
Preprint submitted to Social Networks August 15, 2014
which are referred to as “tweets”. An investigation into the
reasons why people post on the Twitter web site by (Java
et al., 2007) found that about one eighth of posts were
conversational messages rendering Twitter as a prime re-
source for public access to naturally occurring communi-
cation (Danescu-Niculescu-Mizil et al., 2011) making this
public resource an excellent place to study the expression
of social identity.
The study of how identity affects our use of language
online is a growing field. There is evidence for communica-
tion accommodation between offline conversation partners
(Danescu-Niculescu-Mizil et al., 2011) showing that syn-
tax, pitch, gestures, word choice, length or form can differ
according to interlocutor. Evidence for linguistic conver-
gence online is mixed with studies finding evidence both for
(e.g. Riordan et al., 2013) and against (e.g. Christopher-
son, 2011) the existence of convergence in online communi-
cation. The anonymity sometimes engendered in computer
mediated environments can act to enhance the significance
of social identity in contexts where a relevant shared group
membership is salient to users (Postmes et al., 2000). Con-
sequently, social identity can be heightened which explains
why some group phenomena, such as polarisation of atti-
tudes, and stereotyping, can seem enhanced in some on-
line environments (e.g. Postmes et al., 2001). This is ev-
ident due to collective identity amongst communities of
websites of environmental activists (Ackland and O’Neil,
2011). However, such studies of social identity in computer-
mediated-communication are still in their relative infancy
and this research aims to contribute to the further devel-
opment of this field by looking to expressly link commu-
nication accommodation and convergence to social groups
that have formed on Twitter.
In order to identify online groups, we look to the study
of complex networks. In this field, the term communities is
used to denote parts of the network that are more strongly
linked within themselves than to the rest of the network, a
phenomenon that has been observed in many human social
networks (Porter et al., 2009). In this sense, communities
are an emergent property of network structure. Much work
has gone into developing methods to detect such groups
from topological analysis (Fortunato, 2010), and the ex-
tent to which this is possible has been termed modularity
(Newman, 2006). The communities found in this way are
usually associated with groups of friends or acquaintances,
or similarity in traits (Porter et al., 2009; Bryden et al.,
2011; Traud et al., 2012) and have also been shown to share
language features (Bryden et al., 2013). We hypothesise
that communities found in online networks will share so-
cial identity and consequently we expect to find that they
demonstrate communication accommodation and conver-
gence.
In this study we focus on a specific aspect of behaviour
that is strongly associated with social identity, asking whether
individuals will shift their linguistic behaviour according
to which social group they are messaging. The data of
online communities that we used came from a previous
study of the Twitter web site (Bryden et al., 2013). We
tested for communication accommodation by looking to
see if users varied specific language characteristics accord-
ing to whether they had sent conversational messages to
members of the same community or to members from other
communities. We tested for convergence by looking to see
whether this level of language variation for a community
correlated with how strongly linked a community is within
itself.
2. Methods
The data upon which we did our tests was a network
of 189,000 Twitter users. To identify users to download
we used a snowball-sample where, for each user sampled,
all their tweets which mentioned other users (using the
‘@’ symbol) were recorded and any new users referenced
added to a list of users from which the next user to be
sampled was picked. Starting from a random user, conver-
sational tweets, time-stamped between January 2007 to
November 2009 were sampled from the Twitter web site
during December 2009, yielding over 200 million messages.
The network was formed of bidirectional links, where both
nodes had sent at least one message to one another, and
weighted by the number of tweets sent between the two
users linked. We ignored messages that were copies of
other messages (so called retweets, which are identified by
a case-insensitive search for the text ‘RT’). In total the
network had 75 million messages (tweets) directed from
users of the network to one another.
The network was partitioned into communities using a
modularity maximisation algorithm (Blondel et al., 2008)
and a partition of the network was found where 91% of the
tweets were sent by users to other users within the same
community. For each community, characteristic words were
generated that were used more commonly in that group
than the global average (see Supplementary Information
for characteristic words). These allowed us to identify
English speaking groups and also qualitatively summarise
shared characteristics of each group. For more informa-
tion on how characteristic words were generated, and an
argument that the network sampled was representative of
the complete Twitter network, see Bryden et al. (2013).
To investigate changes in language characteristics, we
divided messages into two collections: internal messages
that were sent to other members of the same group, and
external messages that were sent to members of different
communities. For each group, we made sure that both
collections were of the same size by discarding messages
at random from the larger collection. The difference in
word usage between the samples from the two classes was
calculated.
To calculate differences between word usage between
the two samples we used text similarity measures. We used
two different text measures (Gomaa and Fahmy, 2013) to
confirm that the result was not an artefact generated by
one of the measures. For a word wwe define numbers of
2
usages of win the internal and external samples as λi(w)
and λe(w) respectively. The first measure was the Eu-
clidean distance between relative word usage frequencies
for each collection, given by,
X
w
λi(w)
Pvλi(v)λe(w)
Pvλe(v)2
1/2
.(1)
The second measure was the quantitative version of the
Jaccard distance measure (Gallagher, 1999) which is one
minus the multiset intersection of the two samples divided
by the multiset union. This is given by,
1Pwmin [λi(w), λe(w)]
Pvmax [λi(v), λe(v)] .(2)
To look at other linguistic features that can be indica-
tive of changes in linguistic style (see, e.g., Bryden et al.,
2013; Wagner et al., 2013), we also calculated differences
between word-ending frequencies (using both Euclidean
and Jaccard distances) and apostrophe frequencies. Dif-
ferences between apostrophe frequencies were calculated
by calculating the frequency of apostrophes per word used
by each of the two collections and then calculating the
absolute difference between these two values.
3. Results
The partitioning of the sample network of Twitter users
yielded 414 groups, with 42 groups having more than 250
users. A variety of languages were found with different
groups using different languages. To eliminate the effects
of a user simply changing between different languages de-
pending on which group they were speaking to, we did the
study on the 24 groups (of a size greater than 250 users)
that used the English language which were selected in a
previous study (Bryden et al., 2013, and see methods).
With these English-speaking groups, we formed col-
lections of internal and external messages for each group,
and then measured the Euclidean distance in word usage
frequencies between the two collections. Since differences
of word-usage frequencies can arise because users within
a group communicate about one or a limited number of
subjects, we also measured distances of word-ending us-
age frequencies and apostrophe usage frequencies to look
at markers of linguistic style. We found a variety of dis-
tances between internal and external messages in all three
measurements (Figure 1).
There is a variety of distances between the internal and
external word usages in Figure 1. It is possible that these
differences in word usage could have happened by random
chance. To test this on a group-by-group basis we used
a bootstrap by resampling (with replacement) new ran-
dom pairs of collections of messages from the union of the
original internal and external collections used to generate
Figure 1. By calculating linguistic distances between the
Figure 1: A comparison of the 24 English speaking groups in the
Twitter network showing the extent of linguistic variation between
internal and external tweets. The bars show Euclidean distances on
a group-by-group basis between internal and external tweets for the
three measurements: word-usage frequencies (solid bars at the top
of each plot), word-ending frequencies (slashed bars in the middle)
and apostrophe usage (crossed bars at the bottom). For each mea-
surement, all groups were scaled so that the values ranged between
0.0 and 1.0. Each group has a short description and a group num-
ber. The short description was generated by qualitatively inspecting
unusual words generated for each group (see Supplementary Infor-
mation).
newly sampled pairs of collections, we can confirm that the
difference found between the original group didn’t happen
by chance. Repeating this procedure 1,000 times for each
group, we calculated the p-value: the proportion of resam-
pled collections for which a linguistic distance exceeded
that of the original internal and external collections. In
fact, using both the Euclidean and Jaccard measures, none
of the distances between the word, or word-ending usages,
of the resamples exceeded that of the original collections
(p0.001). This showed that the users we studied do
indeed change their word and word-ending usage accord-
ing to whether they are messaging other members of the
group or not. For distance between internal and external
apostrophe usages, 17 of the 24 groups were significant
(p0.05).
The difference between the language use of external
and internal messages raises a question as to how much
this change in language characteristics is due to the sender
of a message conforming to the language use of the re-
ceiver. An alternative scenario may be that external mes-
sages may have their own language patterns. We investi-
gated this by comparing, using both the Jaccard and the
Euclidean measures, the external messages to and from
a focal community against the internal messages of every
community. We found that the most similar community in
each case was the original focal community. This indicates
that the change in language characteristics is indeed due
to the sender of a message conforming to the language use
of the receiver.
The groups of Twitter users analysed in this work were
generated by partitioning the sampled network of Twitter
users such that the proportion of messages sent within the
3
Figure 2: Linguistic variation between internal and external tweets
increases with the proportion of tweets sent within a group. a) dis-
tance between word-usage (circles with regression line, two-tailed
p= 5.6×106), b) distance between word-endings (triangles and re-
gression line, two-tailed p= 0.052), c) distance between apostrophe
usage (crosses and regression line, two-tailed p= 0.0074).
groups was maximised: so called modularity maximisation
(Blondel et al., 2008; Newman and Girvan, 2004). This
generated closely interlinked groups that are relatively iso-
lated from the rest of the network. We assessed whether
there is any relationship between the level of isolation of
a group, measured as the proportion of messages sent by
that group to other members of the same group, and the
amount of linguistic variation between internal and ex-
ternal messages. We found that the distances between
word and apostrophe usage correlated significantly with
the proportion of messages sent within the groups (Fig-
ure 2). This indicates that the more a group was isolated
from the rest of the network, the more it showed linguistic
convergence.
We did not find a significant correlation for word-ending
variation against the proportion of internal tweets (Figure
2, panel b). A visual inspection of the figure reveals that
one of the groups is an outlier from the rest across all three
measurements of linguistic variation. This group (number
93) is made up of a network of people that organise on-
line parties called ‘pawpawties’ to raise money for animal
charities (Manning, 2009). It is intriguing that this group,
which largely exists on Twitter, has much stronger lan-
guage accommodation features compared to similar groups
which appear to have much stronger offline interaction.
When we remove this outlier from the regression, we find
that there is a significant correlation for word-ending vari-
ation against the proportion of internal tweets (two-tailed
p= 0.00080).
4. Discussion
Our work demonstrates how computational methods
can be used to study social processes on large-scale social
networks. Our study was done on an unrestricted large-
scale sample of Twitter where individuals interact freely
with one another. We used topological analysis to identify
social groups in the network and then demonstrated how
linguistic behaviour will change according to the group
membership of the interlocutors. This shows how sub-
tle trends in linguistic behaviour aggregate to form social
identity through communication accommodation and lin-
guistic convergence.
The work illustrates an important methodological tool
for studying social processes on large scale social networks.
Measurements of social behaviour, especially language fea-
tures, rarely appear to conform to a normal distribution
and are thus difficult to analyse with traditional statistical
methods. In this work we use a bootstrap method which,
through resampling our data, is independent of whichever
distribution the original measurements might come from.
The bootstrap is a simple, but powerful, tool for statisti-
cal analysis of subtle social processes at such a large scale
(Efron and Tibshirani, 1993).
Our study has found evidence of behaviour on the Twit-
ter social network that is consistent with theory on social
identity. The results show that people are aware, either
implicitly or explicitly, of the social identity of their inter-
locutor and change their language usage accordingly. This
demonstrates that interaction networks with limited com-
munication channels are still sophisticated enough to allow
their members to express social identities. We have also
found that the extent to which members change their lan-
guage characteristics depends on how isolated their group
is from the rest of the network. This shows that social con-
vergence between several individuals is strongly related to
the proportion of their total interaction that they spend
within the group.
This study is compatible with other studies of linguis-
tic variation within and between groups (e.g. Bell, 1984;
Gregory and Carroll, 1978), and the idea that communi-
ties may develop unique linguistic styles which can become
intertwined with, and markers of, their identity. Our find-
ing of linguistic differences between internal and external
tweets echoes sociolinguistic work on situational fluctua-
tions in linguistic registers (e.g. Iwasaki and Horie, 2000)
and supports a social identity perspective that views such
linguistic variation as part of the process of social cate-
gorisation.
An important difference with previous studies is the
scale at which this study took place. For instance, previ-
ous studies that have looked at convergence did not find
significance with sample sizes of 30 conversations (Christo-
pherson, 2011). Our approach surpasses the boundaries
of survey or interview, and laboratory or field based in-
vestigations, with millions of conversations being analysed
yielding significant statistical power. While the environ-
4
ment of Twitter is somewhat specific and does not relate
to many other on- and offline environments, the fact that
our results here were replicated for each community tested
indicates that our result is likely to be generalisable.
The differences in word usage between the internal and
external messages of each group may be due to each group
sharing interests in certain subjects. To go beyond sub-
ject areas, we also looked at word endings and apostrophe
usage. This is consistent with theory which shows how
groups become associated with particular communication
styles, members may reference those styles in their com-
municative acts as a means of claiming or expressing the
identity in question (Rampton, 1995).
Our study was restricted to English language groups
because a large proportion of the groups in our sample of
Twitter used English. While there were groups that spoke
other languages in our data, we did not have the quantity
of data to adequately resolve sub-groups for non-English
speaking Twitter users. We would expect, with more data,
to be able to resolve sub-groups for non-English speaking
users, and thus be able to test the theory across many
different languages.
It is possible that the sampling algorithm that we orig-
inally used to sample the Twitter messages may have some
introduced some biases which would mean that our sample
is not representative of Twitter as a whole. A sampling
process used can have some bias toward Twitter users that
have had messages sent to them. To mitigate this, we made
sure that unsampled users were only placed once on the list
of users to be sampled, even if they have been messaged
by several previously sampled users. The second issue is
that there may be a bias toward certain communities - es-
pecially toward the community of the user first sampled.
We cover this in more detail in a previous paper (Bryden
et al., 2013) arguing that the sampler will move to random
communities relatively quickly. We found that our sam-
pling method detected a broad variety of communities and
this indicates the sample is likely to be representative of
the population.
Interesting future topics which are possible extensions
of our work include theory on out-groups, where theory
such as Communication Accommodation Theory and the
Social Identity Model of Deindividuation predict diver-
gence when interlocutors message certain external groups.
We didn’t find any evidence of this in our study as we
found that external messages for a particular group were
still closer to the internal messages of the group than any
other. Further investigations of how language character-
istics converge and/or diverge over time may shed some
light on this topic and be of interest in their own right.
Finally, we may also be able to improve an algorithm that
predicts the groups of individuals based on their language
patterns (Bryden et al., 2013), by comparing an individ-
ual’s language use against that of only the internal tweets
of the groups.
Even though the conversations we studied on Twitter
were made up of very short text messages which are pub-
lically posted, these results indicate that many complex
features of normal offline communication take place on-
line. While such behaviour may not be evident at a small
scale, the large quantities of data used in this study meant
that we were able to identify these subtle patterns. This
indicates that future studies on social identity, social be-
haviour and cooperation are likely to prove fruitful.
5. Acknowledgements
Thanks to Shaun Wright, Tim Harrison and the anony-
mous reviewers. This work was supported by the Eco-
nomic and Social Research Council (grant ES/L000113/1).
Ackland, R., O’Neil, M., 2011. Online collective identity: The case
of the environmental movement. Social Networks 33 (3), 177–190.
URL http://www.sciencedirect.com/science/article/pii/
S0378873311000153
Ashforth, B. E., Mael, F., 1989. Social identity theory and the orga-
nization. Academy of Management Review 14 (1), 20–39.
URL http://amr.aom.org/content/14/1/20.short
Bell, A., 1984. Language style as audience design. Language in
society 13 (2), 145–204.
URL http://journals.cambridge.org/production/action/
cjoGetFulltext?fulltextid=2990992
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E., 2008.
Fast unfolding of communities in large networks. Journal of Sta-
tistical Mechanics: Theory and Experiment 2008 (10), P10008.
Boyd, R., Richerson, P. J., 2009. Culture and the evolution of hu-
man cooperation. Philosophical Transactions of the Royal Society
B: Biological Sciences 364 (1533), 3281–3288.
URL http://rstb.royalsocietypublishing.org/content/364/
1533/3281.short
Bryden, J., Funk, S., Geard, N., Bullock, S., Jansen, V. A., 2011. Sta-
bility in flux: Community structure in dynamic networks. Journal
of The Royal Society Interface 8 (60), 1031–1040.
URL http://rsif.royalsocietypublishing.org/content/8/60/
1031.short
Bryden, J., Funk, S., Jansen, V. A., 2013. Word usage mirrors com-
munity structure in the online social network Twitter. EPJ Data
Science 2 (1), 1–9.
URL http://link.springer.com/article/10.1140/epjds15
Carroll, K. S., 2008. Puerto rican language use on MySpace.com.
Centro Journal 20 (1), 96–111.
URL http://www.redalyc.org/articulo.oa?id=37720110
Christopherson, L., 2011. Can u help me plz?? Cyberlanguage
accommodation in virtual reference conversations. Proceedings
of the American Society for Information Science and Technology
48 (1), 1–9.
URL http://onlinelibrary.wiley.com/doi/10.1002/meet.
2011.14504801080/full
Costolo, R., 2013. Twitter, Inc.: Initial public offering.
URL http://www.sec.gov/Archives/edgar/data/1418091/
000119312513424260/d564001ds1a.htm
Danescu-Niculescu-Mizil, C., Gamon, M., Dumais, S., 2011. Mark
my words!: Linguistic style accommodation in social media. In:
Proceedings of the 20th international conference on World wide
web. p. 745–754.
URL http://dl.acm.org/citation.cfm?id=1963509
Efron, B., Tibshirani, R. J., Jan. 1993. An Introduction to the Boot-
strap. Chapman and Hall/CRC, New York.
Ethier, K. A., Deaux, K., 1994. Negotiating social identity when con-
texts change: Maintaining identification and responding to threat.
Journal of Personality and Social Psychology 67 (2), 243.
URL http://psycnet.apa.org/journals/psp/67/2/243/
Fortunato, S., 2010. Community detection in graphs. Physics
Reports 486 (3), 75–174.
URL http://www.sciencedirect.com/science/article/pii/
S0370157309002841
5
Gallagher, E. D., 1999. COMPAH documentation.
URL http://citeseerx.ist.psu.edu/viewdoc/download?doi=
10.1.1.9.1334&rep=rep1&type=pdf
Gallois, C., Ogay, T., Giles, H., 2005. Communication accommo-
dation theory: A look back and a look ahead. In: Gudykunst,
W. B. (Ed.), Theorizing About Intercultural Communication.
Sage, Thousand Oaks, CA, p. 121–148.
URL http://espace.library.uq.edu.au/view/UQ:72030
Giles, H., 1973. Accent mobility: A model and some data. Anthro-
pological Linguistics 15 (2), 87–105.
URL http://www.jstor.org/stable/10.2307/30029508
Gomaa, W. H., Fahmy, A. A., 2013. A survey of text similarity ap-
proaches. International Journal of Computer Applications 86 (13),
13–18.
Gregory, M., Carroll, S., 1978. Language and situation: Language
varieties and their social contexts. Routledge and Kegan Paul,
London, Henley and Boston.
URL http://www.getcited.org/pub/101782218
Gumperz, J. J., 1958. Dialect differences and social stratification in
a north indian village. American Anthropologist 60 (4), 668–682.
URL http://onlinelibrary.wiley.com/doi/10.1525/aa.1958.
60.4.02a00050/abstract
Hogg, M. A., Reid, S. A., 2006. Social identity, self-categorization,
and the communication of group norms. Communication Theory
16 (1), 7–30.
URL http://onlinelibrary.wiley.com/doi/10.1111/j.
1468-2885.2006.00003.x/full
Hormuth, S. E., 1990. The ecology of the self: Relocation and
self-concept change. Cambridge University Press.
URL http://books.google.co.uk/books?hl=en&lr=&id=
k9cn1y9iimMC&oi=fnd&pg=PR15&dq=hormuth+1990&ots=
UWAaS2HDOn&sig=0RErtS-V_S9Umqk_pYJ25yBnraA
Iwasaki, S., Horie, P. I., 2000. Creating speech register in Thai
conversation. Language in Society 29 (04), 519–554.
URL http://journals.cambridge.org/abstract_
S0047404500004024
Java, A., Song, X., Finin, T., Tseng, B., 2007. Why we twitter:
Understanding microblogging usage and communities. In: Pro-
ceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop
on Web mining and social network analysis. p. 56–65.
URL http://dl.acm.org/citation.cfm?id=1348556
Labov, W., 1966. The linguistic variable as structural unit. Wash-
ington Linguistics Review 3, 4–22.
Manning, S., 2009. Animal lovers throw ‘pawpawties’ for charity.
SFGate.
URL http://www.sfgate.com/bayarea/article/
Animal-lovers- throw-pawpawties- for-charity- 3285436.
php#src=fb
Newman, M. E., 2006. Modularity and community structure in net-
works. Proceedings of the National Academy of Sciences 103 (23),
8577–8582.
URL http://www.pnas.org/content/103/23/8577.short
Newman, M. E., Girvan, M., 2004. Finding and evaluating commu-
nity structure in networks. Physical Review E 69 (2), 026113.
URL http://pre.aps.org/abstract/PRE/v69/i2/e026113
Porter, M. A., Onnela, J.-P., Mucha, P. J., 2009. Communities in
networks. Notices of the AMS 56 (9), 1082–1097.
URL http://www.ams.org/notices/200909/rtx090901082p.pdf
Postmes, T., Spears, R., Lea, M., 2000. The formation of group
norms in computer-mediated communication. Human Communi-
cation Research 26 (3), 341–371.
URL http://onlinelibrary.wiley.com/doi/10.1111/j.
1468-2958.2000.tb00761.x/abstract
Postmes, T., Spears, R., Sakhel, K., De Groot, D., 2001. Social
influence in computer-mediated communication: The effects of
anonymity on group behavior. Personality and Social Psychology
Bulletin 27 (10), 1243–1254.
URL http://psp.sagepub.com/content/27/10/1243.short
Rampton, B., 1995. Crossing: Language and ethnicity among ado-
lescents. Longman, London.
URL http://www.getcited.org/pub/103194568
Riordan, M. A., Markman, K. M., Stewart, C. O., 2013. Commu-
nication accommodation in instant messaging an examination of
temporal convergence. Journal of Language and Social Psychol-
ogy 32 (1), 84–95.
URL http://jls.sagepub.com/content/32/1/84.short
Scott, C. R., 2007. Communication and social identity theory: Ex-
isting and potential connections in organizational identification
research. Communication Studies 58 (2), 123–138.
URL http://www.tandfonline.com/doi/abs/10.1080/
10510970701341063
Shepard, C. A., Giles, H., Le Poire, B. A., 2001. Communication
accommodation theory. In: Robinson, W. P., Giles, H. (Eds.),
The new handbook of language and social psychology. John Wiley,
New York, p. 33–56.
Tajfel, H., Turner, J. C., 1979. An integrative theory of intergroup
conflict. In: Austin, W. G., Worchel, S. (Eds.), The social
psychology of intergroup relations. Brooks/Cole, Monterey CA,
pp. 33–47.
URL http://dtserv2.compsy.uni-jena.de/
ss2009/sozpsy_uj/86956663/content.nsf/Pages/
58BD3B477ED06679C125759B003B9C0F/$FILE/Tajfel%20Turner%
201979.pdf
Traud, A. L., Mucha, P. J., Porter, M. A., 2012. Social structure
of facebook networks. Physica A: Statistical Mechanics and its
Applications 391 (16), 4165–4180.
URL http://www.sciencedirect.com/science/article/pii/
S0378437111009186
Virk, A., 2011. Twitter: The strength of weak ties. University of
Auckland Business Review 13 (1), 19–21.
URL http://www.thebookshelf.auckland.ac.nz/docs/
UABusReview/2011_13_i01-5- twitter.pdf
Wagner, C., Asur, S., Hailpern, J., 2013. Religious politicians and
creative photographers: Automatic user categorization in twitter.
In: ASE/IEEE International Conference on Social Computing
(SocialCom2013).
URL http://claudiawagner.info/publications/socialcom_
userClassification_short_new.pdf
6
... Numerous studies (Hamed et al. 2015;Pearce et al. 2019;Wei et al. 2021) have noted that some social media platforms, in particular Twitter, have increasingly become the focus of scientific attention with respect to the analysis of individual and collective opinions and views. Several studies have highlighted the importance of using social media as a source of information to explore perceptions, beliefs, and understandings regarding climate change as well as the factors that can influence and shape such views (Tamburrini et al. 2015;Pearce et al. 2019). ...
Article
Full-text available
Public discourse about climate change is characterized by a wide variety of frames. Understanding how people integrate climate change narratives into their lives is essential for designing socially accepted climate policies. Our study focuses on people's positions and reactions concerning the effects of sea level rise on the Catalan coast (Spain) and references tweets related to a 2021 publication by Climate Central, Picturing Our Future, on sea level rise. The novelty of the approach is the focus on a gradual form of climate change, such as sea level rise, in contrast with extreme events, such as storms or heat waves. We collected and analysed the content of 287 tweets that reacted to the Climate Central's publication mentioned above, classifying them in terms of the sentiment they expressed. The results show three main types of reactions: realist, joking, and denier. Our conclusions underscores the significance of attending to how climate change narratives are portrayed and communicated through social media, and how societal beliefs and perspectives shape these narratives and dispositions. These aspects, crucial for fostering awareness and concern about pressing environmental issues, accentuate the necessity of integrating them into climate policy design. ARTICLE HISTORY
... A meta-analysis by Soliz and Giles (2014) showed that, up to that point, the use of CAT in the context of computermediated communication (CMC) accounted for just 4.7% of the overall work using the theory. The theory's use within CMC has expanded in the last decade to include accommodation in chat rooms (Fullwood et al., 2013), instant messaging (Riordan et al., 2013;Muir et al., 2017), social media platforms such as Twitter (Tamburrini et al., 2015) and Facebook (see Zhang et al., this Issue), interactions with voice-based assistants (see Cohn et al.; Edwards et al., this Issue), and text messaging (Adams et al., 2018; see Adams and Miles, this Issue). The movement of CAT into the domain of CMC has been categorized as CAT's seventh stage of development (see Giles et al., this Issue). ...
... study the convergence and divergence of behaviors within social groupings (Ethier and Deaux, 1994;Bryden et al., 2013). There is evidence, for example, that individuals' linguistic expressions depend on the community culture within which they are currently interacting (Tamburrini et al., 2015). ...
Article
Full-text available
An online survey was used to collect participants' retrospective accounts of an encounter with an “instant enemy” and an encounter with an “instant ally” in samples of 262 American and 250 Taiwanese respondents. Using software that measured the relative use of various word categories, we examined ingroup/outgroup differences and cultural differences in the experience and perception of an “instant enemy” and an “instant ally.” With regard to ingroup/outgroup differences, we found that inclusive and positive emotion words were used more frequently to describe the instant ally encounters, whereas exclusive and negative emotion words were used more frequently in reports of the instant enemy encounters. We also found that our respondents' descriptions of instant ally encounters were more likely to be put into a context defined by words related to leisure, work , and space , whereas their descriptions of instant enemy encounters were more likely to ignore the context and focus instead on what type of person the instant enemy was, as defined by more personal pronouns and words denoting specific categories of humans . With regard to cultural differences, we replicated previous findings indicating that Asian respondents tend to have thoughts and perceptions that are more holistic and integrated than those of Western respondents, as indicated by more words related to cognitive and affective processes, insight , and awareness of causation . Viewed collectively, the findings make a strong case that word-category usage can reveal both well-established and novel findings in comparisons of individuals from different cultures.
... phrases, register) during interactions to fit the style of their conversation partners, consequently enhancing similarities between them. Convergence of communication styles has been documented also in online communication (e.g., Crook & Booth, 1997;Gonzales et al., 2010;Postmes et al., 2000;Sassenberg, 2002;Scissors et al., 2008;Stocks et al., 2018;Tamburrini et al., 2015). Thus, when exaggerated emotion is expressed, it is expected that interlocutors will accommodate their emotional response and reciprocate in a similar level of exaggerated expression, subsequently establishing norms of emotional exaggeration. ...
... On average, 473,000 tweets are sent every minute, and 46% of Twitter users tweet daily (Madden et al. 2013). This results in an enormous quantity of data that holds great statistical power on the opinions of internet users with broad coverage across space and time (Tamburrini et al. 2015, Reyes-Menendez et al. 2018. ...
... Furthermore, biases are not straightforward to test directly in language, since changes to established languages in some domains such as grammar can take a long time. Previous research has used "model systems" such as Twitter (Bryden et al., 2013;Tamburrini et al., 2015) and experimental semiotics: a controlled experiment where participants use a novel communication system that can evolve rapidly (see Caldwell and Millen, 2008;Galantucci and Garrod, 2011;Roberts, 2017). While these methods provide experimental control, they often have low ecological validity. ...
Article
Full-text available
Cross-signing—the emergence of an interlanguage between users of different sign languages—offers a rare chance to examine the evolution of a natural communication system in real time. To provide an insight into this process, we analyse an annotated video corpus of 340 minutes of interaction between signers of different language backgrounds on their first meeting and after living with each other for several weeks. We focus on the evolution of shared color terms and examine the role of different selectional pressures, including frequency, content, coordination and interactional context. We show that attentional factors in interaction play a crucial role. This suggests that understanding meta-communication is critical for explaining the cultural evolution of linguistic systems.
Thesis
Full-text available
Twitter, social media and big data promise much in terms of terrorist signals amenable to analysis. As, however, these signals are noisy, subjectively ambiguous and new, this thesis addresses four questions that are key to reliably ‘tuning in’ to these signals. Each chapter uses big data to investigate patterns too subtle to have been amenable to prior study, with the importance of controlling for the noise associated with big data a central theme running through the thesis. Chapter 1 introduces the work, Chapter 2 reviews the relevant literature and Chapter 3 introduces and discusses the overarching methodology. Chapter 4 considers the validity of inferring information about users from their Twitter language and tweets. I demonstrate that language can be horizontally transmitted and inherited; with behaviour and interactions leading to and predicting, changes in language. This extends previous work with small sample work that did not exclude imitation. In Chapter 5, I characterise jihadist-linked accounts that resurge back from suspension—as identified with novel methods. I show that suspension is less disruptive than previous case studies implied, but that pseudoreplication has been underestimated (Wright, 2016). Having demonstrated the scale of resurgence, Chapter 6 tests whether automated machine methods can improve identification. I develop a text similarity based model and validate it against human-annotated data. The final research chapter, Chapter 7, tackles noise in big data when inferring information about events in the offline world. Extending similar work, I evaluate computational and human coded predictions of how positive geopolitical events are for Daesh. I demonstrate that while the Baqiya family tweets differently on different types of day, most patterns emerge as easily by chance in the negative control data. The work is novel as although some attempts have been made to address the questions in this thesis—or similar ones—using case studies, small samples and laboratory studies, all of these suffer limitations. Some studies have not asked the exact same question, some conclusions have been insufficiently supported with evidence and others have simply been beyond the reach of existing methods. Together, the pieces of work in this thesis shows that computational analysis of big data enables tuning in to subtle signals and sometimes reveals conclusions that contradict less developed research. Control noise, however, often contains as many patterns and thus, future studies should pay particular attention to their methodologies when using noisy, subjective, social media data. [Full text: ttps://pure.royalholloway.ac.uk/ws/portalfiles/portal/28018827/Shaun_Wright_Doctoral_Thesis.pdf]
Article
Task‐unrelated thought (TUT) occurs frequently in our daily lives and across a range of tasks, but we know little about how this phenomenon arises during and influences the way we communicate. Conversations also provide a novel opportunity to assess the alignment (or divergence) in TUT during dyadic interactions. We conducted a study to determine: (a) the frequency of TUT during conversation as well as how partners align/diverge in their rates of TUT, (b) the subjective and behavioral correlates of TUT and TUT divergence during conversation, and (c) if perceived social group identity impacts TUT and TUT divergence during conversation. We used a minimal groups induction procedure to assign participants (N = 126) to either an ingroup, outgroup, or control condition. We then asked them to converse with one another via a computer‐mediated text chat application for 10 min while self‐reporting TUTs. On average, participants reported TUT about once every 2 min; however, this rate was lower for participants in the ingroup condition, compared to the control condition. Conversational pairs in the ingroup condition were also aligned more in their rates of TUT compared to the outgroup condition. Finally, we discuss subjective and behavioral correlates of TUT and TUT divergence in conversations, such as valence, turn‐taking ratios, and topic shifts.
Article
Full-text available
The development of digital technology has changed the way people communicate. The existence of online media is no longer a medium of information but also has become a space for human interpersonal relationships. One of the phenomena is the shifted pattern of a group of people looking for a life partner virtually through dating applications. This emerging trend of online dating applications in the vast digital world has been contradicted with the values of the Eastern community including Indonesia. The communication pattern employed by the users, despite it being considered taboo to some extent, includes the sexual self-disclosure by users to targeted partners in the online dating apps. The study aims at understanding the process of communication of sexual self-exposure by the users in the online dating application. Additionally, this study examines how sexual behaviors as the implication of online interactions. There are some dynamics employed in sexual self-disclosure in the process of online communication, factors that influence the way to communicate in the online dating apps, including occupation, recent events, and sexual experiences
Article
Full-text available
The social networking site MySpace.com has quickly become a venue in which Puerto Ricans are able to communicate among themselves within a larger global community, resulting in an interesting range of language use. After examining the amount of Puerto Rican users, the researcher uses five aspects of MySpace.com profiles to analyze language use in fifty profiles of Puerto Ricans ages 18 to 22. The final portion of the paper is a case study of three profiles highlighting current use of Puerto Rican Spanish and netspeak. The paper concludes that many Puerto Rican users of MySpace.com live in a bilingual linguistic reality floating between Spanish and English.
Article
Full-text available
Measuring the similarity between words, sentences, paragraphs and documents is an important component in various tasks such as information retrieval, document clustering, word-sense disambiguation, automatic essay scoring, short answer grading, machine translation and text summarization. This survey discusses the existing works on text similarity through partitioning them into three approaches; String-based, Corpus-based and Knowledge-based similarities. Furthermore, samples of combination between these similarities are presented.
Article
List of tables Preface 1. Restructuring the ecology of the self: a framework for self-concept change 2. Method considerations for an ecological approach 3. Relocation and changes in commitment: a cross-sectional study over the first year 4. Implications of recent research in cognitive social psychology for self-concept change 5. Social psychological theories on maintenance and change 6. Sociological approached to the self-concept and change 7. The development of self-concept-related measures 8. Functions of the physical environment for the self-concept 9. Anticipation of transition from university 10. The experience sampling method 11. A quasi-experimental study of relocation and satisfaction with self 12. Relocation as transition and change in a physical and social context 13. A longitudinal questionnaire study over one year 14. A longitudinal study of students' transition to university 15. Conclusion References Author index Subject index.
Conference Paper
Finding the "right people" is a central aspect of social media systems. Twitter has millions of users who have varied interests, professions and personalities. For those in fields such as advertising and marketing, it is important to identify certain characteristics of users to target. However, Twitter users do not generally provide sufficient information about themselves on their profile which makes this task difficult. In response, this work sets out to automatically infer professions (e.g., musicians, health sector workers, technicians) and personality related attributes (e.g., creative, innovative, funny) for Twitter users based on features extracted from their content, their interaction networks, attributes of their friends and their activity patterns. We develop a comprehensive set of latent features that are then employed to perform efficient classification of users along these two dimensions (profession and personality). Our experiments on a large sample of Twitter users demonstrate both a high overall accuracy in detecting profession and personality related attributes as well as highlighting the benefits and pitfalls of various types of features for particular categories of users.