Content uploaded by Daniel Gardner
Author content
All content in this area was uploaded by Daniel Gardner on Oct 18, 2017
Content may be subject to copyright.
Chat Speed OP : Practices of
Coherence in Massive Twitch Chat
Colin Ford
Department of Anthropology
University of California, Irvine
cford1@uci.edu
a.m. tsaasan
Department of Informatics
University of California, Irvine
tsaasan@uci.edu
Dan Gardner
Department of Informatics
University of California, Irvine
dlgardne@uci.edu
Bonnie Nardi
Department of Informatics
University of California, Irvine
nardi@uci.edu
Leah Elaine Horgan
Department of Informatics
University of California, Irvine
horganl@uci.edu
Jordan Rickman
Department of Informatics
University of California, Irvine
jrickman@uci.edu
Calvin Liu
Department of Informatics
University of California, Irvine
calvinl1@uci.edu
Copyright is held by the owner/author(s). CHI’17 Extended
CHI17 Extended Abstracts, May 06-11, 2017, Denver, CO, USA ACM 978-1-4503-4656-6/17/05.
http://dx.doi.org/10.1145/3027063.3052765
Abstract
Twitch.tv, a streaming platform known for video game content,
has grown tremendously since its inception in 2011. We ex-
amine communication practices in Twitch chats for the popu-
lar game Hearthstone, comparing massive chats with at least
10,000 concurrent viewers and small chats with fewer than 2,000
concurrent viewers. Due to the large scale and fast pace of mas-
sive chats, communication patterns no longer follow models de-
veloped in previous studies of computer-mediated communica-
tion. Rather than what other studies have described as commu-
nication breakdowns and information overload, participants in
massive chats communicate in what we call “crowdspeak.”
Author Keywords
Twitch; chat; computer-mediated communication; CMC
ACM Classification Keywords
H.1.2 Human Factors
Introduction
Launched in 2011, Twitch.tv is an online video-hosting platform
where gamers live-stream their gameplay. The word twitch con-
notes “short,” “sudden,” even “convulsive,” a fitting term for the
rapid and seemingly incomprehensible discourse found in the
massive Twitch chats for top games like League of Legends and
Hearthstone that regularly draw more than 10,000 concurrent
viewers [21]. Individual messages often have only a few seconds
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
858
This work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike International 4.0 License.
of screen time before rushing out of view. Massive chats thus
present unique circumstances for collective communication, as
the number of concurrent viewers exceeds, by thousands, that of
most Internet Relay Chat (IRC) channels or other chat spaces.
Building on Hamilton et al.’s distinction between small and mas-
sive Twitch chats [5], we compared the two, finding that partic-
ipants in massive chats developed a distinctive form of com-
munication supporting large-scale interaction, or what we call
“crowdspeak.” Crowdspeak may appear chaotic, meaningless,
or cryptic. However, we discovered “practices of coherence” that
make massive chats legible, meaningful, and compelling to par-
ticipants. By coherence, we simply mean that the chat makes
sense to participants and is not experienced as a breakdown,
overload, or other difficulty.
Figure 1: Massive chat (TrumpSC’s
stream); 13 messages / 4 seconds.
Chat has been a topic of academic research for decades, with
early studies dating back to the 1970s [10]. These studies largely
center on conversational modes of speech, where dialogue
takes place in interspersed threads through which participants
address specific others, either as individuals or groups [4, 5,
6, 7, 17, 26, 28]. A key concern in this research has been doc-
umenting the ways coherence is sustained or dissolved. For
example, Greenfield and Subrahmanyam found that teen chat
participants maintained conversational coherence by creating
small discussion groups within larger chats, using visual cues,
and making use of abbreviations. In particular, coherence was
achieved by “establishing who is participating in a particular con-
versation and establishing what constitutes a relevant response”
[4]. Though the technology used by Twitch is nearly identical to
such systems, the massive chats we studied look quite differ-
ent than what is conventionally considered “coherent”—massive
chats are filled with non-sequiturs, verbatim repetition, varia-
tions on prior messages, blocks of text copied from elsewhere,
and tiny messages that include only a few words or emotes. As
conventional conversational coherence “breaks down” [5], par-
ticipants achieve a different kind of coherence that prioritizes
crowd-based reaction and interaction over interpersonal conver-
sation.
We studied Twitch channels streaming the popular Blizzard En-
tertainment game, Hearthstone: Heroes of Warcraft. Twitch’s
Hearthstone section features a variety of prominent stream-
ers such as Reynad, Kripparian, and TrumpSC, whose events
attract upwards of 10,000 concurrent viewers. The game also
supports small streamers such as Ryzen, Alliestrasza, and Za-
laeHS, whose audiences range from around 100–2,000 viewers.
Hearthstone thus presents an ideal opportunity to study differ-
ences in communication patterns by viewers of the same game,
making it possible to examine whether and how communication
differs by chat size.
To assess communication patterns, we observed and recorded
small and massive chats between April and August 2016. We
hypothesized that: 1) messages in massive chats would be
shorter in length (reducing the time needed for input); 2) mas-
sive chats would contain less original content (making it easier
to grasp meaning); and, 3) massive chats would contain fewer
unique “voices,” i.e., perspectives or stances. To test these hy-
potheses, we analyzed 50-message segments of text from five
small chats (<2,000 viewers) and five massive chats (>10,000
viewers). We measured message length, amount of original con-
tent, and number of unique voices in each segment. We used
linguist John Sinclair’s theory of lexical items to define metrics
for original content. Noting that words often hang together in
meaningful combinations, Sinclair argued that the most impor-
tant units of linguistic analysis are not words, but lexical items.
Sinclair defined lexical items as “units of meaning” that may be
words, but are often word pairs or groups, such as phrases like
“the naked eye” or “diamond in the rough.” [19]. We quantified
original content by counting unique lexical items in each 50-
message segment. We found that massive chats featured fewer
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
859
unique lexical items, i.e., less original content, which helped miti-
gate its rapid pace.
In massive and small chats we measured message length by
counting the number of lexical items per message. We found
that massive chat messages contained fewer lexical items on
average, requiring less time for participants to input a message.
We measured original chat content by counting the number of
unique lexical items per 50-message segment. We found that
massive chat segments contained less original content on aver-
age. Although we did not quantify how often rhetorical elements
such as emotes and copypastas were repeated, through our col-
lective 300 hours of qualitative work, and based on one author’s
extensive long-term participation in Twitch chat communities,
we observed that there is a great deal of repetition and reuse of
lexical items in massive Twitch chats. This repetition of familiar
elements recalls the practice of bricolage, a concept from Lévi-
Strauss’s work [11]. Bricolage indicates opportunistic use and
remixing of elements from a fairly small repertoire. Bricolage
occurs in massive chats in the use and reuse of a limited set of
lexical items. Some may be small variants on prior elements.
Lexical items in the Twitch set include words, phrases, emotes,
commands, and copypastas. Emotes are small digital icons, of-
ten a face or character. Copypastas are blocks of text repeated
by participants through the “copy” and “paste” commands. They
are frequently found in Twitch chat, but may originate in other
sites or forums.
Figure 2: Small chat (ZalaeHS’
stream); 16 messages / 111 seconds.
We drew from Trausan-Matu and Rebedea’s work [22] to spec-
ify the voices in a segment of text. Based on Bahktin’s work,
Trausan-Matu and Rebedea argued that voices are not equiva-
lent to individual participants, but represent shared viewpoints or
stances [23]. Their work highlights how several individuals may
join into a single voice, representing a common perspective or
approach. Or, the inverse may occur, where the same individual
adopts multiple voices, switching positions and roles as conver-
sation unfolds. We define a shared voice as a communicational
position that multiple participants adopt by adhering to a con-
sistent viewpoint, syntax, or style of speech. Shared voices can
be seen in chat when, for example, several participants repeat
the same or similar emotes or phrases (Figure 1). We calculated
the total number of voices per 50-message segment, finding that
massive and small chats exhibited a comparable total number of
voices, despite the fact that massive chats had nearly double the
number of individual participants. This consolidation of voices in
massive chats supports communication at scale.
The communication practices we observed entail a shift away
from individual, conversational speech towards collectivized
crowdspeak which maintains coherence by reducing the total
volume of meaningful content participants produce and process.
The crowdspeak we observed did not attempt to build sequential
threads of conversation in the manner of small scale chat dis-
cussed by Greenfield and Subrahmanyam [4] and others [5, 6,
7, 17, 26, 28]. Massive Twitch chats instead supported a playful
form of participation more akin to chanting, clapping, or doing
“the wave” in a large sports arena, where participation is en-
hanced by a crowd that not only watches, but speaks.
Large-scale text-based communication practices have implica-
tions for a still-growing internet. Global events such as political
inaugurations, debates, and the Olympics are now routinely live-
streamed. Due in part to the success of Twitch, many websites
such as YouTube Live, Facebook Live, and Periscope offer real-
time, concurrent chat to viewers alongside a stream’s video feed.
These technologies encourage crowdspeak as a form of active
participation, potentially altering the way in which participation in
major world events is viewed and experienced online.
Background
Twitch has increased in size and scope year by year, growing
from 35 million unique monthly viewers in 2013 to 100 million in
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
860
Figure 3: Hearthstone: Heroes of Warcraft; streamer TrumpSC;
gameplay left; TrumpSC’s webcam feed top right; chat rightmost.
2014. Twitch supports nine billion+ chat messages annually [24,
25]. As a gaming-specific spinoff of the live-streaming platform
Justin.tv, Twitch had eclipsed its parent platform by 2014 when it
was acquired by Amazon.com [18].
That same year, game developer Blizzard Entertainment re-
leased Hearthstone, an online collectible card game set in their
fantasy Warcraft universe. System-matched players play each
other by casting spells and summoning creatures via cards with
numerical values representing tactical information such as health
and damage. Blizzard distributed closed beta keys to well-known
Twitch streamers who built interest in the game by streaming
themselves playing Hearthstone live before most people could
play it at all [2]. By the time Hearthstone had been released to
the general public, the game had already developed a dedicated
audience on Twitch, with a few streamers devoting themselves to
streaming Hearthstone full time. In 2016, Hearthstone was one
of the top three games on Twitch by viewership and appears to
be maintaining a top spot in 2017.
We refer to people who broadcast their gameplay on Twitch as
streamers. Those who watch streaming events are viewers.
Participants contribute to chat. Each streamer has a channel
that goes online whenever s/he starts streaming. Viewers tune
into channels to watch live stream events and can follow their
favorite channels to be notified when the streamer is online.
Streamers broadcast from a webcam alongside direct feeds of
gameplay, allowing viewers to watch the game and the streamer
simultaneously. Text-based chat occurs in the chat window be-
side the video display on the webpage. Most streamers split
their attention between chat and playing the game, continuously
responding to questions, reactions, ideas, or demands issuing
from the chat (Figure 1). This setup allows chat participants to
type and converse with one another in real time as the event
unfolds.
By default, Twitch stream events are filtered by game, enabling
viewers to select a game of interest and then pick from a panel
of potential streamers to watch. Stream events are sorted by live
viewership counts, which are displayed prominently below the
stream title, permitting viewers to select a stream event based
on its current popularity. Top professional streamers can earn six
figure annual incomes [9] sourced from viewer donations, prod-
uct sponsorships, advertising, and optional viewer subscriptions
[3].
Related Work
As far back as 1978, scholars discussed how computer-mediated
communication (CMC) affects human communication. Kochen
[10] observed CMC as a “new linguistic entity with its own vo-
cabulary, syntax, and pragmatics.” In 1987, Rice and Love [16]
predicted that CMC could “change the psychology and sociology
of the communication process itself.” In a computer conferencing
study of Compuserv, they found that CMC “does not assume
the importance of direct paths between [individual] users”—
presciently anticipating the significance of the crowd in some
forms of CMC (vs dyads or small groups). However, most re-
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
861
search has, to date, focused on the smaller scale spaces. For
example, Schiano [17] found that in LambdaMOO, a popular
MUD, “small, private, even exclusive social interactions were the
rule, not the exception” with a good deal of interaction occurring
“in the presence of one simultaneously active companion.” Many
studies continue to assume that the small group is optimal [5, 7,
14].
Figure 4: Massive chat (Reckful’s
stream); Patterns of communication
may initially appear chaotic.
Herring [6] discussed “interactional coherence in CMC,” arguing
that, “It is possible for CMC to be simultaneously incoherent and
enjoyable.” Greenfield et al. [4] picked up the theme of coher-
ence, finding that coherence was achieved in the teen chatroom
they studied as participants clustered in dyads or small groups:
“Many participants [in the large heterogeneous chat]...grouped
themselves in dyads or smaller groups, with each group main-
taining its own conversational thread.” Weisz et al. [27] asked
whether integrating text chat with video “enhanced or harmed”
participant experience, concluding that “socializing around media
is perhaps just as important as the media itself, and supporting
social interactions during media consumption can significantly
affect, and we hope enhance, the viewing experience.” Studies
such as these, and many continuing into the present, assume
that “relationships” and participant self-expression and identity
are critical for successful chat.
Werry [28] observed that within multi-stranded IRC conver-
sations, language is “heavily abbreviated” with “syntactically-
reduced forms, the use of acronyms and symbols, [and] the clip-
ping of words.” Varnhagen et al. [26] reported that instant mes-
saging participants similarly developed “short cuts for express-
ing words, phrases, and emotions.” Jones et al. [7] found that
as chat size grew, the number of messages posted per partici-
pant declined, eventually reaching an asymptotic level at which
the number of posters “remain[ed] constant.” They reported a
limit of about 600 messages per 20-minute interval: “Viewers
can “absorb [up to] 30 messages per minute.” They explained
these findings as “constraints resulting from information over-
load.” After the number of users (including both those who post
messages and those who do not) exceeds about 220, “the com-
munity loses viability altogether” because people will not have
their posts read and therefore will not post [7].
Jones et al. [7] noted that “while IRC is an old technology, it it is
still used by millions of people around the globe on a daily ba-
sis” and is highly relevant to research. We concur, and find that
contemporary studies of Twitch.tv are few but growing, contribut-
ing important understandings of changing patterns of commu-
nication. In a quantitative study of video game live streaming
on Twitch, Kaytoue et al. [8] observed that most viewer traffic
goes to a small number of Twitch streamers: “The top 10% [of]
streamers concentrates 95% of all views, showing that audience
attention is grabbed by a very small set of streamers.” Hamilton
et al. distinguished between small and massive Twitch chats,
designating massive chats as those with more than 1,000 view-
ers. Hamilton et al. argued that increasing viewer size threatens
a breakdown of “meaningful interaction,” due to “huge, com-
pletely unreadable chats” [5]. At the same time, they acknowl-
edged that massive chats are “compelling to some.” Pan et al.
[14] developed TwitchViz to help players and researchers ex-
amine chat behaviors, consistent with Jones et al.’s recommen-
dation in 2008 that visual tools will be helpful. Given the rising
numbers of viewers, “users are now often overloaded with in-
formation...mak[ing] it challenging for streamers to maintain an
understanding of their own communities” [14]. Deng et al. noted
the importance of Twitch.tv to industry, observing that 41 new
games were pre-released on Twitch as promotion. They ex-
pect that “games will increasingly be designed with Twitch-like
broadcast in mind” [2]. Building on Cheung and Huang’s frame-
work of spectator experiences, Smith et al. studied the YouTube
Let’s Play community observing that “the viewer her/himself [per-
forms]...a very active and engaging role as part of the audience”
[1, 20]. This study is not about Twitch.tv, but the authors com-
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
862
mented that some Let’s Players also use Twitch.tv to stream,
arguing for the agentic quality of audience participation in both
venues.
Methods
We used both qualitative and quantitative methods. One au-
thor has extensive experience playing Hearthstone, has been
a long term participant on Twitch.tv, and provided contextual in-
formation about the game and chat practices. Authors without
previous experience with Hearthstone learned to play the game
during the initial weeks of the study.
Figure 5: Terms and definitions.
We conducted 300 collective hours of observation watching
Twitch chats and participating by playing Hearthstone between
April and August 2016. We identified five channels whose stream-
ing events commonly drew massive viewer counts and five
channels whose streaming events commonly drew small viewer
counts. As our primary intent was to understand massive chats,
small channels were included to provide a comparative point of
reference. From May 18 to August 19, 2016, we collected two
50-message segments from each channel, for a total of twenty
50-message segments. 50-message segments were large
enough to observe patterns, but not so large as to be intractable
for the necessary hand coding. We analyzed the segments in
researcher pairs or triads to avoid skewing the results toward the
potential biases of a single coder. Streaming events lasted from
three to seven hours. Often as participants joined chat, they
greeted the streamer and others, regardless of whether they
continued to participate. To allow the chat to stabilize to those
actively participating, we collected 50-message segments 90
minutes into an event.
Within our 50-message segments, we used four primary metrics
to measure chat: 1. scroll rate; 2. message length; 3. chat con-
tent; 4. voices. Scroll rate was measured in lines of chat per time
elapsed, as lines/second. Message length was measured in lex-
Figure 6: Example Coding Activity; 6-message segment.
ical items per message. Chat content was measured in number
of unique lexical items per 50-message segment. Unique lexi-
cal items were counted as they first appeared in each segment
(see rows 2 and 3, Figure 6). Voices were measured in total
voices per 50-message segment. We also used three secondary
metrics: 1. word count per message; 2. unique word count per
50-message segment; 3. participant count per 50-message seg-
ment. These secondary metrics helped interpret the primary
metrics, as will be described in Findings (see Figure 5).
Findings
Our study revealed that massive chat participants styled their
communication around three interrelated practices of coherence:
shorthanding,bricolage, and voice-taking. Shorthanding is the
contraction of text into a smaller space. Bricolage is the recom-
bining of elements from a small repertoire. Voice-taking is the
adoption of shared viewpoints, perspectives, or mannerisms.
Shorthanding, bricolage, and voice-taking were intertwined; we
delineate these three practices in our analysis in order to high-
light how each produced coherence in massive chats. We be-
lieve that these three practices help explain why massive chats
consistently attract large audiences and do not seem to produce
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
863
troubling communication breakdowns, instead constituting a dif-
ferent kind of communication at scale, i.e., crowdspeak.
Figure 7: Twitch emotes and their
meanings. In massive chats, fast scroll-rates and high participant counts
created conditions where an individual message enjoyed only
a small amount of screen time, and an individual participant’s
contributions formed only a small part of the entire chat con-
tent. Massive chats had an average of 47 participants per 50-
message segment, compared to 25 participants in small chat
segments (rounding for simplicity). Massive chats flowed by at a
considerably swifter pace than small chats, with 1.74 lines/second
in massive chats compared to .27 lines/second in small chats.
Shorthanding: Our first hypothesis, that massive chat mes-
sages would be shorter, was supported. Messages in massive
chats contained fewer lexical items. Massive chat segments
contained an average of 3.0 lexical items per message while
small chat segments contained an average of 5.5 lexical items
per message. Counting lexical items allowed us to differentiate
between messages that may have had more or less semantic
content; for example, messages containing a single emote ver-
sus messages with text extending to the end of the line. Lexical
items allowed us to analyze pictorial messages where meaning-
ful units were defined by images rather than words.
Figure 8: Massive chat (nl_Kripp’s
stream); YOLOGG
Metrics such as number of words and line length (which we also
counted) seem like they would be good indicators of message
length. But they proved inadequate for our analysis as they
did not accurately reflect the structure of utterances in massive
chats. For example, many messages were image-based, includ-
ing ASCII art and emotes, which may not typically be considered
words. There was no difference in average word count per mes-
sage; both massive and small chats averaged six words per
message (rounded). Line count per message also was impre-
cise; it did not distinguish between messages that filled part of a
line versus all of a line. In an environment where messages con-
taining a single word or emote can be as meaningful as longer
phrases that might fill an entire line, we found lexical items to be
a more useful metric.
Shorthanding occurred in the form of acronyms, abbreviations,
emotes, and single word commands. These forms reduced the
number of lexical items per message. For example, “!uptime”
was a frequently used command answered by a chatbot that
replied with the time the current stream event had been live. This
command allowed participants to circumvent a lengthier, back-
and-forth dialogue, reducing the number of lexical items needed
to query and convey the event’s uptime. Likewise, the command
“!deck” anticipated and mitigated a dialogue that would have
required more lexical items in order to procure information re-
garding commonly used card decks.
Other forms of shorthand worked because they relied on insider
knowledge and references that allowed brief but vivid utterances.
For instance, “Yogg-Saron, Hope’s End,” a popular card used by
many Hearthstone streamers, casts a series of random spells
at random targets, resulting in an infamous ability to either win
or lose the game in spectacular fashion. The card, often simply
called "Yogg," has become associated with the phrase YOLO
(You Only Live Once), with some streamers shouting “YOLO” in
anticipation of the card backfiring. This reference has been fur-
ther developed into the acronym “YOLOGG” (Figure 8), which
succinctly combines YOLO and Yogg-Saron, invoking the spec-
tacle and randomness of Yogg-Saron without the need to explain
it. These evolving layers of meaning provide complex shorthand
references that may be specific to the game in question, to the
streamer, to Twitch, or even to current world events. For exam-
ple, during the time of our observations there were joking refer-
ences to Harambe, a gorilla at the Cincinnati Zoo who was killed
by a zoo worker when a 3-year-old boy climbed into the animal’s
enclosure. Comments such as “RIP HARAMBE BibleThump” ap-
peared in chats after the playing of Hearthstone cards that fea-
tured a monkey or gorilla. While shorthand is not unique to mas-
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
864
sive chats [26, 28], its heavy use made for shorter messages in
rapidly moving chat, allowing colorful participant responses to
gameplay, Twitch, or broader cultural events.
Bricolage: Our second hypothesis, that massive chats would
contain less original content, was supported. We found that
massive chats included less unique content per 50-message
segment. There was an average of 85 unique lexical items per
segment in massive chats, compared to 169 in small chats
(rounded). Our qualitative observations indicate that frequent
repetition of lexical items in massive chats explains this striking
halving of original content.
We characterize the practices at work in this reduction of unique
content as “rhetorical” bricolage. Lévi-Strauss identifies brico-
lage as the practice of recombining a small set of resources-
at-hand (such as known characters, tropes, and images) to
construct collective narratives [11]. Chat participants practiced
bricolage by recombining emotes, stock phrases (e.g., the Twitch
phrase “top deck” or the popular phrase “drop it like it’s hot”), and
copypastas.
Figure 9: Massive chat (nl_Kripp’s
stream); Copypasta about copypasta.
Emotes were common lexical items in Twitch bricolage. Partic-
ipants’ reactions to the streamer’s gameplay often consisted of
single emotes or emote-word combinations. Different emotes
were associated with different reactions. Rapid repetition of the
same emotes often occurred, such that streamers could quickly
glean a sense of chat sentiment and reply accordingly. For in-
stance, a string of PogChamp messages indicated amazement
at an impressive play or situation, whereas LULs constituted
laughter at the streamer’s mistakes or bad luck. Participants’ use
of a shared emote lexicon led to a great deal of repetition (and
thus less original content) in massive chats. We did not observe
such heavy use of emotes in small chats.
Copypastas are exemplars of bricolage in their referencing,
reusing, and/or remixing of previous elements of chat content.
They tended to follow a formula, but a flexible one, in which
participants could combine multiple copypastas or add custom
elements. Each copypasta was either repeated verbatim or con-
structed with small, often playful variations on a previous copy-
pasta. For instance, in Figure 9, several participants repeated
a copypasta that commented on the experience of propagat-
ing copypastas, parodying participants “instinctively” copying
and pasting in massive streams, with an ironic nod to the “pasta
that conveys no information nor is particularly witty or funny.”
Copypastas were remixed to suit specific contexts. A common
copypasta was “ONE MORE (LUL) AND I’M OUT.” During a Krip-
parian event, several variations on this copypasta occurred:
“ONE MORE OUT AND I’M (LUL)”
“ONE MORE (LUL) AND YOGG IS DONE”
The content of individual copypastas became less important
than the patterns of serial messages, producing coherence
through reduction of original content.
Voice-taking: Our final hypothesis, that massive chats would
contain fewer unique voices, was not supported. Total voice
counts were comparable in chat segment sizes. However, when
we analyzed voices as a ratio of total participants to voices,
there was a clear asymmetry, with an average of 29 unique
voices to 47 participants (rounded) in massive chats, and an
average of 26 unique voices to 25 participants in small chats.
With nearly twice the number of participants in massive vs small
chats, the similarity in voice count is notable.
While participants in both chat sizes at times used their own
voice in their messages, the lower ratio of voices to participants
showed that massive chat participants more often adopted a
voice from a collective repertoire. For example, SMOrc (pro-
nounced "S M orc") is an emote associated with a common
voice. When participants adopted SMOrc’s voice, they tended to
use a certain syntax and style, e.g., typing in all caps with simple
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
865
sentence structure. Not only does SMOrc shun the use of gram-
mar and lowercase letters, he also shuns advanced game tactics
and tactful expression. As a voice, SMOrc is recognized and
shared across multiple Hearthstone channels, an in-joke among
participants. SMOrc’s entire “philosophy” can be summed up
in his quote “MATH HARD,” referring to the basic mechanics of
Hearthstone where every card has numbers associated with its
ability to perform offensive or defensive functions. The use of
SMOrc and similar emotes requires, and signifies, membership
in the Twitch Hearthstone community. Speaking as SMOrc ob-
scures the participant’s voice behind a pugnacious Orc’s shouts,
producing coherence only because participants and viewers are
in the know about gameplay, the streamer, and the semantics of
the emote.
Figure 10: Massive chat (Zetalot’s
Stream): SMOrc advocates attacking
the opponent directly in the "FACE,"
rather than calculating out the
gameplay consequences.
Some emotes produced voices of a more general nature. For
example, LUL did not impose a specific grammar or subject
matter but was always laughing in response to the misfortune
of another, often the streamer. FeelsBadman consistently ex-
pressed pity for oneself or another. These emotes qualified as
voices as they adhered to a consistent viewpoint, rather than
consistent syntax. Not all emotes qualified as voices, however.
BibleThump (sadness), FailFish (disappointment), and Resident
Sleeper (boredom), for example, often simply served as a sort of
punctuation to lend a tone to a message (much like an exclama-
tion point).
Some voices did not include emotes, instead maintaining co-
herence through shared syntax or viewpoint. One example is
“BUILD THE WALL” messages in TrumpSC’s streams, refer-
encing then-Republican presidential candidate Donald Trump.
The resonance of this voice was unique to TrumpSC’s streams
because of the shared name with Donald Trump (TrumpSC’s
name has no actual relation to Donald Trump, referring instead
to trump suits in card games). These messages, and variations
like “BRING THAT WALL ONLINE,” provided comical commen-
tary that occurred when TrumpSC created defensive lines with
his creatures during a game. Other messages like “Make priest
great again” were a reaction to TrumpSC playing a Priest card-
deck, in light of a common opinion among players that it was the
worst class in the game. These voices allowed a shared, mock-
political voice to emerge, styled around the distinctive syntax and
phrasings of Donald Trump. This example showed a clear con-
vergence of mannerisms, speech, and tone among some partici-
pants in massive chats, where shared voices were adopted from
both Twitch-specific events and mainstream culture.
Discussion
Although we have drawn extensively from Hamilton et. al.’s
work, one of our key conclusions differs in that we suggest that
massive chats can be examined as successful communication
spaces in their own right rather than as failing communities.
Hamilton et al. argued that massive chats “destroy the potential
for communities to form through participation” [5]. But the very
popularity of these streams, with their huge viewership numbers,
trouble this characterization. We argue that analyzing massive
Twitch chats by taking smaller chats as a benchmark is pre-
cisely what obscures the coherence of massive chats. Rather,
we found that massive chat participants deployed a consistent
set of practices that allowed communication to continue at scale.
We focused on understanding texts that may appear incompre-
hensible to the average reader but are, in fact, the productions of
a rich insider culture that draws from gaming, and well beyond,
to many internet and popular culture sources. We observed that
crowdspeak relied on tacit references, in-jokes, and acquired flu-
ency. Crowdspeak was made possible by a vibrant community
of chat insiders familiar with a specific, outwardly-obscure set
of symbols, commands, and modes of speech. The presence
of insiders mirrors Hamilton et. al.’s finding that small Twitch
chat communities involve a small set of “regulars” who struc-
ture the conversation [5]. In massive chats, these model partic-
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
866
ipants appear as a collective, recognized through adoption of
shared references, lexicons, and speech patterns. Other stud-
ies have noted the importance of regular, known contributors to
small chat communities [15], a notion based on Oldenburg and
Brissett’s original work on “third places” [12, 13]. Our observa-
tions suggest that in massive chats, regulars may not always be
“known,” but their contributions are no less vital.
Previous researchers have found or assumed limits to mean-
ingful participation in chat contexts with a high number of par-
ticipants. For example, Pan et. al. [14] developed the TwitchViz
tool specifically to cope with a perceived information overload
in massive twitch chats. Jones et. al. [7] described limits of IRC
chat. However, our results suggest that chat contexts can spawn
practices and content that call into question a ceiling at which
breakdown or overload must always occur. In their treatment
of third spaces, Oldenburg and Brisset [13] described the im-
portance of a “conversational style” where everyone seems to
speak “just the right amount.” This “right amount” is essential
to the sociability of the spaces [13]. We suggest that a Twitch-
specific “ right amount” of speech was occurring in the chats
we studied, and that different chat contexts can have their own
proper measure of rate and level of participation.
Our approach of deploying quantitative analysis of practices
alongside close ethnographic readings of texts may be useful in
analyzing crowdspeak as massive chats become more common.
By using lexical items and voices as primary metrics of analysis,
rather than words and participants, we were able to provide in-
sight into modes of collective communication that may not rely
on conversational norms of turn-taking, repair, or topical con-
sistency. Rather, we observed that massive Twitch chats had
their own alternative communicative patterns and practices. We
believe such alternatives are exactly what some of the earliest
scholars of digital communication pointed to when they noted
that computer-mediated communication had emerged as a "new
linguistic entity with its own vocabulary, syntax, and pragmat-
ics" [10], and that CMC should "not assume the importance of
direct paths between [individual] users" [16]. We note that par-
ticipation in massive Twitch chats is less about individual identity
and self-expression than it is about entering and engaging with a
crowd, a topic worthy of continuing research. The crowd has his-
torically attracted the notice of psychologists, sociologists, and
philosophers, including Freud, Durkheim, Sartre, Kierkegaard,
and Canetti, whose theories and ideas can inform future work.
Our own future research, for example, might explore potential
connections between crowdspeak and the myths and rituals of
crowds.
Conclusion
Due to the rapid growth of live-streaming platforms like Twitch,
YouTube Live, and Facebook Live, chat sizes have exploded. As
Kaytoue et al. remind us, most viewer traffic on Twitch goes to
a few streamers, concentrating 95% of all views into a few mas-
sive channels [8]. Our research suggests that crowdspeak may
provide an engaging and coherent communicative form in these
growing online environments. Future research might examine
the extent to which the morphology of crowdspeak may be af-
fected by factors such as platform, primary language, streamer,
and topical focus. The crowd is on the rise, and researchers
should be poised to attend to emergent forms of large-scale en-
gagement and discourse, and their potential contributions to
public life and digital media.
END NOW
Acknowledgements
Many thanks to Evan Conaway, our alt.chi reviewers, and the
greater Twitch community.
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
867
References
1. Gifford Cheung and Jeff Huang. 2011. Starcraft from
the Stands: Understanding the Game Spectator. In Pro-
ceedings of the SIGCHI Conference on Human Factors in
Computer Systems (CHI ’11), 763-772. DOI:https://doi.org/10.1145/
1978942.1979053.
2. Jie Deng, Felix Cuadrado F, Gareth Tyson, and Steve
Uhlig. 2015. Behind the Game: Exploring the twitch
streaming platform. In 2015 International Workshop on
Network and System Support for Games (NetGames),
IEEE Press, Piscataway, NJ, Article No. 8. DOI:http://
dx.doi.org/10.1109/NetGames.2015.7382994.
3. Jay Egger. 2015. How exactly do Twitch streamers make
a living? Destiny breaks it down. Retrieved January 11,
2017 from Esports dailydot.com/esports/twitch-streaming-
money-careers-destiny/.
4. Patricia M. Greenfield and Kaveri Subrahmanyam. 2003.
Online discourse in a teen chatroom: New codes and new
modes of coherence in a visual medium. Journal of Ap-
plied Developmental Psychology 25, 6 (Dec 2003), 713-
738. DOI:http://dx.doi.org/10.1016/ j.appdev.2003.09.005.
5. William A. Hamilton, Oliver Garretson, and Andruid Kerne.
2014. Streaming on twitch: fostering participatory commu-
nities of play within live mixed media. In Proceedings of
the SIGCHI Conference on Human Factors in Computing
Systems (CHI ’14), 1315-1324. DOI: https://doi.org/10.1145/
2556288.2557048.
6. Susan Herring. 1999. Interactional Coherence in CMC.
Journal of Computer Mediated-Communication 4, 4 (Jun
1999), 0. DOI:10.1111/j.1083-6101.1999.tb00106.x.
7. Quentin Jones, Mihai Moldovan, Daphne Raban, and
Brian Butler. 2008. Empirical evidence of information
overload constraining chat channel community interac-
tions. In Proceedings of the 2008 ACM Conference on
Computer supported cooperative work (CSCSW ’08),
323-332. DOI:https://doi.org/10.1145/1460563.1460616.
8. Mehdi Kaytoue, Arlei Silva, Loic Cerf, Wagner Meira
Jr, and Chedy Raissi. 2012. Watch me playing, I am a
profressional: a First Study on Video Game Live Stream-
ing. In Proceedings of the 21st International Conference
on World Wide Web (WWW ’12 Companion). ACM Press,
New York, NY, 1181-1188.
DOI: https://doi.org/10.1145/2187980.2188259.
9. Cameron Keng. 2014. Online Streaming And Professional
Gaming Is A 300,000[USD] Career Choice. Retrieved
January 4, 2016 from http://www.forbes.com/sites/cameronkeng/
2014/04/21/online-streaming-professional-gaming-is-a-
300000-career-choice/#99e735921d99.
10. Manfred Kochen. 1978. Long-term implications of elec-
tronic information exchanges for information science. Bul-
letin of the American Society for Information Science 4, 1
(Jun 1978). 22-23.
11. Claude Lévi-Strauss. 1962. The Science of the Concrete.
In Savage Mind, University of Chicago Press, Chicago,
1-22.
12. Ray Oldenburg. 1999. The great good place: Cafes, cof-
fee shops, bookstores, bars, hair salons, and other hang-
outs at the heart of a community. Da Capo Press.
13. Ray Oldenburg and Dennis Brissett. 1982. "The third
place." Qualitative sociology 5.4 (1982): 265-284.
14. Rui Pan, Lyn Bartram, and Carman Neustaedter. 2016.
TwitchViz: A Visualization Tool for Twitch Chatrooms. In
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
868
Proceedings of the 2016 CHI Conference Extended Ab-
stracts on Human Factors in Computing Systems (CHI EA
’16), 1959-1965. DOI:https://doi.org/10.1145/
2851581.2892427.
15. John Paolillo. 1999. The Vir tual Speech Community: So-
cial Network and Language Variation on IRC. Journal of
Computer-Mediated Communication 4, 4 (Jun 1999), 0.
DOI:10.1111/j.1083-6101.1999.tb00109.x.
16. Ronald E. Rice and Gail Love. 1987. Electronic Emotion
Socioemotional Content in a Computer-Mediated Com-
munication Network. Communication Research 14, 1 (Feb
1987), 85-100. DOI:10.1177/009365087014001005.
17. Diane J. Schiano. 1999. Lessons from LambdaMOO:
A Social, Text-Based Virtual Environment. Presence:
Teleoperators and Virtual Environments 8, 2: 127-139.
DOI:https://doi.org/10.1162/105474699566125.
18. Alyson Shontell. 2014. Twitch CEO: Here’s Why We Sold
to Amazon For 970[USD] Million. Retrieved January 11,
2017 from businessinsider.com/twitch-ceo-heres-why-we-
sold-to-amazon-for-970-million-2014-8.
19. John Sinclair. 1996. The search for units of meaning.
Textus 9, 1, 75-106.
20. Thomas Smith, Mariann Orbrist, and Peter Wright. 2013.
Live-streaming changes the (video) game. In Proceedings
of the 11th European conference on Interactive TV and
video (EuroITV ’13). ACM Press, New York, NY, 131-138.
DOI:https://doi.org/10.1145/ 2465958.2465971.
21. Paul Tassi. 2013. Talking Livestreams, eSports and the
Future of Entertainment with Twitch. Forbes. Retrieved
January 9, 2017 from http://www.forbes.com/sites/insertcoin/
2013/02/05/talking-livestreams-esports-and-the-future-of-
entertainment-with-twitch-tv/#5befe51219b3.
22. Stefan Trausan-Matu and Traian Rebedea. 2009. Poly-
phonic inter-animation of voices in VMT. In Studying vir-
tual math teams, Gerry Stahl (Ed.), Springer US, 451-473.
23. Stefan Trausan-Matu and Traian Rebedea. 2010. A poly-
phonic model and system for inter-animation analysis in
chat conversations with multiple participants. In Inter-
national Conference on Intelligent Text Processing and
Computational Linguistics, (March 2010), Springer Berlin
Heidelberg, 354-363.
24. Twitch.tv. 2013. Twitch Retrospective. Retrieved January
11, 2017 from twitch.tv/year/2013.
25. Twitch.tv. 2014. Twitch Retrospective. Retrieved January
11, 2017 from twitch.tv/year/2014.
26. Connie K. Varnhagen, G. Peggy McFall, Nicole Pugh, Lisa
Routledge, Heather Sumida-MacDonald, and Trudy E.
Kwong. 2009. lol: new language and spelling in instant
messaging. Reading and Writing 23, 6 (Jul 2010), 719-
733. DOI:10.1007/s11145-009-9181-y
27. Justin D. Weisz, Sara Kiesler, Hui Zhang, Yuqing Ren,
Robert E. Kraut, and Joseph A. Konstan. 2007. Watch-
ing Together: Integrating Text Chat with Video. In Pro-
ceedings of the SIGCHI Conference on Human Factors in
Computing System (CHI ’07). ACM Press, New York, NY,
877-886. DOI:https://doi.org/10.1145/ 1240624.1240756.
28. Christopher C. Werry. 1996. Linguistic and Interactional
Features of Internet Relay Chat. In Computer Mediated
Communication: Linguistic, Social and Cross-Cultural
Perspectives, Susan C. Herring (Ed). John Benjamins
Publishing, Amsterdam, 47-64.
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
869
Overall, I enjoyed reading this paper. I like the topic
and the thoughtfulness in the analytic approach. I
find the themes well-explained and a good descriptor
of the communicative styles. However, I note that
the analysis & discussion largely ignore the role of
the streamer.
The Elephant in the Room
My biggest question is to ask what the role of the
live-streamer is in maintaining the coherence of this
massive chat. I think that we need to acknowledge
this before making strong claims about how different
chat styles are able to maintain coherence.
My intuition is that it is the ongoing video live-stream
that anchors the fast chat. This is the loudest “voice”
in the room and it differentiates this type of chat
from classic IRC channels. In those situations,
breakdown occurs when message rates cross a
readable threshold. Twitch does not breakdown – or,
perhaps, it recovers from repeated breakdowns
because the main event, the video stream, is
constantly bringing the conversation back into focus.
This means that the Twitch.tv chat is more
comparable to event-based communication rather
than a generic chat channel (IRC, etc...). I am
thinking of Reddit live-streams, web-forum threads
and Twitter hashtags that are attached to a live event
such as a natural disaster, a sports event or a
political protest.
Methodologically, I am concerned with the absence of
analysis of the role of the streamer. I am sure that
the streamer represents a “voice” in almost every
segment. This makes the count of voices in the
existing paper difficult to depend on. If the streamer
mutters, “MATH HARD”, he can kick off a flood of
responses in the chat. This would be very different
than an emergence of “MATH HARD” that is not
explicitly dependent on the live stream.
I look forward to future research that addresses the
following questions.
What communicative mechanisms are at work when
the main spectacle (streamer, disaster, event, etc…)
provokes different communicative patterns? I am
speculating that an utterance from the streamer, a
foul in the basketball game, or news report about a
crisis will draw disparate voices together to keep a
high-volume conversation from breaking down. Can
these mechanisms be identified and categorized?
Who else keeps a massive chat from breaking down;
perhaps, there is someone other than the streamer
or the crowd. The authors describe the actions of a
crowd where each voice has equal “volume”; I’ve
identified the streamer who has the loudest voice in
the room. Are there others? Leaders? Trolls?
Celebrities? How does power affect massive chats?
What is the role of the technology in massive chats?
Twitch has a built-in limitation of no more than 20
messages per half-minute. Also, the chat stream can
be paused by a reader. These affordances and
restrictions shape the communicative patterns that
emerge and they deserve further investigation.
Thanks to the authors for a thought-provoking article.
Commentary
For alt.chi paper
Chat Speed OP: Practices of
Coherence in Massive Twitch
Chat
Gifford Cheung
University of Washington
Box 359442
4333 Brooklyn Ave NE
giffordc@uw.edu
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
870
Ford et al. challenge the view that chat in Twitch is a
disorganized jumble of interactions resulting from
information overload. The authors make a compelling
argument that the behaviors of Twitch viewers in
both massive and smaller channels has parallels to
concepts from computer mediated communication
and computational lingusitics. I was particularly
drawn to the paper’s use of bricolage to illustrate the
phenomena of copypasta on Twitch. The images
peppered throughout this paper serve to make it an
enjoyably engrossing romp (the ending is especially
clever).
The findings are a great example of the power of
mixed-methods. In this commentary, I would like to
highlight a few methodological concerns that the
authors could perhaps clarify. First, it is unclear how
channels were chosen from the massive or small
viewer pool. I am also not entirely convinced that two
50-message segments per channel are “large enough
to observe patterns, but not so large as to be
intractable for...hand coding.” Why not use time as a
sampling procedure? For instance, why not analyze
1-hour worth of messages? Since Twitch messages in
channels are bursty, the use of 50-message
segments may inadvertently introduce certain biases
(e.g., only capturing bursts or short segments).
Twenty 50-message segments (comparable to less
than 1000 lines of text from an interview) from 300
hours of observations seems like a small amount of
data for presumably a large team of coders. I am
sympathetic to the methodology used but greater
detail and argument on the appropriateness and rigor
of the data collection would strengthen this paper.
The findings show the care with which the authors
handled the role of emotes and shorthanding in
Twitch. Certainly, I think lexical items are a sensible
approach to analysis. I wonder, though, can we really
equate one lexical item to one emote? To me, emotes
convey a *lot*, not just one “meaningful unit.”
Emotes succinctly capture a whole range of in-jokes
and cultural norms (both at at Twitch, game-specific,
and even channel-specific level). Future work might
examine how to measure the “meaning” of emotes
and shorthands with respect to other lexical items.
It was a treat to read the sensitive analysis of the
Twitch messages (e.g., TrumpSC and YOLO). While I
found the hypotheses to be intuitive, I would prefer
them to be better grounded in the literature. Why are
the authors *comparing* popular channels with less-
popular channels? Does a difference in length,
uniqueness, and voices between these channels
mean that these factors attract viewers? Why limit
ourselves to these particular linguistic features? We
know little about viewer behavior on Twitch; we may
speculate that upon viewing popular channels,
viewers are frustrated by “information overload” and
will then seek less-popular channels. Perhaps we are
comparing apples and oranges, two completely
different sets of viewers who want different things
out of Twitch. Lastly, the study does not consider the
large role bot “viewers” play in both participating and
moderating content. I surmise the hypotheses are
seeking to address the interaction of massive
channels with information overload and crowdspeak,
but this backdrop needs to be made more explicit.
Despite my quibbles, this alt.CHI paper is an
excellent recasting of viewer interactions as novel
and admirable accomplishments on Twitch.
Commentary
For alt.chi paper
Chat Speed OP : Practices of
Coherence in Massive Twitch
Chat
Norman Makoto Su
School of Informatics and Computing
Indiana University
919 E. 10th Street
Bloomington, IN 47408
normsu@indiana.edu
alt.chi: Augmented bodies and interations
CHI 2017, May 6–11, 2017, Denver, CO, USA
871