ArticlePDF Available

Do Larger Audiences Generate Greater Revenues under Pay What You Want? Evidence from a Live Streaming Platform

Authors:

Abstract and Figures

As live streaming of events gains traction, pay-what-you-want (PWYW) pricing strategies are emerging as critical monetization tools. We assess the viability of PWYW by examining the relationship between popularity (i.e., audience size) of a live streaming event and the revenue it generates under a PWYW scheme. On the one hand, increasing audience size may enhance voluntary payment/tips if social image concerns are important, because larger audiences amplify the utility pertaining to social image. On the other hand, increasing audience size may reduce tips if gaining the broadcaster’s reciprocal acts motivates tipping because larger audiences are associated with fiercer competition for reciprocity. To examine these trade-offs in the relationship between audience size and revenue under PWYW, we manipulate audience size by exogenously adding synthetic viewers in live streaming shows on a platform in China. The results reveal a mostly positive relationship between audience size and average tip per viewer, which suggests that social image concerns dominate seeking reciprocity. In support of herding, adding synthetic viewers also increases the number of real viewers. Social image concerns and herding together explain the finding that adding one additional viewer improves the tipping revenue per minute by approximately .01 Yuan (1% of the mean level). Further, famous female broadcasters who use recognition-related words frequently during the event benefit the most from an increase in audience size. Overall, the results indicate that revenues under PWYW do not scale linearly and support the relevance of social image concerns in driving individual payment decisions under PWYW.
Content may be subject to copyright.
This article was downloaded by: [129.7.121.247] On: 04 October 2021, At: 08:08
Publisher: Institute for Operations Research and the Management Sciences (INFORMS)
INFORMS is located in Maryland, USA
Marketing Science
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org
Do Larger Audiences Generate Greater Revenues Under
Pay What You Want? Evidence from a Live Streaming
Platform
Shijie Lu , Dai Yao , Xingyu Chen , Rajdeep Grewal
To cite this article:
Shijie Lu , Dai Yao , Xingyu Chen , Rajdeep Grewal (2021) Do Larger Audiences Generate Greater Revenues Under Pay
What You Want? Evidence from a Live Streaming Platform. Marketing Science 40(5):964-984. https://doi.org/10.1287/
mksc.2021.1292
Full terms and conditions of use: https://pubsonline.informs.org/Publications/Librarians-Portal/PubsOnLine-Terms-and-
Conditions
This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact permissions@informs.org.
The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.
Copyright © 2021, INFORMS
Please scroll down for article—it is on subsequent pages
With 12,500 members from nearly 90 countries, INFORMS is the largest international association of operations research (O.R.)
and analytics professionals and students. INFORMS provides unique networking and learning opportunities for individual
professionals, and organizations of all types and sizes, to better understand and use O.R. and analytics tools and methods to
transform strategic visions and achieve better outcomes.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
Do Larger Audiences Generate Greater Revenues Under Pay
What You Want? Evidence from a Live Streaming Platform
Shijie Lu,
a
Dai Yao,
b
Xingyu Chen,
c,
* Rajdeep Grewal
d
a
University of Houston, Houston, Texas 77004;
b
National University of Singapore, Singapore 119245;
c
Shenzhen University, Shenzhen,
Guangdong Province 518060, China;
d
University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599
*Corresponding author
Contact: slu@bauer.uh.edu,https://orcid.org/0000-0002-4180-6022 (SL); dai.yao@nus.edu.sg,https://orcid.org/0000-0002-0852-0700
(DY); celine@szu.edu.cn,https://orcid.org/0000-0002-3813-0680 (XC); grewalr@unc.edu,https://orcid.org/0000-0003-4467-2717 (RG)
Received: January 6, 2020
Revised: July 21, 2020; November 16, 2020
Accepted: December 27, 2020
Published Online in Articles in Advance:
August 19, 2021
https://doi.org/10.1287/mksc.2021.1292
Copyright: © 2021 INFORMS
Abstract. As live streaming of events gains traction, pay what you want (PWYW) pricing
strategies are emerging as critical monetization tools. We assess the viability of PWYW by ex-
amining the relationship between popularity (i.e., audience size) of a live streaming event and
the revenue it generates under a PWYW scheme. On the one hand, increasing audience size
may enhance voluntary payment/tips if social image concerns are important because larger
audiences amplify the utility pertaining to social image. On the other hand, increasing audi-
ence size may reduce tips if gaining the broadcasters reciprocal acts motivates tipping be-
cause larger audiences are associated with ercer competition for reciprocity. To examine
these trade-offs in the relationship between audience size and revenue under PWYW, we ma-
nipulate audience size by exogenously adding synthetic viewers in live streaming shows on a
platform in China. The results reveal a mostly positive relationship between audience size
and average tip per viewer, which suggests that social image concerns dominate seeking reci-
procity. In support of herding, adding synthetic viewers also increases the number of real
viewers. Social image concerns and herding together explain the nding that adding one ad-
ditional viewer improves the tipping revenue per minute by approximately 0.01 yuan (1% of
the mean level). Further, famous female broadcasters who use recognition-related words fre-
quently during the event benet the most from an increase in audience size. Overall, the re-
sults indicate that revenues under PWYW do not scale linearly and support the relevance of
social image concerns in driving individual payment decisions under PWYW.
History: K. Sudhir served as the senior editor and Juanjuan Zhang served as associate editor for this article.
Funding: This research is supported by the National Natural Science Foundation of China [Grant
71872115], the Natural Science Foundation of Guangdong Province, China [Grant 2020A1515011201],
and National University of Singpore Research [Grant R-316-000-104-133].
Supplemental Material: The data and online appendix are available at https://doi.org/10.1287/mksc
.2021.1292.
Keywords:live streaming pay what you want social media user-generated content tipping eld experiment video analysis
1. Introduction
Peer-to-peer live streaming (live streaming hereinafter)
of events, such as sports (e.g., video games; The Econo-
mist 2014), hobbies (e.g., bird watching; Knowledge@
Wharton 2015), and political commentaries (e.g., satiri-
cal videos; Qin 2016), is the next big revolution in de-
mocratizing the production and broadcast of videos.
Not surprisingly, live streaming is getting attention
from major players: the market has witnessed the
launch of Facebook Live and YouTube Live Streaming
Channels, the acquisition of Twitch by Amazon.com,
and the emergence of Twitters Periscope. According
to Facebook, these live videos are 10 times more likely
to generate comments than recorded videos and
viewed three times longer than regular videos (Media-
kix 2018). The growth of live streaming in China is
phenomenal and outstrips other countries (DUrbino
2017); the celebrity of live streaming stars parallels
those of movie stars (China Daily 2016). The size of the
live streaming market in China reached $4.4 billion in
2018, a 32% increase over 2017, and the size of the
global live streaming market was $7.4 billion in 2018, a
47% increase over the previous year (Deloitte 2018).
Two monetizing strategies are popular with user-
generated content (UGC), which includes live stream-
ing. The indirect model involves advertising and
product placements, and the direct model involves
charging the viewers. The nascent stages of live
streaming and the transient nature of live streaming
viewers have resulted in rms testing a pay what you
want (PWYW) pricing strategy (e.g., Kim et al. 2009,
Gneezy et al. 2012). Under PWYW, viewers can access
964
MARKETING SCIENCE
Vol. 40, No. 5, SeptemberOctober 2021, pp. 964984
ISSN 0732-2399 (print), ISSN 1526-548X (online)
http://pubsonline.informs.org/journal/mksc
live video content free of charge and pay/tip broad-
casters in real time by sending voluntary payments in
the form of virtual cash and virtual gifts. The revenue
collected from viewers is later split between the live
streaming platform and broadcasters. Prominent live
streaming platforms, such as YouTube Live and
Twitch, fully embrace the PWYW strategy with Face-
book Live adopting this pricing strategy for live
streaming of video games (Roettgers 2018).
The prominence of the PWYW revenue model leads
to astronomical growth of the live streaming industry
in both the United States and China over the past
few years. According to Streamlabs and Goldman
Sachs research, live streaming broadcasters within the
United States received $129 million in the form of dis-
cretionary tips in 2017 (Hays 2018). It also predicts
that the tipping marketwill reach $372 million in
2022, suggesting a compound annual growth rate of
23.6% from 2017 to 2022. For the live streaming mar-
ket in China, the user base has increased from 310 mil-
lion in 2016 to 504 million in 2019, suggesting that ap-
proximately two out of ve Chinese watched live
streaming shows in 2019 (iiMedia Research 2020).
With the booming live streaming industry, a number
of Chinese live streaming service providers (e.g., Bili-
bili, DouYu, Huya, Momo) have become billion-dollar
companies that are publicly traded in the United
States.
Despite the wide adoption of PWYW in the live
streaming industry, the viability and efcacy of this
revenue model remain unclear. In this research, we as-
sess the scalability of PWYW by addressing the fol-
lowing question: do larger audiences generate greater
revenues in live streaming with PWYW schemes? As
tipping revenue in a live streaming session depends
on the number of viewers and tip amount per viewer,
we rst ask how popularity information (i.e., audience
size) affects viewer participation. In settings in which
product quality is unknown to consumers a priori,
popularity can serve as a signal of quality to draw in
new viewers (e.g., Tucker and Zhang 2011, Zhang and
Liu 2012). Because of the self-reinforcement of popu-
larity information, we expect herding to occur in the
context of live streaming, suggesting a positive effect
of audience size on viewer participation.
Further, we examine whether larger audiences en-
hance or reduce the average tip amount per viewer.
We posit that the direction of this effect depends on a
central trade-off. On the one hand, theory of signaling
and status seeking (e.g., Lampel and Bhalla 2007,
Gneezy et al. 2012, Toubia and Stephen 2013) and the
public nature of viewerspayment in live streaming
suggest that a viewers utility of tipping should in-
crease as audience size increases because of an up-
ward bump in the viewerssocial image,dened as an
individuals social status and prestige perceived by
others (Lampel and Bhalla 2007). As such, a represen-
tative viewers tip amount should increase as audi-
ence size grows. On the other hand, if seeking reciproci-
ty is the primary motive for tipping, as a session
becomes more crowded, the chance of a viewer gain-
ing reciprocal acts from the broadcaster in the form of
social interactions should reduce, which should lead
to a lower tip per viewer because of the more intense
(perceived) competition for reciprocity (referred to as
the N-effect in social psychology; Garcia and Tor 2009).
The negative relationship between audience size and
tip amount per viewer might also occur because of
free-riding: as group size grows, the individual contri-
bution declines (e.g., Olson 1965, Andreoni 1988).
Thus, the net effect of audience size on tip amount per
viewer is a priori unclear. It could be either positive,
negative, or null, depending on the relative strengths
of multiple underlying forces (e.g., social image, reci-
procity, free-riding).
Two features of the live streaming industry make it
suitable for us to study the scalability of the PWYW
revenue model. First, both the payment decisions and
audience size are public information, which enable
status-signaling and herding by viewers. These behav-
ioral mechanisms are likely absent for products and
services that do not disclose consumerspayment to
others (e.g., shows on Netix). Second, a live stream-
ing event is hosted over the internet and, therefore,
does not have a capacity constraint on the number of
consumers who can simultaneously use the service.
This is different from off-line events (e.g., sports
games, concerts) for which the number of consumers
is restricted by the physical space.
To understand how audience size affects revenues
under PWYW, we conduct a eld experiment on a
large live streaming platform in China. In the experi-
ment, we rst randomize the displayed audience size
(based on treatment and control condition allocation)
across broadcasters. Then, using the allocation of
broadcasters to one of the three conditions (two treat-
ment and one control), we randomize the audience
size within broadcaster, within session (as most
broadcasters have multiple sessions of around an
hour each), and at every minute of a session (after the
rst 10 minutes as we discuss subsequently), for
which deviation of displayed audience size from the
actual audience size depends on a random draw. Spe-
cically, for each of the two treatment conditions, at
every minute, we draw a random number of synthetic
viewers to add from a distribution with a mean of two
or four, respectively.
Empirical identication of the treatment effect of
adding synthetic viewers proceeds in three steps.
First, we begin with a mean comparison by exploiting
variation at the broadcaster level across treatment
groups. The results from the mean comparison
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 965
suggest that adding synthetic viewers indeed signi-
cantly improves tipping revenues per minute; howev-
er, such a benet from audience augmentation is sub-
ject to diminishing returns as the difference is not
statistically signicant when increasing the average
number of synthetic viewers per minute from two to
four.
Second, though intuitive and convenient, the mean
comparison at the broadcaster level is subject to ag-
gregation bias mainly because the treatment strength
varies across sessions and within sessions over time;
thus, averaging could result in aggregation bias. To
tackle this problem, we resort to the slope comparison
by testing whether the slope of tipping revenues
against time differs across treatment conditions be-
cause the average treatment strength increases over
time from our manipulation (i.e., we add an average
of two or four viewers every minute based on treat-
ment condition). The slope analysis yields qualitative-
ly similar results with increased statistical signicance
of identied effects.
Third, the panel structure of the data allows us to
precisely account for the time-varying treatment
strength randomized at the minute level to obtain an
unbiased estimate of the treatment effect. We also con-
trol for unobserved session heterogeneity with session
xed effects (which alleviates the need for broadcast-
er-level xed effects) to identify the effect of audience
size based on within-session variation across time.
Several ndings emerge from the linear panel regres-
sions. Increasing the audience size by one unit im-
proves the tipping revenue per minute by approxi-
mately 0.01 yuan, which is 1% of the mean level. Such
a positive effect on tipping revenue is subject to di-
minishing returns, and the effect turns negative if the
number of displayed viewers exceeds 567 (96.6 per-
centile). By breaking down the revenue to the number
of real viewers times tip amount per viewer, we nd
that two forces drive the mostly positive relationship
between average treatment strength and tipping reve-
nue: a positive treatment effect on viewer size and a
mostly positive treatment effect on tip per viewer. The
former effect on viewer size seems to rely on theories
related to herding, and the latter effect on tip per
viewer suggests the dominance of social image con-
cerns over seeking reciprocity in driving tipping
decisions.
We explore the potential heterogeneity in the effect
of increasing audience size in subsequent analyses.
We nd that female and more famous broadcasters
tend to enjoy greater benets from larger audiences
than male and less famous broadcasters. Such an im-
provement in tipping revenue comes from both a
stronger effect on audience size and tip per viewer.
These results are generally consistent with the data
pattern predicted by herding resulting from social
norm (Croson and Shang 2008, Simonsohn and Ariely
2008) rather than observational learning (Banerjee
1992, Bikhchandani et al. 1992). Furthermore, the theo-
ry of intrasexual competition from evolutionary psy-
chology predicts that status-signaling motivation
among male customers is more likely to appear in the
presence of a female rather than a male broadcaster
(Sundie et al. 2011). Given the majority of male audi-
ences on our platform (66%), our nding of a stronger
effect on tipping to female than male broadcasters
suggests the relevance of social image concerns in live
streaming. We further provide corroborating evidence
of social imagerelated utility in tipping by examining
the moderating effects of broadcaster performances. If
social image concerns are indeed important, we expect
the broadcasters tendency to express recognition of
individual viewers to strengthen the positive effect of
audience size on tipping. We employ state-of-the-art
speech recognition techniques to measure the broad-
casters use of recognition-related words during a live
streaming event and nd the positive moderating ef-
fect predicted by the model with social imagerelated
utility.
With this research, we aim to make several impor-
tant contributions. Substantively, we nd an overall
positive and concave causal relationship between au-
dience size and tipping revenue, suggesting that the
revenue under PWYW is less scalable than that under
other monetization tools (e.g., advertising, product
placement) that sell viewerseyeballs or clicks, which
increase linearly with the audience size. To our
knowledge, this research is among the rst few eld
databased as opposed to survey-based studies in the
live streaming literature. With our nding of a posi-
tive effect of audience size on average tip amount per
viewer, this research also contributes to the PWYW lit-
erature by conrming the relevance of social image-
related utility in commercial contexts. Although re-
search shows the relevance of social image motive in
charity contexts (Harbaugh 1998, Ariely et al. 2009,
DellaVigna et al. 2012), no evidence exists of social im-
age concerns or social pressure in PWYW employed
in commercial contexts. Our data from live streaming
provide new evidence of the importance of social im-
age concerns in driving individual payment decisions
under PWYW.
2. Related Literature
As live streaming is an emerging form of UGC, the lit-
erature on UGC is of relevance. Previous research
shows the importance of social inuence in driving in-
dividual contribution on various types of UGC plat-
forms. For example, Toubia and Stephen (2013) con-
duct a eld experiment by adding synthetic followers
on Twitter and nd that social imagerelated utility
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
966 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
(utility associated with the perception of others) is
generally more important than intrinsic utility (direct
utility from posting content) in motivating users to
contribute content; thus, userscontent contribution is
likely to increase with the augmentation of followers.
Zhang and Zhu (2011)nd social benet in content
generation by examining usersposting and editing
efforts on Wikipedia. Shriver et al. (2013) advance the
literature by showing the reverse causal effect of con-
tent generation on user engagement and social ties in
the context of an online windsurng community. Live
streaming differs from conventional UGC in its core
business model. Almost all UGC platforms studied
(e.g., online forums, review sites, social network)
monetize online trafc through advertisements. By
contrast, most live streaming platforms rely on view-
ersvoluntary payments/tips to generate revenues.
Our study differs from previous research on UGC by
exploring the scalability of revenues against the size
of consumers when the UGC platform monetizes traf-
c through PWYW.
As voluntary payments/tips facilitate the pricing of
live streaming, our study closely relates to the litera-
ture on tipping and PWYW. Tipping is a common and
important component of service marketing (for a re-
view, see Azar 2007); for example, annual tipping in
the U.S. food industry amounts to $46.6 billion (Azar
2011). There is increasing interest from both industry
and academia in exploring whether tipping, in lieu of
axed price, can serve as an alternative business
model to generate revenues (e.g., Natter and Kauf-
mann 2015). The English rock band Radiohead is a fa-
mous application of this idea. Radiohead announced
in 2007 that it would allow fans to set their own price,
if anything, for downloading its seventh album In
Rainbows. The live streaming industry is also adopting
and experimenting with the viability of this PWYW
business model.
Such PWYW pricing schemes, in which consumers
decide the payment amount (including zero), are be-
ginning to receive scholarly scrutiny (e.g., Schmidt
et al. 2015, Jung et al. 2016, Chen et al. 2017). Kim et al.
(2009)nd that PWYW transactions evoke concerns
about reciprocity and fairness. Gneezy et al. (2012)
identify the importance of self-identity and self-image
in PWYW and show that self-image concerns drive
people to pay more but also make them less likely to
buy. Research also shows that the strategy of shared
social responsibility, a modied version of PWYW
with a xed percentage of consumerspayment going
to support a charity cause, is more protable than a
xed price (Gneezy et al. 2010) and insensitive to the
percentage of payment allocated to the charity (Jung
et al. 2017). Our study differs from previous studies of
PWYW in that the tip by consumers is publicly ob-
servable in live streaming instead of anonymous as in
off-line settings (e.g., theme parks, restaurants) in
which previous PWYW experiments typically oc-
curred. The public nature of payment in PWYW in the
live streaming industry suggests the importance of so-
cial image, which drives an individuals desire to im-
prove the social status and prestige perceived by
others (Lampel and Bhalla 2007). We contribute to the
PWYW research by providing suggestive evidence of
the existence of social imagerelated utility in driving
consumerspayment in the noncharity contexts. This
novel nding of social imagerelated utility under
PWYW in the live streaming industry is by and large
consistent with the prominence of social image in mo-
tivating user contributions in front of others on online
platforms (Toubia and Stephen 2013).
The total revenue in live streaming depends not
only on the individual tip amount but also on the size
of viewers. Thus, we draw from previous herding lit-
erature to theorize the impact of popularity informa-
tion on viewer participation (e.g., Banerjee 1992, Bikh-
chandani et al. 1992). Before entering a live stream,
viewers have only limited information about its quali-
ty and, thus, may resort to other participants to ratio-
nalize their decisions, resulting in herding behavior,
which predicts a positive effect of popularity informa-
tion on viewer participation. Previous literature fur-
ther suggests that herding occurs for at least two rea-
sons: (1) social norm, in which a potential viewer
refers to existing viewerschoices as a descriptive so-
cial norm and then passively mimics their decisions
(e.g., Croson and Shang 2008, Simonsohn and Ariely
2008), and (2) observational learning, in which a po-
tential viewer infers higher quality from a session
with a larger number of existing viewers and, thus, in-
creases the likelihood of participation (e.g., Cai et al.
2009, Zhang 2010). In this study, we nd data patterns
that are more consistent with herding driven by social
norm rather than observational learning.
Finally, our research builds on the growing litera-
ture on live streaming. Extant research is largely com-
puter science focused (e.g., Pires and Simon 2015,He
et al. 2016) with only a handful of studies examining
the reasons behind the production and popularity of
live streaming. With a survey of Twitch viewers,
Sj¨
oblom and Hamari (2017) identify tension release,
interpersonal bonding, and entertainment as three key
motivators to watch live streaming of video games.
Lee et al. (2019) survey live streaming viewers in
China and nd that interaction and content appear to
be the two most important reasons that motivate
viewers to tip. For interactions, they further nd that
viewers are motivated by both reciprocity, that is,
viewers expect broadcasters to engage in social inter-
actions with tippers in return for tips received, and so-
cial image, that is, viewers tip to grab attention from
the crowd and enjoy standing out from the others.
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 967
Another survey from an industrial report of Chinese
live streaming platforms also highlights the impor-
tance of social image concerns in viewerstipping (ii-
Media Research 2019). According to this survey, re-
ceiving social recognition is among the top ve
reasons why viewers tip in live streaming. The other
four reasons include content uniqueness, content rele-
vance, broadcaster charisma, and social norm. For
broadcastersbehaviors, Tang et al. (2016)interview20
frequent broadcasters on Meerkat and Periscope, two
popular live streaming apps in the United States, and
nd that most broadcasters use live streaming to build
their personal brand. In contrast with previous survey-
based studies, we execute a randomized eld experi-
ment rather than self-reports to understand tipping be-
haviors manifested in individualsreal actions.
1
3. Field Setting and Experiment Design
3.1. Background
We conduct our study in collaboration with one popu-
lar peer-to-peer live streaming platform in China,
which prefers to remain anonymous. This platform
started in 2005 as an online community in which users
could post jokes and share entertaining stories. As in-
ternet users substantially shifted their time from desk-
top to mobile devices, the platform launched a mobile
app in 2012 that provided a service similar to its web-
site. In early 2016, the platform expanded its product
line by offering a new live streaming service in its ex-
isting mobile app. By the time of our study in August
2016, this live streaming app had more than 600,000
monthly active viewers and more than 40,000 regis-
tered broadcasters. In general, the broadcasters of this
live streaming app are young with 55% aged 1824,
40% aged 2534, and 5% aged 3544 years. Approxi-
mately 80% of the broadcasters are female, and 66% of
viewers are male.
2
Thelivestreamingappusedinourstudyistypical
of its counterparts in the United States. When opening
theapp,aviewercanseethumbnailsoflivesessions
and the number of people currently watching each ses-
sion. At the time of our study, recency of starting time
determined session ordering. Viewers are free to join
any session hosted by a broadcaster. During each
session, viewers can interact with the broadcaster and
the audience in three ways: using tips,chats,andlikes.
Figure 1shows an example of a viewer interface during
a live streaming session. Here, viewers can send tips in
the form of virtual gifts purchased through the app.
Thesevirtualgiftsappearonthescreenforaboutone
to ve seconds, depending on the gift value. The prices
of these gifts range from 0.1 to 1,000 yuan.
3
The plat-
form pays a xed proportion of the revenue from those
virtual gifts to the broadcaster. The types of broadcast-
ing content are reality shows of broadcasters, who
usually chat with the audience about trendy topics and
occasionally perform (e.g., singing, playing music). On-
line Appendix A provides details on the broadcasting
genres. An important feature of this live streaming app
is its real-time updates to the number of viewers in
each session, and this information is public. There is no
capacity constraint on the maximum number of view-
ers in a session, which allows us to conduct experi-
ments by manipulating the audience size.
3.2. Experiment Design
We designed and implemented a randomized con-
trolled eld experiment to estimate the causal effect of
audience size on tipping revenues. We collaborated
with the focal live streaming app to add exogenous
variation in the number of viewers during broadcast-
ing sessions. Before our experiment, the platform pro-
vided us with live streaming data from a random sam-
ple of 165 active broadcasters during the period from
July 8 to August 8, 2016. For each session, we observed
the session length; the number of tips, chats, and likes;
and the number of chatters, which represents the num-
ber of viewers who submitted chats at least once dur-
ing a session.
4
We randomly assigned those broadcast-
ers into three groups. Although, initially, each group
had 55 broadcasters, a few of them did not broadcast
during our experiment from August 11 to September
12, 2016. This lack of broadcasting left the rst group
with 48 broadcasters, the second group with 51, and
the third group with 54. We set live streaming sessions
by the rst group as the control group and treated
those by the second and third groups, which we call
treatment groups 1 and 2 (T1 and T2) hereinafter.
To check whether our randomization worked as in-
tended, we compared the mean of key pretreatment
metrics at the level of broadcasters across the three
Figure 1. (Color online) A Snapshot of a Live Streaming
Session
Tips in the form of
virtual gifts
Likes
Chats
Number of current viewers,
including both real and
synthetic viewers
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
968 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
groups. These metrics included the average value of
tips, the average number of likes and chats, and the
average number of chatters per session. We also com-
pared the number of sessions, session length, and pro-
portion of female broadcasters across the three
groups. As Table 1shows, the p-values of the F-test in-
dicate no signicant evidence to reject the null hy-
pothesis that the means of the pretreatment metrics by
broadcasters in each group are the same, thus con-
rming the success of our randomization procedure.
For each session by a broadcaster in T1, we asked
the platform to add an average of two synthetic view-
ers at the end of each minute after the 10th minute.
We treat sessions in T2 similarly to those in T1 but
double the strength of the treatment by adding an av-
erage of four synthetic reviews at each minute after
the 10th minute. The platform added xsynthetic
viewers at the end of each minute; xis drawn from a
normal distribution, and · is the oor function.
When xis negative, up to xnumber of synthetic
viewers leave the session. We did not add synthetic
viewers at the beginning of a session to avoid poten-
tial suspicion from the broadcaster and real viewers.
The platform also used only dormant accountsthose
generated by real users who were no longer active
as synthetic viewers in this study. Thus, the platform
did not create any new accounts for this study, which
could be troublesome if a savvy viewer noticed any
unauthentic prole information such as a relatively re-
cent registration date. The synthetic viewers we added
did not interact with any party during a session. Fig-
ure 2shows the distribution of the number of added
synthetic viewers at each minute in T1 (M2.08) and
T2 (M4.12), respectively. This manipulation allows
us to have exogenous data variation at the level of mi-
nutes though we conduct initial randomization at the
level of broadcasters.
Recall that the goal of our research is to investigate
the relationship between audience size and tipping
revenue, conditional on everything else, including en-
gagement activities (i.e., tips, chat, likes). We, there-
fore, did not allow synthetic viewers to engage in a
live stream so that we can have a relatively clean
setting to identify the effect of audience size alone
rather than a combined effect of audience size and en-
gagement activities. It is also technically challenging
to generate synthetic viewers who can meaningfully
send tips, chats, and likes as a regular viewer. Thus,
we manipulated only the audience size but not any
engagement activity in this experiment. This design is
similar to Toubia and Stephen (2013) in which the au-
thors manipulated only the number of followers rath-
er than their activities (e.g., replying, sharing, men-
tioning) on Twitter when studying the effect of
followers on content creation.
3.3. Data Description
Our data include 2,222 sessions by 153 broadcasters
during the experiment and 2,226 sessions by these
broadcasters before the experiment.
5
We have 813,
660, and 749 sessions in C, T1, and T2, respectively,
during the experiment. The number of sessions in C,
T1, and T2 are 794, 678, and 754, respectively, before
Table 1. Means of Pretreatment Variables Across Groups
Means
p-values of F-testCT1T2
Tips per session (yuan) 24.7 26.5 24.4 0.973
Number of chats per session 193.2 166.9 183.7 0.756
Number of likes per session 3,723.8 4,409.2 4,275.6 0.931
Number of chatters per session 21.7 19.7 22.4 0.883
Session length (minutes) 51.0 46.3 46.4 0.622
Number of sessions 16.5 13.3 14.0 0.274
Female 0.771 0.863 0.815 0.502
Number of broadcasters 48 51 54
Figure 2. (Color online) Histogram of Number of Added
Synthetic Viewers
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 969
the experiment. In contrast with session-level data col-
lected before the experiment, we have minute-level
observations during the experiment. In particular, we
observe the number of likes and chats and the value
of tips sent by real viewers during each minute tof a
session k. We also observe the number of real and syn-
thetic viewers at minute t. For broadcasters, we ob-
serve session length and infer broadcaster gender on
the basis of saved video recordings of all streams. In
Section 5.2, we describe how we generate four perfor-
mance-related metrics by analyzing video content
(i.e., Recognition,FaceTime,HandTime, and Emotion).
We present the summary statistics of key variables for
viewers and broadcasters in Tables 2and 3, respec-
tively. On average, each session lasts approximately
50 minutes, and a broadcaster in C, T1, and T2 re-
ceives 0.61, 1.03, and 1.15 yuan of tips per minute,
respectively.
Table 2. Summary Statistics of Viewer-Related Variables Across Groups
Definition Group Mean
Standard
deviation Minimum Maximum
Tipkt Total tips received at minute tof
session k, cents
C 61.2 480 0 30,000
T1 103 609 0 16,810
T2 115 822 0 30,000
TipRatekt Average tips per real viewer at
minute tof session k, cents
C 5.20 29.3 0 1,680
T1 6.92 33.8 0 3,011
T2 5.86 29.8 0 3,033
ChatRatekt Average number of chats per
real viewer at minute tof
session k
C 0.572 0.468 0 21.8
T1 0.515 0.410 0 25.5
T2 0.529 0.508 0 35.5
LikeRatekt Average number of likes per real
viewer at minute tof session k
C 4.36 7.54 0 290
T1 4.86 7.94 0 254
T2 3.24 6.39 0 204
Nreal
kt Number of real viewers at
minute tof session k
C 9.96 10.1 1 200
T1 11.9 12.2 1 207
T2 15.3 18.6 1 492
Nsyn
kt Number of synthetic viewers at
minute tof session k
C0 0 0 0
T1 72.4 67.7 0 694
T2 139 137 0 2,558
Ndisp
kt Number of displayed viewers at
minute tof session k
C 9.96 10.1 1 200
T1 84.3 70.9 0 710
T2 155 142 1 2,558
Notes.TipRate Tip
Nreal,Ndisp Nreal +Nsyn. 1 cent 0.01 yuan.
Table 3. Summary Statistics of Broadcaster-Related Variables Across Groups
Definition Group Mean
Standard
deviation Minimum Maximum
Recognitionkt Count of four recognition-related
words: welcome,”“hello,
thank you,and many
thanksduring a minute
C 6.09 5.98 0 49
T1 5.60 5.73 0 48
T2 6.03 6.23 0 47
FaceTimekt Duration of the broadcasters
face appearing in front of the
camera during a minute
C 0.644 0.376 0 1
T1 0.805 0.302 0 1
T2 0.720 0.347 0 1
HandTimekt Duration of the broadcasters
hands appearing in front of
the camera during a minute
C 0.281 0.311 0 1
T1 0.415 0.355 0 1
T2 0.324 0.319 0 1
Emotionkt Emotion score of the broadcaster
during a minute: 1 means
pure negativity, and 1 means
pure positivity
C 0.095 0.183 11
T1 0.102 0.150 0.983 1
T2 0.121 0.185 0.933 1
LengthkLength of session k, minutes C 50.3 52.4 1 345
T1 52.6 53.0 1 363
T2 54.6 59.1 1 687
FemaleiGender of broadcaster i
(1 female, 0 male)
C 0.771 0.425 0 1
T1 0.862 0.238 0 1
T2 0.815 0.392 0 1
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
970 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
Table 4reports the correlation matrix of key varia-
bles in our data. We nd a positive correlation between
the tip amount and the audience size measured by the
number of displayed viewers (r0.059, p<0.01). This
positive association seems to be driven by the positive
correlation between audience size and the number of
real viewers (r0.326, p<0.01)andthepositivecorre-
lation between audience size and the average tip per
viewer measured by the TipRate (r0.016, p<0.01).
We show how the distributions of the three reve-
nue-related variables (i.e., Tip,TipRate,andNreal )vary
by the number of synthetic viewers in Figure 3.We
use a heat map to visualize the distribution; the darker
areas represent the higher density. The scatterplots of
our raw data reveal an inverted U-shaped associations
between Nsyn and Tip and between Nsyn and TipRate
and a mostly positive and concave relationship be-
tween Nreal and Nsyn when the number of synthetic
viewers is not too large (Nsyn 1, 200). Figure 3also
depicts outliers in which the number of added syn-
thetic viewers is extremely large (Nsyn >1, 200) be-
cause of a few unusually long sessions.
4. Main Analysis
4.1. Mean Comparison
We describe the raw treatment effect by comparing the
means of revenue-related variables across C, T1, and
T2. Specically, we examine whether the value of tips
per minute, the session length, and the streaming fre-
quency, all aggregated at the broadcaster level, vary
by treatment conditions. We decompose the value of
tips to the number of real viewers (Nreal) and average
tip per real viewer (TipRate) to explore the potential
mechanism through which audience size may affect
tipping revenue. As previously theorized, the popular-
ity information may affect tipping revenues through
the impact on viewer participation because of herding
and through the impact on individual tipping because
of the potential coexistence of viewersmotivations of
improving social image and seeking reciprocity.
We report the results of the broadcaster-level mean
comparison in Table 5. We note that the value of tips
per minute (i.e., Tip) increases signicantly when we
add synthetic viewers. However, the value of tips is
not signicantly different between the T1 and T2 con-
ditions in which treatment strength doubles from T1
to T2. This result suggests that the positive effect of
treatment on Tip is nonlinear and subject to diminish-
ing returns. By decomposing Tip into Nreal times Tip-
Rate, we show that two factors drive the positive treat-
ment effect on tipping revenue: a positive treatment
effect on viewer participation and a positive treatment
effect on TipRate. With regard to the length and fre-
quency of streams, we nd no signicant difference
between the control and treated broadcasters. We
speculate that the null effect of audience size manipu-
lation on session length might depend on broadcast-
ersschedules. Most broadcasters on our live stream-
ing platform are using the streaming services either as
a hobby or a part-time job. They typically have other
duties (e.g., full-time worker/student) that may affect
their exibility of broadcasting. We also observe that
broadcasters often announce how much time they will
broadcast at the beginning of a session. The lack of
exibility and predetermined broadcasting schedules
might explain the similar session length between the
control and treated conditions.
Despite the convenience, the treatment effect re-
vealed from the aggregate-level mean comparison suf-
fers from at least two sources of bias. The rst results
from our experimental manipulation in which the rst
10minutesinsessionsofT1andT2wereactuallyun-
treated. Given this nontreatment for the rst 10 minutes,
the aggregation based on the whole sample can result in
an underestimate of the treatment effect. The second
source of bias arises from the varying strength of treat-
ment over time. We provide an illustrative example
here to show the potential bias in treatment effect in-
ferred from the mean comparison at the aggregate
broadcaster level. For simplicity, we assume treatment
strength is linear in time and each broadcaster only
streams once. We describe the outcome variable yit of
broadcaster iat time tas follows:
yit Treati1()α+ft
()
+ξi+it,(1)
yit Treati0
()
α+ξi+it,(2)
where f(t)is the time-varying treatment effect and ξi
denotes broadcaster heterogeneity with mean zero.
Table 4. Correlation Matrix
Tip TipRate ChatRate LikeRate Nreal Nsyn Ndisp
Tip 1
TipRate 0.733** 1
ChatRate 0.054** 0.007* 1
LikeRate 0.000 0.023** 0.111** 1
Nreal 0.171** 0.040** 0.259** 0.055** 1
Nsyn 0.037** 0.011** 0.007* 0.011** 0.198** 1
Ndisp 0.059** 0.016** 0.042** 0.019** 0.326** 0.991** 1
**p<0.01; *p<0.05.
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 971
Based on Equation (1), the mean of yit for broadcast-
er ifrom treatment groups is ¯
yiα+Ti
t1ft
()
=Ti+
ξi+Ti
t1it=Ti, where Tirepresents the number of
treated periods (session length). As we do not nd a
treatment effect on session length, we assume that Ti
is independent and identically distributed, and there-
fore, we have
Ei
¯
yi

α+ETiTi
t1ft
()
Ti

:(3)
Figure 3. (Color online) Relationship Between the Number of Synthetic Viewers and Revenue Metrics
Notes. Each line represents the data tted by a cubic spline and the surrounding areas represent the 95% condence interval. The gap on the Y-
axis in Figure 3(a) is caused by the discrete nature of the tip amount: the smallest tip is 10 cents (1 cent 0.01 yuan), which causes the gap be-
tween log(10) 2.30 and 0 on the Y-axis.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
972 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
Consider a typical between-subjects design with
multiple trials in which the treatment strength is cons-
tant (i.e., ft
()β) across trials (i.e., minutes). In this
scenario, we have E¯
yi|Treati1

E¯
yi|Treati0

ETiTi
t1ft
()
=Ti

β, suggesting that the mean com-
parison estimate works. Nevertheless, the number of
synthetic viewers added in our experiment accu-
mulates over time, and therefore, the treatment
strength is not identical across trials. If the treatment
effect is perfectly linear in strength (i.e., ft
()βt), we
have E¯
yi|Treati1

E¯
yi|Treati0

ETiTi
t1ft
()
=Ti

ETiTi+1()
2ββ, suggesting that the broadcaster-level mean
comparison estimate is proportionally unbiased. Howev-
er, if the treatment effect is nonlinear (e.g., ft
()βt+γt2),
we have E yi|Treati1

E yi|Treati0

ETiTi+1
2β+
Ti+1()2Ti+1()
6γETiTi+1()
2β+ETiTi+1()2Ti+1()
[]
6γ, suggesting that
the mean comparison estimate is no longer proportional-
ly unbiased, and the extent of the bias is determined by
the convexity of treatment effect and the distribution of
session length.
To alleviate the rst source of bias resulting from the
nontreatment for the rst 10 minutes, we rerun the
mean comparison by excluding observations associated
with the rst 10 minutes from the two treatment
groups. As the top panel of Table 6shows, the means
of variables of interest in the treatment groups all
move toward the expected direction (e.g., Tip in T1 and
T2 both move up compared with the values in Table 5).
We cannot reject the hypothesis that Tip in C is smaller
than those in T1 and T2 at the 5% level. In addition,
TipRate signicantly increases from C to T1 (p0.018)
and then decreases from T1 to T2 (p0.034), suggest-
ing an inverted U-shaped relationship. Furthermore,
the differences in the number of real viewers between
C and T1 and between T1 to T2 are statistically signi-
cant at the 10% level, suggesting an overall positive
treatment effect on Nreal.Wealsoreporttheresultsof
the mean comparisons in the bottom panel of Table 6
when excluding the rst 10 minutes of data from both
control and treatment groups. We nd qualitatively
similar patterns. To tackle the second source of bias re-
sulting from varying treatment strength, we need to ac-
count for the variation in treatment strength over time
and across sessions as we discuss next.
4.2. Slope Comparison
One intuitive approach of exploiting the variation in
treatment strength over time is to compare the slope
of variables of interest with time across treatment con-
ditions. For example, if the addition of synthetic
Table 5. Broadcaster-Level Mean Comparison Using the Whole Sample
Mean p-value
C T1 T2 C vs. T1 T1 vs. T2 C vs. T2
Tip, cents 49.6 103.7 91.1 0.011* 0.341 0.052
+
TipRate, cents 4.51 6.57 4.87 0.049* 0.098
+
0.743
Nreal 9.70 11.1 14.7 0.151 0.116 0.043*
Length, minutes 48.8 47.3 44.1 0.425 0.293 0.275
Number of sessions 17.7 13.8 14.8 0.108 0.358 0.175
Notes. p-value of one-sided t-test is reported. 1 cent 0.01 yuan.
**p<0.01; *p<0.05;
+
p<0.10.
Table 6. Broadcaster-Level Mean Comparison Excluding Observations with t10
Mean p-value
C T1 T2 C vs. T1 T1 vs. T2 C vs. T2
Excluding observations with t 10 for T1 and T2
Tip (cent) 49.6 128.7 95.4 0.005** 0.176 0.032*
TipRate (cent) 4.51 7.24 4.78 0.018* 0.034* 0.399
Nreal 9.70 12.8 16.4 0.062
+
0.096
+
0.010*
Excluding observations with t 10 for T1, T2, and C
Tip (cent) 57.6 128.7 95.4 0.012* 0.176 0.069
+
TipRate (cent) 4.93 7.24 4.78 0.042* 0.034* 0.558
Nreal 9.99 12.8 16.4 0.094
+
0.096
+
0.016*
Notes. p-value of one-sided t-test is reported. 1 cent 0.01 yuan.
**p<0.01; *p<0.05;
+
p<0.10.
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 973
viewers indeed increases the number of real viewers,
we should expect the slope of Nreal to be higher for T2
than T1 than C because the number of synthetic view-
ers on average increases faster over time in T2 than T1
than C. In addition, given that the addition of synthet-
ic viewers proceeded after the 10th minute, we should
not expect a signicant difference in Nreal across C, T1,
and T2 during the rst 10 minutes.
The dynamics of Nreal over time conrm our expect-
ations. As Figure 4(a) shows, the number of real view-
ers gradually increases during the rst 10 minutes, but
there is no substantial difference across the three
groups. The p-values of the F-test that Nreal is the same
across the three groups at t1, :::,10haveameanof
0.559 and a minimum of 0.184. In addition, when
t10, we are unable to reject the hypothesis that the
slope in group C is the same as the slope in T1 and T2
(p0.571). These tests provide additional support for
the success of our eld experiment manipulations. By
comparing the slope of Nreal across groups when
10 <t120, we nd that the slope of T1 is steeper
than C (p<0.001) and the slope of T2 is steeper than
T1 (p<0.001), which is consistent with the positive
treatment effect on Nreal. As Figure 3shows, the data
with Nsyn >1,200 result from a few unusually long ses-
sions, and t120 is approximately associated with the
threshold of Nsyn 1, 200 (i.e., the maximum of Nsyn
when t120 is 1,153). Thus, we use the data truncated
at 120 minutes in subsequent analyses to prevent dis-
proportionate inuence from outliers.
We further plot the dynamics in Tip in Figure 5and
TipRate in Figure 6. Again, we are not able to reject the
hypothesis that either Tip or TipRate is the same across
C, T1, and T2 at t1, :::, 10 at the 5% level with the
exception of TipRate at t2(p0.026). By comparing
the slopes during the nontreated period (t10), we
cannot reject the hypothesis that the slope of Tip is the
same across groups (p0.670), nor can we reject the
same-slope hypothesis for TipRate (p0.571). During
the treated period (10 <t120), we nd that the slope
of Tip in the control group is generally downward
while the slope of Tip is either at or slightly upward
in two treatment groups, suggesting a positive treat-
ment effect on Tip. Although the slopes for both T1 (p
0.028) and T2 (p0.012) are signicantly steeper
than C, there is no signicant difference in the slopes
between T1 and T2 (p0.552). This data pattern again
suggests that the treatment effect on tip amount is
nonlinear and subject to diminishing returns. For the
slope of TipRate shown in Figure 6(b), the downward
slope for T1 is signicantly less salient than C (p<
0.001) and signicantly more salient than T2 (p
0.043), and the slope for T2 is not signicantly differ-
ent from C (p0.110), which implies an inverted U-
shaped relationship. Overall, the results from our
slope comparisons provide additional support for the
success of our manipulation in the experiment and
conrm similar treatment effects on Tip,Nreal,andTip-
Rate found in previous mean comparisons with in-
creased statistical power.
Figure 4. (Color online) Dynamics of Number of Real Viewers Across Groups
Note. In Figures 46, the minute-level mean estimates and error bars with 95% condence interval are shown in the left gure, and the minute-
level mean estimates and a linear t line with 95% condence interval are shown in the right gure.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
974 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
4.3. Regression Analysis
A unique design of our eld experiment is the ran-
domization of synthetic viewers over time and across
sessions in addition to the rst level of randomization
across broadcasters. This manipulation not only gives
rise to the differential growth rates of synthetic
viewers across groups (i.e., Nsyn grows faster in T2
than T1 than C), but it also allows us to observe the
exact treatment strength (i.e., Nsyn) at each minute.
This additional level of randomization enables us to
exploit the richer temporal variation to identify the ef-
fect of audience size on tipping revenues.
Figure 5. (Color online) Dynamics of Tip Amount Across Groups
Figure 6. (Color online) Dynamics of TipRate Across Groups
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 975
To estimate the causal impact of audience size, we em-
ploy a linear panel regression with session xed effects.
Compared with the mean comparison estimate, we are
able to explicitly account for the variation in treatment
strength in the regression, and therefore, the estimate is
no longer subject to aggregation bias. An additional ben-
et of the regression analysis is that the coefcient esti-
mate can provide more managerially relevant insights
(e.g., the impact of increasing audience size by X on out-
come Y), whereas the treatment effects inferred from be-
tween-group comparisons can only provide manipula-
tion-specic insights (e.g., the impact of adding an
average of two or four synthetic viewers per minute on
outcome Y), which might not be of interest to other rms.
Let kdenote a live streaming session and tdenote a
minute. The previous results regarding the effect of au-
dience size on tipping revenue suggests a nonlinear re-
lationship. To capture this potential nonlinear effect, we
model the tip amount denoted by ykt as a function of
boththeaudiencesizeanditssquaredtermasfollows:
6
ykt β1Ndisp
kt +β2Ndisp
kt

2
+β3t+β4Tenurekt +ηk+kt,
(4)
where Ndisp
kt is the number of displayed viewers in ses-
sion kat minute t;ηkrepresents the session-xed effects,
which account for any unobserved session-specicfac-
tors (e.g., quality of a session) that may affect viewers
tipping decisions; and kt is the idiosyncratic error term.
We account for two different effects of timing in
Equation (4). A broadcasters level of engagement
might change over time and, therefore, affect viewers
tipping behavior. Similarly, a viewers motive to tip
might also vary by the viewers stay in a session. We
account for the former by including the time since the
beginning of a session denoted by tand account for
the latter by including the average length of stay for
real viewers in a session kat minute tdenoted by
Tenurekt in Equation (4). As our data are at the aggre-
gate rather than the individual viewer level, we are
unable to precisely measure how long each viewer re-
mains in a session. Nevertheless, we create a proxy for
the average length of stay for real viewers by making
the following two assumptions:
7
Assumption 1. If Nreal
kt Nreal
k,t1, all existing real viewers do
notexitand(Nreal
kt Nreal
k,t1)new real viewers arrive at
time t. Thus, Tenurekt Tenurek,t1×Nreal
k,t1+Nreal
k,t1=Nreal
kt
Tenurek,t1+1

×Nreal
k,t1=Nreal
kt :
Assumption 2. If Nreal
kt <Nreal
k,t1,(Nreal
k,t1Nreal
kt )existing
real viewers exit and no new real viewers arrive at time t.
Thus, Tenurekt Tenurek,t1+1.
Estimating Equation (4) directly might lead to biased
estimates because Ndisp
kt could be correlated with the
error term kt if there are unobserved factors (e.g., am-
bience) that affect both viewersparticipation and tip-
ping decisions. To address this potential endogeneity
issue, we use Nsyn
kt anditssquaredtermasinstrument
variables (IVs) for Ndisp
kt and its squared term. These are
valid IVs because Nsyn
kt is correlated with Ndisp
kt by deni-
tion but not with the error terms as a computer algo-
rithm exogenously generates Nsyn
kt . Note that the audi-
ence size correlates with time trend as a result of the
nature of our experiment design (corr:Nsyn
kt ,t

:571).
However, Nsyn
kt does not increase linearly over time be-
causeweaddedarandom number of synthetic viewers
rather than a xed number to both T1 and T2. To em-
ploy the IV estimation, we focus on a subsample of our
data set in which the IVs are well dened. In particular,
weuseobservationsfromT1andT2when10<t120
for model estimation because synthetic viewers were
only added to treatment groups after the rst 10 mi-
nutes. We also truncate at 120 minutes to mitigate the
potential estimation bias caused by outliers.
We report estimation results of Equation (4)inTable7,
in which Model 1 includes the linear effect only and
Model 2 includes potentially nonlinear effects. As we
manipulated only the audience size in the experiment,
we interpret the estimation results as the effect of audi-
ence size alone rather than a combined effect of the au-
dience size and engagement activities of viewers. The
signicant and positive coefcients of Ndisp in both
models conrm the positive treatment effect of audi-
ence size on tipping revenues as we found previously.
On average, increasing the audience size by one unit
improves the tipping revenue per minute by approxi-
mately 0.01 yuan, which is 1% of the mean level. Such
a positive effect of audience size on tipping revenue is
subject to diminishing returns as indicated by the neg-
ative coefcient of the squared term (0.00126, p<
0.01). The coefcient estimates from Model 2 suggest
that the effect of audience size is mostly positive in our
data range as it turns negative when Ndisp exceeds
1.43/(2 ×0.00126) 567, which is the 96.6 percentile in
our data.
We next examine the drivers of the mostly positive
effect of audience size on tipping revenue by breaking
down the net impact on Tip to the impact on viewer
participation (Nreal) and average tip amount per view-
er (TipRate) separately.
4.3.1. Does Adding Synthetic Viewers Draw More Real
Viewers? We estimate the model of the number of
real viewers as follows:
Nreal
kt β1Nreal
k,t1+β2Nsyn
kt +β3Nsyn
kt

2
+β4t+β5Tenurekt +ηk+kt, (5)
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
976 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
where we include the number of real viewers from
the last period to capture the inertia in viewerspartic-
ipation decisions. We also use Nsyn rather than Ndisp as
the independent variable because Ndisp Nreal +Nsyn,
suggesting that Ndisp
kt perfectly correlates with kt and,
therefore, cannot serve as an independent variable.
As Table 7shows, adding synthetic viewers draws
more real viewers to live streaming sessions.
8
This is
supported by the positive and signicant coefcients of
Nsyn (0.021, p<0.01), which suggests the existence of
herding in a live streaming context. Adding 50 synthet-
ic viewers results in approximately one additional real
viewer. Although we nd diminishing returns of the ef-
fect, the relatively small coefcient of Nsyn
()
2suggests
that the effect of audience size on viewer participation
is mostly positive in our data range. That is, the effect is
positive until Nsyn exceeds 0.028/(2 ×1.64e-5) 854,
which is the 99.2 percentile in our data.
4.3.2. Does a Larger Audience Encourage or Discour-
age Average Tip Amount per Viewer? As audience
size increases, a viewer tends to obtain greater social
imagerelated utility from tipping and, thus, might in-
crease the tip amount. Nevertheless, a larger audience
may also suggest greater competition for the broad-
casters reciprocal acts, which results in a lower
marginal utility of tipping and further reduces the
willingness to tip. To empirically test the direction of
the effect, we estimate the model of TipRate using
Equation (4) with ykt TipRatekt. We report the results
from the IV estimation in Table 7under the column
DV TipRate.Thecoefcient estimates reveal an over-
all positive relationship between the audience size and
TipRate, suggesting the dominance of social image
related utility over seeking reciprocity in driving view-
erstipping when Ndisp is not too large. When Ndisp
exceeds a certain threshold (0.022/[2 ×2.32e-5] 474,
which is the 94.2 percentile), the relationship reverses,
perhaps because of the dominance of the negative com-
petition effect.
5. Model Extensions and
Robustness Checks
We rst extend our main analysis by further exploring
the heterogeneity in the treatment effect of increasing
audience size. We then conduct a series of robustness
checks to show that our ndings are not sensitive to
alternative model specications and assumptions.
5.1. The Moderating Effect of Broadcaster
Characteristics
5.1.1. Does the Effect of Audience Size Differ by
Broadcaster Gender? Previous research suggests that
herding occurs if a viewer refers to existing viewers
choices as a descriptive social norm and, thus, pas-
sively mimics their decisions to follow well-attended
sessions (Croson and Shang 2008, Simonsohn and
Ariely 2008). Our pretreatment data show that live
streaming sessions by female broadcasters are better
attended than sessions by male broadcasters as indi-
cated by the signicant difference (8.00, p<0.01) in
the number of chatters per session between female
and male broadcasters. The theory of social norm sug-
gests that a viewers choice of a mainstream product
is more justiable than the choice of a niche product,
Table 7. Regression Results of Models of Tip, TipRate, and Number of Real Viewers
DV Tip DV TipRate DV Nreal
Model 1 Model 2 Model 1 Model 2 Model 1 Model 2
Ndisp 0.920** 1.43** 0.012* 0.022**
(0.215) (0.296) (0.005) (0.008)
Ndisp
()
20.001** 2.32e-5*
(3.34e-4) (1.09e-5)
Lag Nreal 0.757** 0.754**
(0.016) (0.016)
Nsyn 0.021** 0.028**
(0.004) (0.003)
Nsyn
()
21.64e-5*
(6.83e-6)
t2.57** 3.11** 0.057** 0.067** 0.040** 0.045**
(0.770) (0.819) (0.020) (0.022) (0.008) (0.007)
Tenure 4.97** 5.40** 0.103 0.111 0.238** 0.246**
(1.70) (1.82) (0.061) (0.064) (0.029) (0.028)
Session xed effects Yes Yes Yes Yes Yes Yes
Endogeneity correction Yes Yes Yes Yes
Number of observations 53,893 53,893 53,893 53,893 52,833 52,833
Notes. Robust standard errors clustered by broadcasters in parentheses. Model 1 considers linear effect only;
Model 2 includes both Nand N
2
.
**p<0.01; *p<0.05.
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 977
and thus, other viewers are more likely to follow this
viewer to adopt the mainstream rather than the niche
product. Following this reasoning, as female broad-
casters are the mainstream on our platform, we expect
the effect of popularity information on viewer partici-
pation to be stronger in sessions by female broadcast-
ers than male broadcasters.
Nevertheless, previous research also indicates that
consumers often engage in observational learning to
draw quality inference from the observation of others
decisions (e.g., Banerjee 1992, Zhang and Liu 2012). For
example, Tucker and Zhang (2011) show that niche
products with narrow appeal benetmorefrompopu-
larity information than broad-appeal products in e-
commerce because consumers infer higher quality
from a narrow-appeal product than an equally popular
broad-appeal product. Given the majority of male
viewers and female broadcasters, we can classify ses-
sionsbymalebroadcastersasnicheproducts.Thus,if
observational learning drives herding in our data, male
broadcasters benetmorethanfemalebroadcasters
from the increase in popularity in drawing viewers.
Broadcaster gender may also moderate an individu-
als motivation for tipping. Theories from evolution-
ary psychology suggest that men are likely to display
nancial resources as a tactic to signal status and gain
attention from female mates, which results in a higher
level of conspicuous consumption by men than wom-
en (Buss 1988, Griskevicius et al. 2007, Kenrick and
Griskevicius 2013). In our context of live streaming,
the theory of intrasexual competition predicts that
such status-signaling motivation among male custom-
ers is more likely to appear in the presence of a female
rather than a male broadcaster (Sundie et al. 2011).
Given the male-dominant audience on our platform,
we expect the utility related to social status and pres-
tige to be greater in sessions by female than male
broadcasters, which predicts a stronger effect of audi-
ence size on average tip amount (a form of conspicu-
ous consumption) in female broadcasterssessions.
To test the moderating effect of broadcaster gender,
wedividethesampleintofemaleversusmalebroad-
casters and reestimate Equations (4)and(5). Table 8re-
ports the results. Although the main effects of audience
size on Tip are both signicant and positive for female
and male broadcasters, the coefcient of Ndisp is smaller
for male broadcasters. Next, we compare the effect of
audience size on Nreal and TipRate between female and
male broadcasters, respectively. We nd that the coef-
cient of Nsyn is larger in the model of Nreal for female
broadcasters (0.032, p<0.01) than for male broadcasters
(0.016, p<0.01). This result provides suggestive evi-
dence that social norm rather than observational learn-
ing drives the herding in viewersparticipation in live
streaming. Regarding the regression results of TipRate,
we nd only a signicant effect of Ndisp for female broad-
casters (0.025, p<0.01) although the effect is not statisti-
cally signicant for male broadcasters (0.011, p>0.05).
This result implies that the motivation of signaling social
status only exists in female broadcasterssessions, which
is consistent with the theory of intrasexual competition.
We show in Online Appendix C that the differential ef-
fectsofaudiencesizeonTip,TipRate,andNreal between
female and male broadcasters still hold when we test
the moderating effect using interaction terms between
audience size and gender indicator variable.
Table 8. Regression Results on Subsamples (Female vs. Male Broadcasters)
Female broadcasters only Male broadcasters only
DV Tip DV TipRate DV Nreal DV Tip DV TipRate DV Nreal
Ndisp 1.55** 0.025** 1.01* 0.011
(0.328) (0.009) (0.450) (0.019)
Ndisp
()
20.001** 2.19e-5* 0.001 4.07e-5
(3.50e-4) (9.60e-6) (0.001) (3.36e-5)
Lag Nreal 0.741** 0.706**
(0.017) (0.017)
Nsyn 0.032** 0.016**
(0.003) (0.004)
Nsyn
()
21.55e-5* 1.96e-5*
(6.43e-6) (7.72e-6)
t3.20** 0.074** 0.048** 2.35* 0.028 0.026**
(0.962) (0.025) (0.008) (1.10) (0.042) (0.005)
Tenure 4.81** 0.064 0.284** 5.71 0.196 0.154**
(2.29) (0.076) (0.035) (2.94) (0.116) (0.044)
Session xed effects Yes Yes Yes Yes Yes Yes
Endogeneity correction Yes Yes Yes Yes
Number of observations 38,613 38,613 37,944 15,280 15,280 14,889
Note. Robust standard errors clustered by broadcasters in parentheses.
**p<0.01; *p<0.05.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
978 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
5.1.2. Does the Effect of Audience Size Vary by Broad-
caster Fame? Viewers on the live streaming platform
do not know all broadcasters equally. From a viewers
perspective, tipping a star broadcaster rather than a
nonstar broadcaster can improve the viewers social
image because of the perceived afliation with a more
prestigious group in front of other viewers, holding
everything else equal. Thus, we might expect a greater
effect of audience size on tipping revenue in sessions
by more famous broadcasters.
We operationalize the fame of each broadcaster by
the dollar value of all tips received during the one-
month pretreatment period because the platform pub-
licizes the rank of broadcasters on the basis of their
monthly earnings with the top broadcaster earning the
most. The variable Famei,dened as the monthly tip
amount, has a mean of 403.1 yuan and a maximum of
6,600 yuan in our sample, suggesting that none of the
broadcasters are superstars in the conventional sense.
Nevertheless, we still have a large variation in the
monthly earnings of broadcasters (SD 809.7), which
enables us to identify the moderating effect of fame.
We add interaction terms N×Fameiand N2×Famei
to Equations (4)and(5) and report the regression results
in Table 9.Thecoefcient of Ndisp ×Fame is signicant
and positive in the regression of Tip, suggesting a stron-
ger effect of audience size in sessions by more famous
broadcasters. The most earned broadcaster benets
approximately 180% (4.03e-4 ×[6,600 403]/1.40
1.78) more than an average broadcaster from the in-
crease in audience size. The coefcient of Ndisp ×Fame in
the regression of TipRate is positive but statistically in-
signicant, which does not support the enhanced social
imagefromtippingtotop-rankedbroadcasters.The
positive moderation of Fame in the effect of audience
size on Tip seems to be driven by the stronger effect on
viewer participation as suggested by the positive and
signicant coefcient of Nfake ×Fame in the regression of
Nreal. As sessions by more famous broadcastersexhibit
more viewers, this result is consistent with our previous
nding that social norm rather than observational learn-
ing drives herding on our platform.
5.2. The Moderating Effect of Broadcaster
Performances
We further investigate the potential moderating effect
of broadcaster performances. As time-variant perfor-
mance metrics are potentially endogenous to the treat-
ment, the results derived from this model extension
are better interpreted as exploratory rather than con-
clusive. Our main analysis reveals a positive causal im-
pact of audience size on viewerstipping when the au-
dience size is not too large, which is consistent with
the existence of social imagerelated utility in tipping.
In a live steaming session, a broadcasters recognition
of a viewers action would likely garner attention from
other viewers, thus enhancing the social imagerelated
utility of the focal viewer. For example, a broadcaster
may welcome a viewer when this viewer joins a ses-
sion or express gratitude to the viewer for the tip re-
ceived. In line with this logic, a viewer is more likely
to tip in front of other viewers in anticipation of a
broadcastersrecognition.
9
We, therefore, expect a
broadcasters recognition tendency to positively mod-
erate the effect of audience size on viewerstipping.
To test this hypothesis, we operationalize a broad-
casters recognition tendency by the frequency of say-
ing words, as suggested by the platform, related to
greeting and gratitude toward a specic viewer dur-
ing a session: welcome,”“hello,”“thank you,and
many thanks(in Chinese). We create Recognitionkt,
the count of these four words by the broadcaster in
session kduring minute t, by employing the state-of-
the-art speech recognition algorithms detailed in On-
line Appendix B.
We also create three other metrics (i.e., FaceTimekt,
HandTimekt,andEmotionkt) from the video content as
proxies of the intensity and quality of the broadcast-
ers performance (for denitions and summary statis-
tics, see Table 3). Specically, we measure the intensi-
ty of performance according to the percentage of
frames in which the broadcasters face and hands ap-
pear in front of the screen within a minute. Intuitively,
the longer a broadcaster disappears from the screen,
Table 9. The Moderating Effect of Broadcaster Fame
DV Tip DV TipRate DV Nreal
Ndisp 1.40** 0.021**
(0.293) (0.008)
Ndisp
()
20.001** 2.23e-5*
(2.73e-4) (1.04e-5)
Ndisp ×Fame 4.03e-4** 3.87e-6
(1.17e-4) (2.66e-6)
Ndisp
()
2×Fame 2.73e-7 2.68e-9
(1.85e-7) (6.80e-9)
Lag Nreal 0.709**
(0.024)
Nsyn 0.032**
(0.003)
Nsyn
()
21.70e-5**
(5.58e-6)
Nsyn ×Fame 7.03e-6**
(1.75e-6)
Nsyn
()
2×Fame 1.89e-9
(7.19e-9)
t3.01** 0.066** 0.054**
(0.863) (0.022) (0.008)
Tenure 5.19** 0.109 0.275**
(1.76) (0.065) (0.033)
Session xed effects Yes Yes Yes
Endogeneity correction Yes Yes
Number of observations 53,893 53,893 52,833
Notes. Robust standard errors clustered by broadcasters in parenthe-
ses. Ndisp,Nsyn ,andFame are mean-centered.
**p<0.01; *p<0.05.
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 979
the lower the performance intensity perceived by
viewers. We assume that the quality of a session is as-
sociated with a broadcasters emotion, which we infer
from facial expressions using Microsoft Azure Emo-
tion. This program detects and analyzes facial expres-
sions using deep convolutional neural networks. The
output of these algorithms is a vector of scores associ-
ated with eight types of positive and negative emo-
tions (see Online Appendix B). We aggregate these
scores to create an emotion score, Emotionit, to mea-
sure the extent of positivity in a broadcasters facial
expression within a minute.
To evaluate the moderating effects, we add interac-
tion terms N×Recognitionkt and N2×Recognitionkt to
Equations (4)and(5). We also include Recognitionkt as
an independent variable because of the variation of
Recognitionkt over time and across sessions, which ses-
sion xed effects do not capture. We present estima-
tion results in Table 10. The results show that the posi-
tive impact of audience size on Tip is stronger in a
session in which the broadcaster is more likely to ex-
press recognition. Broadcasters whose frequency of
using recognition-related words is one standard devi-
ation above population mean enjoy 10.0% (6 ×0.023/
1.38 0.100) more revenue from the increase in
audience size. The stronger effect on Tip results from
the moderating effect of Recognition on TipRate as indi-
cated by the positive and signicant coefcient of
Ndisp×Recognitionkt in the model of TipRate. These
ndings again provide evidence for the existence of
social imagerelated utility in tipping.
Further analyses provide additional evidence in
support of our proposed mechanism through the so-
cial imagerelated utility. We examine whether broad-
castersbehaviors (i.e., FaceTimekt,HandTimekt,
Emotionkt) that are unlikely to be related to the signal-
ing value of a viewers tipping also moderate this pos-
itive effect. Note that these types of broadcaster be-
haviors differ from Recognitionkt as a broadcasters
body language and facial expressions are often made
to the general audience rather than to a particular
viewer. Therefore, we do not expect FaceTimekt,
HandTimekt,orEmotionkt to moderate the impact of
Ndisp
kt on social imagerelated utility of tipping; thus,
we propose a falsication test. To conduct the pro-
posed falsication test, we reestimate the model of Tip
in a similar way as we did for the test for Recognition.
The results in Table 11 conrm that neither of these
three performance metrics moderates the effect of au-
dience size on tipping revenue. Aside from broadcast-
er performances, we explore a few other potential
moderators in Online Appendix C and nd that past
tips and the number of real viewers moderate the ef-
fect of audience size on tipping, but viewer engage-
ment does not moderate this effect.
5.3. Robustness Checks
We conduct a series of analyses to ensure the robust-
ness of our main ndings in Online Appendix D. We
rst show that the positive effect of audience size on
tipping revenue still holds after controlling for poten-
tial confounding factors, including the proportion of
synthetic viewers, the proportion of new viewers, and
the potential spillover effects of other concurrent
streams.
We then consider an alternative model specication
without session xed effects but with the time-of-day
and social norm effects. We still nd a mostly positive
effect of audience size on each of the three outcome
variables (Tip,TipRate,Nreal). In addition, total tips
tend to be higher in sessions by female and famous
broadcasters. Viewers tend to tip more in talk shows
than other genres. Regarding the timing of broadcast-
ing, sessions that start in the early morning (8 a.m.)
and late afternoon (4 p.m.) receive signicantly fewer
tips than sessions that start at midnight.
We exclude all short sessions that last no more than
10 minutes in our regressions because of the lack of
variation in the number of synthetic viewers during
the rst 10 minutes. In total, 482 out of 2,222 sessions
are short sessions and contribute to 0.47% of total
Table 10. The Moderating Effect of Broadcasters
Recognition Tendency
DV Tip DV TipRate DV Nreal
Ndisp 1.38** 0.021**
(0.295) (0.008)
Ndisp
()
20.001** 2.35e-5*
(2.90e-4) (1.16e-5)
Ndisp ×Recognition 0.023* 7.99e-4*
(0.010) (3.22e-4)
Ndisp
()
2×Recognition 1.22e-5 4.61e-7
(3.38e-5) (8.22e-7)
Lag Nreal 0.726**
(0.021)
Nsyn 0.031**
(0.004)
Nsyn
()
22.11e-5**
(6.38e-6)
Nsyn ×Recognition 1.44e-4
(1.51e-4)
Nsyn
()
2×Recognition 1.23e-6**
(3.22e-7)
Recognition 0.069 0.021 3.52e-4
(1.04) (0.026) (0.005)
t2.98** 0.065** 0.051**
(0.814) (0.022) (0.009)
Tenure 4.67** 0.084 0.275**
(1.64) (0.059) (0.034)
Session xed effects Yes Yes Yes
Endogeneity correction Yes Yes
Number of observations 53,893 53,893 52,833
Notes. Robust standard errors clustered by broadcasters in parenthe-
ses. Ndisp,Nsyn ,andRecognition are mean-centered.
**p<0.01; *p<0.05.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
980 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
revenues. The exclusion of short sessions might give
rise to a sample selection problem if the social benets
of tipping systematically differ between short and
long sessions. We provide evidence that our ndings
do not seem to be subject to the sample selection bias.
We not only observe data across the three treatment
groups but also have data before and after the experi-
ment, which allows us to check the robustness of nd-
ings using a difference-in-differences (DID) analysis.
We present corroborating evidence of the positive and
nonlinear treatment effect on tipping revenue from
the estimation of a DID model.
To further test the proposed behavioral mechanism
through social image, we investigate the effects of au-
dience size on chats and likes as a falsication test. We
nd a negative main effect of audience size on Cha-
tRate and a null effect on LikeRate, which are consis-
tent with the absence of social image in these nonmon-
etary engagement activities. One possible explanation
for the main negative effect on ChatRate is the substi-
tution between tipping and chatting. When audience
size grows, the increasing social imagerelated utility
in tipping may drive viewers to switch from chatting
to tipping, which explains the downward trend in
ChatRate against audience size. Notably, as nonmonet-
ary engagement activities, such as chats and likes, are
often used as key health indicators by online commu-
nities and platforms, the decrease in chats against au-
dience size could be a potential downside of the
PWYW model.
We also empirically rule out an alternative mecha-
nism that the positive effect of audience size on
tipping is through changes in service quality mani-
fested in broadcaster behavior. Specically, none of
the four lag broadcaster performance metrics has a
statistically signicant relationship with either Tip or
TipRate, suggesting that broadcaster behavior does
not mediate the positive main effect of audience size
on tipping. For broadcaster performance, we nd that
a larger audience size generally increases the duration
of a broadcasters face appearing in front of the cam-
era (i.e., FaceTime). This nding suggests that broad-
casters tend to increase their broadcasting effort when
having a larger audience. Although we did not nd a
signicant relationship between FaceTime and view-
ersimmediate tips, an increased frequency of show-
ing faces to viewers might improve the viewer-broad-
caster relationship in the long run and, therefore, lead
to a greater scalability of PWYW. Given the relatively
short span of our data, we are unable to test this hy-
pothesis in this study. We, therefore, leave these inter-
esting questions pertaining to the long-run benets of
a large audience to future research.
Finally, we conduct a descriptive analysis by examin-
ing the correlations between audience size and key out-
come variables using data from the control group
alone, in which all viewers are real. Consistentwith our
main ndings from the experiment, we observe a most-
ly positive relationship between audience size (Ndisp)
and total tips (Tip) and a mostly positive relationship
between audience size and average tip per real viewer
(TipRate). These observations suggest that not only hav-
ing more synthetic viewers, but also having more regu-
lar real viewers, may improve tipping revenue.
Table 11. The Moderating Effect of Additional Performance Metrics of Broadcasters
DV Tip
MFaceTime M HandTime M Emotion
Ndisp 1.40** 1.43** 1.42**
(0.282) (0.287) (0.301)
Ndisp
()
20.001** 0.001** 0.001**
(3.20e-4) (2.90e-4) (3.51e-4)
Ndisp ×M0.377 0.084 0.517
(0.268) (0.326) (0.461)
Ndisp
()
2×M2.20e-4 2.10e-4 0.001
(4.22e-4) (7.75e-4) (0.001)
M19.4 7.75 8.44
(19.9) (19.0) (35.4)
t3.11** 3.12** 3.10**
(0.862) (0.844) (0.823)
Tenure 5.44** 5.43** 5.43**
(1.91) (1.82) (1.81)
Session xed effects Yes Yes Yes
Endogeneity correction Yes Yes Yes
Number of observations 53,893 53,893 53,893
Notes. Robust standard errors clustered by broadcasters in parentheses. Ndisp,FaceTime, HandTime,andEmo-
tion are mean-centered.
**p<0.01; *p<0.05.
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 981
6. Discussion and Conclusion
Fueled by the advances in mobile technology and in-
ternet penetration, the live streaming industry has
grown to be a billion-dollar market and has become a
viable source of income for thousands of people
across the globe. As major players in the social media
industry embrace live streaming, the question regard-
ing the scalability of the PWYW-based revenue model
(i.e., whether larger audiences generate greater reve-
nues) remains unanswered.
We nd a mostly positive casual effect of audience
size on tipping revenues using data from a eld ex-
periment run on a live streaming platform that uses
PWYW. However, the effect of audience size is con-
cave, and it can even turn negative when audience
size is relatively large. By decomposing the net effect
on revenue to the effect on viewer size and on tip per
viewer, we nd that the positive effect on total reve-
nue results from a positive effect of popularity infor-
mation on viewer size and a mostly positive effect on
individual payment. The former nding is consistent
with theories related to herding; the latter nding sug-
gests the dominance of social image motive over seek-
ing reciprocity in driving a viewers tipping in live
streaming. We also nd that famous female broadcast-
ers who use recognition-related words frequently ben-
et the most from an increase in audiences. In terms
of mechanisms, we provide suggestive evidence for
the importance of social imagerelated utility in
PWYW through moderation analyses and falsication
tests. We also scrutinize the possibility that audience
size can improve the performance quality of broad-
casters and, thus, have an indirect effect on tipping.
The lack of a statistically signicant relationship be-
tween broadcastersrecent performances and tipping
suggests the absence of such an indirect effect.
Our research adds to the PWYW literature by pro-
viding the rst assessment of the scalability of reve-
nues generated by PWYW, which sheds light on the
viability of PWYW as a key monetization tool for
rms, especially for social media rms that rely on
UGC. In addition, we complement previous PWYW
studies by showing that an important motive for con-
sumers to pay under PWYW is to improve their social
images, at least in the live streaming context. We also
contribute to the burgeoning live streaming literature
by presenting one of the rst few studies to use eld
rather than survey data.
Our ndings provide several important implications
to marketing practitioners. Because of the concave re-
lationship between audience size and tipping revenue,
platforms might consider splitting viewers into multi-
ple chat rooms to alleviate their declining marginal
benet of tipping, especially for extremely populous
sessions. The validity of this recommendation hinges
on the assumption that either the tipping rate does not
depend on the broadcasterperformance quality or the
broadcastersperformance quality does not depend on
the audience size. When this assumption does not
hold, splitting viewers into smaller groups might low-
er the broadcasters performance quality, which, in
turn, lowers the tipping amount. Although we nd
support for this assumption in our empirical data, we
caution rms that they should scrutinize this assump-
tion before following our recommendation of splitting
viewers.
A caveat of using PWYW in live streaming is that it
does not scale linearly with the viewership as other
monetization methods do, such as advertising and
subscription. Thus, a live streaming rm might con-
sider diversifying its monetization methods to accom-
modate advertising and/or subscriptions when it
reaches a certain stage. Although we used synthetic
viewers as a part of the research design, we do not
recommend platforms to add synthetic viewers to live
streaming sessions to increase revenues because of the
unethical nature of this practice. Instead, given the im-
portance of social imagerelated utility in PWYW, live
streaming platforms could offer additional status-
seeking devices, such as badges, avatars, or ranking
systems, to further improve revenues. We also caution
rms that intentionally manipulating audience size
may backre because the revenue will actually de-
crease rather than increase when the number of dis-
played viewers is relatively large.
We focus on the context of live streaming in this re-
search. However, the question pertaining to scalability
extends to other important formats of virtual commu-
nication, such as massive open online courses
(MOOC). As consumers are increasingly migrating
from the traditional off-line communication to the
new normalof online/virtual communication, it is
imperative for rms to understand and quantify the
marginal benet of expanding the audience given that
the marginal cost of holding a larger audience is negli-
gible in the virtual space. We take the rst step in this
research to examine the scalability of PWYW as a
business model for live streaming. Future research
could extend our research design to explore the scal-
ability of other forms of virtual communication. For
example, it will be interesting to study how the overall
learning effectiveness of a MOOC might vary with the
number of enrolled students.
Our study has limitations, which suggest avenues
for future research. One limitation of this study is the
relatively small sample size; thus, larger sample size
exploration should help establish robustness. In addi-
tion, we do not observe individual-level data and,
therefore, are unable to assess the potential heteroge-
neity in treatment effects across viewers. For example,
viewers who tipped a lot in the past might gain more
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
982 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
benet from broadcastersattention than those mar-
ginal viewers who rarely tipped. Should individual-
level data be available, researchers could extend our
analysis to investigate heterogeneous treatment effects
across viewers. We started the treatment after the rst
10 minutes, which might give rise to the sample selec-
tion problem. Although we nd no strong evidence of
sample selection in our empirical context, future re-
search could improve our experiment design to avoid
the sample selection a priori. Finally, we manipulated
only the audience size but not any engagement activi-
ty (e.g., tips, chats, likes) in this experiment. An inves-
tigation of the separate effects of different types of en-
gagement activities on revenue could be a fruitful
direction for future research.
Acknowledgments
The authors thank an anonymous live streaming platform in
China for implementing the study and providing the data. The
authors also grateful to the editor, associate editor, and two
anonymous reviewers for their constructive guidance. The au-
thors also thank the participants at the 2017 Marketing Dynam-
ics Conference; the 2017 Marketing Science Conference; the
2017 NYU-Temple Conference on Digital, Mobile Marketing,
and Social Media Analytics; the 2018 China India Insights Con-
ference; the 2018 NUS-Tsinghua Workshop on Digital Econo-
my; the 2018 UTD-FORMS Conference; Hong Kong University,
Peking University, Fudan University, Xiamen University, Uni-
versity of Texas at Austin, and University of North Carolina at
Chapel Hill for their valuable feedback.
Endnotes
1
Lin et al. (2021) also use the field data to study drivers of tipping.
Different from our focus on audience size, they focus on the interac-
tions between broadcastersand viewersemotions.
2
According to Lin and Lu (2017), the percentage of female broad-
casters and male viewers from our platform are close to the indus-
try average (73% female broadcasters and 75% male viewers).
3
The exchange rate between Chinese yuan and U.S. dollars is
around 6.6 yuan per dollar as of August 1, 2016.
4
The platform did not provide us the number of real viewers dur-
ing the pretreatment period because it was unable to retrieve this
variable from the database for some technical reasons.
5
Mainly because of the companys data policy, the sample size col-
lected in this research is more moderate than those in recent field
experiments in marketing (e.g., Sudhir et al. 2016, Dub´
e et al. 2017).
Nevertheless, we observe granular data at the minute level within
sessions.
6
We also test a linear-log model in which Ndisp
kt is log-transformed.
The smaller mean absolute error of the simple linear model with
squared term (100.30) than that of the linear-log model (101.53) sup-
ports our model specification.
7
The variable Tenure has a mean of 9.18 and a standard deviation
of 5.34 under these two assumptions.
8
Because of the inclusion of Nreal
k,t1, we also estimated Equation (5)
using the ArellanoBond estimator (Arellano and Bond 1991)and
found statistically equivalent results.
9
A broadcaster with a high tendency to recognize viewers does not
necessarily interact with a viewer more frequently because the
recognition tendency can be driven by a broadcasters politeness
rather than reciprocity.
References
Andreoni J (1988) Privately provided public goods in a large econo-
my: The limits of altruism. J. Public Econom. 35(1):5773.
Arellano M, Bond S (1991) Some tests of specication for panel data:
Monte Carlo evidence and an application to employment equa-
tions. Rev. Econom. Stud. 58(2):277297.
Ariely D, Bracha A, Meier S (2009) Doing good or doing well? Im-
age motivation and monetary incentives in behaving proso-
cially. Amer. Econom. Rev. 99(1):544555.
Azar OH (2007) The social norm of tipping: A review. J. Appl. Soc.
Psych. 37(2):380402.
Azar OH (2011) Business strategy and the social norm of tipping. J.
Econom. Psych. 32(3):515525.
Banerjee AV (1992) A simple model of herd behavior. Quart. J. Econ-
om. 107(3):797817.
Bikhchandani S, Hirshleifer D, Welch I (1992) A theory of fads, fash-
ion, custom, and cultural change as informational cascades. J.
Political Econom. 100(5):9921026.
Buss DM (1988) The evolution of human intrasexual competition:
Tactics of mate attraction. J. Personality Soc. Psych. 54(4):616618.
Cai H, Chen Y, Fang H (2009) Observational learning: Evidence
from a randomized natural eld experiment. Amer. Econom.
Rev. 99(3):864882.
Chen Y, Koenigsberg O, Zhang ZJ (2017) Pay-as-you-wish pricing.
Marketing Sci. 36(5):780791.
China Daily (2016) China's internet celebrity economy bigger than ci-
nema. Accessed May 25, 2021, http://www.chinadaily.com.cn/
china/2016-09/17/content_26812402.htm.
Croson R, Shang JY (2008) The impact of downward social informa-
tion on contribution decisions. Experiment. Econom.
11(3):221233.
DellaVigna S, List JA, Malmendier U (2012) Testing for altruism and
social pressure in charitable giving. Quart. J. Econom.
127(1):156.
Deloitte (2018) Technology, media and telecommunications predictions
2018. Accessed May 25, 2021, https://www2.deloitte.com/cn/
en/pages/technology-media-and-telecommunications/articles/
tmt-predictions-2018.html.
Dub´
e J-P, Luo X, Fang Z (2017) Self-signaling and prosocial behav-
ior: A cause marketing mobile eld experiment. Marketing Sci.
36(2):161186.
DUrbino L (2017) Chinas new craze for live-streaming. A new way
of bringing colour to dreary lives. The Economist Online (Febru-
ary 9), https://www.economist.com/special-report/2017/02/
09/chinas-new-craze-for-live-streaming.
Garcia SM, Tor A (2009) The N-effect: More competitors, less com-
petition. Psych. Sci. 20(7):871877.
Gneezy A, Gneezy U, Nelson LD, Brown A (2010) Shared social re-
sponsibility: A eld experiment in pay-what-you-want pricing
and charitable giving. Sci. 329(5989):325327.
Gneezy A, Gneezy U, Riener G, Nelson LD (2012) Pay-what-you-
want, identity, and self-signaling in markets. Proc. Natl. Acad.
Sci. USA 109(19):72367240.
Griskevicius V, Tybur JM, Sundie JM, Cialdini RB, Miller GF, Ken-
rick DT (2007) Blatant benevolence and conspicuous consump-
tion: When romantic motives elicit strategic costly signals. J.
Personality Soc. Psych. 93(1):85102.
Harbaugh WT (1998) The prestige motive for making charitable
transfers. Amer. Econom. Rev. 88(2):277282.
Hays S (2018) Esports: The tipping market”—$129,000,000 in tips
to str eamers paid in 2017. Accessed May 25, 2021, https://
medium.com/hackernoon/esports-the-tipping-market-129-000-
000-in-tips-to-streamers-paid-in-2017-2cd7248ee623.
Lu et al.:Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS 983
He Q, Liu J, Wang C, Li B (2016) Coping with heterogeneous video
contributors and viewers in crowdsourced live streaming: A
cloud-based approach. IEEE Trans. Multimedia 18(5):916928.
iiMedia Research (2019) 2018-2019 China online live broadcasting in-
dustry research report. Accessed May 25, 2021, https://www.
iimedia.cn/c400/63478.html.
iiMedia Research (2020) 2019-2020 China online live streaming mar-
ket research report. Accessed May 25, 2021, https://www.
iimedia.cn/c400/69017.html.
Jung MH, Perfecto H, Nelson LD (2016) Anchoring in payment:
Evaluating a judgmental heuristic in eld experimental settings.
J. Marketing Res. 53(3):354368.
Jung MH, Nelson LD, Gneezy A, Gneezy U (2017) Signaling virtue:
Charitable behavior under consumer elective pricing. Marketing
Sci. 36(2):187194.
Kenrick DT, Griskevicius V (2013) The Rational Animal: How Evolu-
tionMadeUsSmarterThanWeThink(Basic Books, New York).
Kim JY, Natter M, Spann M (2009) Pay what you want: A new par-
ticipative pricing mechanism. J. Marketing 73(1):4458.
Knowledge@Wharton (2015) Liveand lucrative? Why video stream-
ing supremacy matters. Accessed May 25, 2021, https://knowledge
.wharton.upenn.edu/article/live-and-lucrative-why-video
-streaming-supremacy-matters/.
Lampel J, Bhalla A (2007) The role of status seeking in online com-
munities: Giving the gift of experience. J. Comput. Mediated
Comm. 12(2):434455.
Lee YC, Yen CH, Wang D, Fu WT (2019) Understanding how digital
gifting inuences social interaction on live streams. Proc. 21st
Internat. Conf. Human Comput. Interaction Mobile Devices Services
(ACM, Taipei, Taiwan), 110.
Lin J, Lu Z (2017) The rise and proliferation of live-streaming in
China: Insights and lessons. Internat. Conf. Human Comput. Inter-
action (Springer, Cham, Switzerland), 632637.
Lin Y, Yao D, Chen X (2021) Happiness begets money: Emotion and
engagement in live streaming. J. Marketing Res. 58(3):417438.
Mediakix (2018) The top 16 Facebook live statistics you should know.
Accessed May 25, 2021, https://mediakix.com/blog/facebook-live
-statistics-video-streaming-to-know/#gs.vR8kVHA.
Natter M, Kaufmann K (2015) Voluntary market payments: Under-
lying motives, success drivers and success potentials. J. Behav.
Experiment. Econom. 57:149157.
Olson M (1965) Logic of Collective Action: Public Goods and the Theory
of Groups (Harvard University Press, Cambridge, MA).
Pires K, Simon G (2015) YouTube live and Twitch: A tour of user-
generated live streaming systems. Proc. 6th ACM Multimedia
Systems Conf. (ACM, Portland, OR), 225230.
Qin A (2016) Chinas viral idol: Papi Jiang, a girl next door with atti-
tude. The New York Time Online (August 24), https://www
.nytimes.com/2016/08/25/arts/international/chinas-viral-idol
-papi-jiang-a-girl-next-door-with-attitude.html.
Roettgers J (2018) Facebook adds ability to tip live streamers to mo-
bile apps. Variety Online (April 16), https://variety.com/2018/
digital/news/facebook-tipping-in-app-purchases-1202754904/.
Schmidt KM, Spann M, Zeithammer R (2015) Pay what you want as
a marketing strategy in monopolistic and competitive markets.
Management Sci. 61(6):12171236.
Shriver SK, Nair HS, Hofstetter R (2013) Social ties and user-generated
content: Evidence from an online social network. Management Sci.
59(6):14251443.
Simonsohn U, Ariely D (2008) When rational sellers face nonrational
buyers: Evidence from herding on eBay. Management Sci.
54(9):16241637.
Sj¨
oblom M, Hamari J (2017) Why do people watch others play video
games? An empirical study on the motivations of Twitch users.
Comput. Human Behav. 75(10):985996.
Sudhir K, Roy S, Cherian M (2016) Do sympathy biases induce char-
itable giving? The effects of advertising content. Marketing Sci.
35(6):849869.
Sundie JM, Kenrick DT, Griskevicius V, Tybur JM, Vohs KD, Beal
DJ (2011) Peacocks, Porsches, and Thorstein Veblen: Conspicu-
ous consumption as a sexual signaling system. J. Personality Soc.
Psych. 100(4):664680.
Tang JC, Venolia G, Inkpen KM (2016) Meerkat and Periscope: I
stream, you stream, apps stream for live streams. Proc. 2016
CHI Conf. Human Factors Comput. Systems (ACM, San Jose, CA),
47704780.
The Economist (2014) An itch to Twitch. Accessed May 25, 2021,
https://www.economist.com/blogs/schumpeter/2014/05/live-
video-streaming.
Toubia O, Stephen AT (2013) Intrinsic vs. image-related utility in so-
cial media: Why do people contribute content to twitter? Mar-
keting Sci. 32(3):368392.
Tucker C, Zhang J (2011) How does popularity information affect
choice? A eld experiment. Management Sci. 57(5):828842.
Zhang J (2010) The sound of silence: Observational learning in the
US kidney market. Marketing Sci. 29(2):315335.
Zhang J, Liu P (2012) Rational herding in microloan markets. Man-
agement Sci. 58(5):892912.
Zhang X, Zhu F (2011) Group size and incentives to contribute: A
natural experiment at Chinese Wikipedia. Amer. Econom. Rev.
101(4):16011615.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
984 Marketing Science, 2021, vol. 40, no. 5, pp. 964984, © 2021 INFORMS
... The booming development of live streaming commerce worldwide has begun to receive sufficient research attention. Most of the literature about live streaming commerce focuses on empirical research, mainly including the impact of live streaming on consumers' purchase intention and behavior (see Sun et al. [9]; Tobon & García-Madariaga [10]; Ma et al. [17]; Liu et al. [18]; Lu et al. [19]), customers' perceived value of live streaming, customer trust, and engagement (e.g., Wongkitrungrueng & Assarut [6]; Wongkitrungrueng et al. [7]; Guo et al. [20]), the interaction between streamers and users (see Kang et al. [11]; Li et al. [21]; Lin et al. [22]), etc. Among them, Wongkitrungrueng et al. [7] studied consumers' perceived value of live streaming commerce and the trust between sellers and consumers. ...
Article
Full-text available
It is critical to investigate new contract cooperation mode to deepen the cooperative relationship between brand suppliers and streamers as the live streaming commerce industry is gradually approaching standardization. In this paper, we discuss the proportional incentive contract based on target sales volume and study contract design optimization based on principal-agent theory in the context of the live streaming commerce supply chain. Then we compare the incentive effect of the proportional incentive contract and the linear contract on the streamer’s sales effort. The results show that the optimal solutions of the proportional incentive contract exist under certain conditions and are the first-best solutions. In contrast, the linear contract cannot achieve the first-best solution. When the proportional threshold is less than a certain fixed value, the existence of the optimal proportional incentive contract can always be guaranteed, and there are multiple pairs of contract menus to make the contract reach the optimal state. Furthermore, while the fixed service fee and the commission rate in the proportional incentive contract both have an incentive effect on the streamer, only the commission rate does so in the linear contract. As a result, the proportional incentive contract can help the live streaming commerce supply chain system to achieve higher performance than the linear contract.
... Sharma and Nayak [24] investigated the antecedents and consequences of the IRP in the application of PWYW pricing to tourism; they concluded that altruism and social desirability have positive effects on IRP in tourism, whereas price consciousness has a negative effect. Lu et al. [25] examined the scalability of PWYW as a business model for live streaming by examining the relationship between the popularity (i.e., audience size) of a live-streaming event and the revenue it generated under a PWYW scheme. Their results indicated that revenues under PWYW did not scale linearly and supported the relevance of social-image concerns in driving individual payment decisions under PWYW. ...
Article
Full-text available
The pay-what-you-want (PWYW) pricing scheme allows consumers to pay whatever amount they wish for a particular product or service. PWYW is increasingly being used by restaurants and the hotel industry in Western countries, and it is thus important to study consumer behavior when faced with the PWYW option and to further explore the acceptance of PWYW conditions from a theoretical perspective. The extant literature indicates that the perception a “fair price” in the mind of a consumer is the result of an evaluation based on comparing the price paid with the reference price. Fairness thus plays an important role in affecting consumers’ intentions to pay more than zero under the PWYW option. First, we develop a theoretical model that incorporates the consumer’s “fair price” into their utility function. Thus, we obtain different decisions and optimal Nash equilibrium prices across individuals. When a seller provides differentiated products and allows consumers the option to pay what they want, it is fascinating that, in our model setting, the scenario in which all consumers pay zero never provides an equilibrium solution as long as the consumers’ fair prices are positive. In contrast, a scenario in which all consumers are willing to pay a nonzero amount occurs under certain conditions. Furthermore, it is demonstrated that it is possible for this pricing system to be more profitable for the seller than a uniform pricing scheme. Finally, we conduct a sensitivity analysis of the parameters in our proposed model and present several illustrative examples to verify our results.
... Previously, scholars had been dependent on sponsors to collect data, so the ease of independent data collection may have contributed to the increase in field experiments in the past decade. The 102 papers on pricing and promotions can be further subdivided into more specific research areas, including price discounts (Choi et al., 2020;Kim et al., 2014;Sokolova and Li, 2021), pricing for signaling (Anderson and Simester, 2003;Haruvy and Leszczyc, 2009), price elasticity (Kumar et al., 2009) and price innovations such as pay what you want (Chen et al., 2017;Lu et al., 2021). Promotions-focused studies included manipulations, such as incentivizing purchase behavior (Dugan et al., 2021;Leipnitz et al., 2018), the impact of free samples (Kim et al., 2014) and the impact of coupons (Nies and Natter, 2010). ...
Article
Purpose The purpose of this study is to present a systematic methodological review of the application of field experiments in the domain of marketing research. By performing this study, the authors seek to offer necessary advice and suggestions to marketing scholars interested in the application of field experiments and to promote the adoption of field experiments as a preferred methodological choice among scholars in this domain. Design/methodology/approach A total of 315 field experiments published in the ten leading marketing journals in the past five decades were analyzed in this systematic methodological review. This study examines various aspects of field experiments, including the research profile of existing research, different trends and topics related to field experiments, choice of research questions, methods of observations, unobtrusive data collection, types of interventions and outcome variables. Findings This study identified various trends and topics, categories of manipulations, types of limitations and important considerations in designing field experiments and offered necessary advice on the future of field experiments in marketing research. Research limitations/implications This study provides a complete roadmap for future marketing scholars to adopt field studies in their research plans. The systematic summary of limitations and the checklist will be helpful for the researchers to design and execute field studies more effectively and efficiently. Practical implications This review study offers a complete roadmap for marketing scholars who are interested in adopting field experiments in their research projects. The discussion of trends and topics, manipulations, limitations, design considerations and checklist items for field experiments offers relevant insights to marketing scholars and may help them design and execute field experiments more effectively and efficiently. Originality/value To the best of the authors’ knowledge, this study is the first of its kind to provide a comprehensive methodological review of field experiments published in leading marketing journals throughout the past five decades. This study makes novel and unique contributions to both theory and literature on field experiments in the marketing discipline.