Content uploaded by James G. Webster
Author content
All content in this area was uploaded by James G. Webster on Aug 21, 2020
Content may be subject to copyright.
Content uploaded by Angela Xiao Wu
Author content
All content in this area was uploaded by Angela Xiao Wu on Jul 22, 2020
Content may be subject to copyright.
Content uploaded by Angela Xiao Wu
Author content
All content in this area was uploaded by Angela Xiao Wu on Jul 22, 2020
Content may be subject to copyright.
ONLINE ATTENTION FLOWS 1
*Equal contributors.
This is the authors’ prepublication copy. The final version is available here:
Wu, A.X., Taneja, H., & Web ster, J. (2020 ). Going with the flow: Nudgi ng audiences o nline. New
Me dia & Society. DOI: 1 0.1177/1461 444820941183
Going with the Flow: Nudging Attention Online
Angela Xiao Wu*
New York University, USA
Harsh Taneja*
University of Illinois at Urbana-Champaign, USA
James G Webster
Northwestern University, USA
*Equal contributors.
Abstract
Theories explaining the impacts of online media often swing between the actions
of empowered individuals and the distribution structures put in place by powerful
corporations. To explicate how these factors interact, we adapt the concept of
audience flow to highlight the temporal dimension of web use and demonstrate how
digital architectures subtly nudge masses of people into online attention flows. We
identify sequential usage patterns through a network analysis of passively measured
clickstreams, combined with data on website ownership and website architectures.
Our sample, based on a panel of 1 million users, includes 1761 websites that reached at
least 1% of Internet users in the United States. Our findings reveal previously unseen
patterns of online audience formation, which have implications for studying media
effects and understanding institutional power on the Internet.
Media use can be conceptualized as a flow in which people encounter a sequence
of offerings. This way of thinking has provided academics and industry professionals
with a powerful framework for understanding and controlling audience formation. But
with the advent of on-demand digital media, the notion that people’s media use could be
directed by media organizations fell into disfavor. In its place arose a rhetoric of user
empowerment in which people controlled the time and substance of their media
encounters. The vision of newly liberated users has, in turn, contributed to the appeal of
theories that see the actions of purposeful individuals as the key to understanding online
ONLINE ATTENTION FLOWS 2
*Equal contributors.
behavior. We argue that this simple notion of empowerment is an illusion. On the web,
media architectures still shape the flow of public attention. This happens in subtle ways
that nudge users in particular directions. It often takes advantage of habitual behaviors
and is generally difficult for the users themselves to see or understand.
We demonstrate these patterns of sequential internet use by analyzing passively
measured traffic data from a million internet users on all 1761 popular websites that
reached at least one percent of Internet users in the US. In doing so, we adapt the concept
of flow, advanced by researchers of television and radio including Raymond Williams, to
digital media studies and develop a new way to identify online attention flows through a
network analysis of clickstream data. Drawing also on website-level data about corporate
ownership and digital architectures, our research reveals previously unseen structures
which have implications for understanding institutional power on the internet.
How Audiences Form Online
Most theories of media use are tilted in favor of the individual agent (Webster,
2014). Economists and sociologists have long assumed that genre preferences
determined people’s media choices (Owen and Wildman, 1992; Peterson, 1992). Uses
and gratifications, a dominant paradigm in communication, maintains its commitments to
individual gratifications sought as determinants of use (Sundar and Limperos, 2013).
Likewise, research on selective exposure (Stroud, 2010), mood management (Zillman,
2000), and the role of habits (LaRose, 2010) rely almost exclusively on psychologically
based theories as antecedents of media choice. These assumptions about media audiences
manifest in many subfields of mass communication research. In political communication,
for example, the selective exposure thesis posits that with increased media options,
people are empowered to choose attitude-consistent information and avoid counter-
attitudinal content. Thus, for over a decade, many academics have expected patterns of
attention to coalesce into partisan echo chambers (Jamieson and Cappella, 2009; Stroud,
2010). Political communication is not unique. A kind of consumer sovereignty reigns
supreme in many theories of media choice.
Political economists have been skeptical of such sovereignty, arguing that
consumer choice, and even functioning democracy, were constricted by concentrated
media ownership (Downing, 2011). But even analysts sensitive to questions of
institutional power were seduced by the newly liberating potential of the internet and its
“nearly free” digital affordances (e.g., Benkler, 2006). Only recently have political
economists brought the analysis of distribution to bear on thinking through online
audience formation. That research argues that internet distribution constitutes a costly and
powerful structure that corporations deploy to “underpin our communicative encounters”
(Murdock, 2018: 365; Sandvig, 2015).
In The Internet Trap, Matthew Hindman (2018) provides a detailed account of
how a handful of tech giants have succeeded in dominating internet traffic. These
companies invest massively in hardware and software (i.e., server farms and code) to
overtake their lesser competitors in delivery speed and reliability. Based on large-scale
field experiments they are able to optimize user experience through tinkering their
websites. Their unmatched marketing resources in the form of paid search and promoted
social media feeds contributes to their dominance in online placement. Platform studies
ONLINE ATTENTION FLOWS 3
*Equal contributors.
have added insights to the political economy of the internet by examining the interface
and entwined functionalities of websites and apps. For example, studies show how
platforms such as Facebook and Google manage to trap users through intricate
coordination of design and partnerships (e.g., Helmond, 2015; van Dijck et al, 2018).
This focus on the materiality of distribution to think about internet use effectively brings
in more critical purchase than the common wisdom that on the web “the big gets bigger”
because of “network effects,” a purported mechanism wherein existing users always
bring in more users through their social connections.
In sum, most of the early social science on the uses and effects of online media
assumed an empowered, active audience able to choose whatever media they want when
they want it. In contrast, political economists assumed that people are being acted upon
by powerful, often unseen distribution infrastructures. In light of this binary, an emerging
literature began to theorize that exposure resulted from individuals interacting with
institutional structures (Webster, 2014). For example, building on a line of work that
stressed hybrid organizational and technological logics as integral to patterns of political
communication online (e.g., Chadwick, 2013), Thorson and Wells (2016) laid out the
varied influences exerted by actors such legacy media institutions, social networks, and
computer algorithms over media exposure and effects today.
This paper extends this growing literature in two ways. First, we develop an
analytical framework on audience formation that accounts for both individual user agency
and structural power. We do this by invoking the concept of flow from television studies,
adapting it to online media, and synthesizing it with existing conceptual efforts. Second,
by employing novel datasets and analytical methods, our study yields systematic
empirical findings that provide some clarity on what people encounter online and how
digital distribution actually affects it. The scope of our analysis and conceptualization,
importantly, addresses a wide realm of online activities beyond content consumption.
Flows in Linear Media
Television and radio are experienced as a flow of encounters arranged by a
programmer. In the aggregate, these individual flows are seen by institutions as audiences
flowing from one program to the next. In this study, “flow” refers to sequential behaviors
enacted by groups of people. It is a macro-level phenomenon concerned with how public
attention is directed to media content, often planned by media institutions.
This way of understanding media use has been popular since the dawn of
broadcasting. Linear television delivered a fixed schedule of offerings in a stream that
encouraged audiences to flow from one program to the next. From the earliest days of
TV, programmers and marketers realized that there were stable patterns of audience flow,
called lead-in or inheritance effects, which could help them to build and manage
viewership (Eastman and Ferguson, 2013; Webster, 2006). Empiricists noted these
regularities and posited laws about the “duplication of viewing” between pairs of
programs (Cooper, 1996; Goodhardt et al., 1987). Cultural theorists also recognized that a
“planned flow, is…. the defining characteristic of broadcasting, simultaneously as a
technology and as a cultural form” (Williams, 1975: 86).
Stable patterns of viewing, however, do not mean that audiences are passive dolts.
Inheritance effects are typically the strongest when programs scheduled back to back
ONLINE ATTENTION FLOWS 4
*Equal contributors.
were of the same genre (Webster, 2006). Audiences actively expressed their program
preferences, but they did so within the temporal constraints imposed by the fixed
schedules of linear television. By the end of the twentieth century, however, everything
seemed to change (Balnaves et al., 2011). With the advent of increased competition from
cable and satellites, and ultimately the explosion of non-linear media on the internet, the
concept of a planned flow lost much of its appeal.
In 2008, The New York Times Magazine cited industry experts who claimed that a
planned flow was a thing of the past. “Everyone’s composing their own flow. And once
you start becoming the composer of your own flow, you can’t go back” (quoted in
Webster, 2014: 65). The so-called death of flow reflected the familiar belief that online
media empowered users. Media could no longer be pushed at people. Instead, people
pulled what they wanted when they wanted it. As Pariser and Helsper (2011: 67) noted,
“internet enthusiasts were excited about the shift from push to pull for reasons that are
now pretty obvious…pull media put users in control. The problem is that pull is actually
a lot of work.” Indeed, a growing body of evidence suggests that the rumors of flow’s
demise are, in the words of Mark Twain, “greatly exaggerated.” And there is reason to
believe that the internet allows institutions to build and manage audiences by structuring
patterns of flow. Institutions may shape people’s media encounters by “nudging” them in
particular directions, often without their awareness.
Flows in Online Media
Since its inception, analysts have seen the internet as “anti-television,” because it
provides distribution and transmission technologies that enable individual freedom of
choice, lateral connections, and user-generated content. These features seemed to break
media monopolies (Sandvig, 2015). Unlike television viewers who were encouraged to
follow a preordained sequence, web users were free to explore an endless “cyberspace.”
Although some critics argued that there was never any “space in cyberspace” (Manovich,
2001: 253), the concept is largely taken for granted in today’s popular imaginations
(Mansell, 2012). The idea of this open, largely unstructured, space has encouraged an
inaccurate image of the sovereign user composing her own journeys through a vast online
landscape.
In contrast, seeing the internet as a temporal medium is productive both for
enabling closer inquiries into actual web usage and for rethinking the much-touted user
agency being enacted in the metaphorical “cyberspace.” Media theorists have noted that,
both phenomenologically speaking and in terms of the underlying technical operations,
user experiences online are temporally rather than spatially organized (Chun, 2006;
McPherson, 2006). McPherson (2006: 243) uses “volitional mobility” to describe the
sensation of web surfing, during which the user “mov[es] from link to link with a certain
illusion of volition.” In this sense, the internet “emulat[es] the televisual event,” and the
web user’s experiential sequences can be understood as flows.
Just as television audiences flow from program to program, internet users visit
websites sequentially. Each day millions of people land on NYTimes.com from
Facebook, and numerous individuals begin with Yahoo.com, their default homepage,
followed by reading pages from Yahoo News. Each such sequence constitutes a flow,
which empirically is the volume of people whose attention moves successively across the
ONLINE ATTENTION FLOWS 5
*Equal contributors.
particular web outlets involved. To adapt to the context of web use, we use the term
“attention flow” instead of “audience flow” that television scholars employ.
Importantly, invoking the concept of the flow to study web use redirects scholarly
attention toward the role of institutional arrangements in shaping the sequencing of
attention (Chun, 2006: 49). For example, remarking on the “increasing popularity of
‘portal’ sites” such as MSN and AOL in the early days of the web, McPherson (2006:
248) cautions against the ascendance of “a Web architecture which works to constrict the
surfer’s movement, effectively detouring users along particular paths or containing them
within particular sites.”
Nudges and Online Choice Architectures
What sets online attention flows apart from their televisual predecessors,
however, is the user’s heightened feeling of choice. It is useful to think of the shaping of
online flows as “nudges”—that is, “liberty-preserving approaches that steer people in
particular directions, but that also allow them to go their own way” (Sunstein, 2014:
358).
1
The literature of behavioral economics has found that “choice architectures”, or the
characteristics of the environment in which options are presented, powerfully influence
people’s decision-making. For example, people tend to go with whatever is set as the
default option. Importantly, nudges are not manipulations imposed on user choice, but
part of the context inevitably present when people choose (Thaler and Sunstein, 2009).
We might then think about “online choice architectures” that maintain users’
sense of freedom while subtly nudging their attention. Within platforms, nudges might
emulate linear media by engineering a site’s “stickiness.” For example, YouTube and
Netflix have auto-play features and produce behaviors much like television inheritance
effects (Sandvig, 2015). But the repertoire of choice architectures online is much larger.
Many nudges span outlets. These include various embeddings of hyperlinks, pathways
generated based on real-time data (e.g., social media customization, content
recommendation, and search results) or on some design decisions (e.g., a gateway to an
online form), software default settings (e.g., default homepages and search engines), and
technical properties largely imperceptible to the users (e.g., low browsing speed). In
theory, a user is being nudged if she behaves differently in one set of choice architectures
versus in another.
In an attempt to bridge traditional media use and effects research and political
economy of the internet, our research program on online audience formation seeks to
identify “institutional nudges,” by which we mean nudges performed by institutions
through well-thought through, coordinated website designs. These nudges thus manifest
in observable patterns of web use that could be systematically explained at scale by
institutional intentions.
1
Institutions also rely on digital architectures closed by design (i.e. “walled gardens”) to
control and restrict user flows on the web, but conceptually the notion of nudges is less
relevant in those settings.
ONLINE ATTENTION FLOWS 6
*Equal contributors.
Notably, the notion of nudge is more expansive than that of curation, a key phrase
employed in the burgeoning scholarship on content platforms. As Thorson and Wells
(2015) write, to curate is to select, filter, organize, annotate, and frame content, which
becomes increasingly essential for people to “efficiently” cope with the information
abundance of our age. The notion of nudge that we employ here is broader; it is not
limited to content consumption, but extends to other activities people conduct online,
such as banking and checking maps.
Three Approaches to Analyzing Flows
How can we study temporal, unfolding experiences of web users in a way that
reveals the presence of online choice architectures? Broadly speaking, we can identify
three different approaches that researchers have adopted to analyzing flows.
The first focuses on texts, wherein flows are revealed by analyzing the temporal
sequence of content delivered by one curating actor (e.g., a channel, news outlet, or social
media platform). This was the approach taken by Raymond Williams (1975) in his classic
work on television. He demonstrated his method by describing the flow of content on a
channel in evermore granular detail, proceeding from programs and narratives to
sequences of sounds and images. Williams and others using this approach (e.g., Caputi,
1991) infer how these “flow texts” influence users. This is a micro-level form of analysis
often grounded in cultural theory. More recent work theorizing “curated flows” (Thorson
and Wells, 2016) seems in keeping with this approach to flows, while featuring the
influences of social networks and algorithmic renderings on the user’s exposure to flow
texts. All are “user-centric” in that the individual user’s experience of flow is the center
of attention. In a world of anywhere-anytime media, it is increasingly difficult to claim
which flow texts are idiosyncratic or common, and in turn, extrapolate these micro-level
flows to larger social phenomena.
The second approach is concerned with “flow culture” (Flichy, 1980; Bolaño,
2015), and pitched at a higher level of analysis. It focuses on media organizations, which
produce and distribute culture and the markets within which they operate. Analyzing
these institutional arrangements, researchers have demonstrated how the flow of culture
is controlled and made inferences as to its ideological effects (e.g., Kreiss, 2016). Often
grounded in the political economy of media, such work can be thought of as a “media-
centric” approach since the experience of individuals is of less interest than the work of
media organizations and markets. As is the case with user-centric approaches, it is
increasingly difficult for media-centric work to make the leap from institutional
arrangements to a clear picture of how choice architectures affect the distribution of
public attention. Given that platforms now control the flow of audience to and from other
media, a static look at organizational affiliations tends to miss the power dynamics at play
in digital media.
The third approach focuses on the extent to which audiences flow through
sequences of content. This is in keeping with the work of empiricists we noted above who
have documented law-like patterns of audience behavior. Academic work of this sort has
been done in marketing and communications. Typically analyzing passively observed
media use data in order to model macro-level patterns of exposure, it can be thought of as
an “audience-centric” approach to analyzing flows (e.g., Webster and Ksiazek, 2012). It
ONLINE ATTENTION FLOWS 7
*Equal contributors.
is sometimes done for applied purposes (e.g., Rust and Eechambadi, 1989), with few
overt theoretical aspirations other than understanding the forces that shape mass behavior.
The strength of applying an audience-centric approach to nonlinear media is that it can
document the actual consumption of online culture. To date, this literature has
concentrated on analyzing snapshots of which outlets share audiences. Fortunately, as we
elucidate next that limitation is not inherent in audience-centric analysis.
Sequential Attention Flows
Just as studies of linear media identified how audiences flowed from one program
to the next, it is possible to capture the dynamics of attention flows across online
offerings. Understanding how these flows shape public attention can offer new insights
into the origins and likely effects of online media consumption. Consider for instance two
partisan outlets MSNBC and Fox News. Studies show high audience duplication between
their websites (Gentzkow and Shapiro, 2011). Different pathways may lead people to
consume ideologically divergent outlets. Some people may land on the websites via their
friends’ recommendation on social media or top-ranked results by search engine, while
others, hoping to “check out the other side,” may embark on these outlets using URLs.
Discerning these processes requires a temporal dimension that analysis of audience
duplication lacks (Möller et. al., 2019).
Using clickstream data, we propose a “sequential” audience-centric approach to
overcome this methodological limitation. Clickstream data capture user activity as people
move from browsing one webpage to another (Wu and Ackland, 2014). These attention
flows, as we have conceptualized, are invariably shaped by online choice architectures
within which they occur, and our research program aims to identify certain forms of
nudges that align with institutional intent. We operationalize the scale of such intents
through shared website ownership and traits of website architectures. In this vein, we
pose the following research questions:
RQ1: What patterns of attention flows emerge from analyzing people’s sequential
browsing activity?
RQ2: In what ways are these patterns of attention flows related to online choice
architectures, especially ones that internet corporations use for audience building?
Method
To address these questions, we obtained passively measured national-level
clickstream traffic data, together with ownership information and “website category” of
the web outlets from comScore. We sampled all websites that had at least 1% (2.6
million) of all web users in the US visiting in October 2015, resulting in a sample of 1761
web outlets. To determine this sample, we first downloaded the entire list of 33,000 web
entities measured by comScore in October 2015. ComScore nests web entities
hierarchically based on ownership. Consider Google for example. ComScore aggregates a
collection of all Google-owned entities as “Google Sites” (level 1), which can be
disaggregated into Google.com and Youtube.com (both at level 2). Google.com can be
further expanded to obtain traffic separately for “Google Search,” “Gmail,” and “Google
News” (level 3). Based on our framework, Google Search represents a distinct online
choice architecture compared to Google News. Thus, to obtain high granularity required
ONLINE ATTENTION FLOWS 8
*Equal contributors.
to test our conceptual framework, we disaggregated web entities up to the third level and
selected all level 3 outlets that met our threshold of 1% unique users. We report this
entire sample in a supplemental appendix (Appendix 3).
In addition to website ownership data, we conducted at different stages website-
level observations both to inform our quantitative analysis of web use data and to
interpret its results in light of our conceptual framework. More detail on this qualitative
component is provided below as we delineate our analytical steps.
We extracted the clickstream data for each web outlet. Reported for a specific time
period, these reflect the number of total unique users that landed on an outlet immediately
after visiting other outlets during this period. In clickstream data, if a user switches from
outlet a to b and then from b to c, outlet a is a traffic source for b, and b a traffic source
for c. An aggregation of all these clickstreams in a media ecosystem would enable the
researcher to infer, through network analysis, attention flows at scale. Since our sample
was both large and expansive, we had the complete list of the sources of incoming traffic
for most websites. Utilizing these clickstreams from all websites, we created a matrix of
websites where each cell indicates the user volume flowing from one site to another. In
effect, unlike undirected audience duplication networks (Webster and Ksiazek, 2012),
this can be analyzed as a directed network of websites, with edges being user volumes
flowing from one website to another.
For this network, we first estimated each node’s weighted out-degrees and in-
degrees. The weighted out-degree in a clickstream network represents the total volume of
user traffic that flows out of this site to all other sites in the network. The network was
highly centralized, with highly skewed distributions of both in-degrees and out-degrees
but the latter being especially concentrated (Gini Coefficient = 0.64). Thus, a handful of
outlets with exceedingly large out-degree scores serve as the sources of traffic flows to
large portions of the web (see Appendix 3). Thus the graph has an inherent core-
periphery structure obviating the need to filter insignificant ties. These are the most
popular email providers, portals, search engines, and social networks. Based on their
large out-degree scores, we infer that user visits to most other websites originate from
these sites.
Results
Patterns of Attention Flows: Clickstream Constellations
To address RQ1, we implemented a cluster analysis using a modularity based
method for detecting communities with dense interconnections (Blondel et al, 2008). In
this context, such a community/cluster represents a socially shared browsing sequence,
with several website pairs accessed in succession by users. We refer to these clusters as
“constellations.” Other than being mathematically acceptable, a cluster solution coheres
with our theoretical expectations if, for most constellations, the constituent sites either
exhibit some content similarity (a proxy for user preference) or point to some mechanism
explained by online choice architectures. The solution we arrived at has 14 clusters, with
a modularity score of 0.25. Since the modularity of a randomly generated network is zero,
this score indicates a good fit. Figure 1 shows the 14 constellations of our solution. For
further analysis, we left out the three smallest clusters since they had a small number of
highly specialized websites.
ONLINE ATTENTION FLOWS 9
*Equal contributors.
For each constellation/cluster, based on the weighted-outdegree scores of all its
constituent websites, we calculated a Gini coefficient (see Table 1). This indicates the
extent to which a browsing sequence is “anchored” by a handful of sites that serve as
beginning and returning points, preceding visits to most others. Given usual browsing
patterns, people more likely move from anchor sites to the subsequent site through built-
in pathways such as hyperlinks instead of entering the URLs in their browser. If a small
number of sites anchor the browsing sequence, the Gini coefficient tends to be high. The
power of anchors is also reflected in the proportion of their cumulative outdegrees in the
constellation’s total outdegrees (Table 1).
Next, to qualitatively examine each constellation’s composition, we first ranked its
websites by weighted outdegrees. Then beginning with the anchor(s), we noted functions
and brands of all the outlets that account for 80% of the constellations’ total outdegrees.
If a constellation had less than 20 outlets meeting this criterion, we further went down the
list to examine at least 20 outlets per constellation. In doing so we examined 328 outlets.
We also investigated the existence of historical partnerships between anchors that the
ownership data do not directly provide. For each constellation, we combine quantitative
results with qualitative examination to discern its likely mechanism.
[Figure 1 and Table 1 about here]
Google Complex. This cluster has the largest number of sites. It mainly contains
utilitarian websites such as those of retailers (including eBay and Amazon) along with
websites of service providers in domains such as finance, shopping, telecom, travel as
well as government. What these sites have in common is that users are and are unlikely to
encounter them as part of random browsing, as they visit them to accomplish a specific
purpose. With the highest outdegree, Google Search anchors most of these visits. Other
prominent anchors in this cluster include YouTube and Gmail; their high out-degrees
suggest that users return to them intermittently during typical browsing sessions.
Social Network Complex. This cluster has major online social networks such as
Facebook, Twitter, and Linkedin. Alongside them are a large number of sites providing
content on current affairs, politics, entertainment, and specialized topics such as sport.
These sites are run by legacy media organizations (e.g., Washington Post) and digital
companies alike (e.g., Buzzfeed). “Socially-driven” outlets such as GoFundMe, Spotify,
Fitbit, and Legacy.com constitute the third type of sites in this cluster. Finally, sites of
various banks and mobile/ISPs are also present here, which may be because when people
go to these long-term services, they need not resort to search engines. Instead, they access
them directly by entering the URL or clicking bookmarks, most likely during their social
media time.
Yahoo Homepages. A number of Yahoo’s flagship online properties (e.g., Yahoo
Homepage, Yahoo News, Yahoo Sports, and Yahoo Mail) anchor this somewhat smaller
constellation. Also present are many outlets specialized in business news (Bloomberg and
Business Insider), political news commentaries (e.g., Atlanic, Dailybest, TheHill,
Politico, Slate, Vox.com), and general “soft” news (e.g., Huffpost and USAToday).
Classmates.com, which helps search for past high school friends, and online services for
investments and mortgages are other notable inclusions in this constellation.
Yahoo Search. Many utilitarian websites, which originated in the 1990s, such as
ONLINE ATTENTION FLOWS 10
*Equal contributors.
Ask.com, Ehow, MapQuest and WedMD, and serve a specific purpose along with the
anchor Yahoo Search constitute this cluster. This cluster probably captures browsing
routines of a segment of older web users that continue to rely on Yahoo Search.
Interestingly, included here are also multiple smaller search engines, which when
examined all turned out to be powered by Yahoo Search but with different frontends.
Porn Constellation. Adult sites primarily constitute this cluster. Among them
Pornhub.com has the highest out-degree and thus serves as a gateway to other adult
websites. Pornub’s homepage links to several of these (e.g., Youporn and Xtube), which
are part of the “Pornhub Network.” Several other adult sites that are part of this cluster
appear to utilize adult advertising networks to feature their content on the network of
Pornhub sites.
Bing / Microsoft. With the highest Gini, this constellation consists of several
content portals and web services aligned with or owned by Microsoft, including
AccuWeather, Office, and Windows. Microsoft’ search engine Bing is the anchor.
Relatedly, AccuWeather is Bing’ default weather widget.
User Data Solicitors. The websites clustering here attempt to extract personal data
from users for commercial purposes. Through adverts or pop-ups on other websites, these
sites lure users to a variety of online questionnaires with the promise of product discounts
and other rewards. Bundled together are also websites run by market research companies,
who are in charge of survey design and data analytics of solicited responses.
Job Search. Indeed.com, Oracle, and other popular portals for job applications and
human resource management constitute this constellation, along with websites of
companies that manage databases for job portals.
AOL Homepages. Anchored by “AOL Homepages,” this constellation contains a
few AOL websites (e.g., AOL Email) along with other sites that are part of the AOL
brand, including USMagazine, EverydayHealth, and ZergNet.
Travel. This cluster has airline and hotel websites, along with third-party sites
specialized in booking travel packages, flights and hotels, hotels. Anchors include
TripAdvisor, Expedia, and Priceline.
Citibank-Retailers. K-Mart, Macy, Old Navy, Sears and other retailers constitute
this cluster along with Citibank, which manages the credit card programs for all these stores.
Attention Flows, Audience Building, and Online Choice Architectures
Shared Ownerships by Constellation. To address RQ2, we first associated shared
ownership of web outlets with their shared constellation membership and found a clear
pattern. Of the 1761 web outlets, over a third (653, 37%) were controlled by multi-site
owners. For example, Google.com and Youtube.com shared an owner (“Google Sites”),
whereas Hilton.com did not share ownership with any other sites. Notably, 54% of the
5468 website-pairs that shared owners also belonged to the same cluster; by comparison
only 20% of the 3 million other pairs (that did not share owners) clustered together. Table
1 further shows that for the majority of the clusters, most web outlets owned by a
company clustered together. Importantly, the anchoring websites that we identified earlier
tend to be the flagship sites of major internet companies, and the corresponding clusters
often involved other websites owned by the same companies. As discussed above,
ONLINE ATTENTION FLOWS 11
*Equal contributors.
anchors in high-Gini constellations are more influential in shaping the browsing
sequences within each constellation.
Our qualitative investigation also revealed the impact of partnerships on online
audience building. The presence of AOL-owned sites in the two Yahoo constellations, in
particular, can be explained by the two companies’ prevailing content partnership when
our data were collected. Using audience duplication data, we conducted a separate
analysis (detailed in Appendix 1) on how different browsing sequences were enacted by
the same or different audiences. The results confirmed our observation.
Website Types by Constellation. By examining the types of websites that constitute
each clickstream constellation, we can shed further light on how various choice
architectures are deployed to nudge people online. To infer these forms of institutional
nudging, we combined comScore’s “website categories” into the following five types:
search, social media, media content, software/hardware, and e-commerce/service (See
Appendix 2 for details on this recoding). We found significant associations between these
website types and constellation membership in Figure 2 (χ-squared = 444.11, df = 52, p-
value < .001). Our approach is to view these data as website-level proxies for online
choice architectures. A more definitive reckoning of the uses and effects of nudging
architectures would require measurement at the level of individual webpages (e.g., Möller
et. al. 2019) or even code.
[Figure 2 about Here]
Nudging Attention Flows on the Web
In this section, we organize the mechanisms that nudge online attention flows
across websites into four distinct forms, each associated with particular website
architectures. We arrived at these associations based on the preceding quantitative and
qualitative investigations of constellation distribution and anchors, as well as their
relations to shared ownership and website types. In Table 2, we arranged these forms of
nudging by their visibility to the user, an arrangement that draws inspiration from critical
algorithm studies that the power of digital architectures comes much from the lack of user
awareness. While this typology of institutional nudges emerges from our empirical study
(see Table 2 for their associated website types and the clickstream constellations whose
formations appear to be driven by the corresponding nudges), we present this typology
also as a heuristic provocation for future empirical research, which we elaborate on in the
conclusion.
[Table 2 about here]
The first form of nudges occur through visible features on the web page. This
includes ranked recommendations on search engines and content curation on social
media. As Figure 2 indicates, different websites are part of social media -anchored
clickstream constellations and clusters anchored by search engines. The latter are largely
“e-commerce/service” websites for everyday services including banking, health
consultations, maps and retail. Contrastingly, “media content” websites where
information is an end in itself, providing news, commentaries, and entertainment content,
constitute the former. In other words, search anchors users towards utilitarian, functional
goods and social media towards symbolic, experiential goods. The utilitarian goods serve
ONLINE ATTENTION FLOWS 12
*Equal contributors.
the needs of people fulfilling their mundane tasks for which social sharing is likely
incidental. The symbolic goods seem more amenable to social sharing. Based on our
results, it is reasonable to infer that social networks, including Facebook, largely tend to
precede people’s visits on news information sites.
Although web users may not comprehend the algorithmic processes involved,
they recognize that search engines and social media direct them to other parts of the web.
In comparison, a somewhat less visible form of nudge occurs through “hypertexts”
which, integrated within the page content, tend to link to websites with shared economic
interests. High volumes of attention flow through “media content” suggest this
arrangement for audience building. For example, nudging through hypertexts may
explain that the Yahoo Homepages Constellation contains content offerings such as
Yahoo News and Yahoo Finance. These less visible nudges also direct attention to
unaffiliated websites with underlying capital connections to the brands such as ATT.net
(Yahoo provides a customized portal homepage to internet subscribers of AT&T).
Another example is the “all-media content” Porn Constellation, where a closed maze
formed by MindGeek-owned niche adult sites appears to confine sequential users’ flows.
Our analysis suggests nudging through hypertexts operate in a no less powerful manner
than platform curation. For example, we found Yahoo to remain enormously central,
despite the fact that Silicon Valley has predicted its demise for over a decade. The online
choice architectures which shepherd Yahoo users into an ecosystem of Yahoo and partner
properties play a part in retaining its relevance.
2
A step further along the visibility spectrum, nudges occur through browser
software configurations on the user end, such as default toolbars and homepages. For
example, many “anchoring” portals serve as default homepages on certain browsers.
Furthermore, that major search engines anchor distinct constellations may be due to
2
This might ring a bell with readers familiar with hyperlink analysis. Although it is
tempting to think that hyperlinks between sites could explain most of these observed
attention flows, recent studies have found hyperlink analysis largely uncorrelated with
user behavior whether modelled as clickstreams (Wu & Ackland, 2014). Consistent with
existing findings, we do not believe hyperlinks between these sites would explain much
variance in attention flows. Furthermore, as we alluded to in an earlier section, choice
architectures on the contemporary web are far more complicated, elusive, and relatively
invisible compared to the relatively static hyperlinks typically captured by web scrapers.
This conception of online choice architectures, therefore, should not be reduced to (static)
connections between websites.
ONLINE ATTENTION FLOWS 13
*Equal contributors.
bundled browser installations. Microsoft Windows browsers (e.g., Internet Explorer and
Edge, which default to MSN as a homepage) has Microsoft Bing as the default engine for
their search bar; Safari, the default browser for Apple Macs with “livepages.apple.com”
as the default homepage, uses Google Search as the default. These ownership and
partnership connections are evident in our findings about traffic patterns. In short,
corporate software arrangements, also aimed at growing user bases, nudge people to
incorporate some of the most popular websites—oftentimes anchoring sites for browsing
sequences we observed—into their web use routines.
Many users may lack the initiative or technical know-how to confront and alter
default software configurations, and in turn their nudges. A opaquer form of nudge,
however, takes place at the “back-end”, which usually reflects partnerships between
institutions. For example, the constellation Citibank-Retailer formed due to all the retailer
sites having Citibank as their payment gateway. Likewise, the Job Search Constellation
knit together because various job portals manage their data through common vendors
such as Salesforce or Oracle.
In summary, while the use of a search engine requires some premeditation and
intentionality, other anchoring sites ranging from social media, web portals, and
contracted technical platforms on the backend seem to direct traffic in ways that require
little or no conscious planning or choosing on the part of the user. The corresponding
online choice architectures involved are difficult for people to discern in increasing
degrees. Institutions routinely manage online flows with integrative hypertexts, browser
configurations, and back-end bundling, all of which are essentially invisible.
Conclusion
There is a growing consensus that cyberspace—a universe of digital resources that
empower people to achieve their own ends—is more complicated than visionaries
imagined. Not unlike the linear media of radio and television, the new purveyors of
online media strive to manage people’s time and attention to suit themselves. To
investigate what those new patterns of public attention look like, we used the concept of
flow. Specifically, we investigated how the structural features of the internet have
affected the flow of public attention across and among websites. In doing so, we
reclaimed an analytical framework that conceives of media use as a temporal flow and
developed new ways of analyzing clickstream data. As a result, we discovered that
managing the flow of online audiences goes beyond overt recommendations and website
curation to include less visible mechanisms that nudge people. We found these relatively
invisible forms of nudges to serve underlying institutional interests in audience building.
Specifically, our study informs research on media effects, the nature of audience
fragmentation, and the political economy of communication. We consider each of these
in turn.
To begin, conceptualizing people’s online media use in this way can enrich our
understanding of media exposure and its consequences. As we have noted, the empirical
research of audience behavior, grounded in selective exposure, motivated reasoning, and
uses & gratifications, has historically presumed an easy congruence of individual
attitudes and message characteristics. Following recent efforts at conceptualization that
incorporate institutional, technological, and social factors (Thorson and Wells, 2016), our
ONLINE ATTENTION FLOWS 14
*Equal contributors.
empirical research program on attention flows provides analytical perspectives on
environmental factors that shape what people see or hear. Supplementing existing work
that combines individual behavioral logs with self-reported characteristics (e.g., Möller
et. al., 2019), we call for more empirical research on structural features beyond individual
awareness and articulation. If people are subtly nudged in directions not dictated by their
predispositions, over time it might be possible to cultivate attitudes and behaviors that
would not otherwise exist. Similar complications arise in consideration of most media
effects research.
Understanding how attention flows across outlets can also shed light on the
formative processes underlying patterns of audience fragmentation (Peters and Schrøder,
2018), especially as they pertain to the debate on audience polarization. Drawing into
question the extent of partisan polarization, recent studies of audience duplication
between outlets have shown a good deal of cross-cutting exposure (Webster & Ksiazek,
2012; Taneja, Wu & Edgerly, 2017; Gentzkow & Shapiro, 2010). Yet these barely
scratch the surface of online attention flows and cannot explain how the same people
might, for example, come across ideologically opposite websites such as Fox News and
MSNBC.
Analysis of clickstream data as directed networks can identify what online choice
architectures may have led the same user to such cross-cutting exposure. For instance, if
these users were largely sent by search engines, it would suggest their conscious choice to
check out “both sides.” Yet as our findings show, social media play a bigger role than
search engines in directing people to media content. It is thus possible that social media
have some power to promote “mainstreaming” rather than creating “echo chambers.” In
this vein, empirical analyses about pathways to content exposure might inform research
on confirmation bias. Also, based on what we find, cross-cutting exposure could be
channeled by nudges among economically entangled websites, which motivates the need
to bring commercial factors into theorizing partisan audience fragmentation. Finally, we
would like to emphasize the potential of examining attention flows between news
websites and sites embodying choice architectures of various mechanisms. This would
reveal formative processes behind fragmentation that remain hidden in studies focusing
on audiences of news sites alone.
Furthermore, our work raises fundamental questions about the scope and nature of
power relations on the internet. Analyzing flows enables a fuller investigation of the role
of institutional arrangements in shaping the online attention economy, thereby addressing
the gap in the political economy literature of how audiences form in relation to the
affordances of distribution. This is no small matter. Hindman (2018: 166-7) argues that
individual choices online are potentially influenced by “[h]undreds of features on a site,”
and that “tiny effects multiply with every user visit,” compounding exponentially to
produce the macro-level structure of the web. Our results provide systematic evidence
that online choice architectures do, in fact, nudge users. It seems plausible that these “tiny
effects” will accumulate over time forming larger structures that channel flows of public
attention. Although, subtle as nudges are, users will probably not notice. By calling
attention to the economic resources and arrangements entailed in the construction and
maintenance of these architectures, our study highlights the advantageous position large
corporations have in competing in the online marketplace.
ONLINE ATTENTION FLOWS 15
*Equal contributors.
A research program on attention flows provides political economists, as well as
those doing platform studies, with new tools to investigate their claims. For example,
analysts argue that a handful of platforms have gained disproportionate power through
calibrating interfaces for nested-applications and forming strategic partnerships for
locking-in users (e.g., van Dijck et al, 2018). They further argue that these mega-
platforms are taking on “infrastructural properties” – that is, they function as shared,
indispensable services like public utilities (e.g. Nieborg and Helmond, 2018). The focus
on the “infrastructurization” of platforms does stimulate invaluable policy and ethical
discussions concerning tech companies and their products (Plantin et al, 2018). However,
while examining the business and technical dimensions of the platform illustrates its
potential mechanisms for user growth and governance, these methods are invariably
platform-centric and unable to address actual user engagement with and beyond the
platform (for a review on the platform-centric episteme, see Wu and Taneja, 2020).
In contrast, investigating sequential user behavior across the web re-centers the
analysis and may demonstrate more forcefully a platform’s “infrastructural
characteristics” and its position in the broader media ecosystem. For instance, our study
shows that the power of singular sites such as Facebook to “platformize the web” (e.g.,
Helmond 2015) lacks support from the macro patterns of online attention flows. Instead,
the political economy of the web in action is better illustrated through a complex cascade
of pervasively distributed choice architectures that latently nudge attention across the
web. Along the same line, our study showcases how data-extraction technologies (Turow
and Couldry, 2018) operate in tandem with structures of digital distribution joined
through ownership and partnerships. This perspective complements the existing
scholarship that documents how these technologies are employed by mega-platforms to
harvest data.
As a first step in this line of inquiry, our empirical study has limitations that future
work may overcome. To begin with, as we discussed, our data is on the website-level.
One could gain greater resolution on the nudging mechanisms by analyzing more
granular, webpage-level traffic data and digital architectural traits. Our dataset also does
not isolate the analytics trackers that commercial actors employ to direct and monitor
online user behavior. Incorporating this would significantly advance inquiries into the
extractive nature of online architectures. Likewise, to provide more comprehensive
empirical analysis of the political economy of online audience building, future research
may incorporate rigorous data on partnership, in addition to ownership, as a covariate.
Finally, our study does not account for the mobile environment, toward which more
platform studies now gravitate. A similar approach, we suggest, can be adapted to
examine the sequential usage of mobile apps in relation to institutional factors.
Digital media are colonizing every part of modern life. Although invaluable, they
are not benign. The institutions that control them are serving their own agendas. Barring a
massive regulatory intervention, the online choice architectures that shape how we use
these resources will evolve and operate largely out of sight. These mechanisms subtly
direct the flow of public attention to information, entertainment, and commerce. They are
invisible arbiters in the marketplace of ideas and beyond. And as such, they can mitigate
or exacerbate a variety of media effects, some of which go to the very heart of
participatory democracies. Academics are uniquely positioned to reveal these structures
ONLINE ATTENTION FLOWS 16
*Equal contributors.
and how they intervene in social life, but our theories and methods must better reflect our
new realities. Only then can we fully realize the benefits of digital media.
ONLINE ATTENTION FLOWS 17
*Equal contributors.
References
Balnaves M, O'Regan T and Goldsmith B (2011) Rating the Audience: The Business of
Media. New York: Bloomsbury.
Benkler Y (2006) The wealth of networks. New Haven: Yale University Press.
Blondel VD, Guillaume JL, Lambiotte R and Lefebvre E (2008) Fast unfolding of
communities in large networks. Journal of Statistical Mechanics (10): P10008.
Bolaño C (2015) The Culture Industry, Information and Capitalism. London: Palgrave
Macmillan.
Chadwick A (2013) The hybrid media system. Oxford: Oxford University Press.
Chun WHK (2006) Control and freedom. Cambridge: MIT Press.
Cooper R (1996) The status and future of audience duplication research: An assessment
of ratings‐based theories of audience behavior. Journal of Broadcasting &
Electronic Media 40: 96–111.
Caputi J (1991) Charting the Flow: The Construction of Meaning through Juxtaposition
in Media Texts. Journal of Communication Inquiry 15(2): 32–47.
Downing JDH (2011) Media Ownership, Concentration, and Control: The Evolution of
Debate. In: Wasko J, Murdock G, Sousa H (eds) The Handbook of Political
Economy of Communications. Oxford: Wiley-Blackwell, 140–168.
Eastman ST and Ferguson DA (2013) Media programming: Strategies and practices.
Boston: Wadsworth.
Flichy P (1980) Les industries de l’imaginaire. Grenoble: Presses Universitaires de
Grenoble.
Gentzkow M and Shapiro JM (2011) Ideological segregation online and offline. The
Quarterly Journal of Economics 126(4): 1799-1839.
Goodhardt GJ, Ehrenberg ASC and Collins MA (1987) The television audience: patterns
of viewing: an update. Aldershot: Gower.
Hindman M (2018) The Internet Trap. Princeton: Princeton University Press.
Jamieson KH and Cappella JN (2008). Echo Chamber. Oxford University Press.
Kreiss D (2016) Seizing the Moment: The Presidential Campaigns’ Use of Twitter during
the 2012 Electoral Cycle. New Media & Society 18(8): 1473–90.
McPherson T (2006). Reload: Liveness, mobility, and the web. In: Keenan T and Chun
WHK (eds) New Media, Old Media. New York: Routledge, 199–208.
Mansell R (2012) Imagining the Internet. Oxford: Oxford University Press.
Möller J, van de Velde RN, Merten L and Puschmann C (2019) Explaining Online News
Engagement Based on Browsing Behavior: Creatures of Habit? Social Science
Computer Review. OnlineFirst. DOI: 10.1177/0894439319828012.
Murdock G (2018) Media Materialties: For A Moral Economy of Machines. Journal of
Communication 68(2): 359–368.
ONLINE ATTENTION FLOWS 18
*Equal contributors.
Napoli, PM. (2014). Automated media: An institutional theory perspective on algorithmic
media production and consumption. Communication Theory 24(3): 340-360.
Nieborg DB and Helmond A (2019) The political economy of Facebook’s
platformization in the mobile ecosystem: Facebook Messenger as a platform
instance. Media Culture & Society 41(2): 196–218.
O'Keefe DJ (2016) Persuasion. London: Sage.
Owen BM and Wildman SS (1992) Video Economics. Cambridge: Harvard University
Press.
Pariser E and Helsper DE (2011) The Filter Bubble. New York: Penguin.
Peters C and Schrøder KC (2018) Beyond the Here and Now of News Audiences: A
Process-Based Framework for Investigating News Repertoires. Journal of
Communication 68(6): 1079–1103
Peterson RA (1992). Understanding audience segmentation: From elite and popular to
omnivore and univore. Poetics 21(4): 243–258.
Plantin JC, Lagoze C, Edwards PN and Sandvig C (2018) Infrastructure studies meet
platform studies in the age of Google and Facebook. New Media & Society 20(1):
293–310.
Rust RT and Eechambadi NV (1989) Scheduling Network Television Programs: A
Heuristic Audience Flow Approach to Maximizing Audience Share. Journal of
Advertising 18(2): 11–18.
Sandvig C (2015) The Internet as the Anti-Television: Distribution Infrastructure as
Culture and Power. In: Parks L, Starosielski L (eds) Signal traffic. Chicago:
University of Illinois Press, 225–245.
Stroud NJ (2010) Polarization and partisan selective exposure. Journal of Communication
60(3): 556-576.
Sunstein CR (2014) Nudging: A Very Short Guide. Journal of Consumer Policy 37: 583–
588.
Taneja H, Wu AX and Edgerly S (2018) Rethinking the generational gap in online news
use: An infrastructural perspective. New Media & Society 20(5): 1792–1812.
Thaler RH and Sunstein CR (2009) Nudge. New York: Penguin.
Thorson K and Wells C (2016) Curated flows: A framework for mapping media exposure
in the digital age. Communication Theory 26(3): 309–328.
Turow J and Couldry N (2018) Media as Data Extraction: Towards a New Map of a
Transformed Communications Field. The Journal of Communication 68(2): 415–
423.
van Dijck J, Poell T and de Waal M (2018) The Platform Society. New York: Oxford
University Press.
Webster JG (2006) Audience Flow Past and Present: Television Inheritance Effects
Reconsidered. Journal of Broadcasting & Electronic Media 50(2): 323–337.
ONLINE ATTENTION FLOWS 19
*Equal contributors.
Webster JG (2014) The marketplace of attention. Cambridge: MIT Press.
Webster JG and Ksiazek TB (2012) The dynamics of audience fragmentation: Public
attention in an age of digital media. Journal of Communication 62(1): 39–56.
Williams R (1975) Television. New York: Schocken.
Wu AX and Taneja H (2020) Platform enclosure of human behavior and its
measurement: Using behavioral trace data against platform episteme. New Media
& Society. OnlineFirst.
Wu L and Ackland R (2014) How web 1.0 fails: the mismatch between hyperlinks and
clickstreams. Social Network Analysis and Mining 4(1): 1-7.
Zillmann D (2000) Mood Management in the Context of Selective Exposure Theory.
Annals of the International Communication Association 23(1): 103–123.
ONLINE ATTENTION FLOWS 20
*Equal contributors.
Tables and Figures
ONLINE ATTENTION FLOWS 21
*Equal contributors.
ONLINE ATTENTION FLOWS 22
*Equal contributors.
Figure 1. 14 Clickstream Constellations
Note. The “size” of each constellation is the number of its constituent web outlets.
ONLINE ATTENTION FLOWS 23
*Equal contributors.
Figure 2 Distribution of Online Choice Architectures in Each Constellation
Note: Each vertical bar is a constellation, the horizontal axis represents the number of
constituent websites. The parentheses in the series legends indicate the total number of
websites of the said type.
A preview of this full-text is provided by SAGE Publications Inc.
Content available from New Media & Society
This content is subject to copyright.