Characterizing the Demographics Behind the #BlackLivesMatter Movement

Article (PDF Available) · December 2015with 1,237 Reads
Source: arXiv
Abstract
The debates on minority issues are often dominated by or held among the concerned minority: gender equality debates have often failed to engage men, while those about race fail to effectively engage the dominant group. To test this observation, we study the #BlackLivesMatter}movement and hashtag on Twitter--which has emerged and gained traction after a series of events typically involving the death of African-Americans as a result of police brutality--and aim to quantify the population biases across user types (individuals vs. organizations), and (for individuals) across various demographics factors (race, gender and age). Our results suggest that more African-Americans engage with the hashtag, and that they are also more active than other demographic groups. We also discuss ethical caveats with broader implications for studies on sensitive topics (e.g. discrimination, mental health, or religion) that focus on users.
Characterizing the Demographics Behind the #BlackLivesMatter Movement
Alexandra Olteanu
EPFL
alexandra.olteanu@epfl.ch
Ingmar Weber
QCRI
iweber@qf.org.qa
Daniel Gatica-Perez
Idiap and EPFL
gatica@idiap.ch
Abstract
The debates on minority issues are often dominated by or
held among the concerned minorities: gender equality de-
bates have often failed to engage men, while those about race
fail to engage the dominant group. To test this observation,
we study the #BlackLivesMatter movement and hashtag on
Twitter—that has emerged and gained traction after a series of
events typically involving the death of African-Americans as
a result of police brutality—aiming to quantify the population
biases across user types (individuals vs. organizations), and
(for individuals) across 3 demographics factors (race, gender
and age). Our results suggest that more African-Americans
engage with the hashtag, and that they are also more ac-
tive than other demographic groups. We also discuss ethical
caveats with broader implications for studies on sensitive top-
ics (e.g. mental health or religion) that focus on users.
Introduction
While the growing number of discussions about minor-
ity1issues—including gender (O’Brien and Kelly 2013),
income (Moodie-Mills 2015), or race (Lashinsky 2015)—
is good news, empirical evidence suggests that they are
held mainly among the discriminated group: women dom-
inate the debate on gender (Royles 2014), while African-
Americans dominate the one on race (Pettit 2006). Although
social media has led to a paradigm shift for advocacy by in-
creasing the effectiveness, the speed and the outreach of so-
cial campaigns, many still fail to reach far beyond the com-
munities for which they advocate.
In this paper, we explore this observation in the context of
the #BlackLivesMatter movement2on Twitter. We want to
gain insights into the level of involvement across user demo-
graphics. What can be said about the demographic composi-
tion of the communities engaged in the discussions? Does
the discriminated group dominate the debate? Ultimately,
engaging diverse stakeholder groups is beneficial for the so-
cial campaign’s success (Ward 2013), and knowing the ex-
tent to which they contribute to the debate is helpful in learn-
ing how to alter the message to appeal to them.
Copyright c
2016, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
1Throughout the paper, by minority we refer to a group that is
subordinate to a more dominant group in society.
2http://blacklivesmatter.com/contact/
#BlackLivesMatter is a movement (and a hashtag) created
after the killing of Trayvon Martin in 2012, with over 1,000
demonstrations being held since then.3The hashtag has been
used during a number of events involving disproportionate
police violence against African-Americans, as well as dis-
proportionate reaction of mainstream media when terror at-
tacks occur in Western countries compared to when they oc-
cur in African countries (Zuckerman 2015).
Contributions. Our main contribution is a demographic
characterization of users involved in the #BlackLivesMat-
ter movement on Twitter. Our findings suggest that African-
Americans are both more numerous and active than other de-
mographic groups. Young females are more likely to actively
engage in the debate than men, yet, the proportions of white
and African American females are similar. Looking at male
users, we see a slightly different pattern: young adults still
dominate the discussions, but they are largely African Amer-
icans. Contrasting individuals to organizations, amounting
for 5% of profiles, we see a 3 times higher tweeting rate.
To run this study, we also created a collection of about
6,000 Twitter users annotated with demographic informa-
tion such as race, age, and gender. In contrast with previous
work that reports demographic information by automatically
predicting demographic factors for each user based e.g. on
their profile picture or name (Minkus, Liu, and Ross 2015;
Zagheni et al. 2014; Bakhshi, Shamma, and Gilbert 2014;
Mislove et al. 2011), we crowdsourced these annotations.
Although more expensive, we do so to work around known
pitfalls of automated user classification such as low re-
call (Minkus, Liu, and Ross 2015) and classification er-
rors (Yadav et al. 2014).
Limitations and Ethical Challenges. We note that such
an endeavor is not without caveats. First, there are intrin-
sic issues with hashtag-based analyses, and the reliance on
a single media platform and public APIs (Tufekci 2014;
Boyd and Crawford 2012): The hashtag we focus on does
not cover all the discussions and contributions around the
issue at core. The movement and hashtag use are recent
and we cannot capture the long-term evolution of the de-
mographics behind the core debate.
Second, there are important ethical challenges (Boyd and
Crawford 2012): Although publicly available, user profile
3https://www.elephrame.com/textbook/protests
310
Table 1: The basic stats of our dataset.
Movement Tweets Users Start Day End Day
#BlackLivesMatter 3.54M 0.88M April 11, 2012 May 10, 2015
data is inherently sensitive as e.g. users might not antic-
ipate a particular use of their data, especially when cre-
ated in a context sensitive space and time. This becomes
even more delicate when explicitly analyzing their demo-
graphic attributes. We discuss these challenges as we detail
our methods and their implications.
Data Collection and Annotation
The Movement On Twitter. The #BlackLivesMatter hash-
tag (whose usage over time is shown in Figure 1) was first
used on Twitter on April 2012 in relation to the killing of
Trayvon Martin (Graeff, Stempeck, and Zuckerman 2014).
Yet, it grew into a movement only after the acquittal of
George Zimmerman (the man who fatally shot Martin) in
July 2013,4and got consistent traction after the killing of
Michael Brown and with the Ferguson unrest.5The move-
ment gained momentum after the killing of Tamir Rice,6a
12 year-old school boy, and the decision of a grand jury not
to indict the officer that put Eric Garner in a chokehold.7
Since then, the movement periodically regained public at-
tention after events involving police brutality, including the
deaths of Walter Scott8and Freddie Gray.9
Collecting Tweets. To collect tweets published from the
day before the first use of the hashtag10 until May 10, 2015,
we crawled Topsy11 in April-May 2015—dataset figures in
Table 1 and Figure 1. To maximize the coverage of our col-
lection, we repeated the crawling with various time window
sizes until its volume converged.
Data Collection and Annotation. User data (public profile
data and crowdsourced annotations) were collected in June
2015. User profiles were annotated according to the entity
behind the Twitter accounts via the crowdsourcing platform
Crowdflower12. We asked crowdworkers to categorize users
as individuals, governmental agencies, NGOs, media, oth-
ers; and, then, the individuals according to 3 perceived de-
mographic attributes: race, age and gender. Crowdworkers
were shown automatically generated screenshoots of the up-
per part of users’ public profiles, including the picture ban-
ner, the profile picture, the name and profile description, and
the last one or two tweets. The screenshoots were provided
via short-lived URLs in order to limit access to user profile
information and minimize the risk of privacy violations.
4http://en.wikipedia.org/wiki/Black Lives Matter
5http://en.wikipedia.org/wiki/Shooting of Michael Brown
6http://en.wikipedia.org/wiki/Shooting of Tamir Rice
7http://en.wikipedia.org/wiki/Death of Eric Garner
8http://en.wikipedia.org/wiki/Shooting of Walter Scott
9http://en.wikipedia.org/wiki/Death of Freddie Gray
10First tweet containing a term obtained via http://ctrlq.org/first/
11http://about.topsy.com/terms-and-conditions/
12https://crowdflower.com/
We annotated about 6,000 users from 6 random samples
with various characteristics (e.g. from all users, from highly
active ones, from users tweeting about the topic even when
the media attention fades away). We showed crowdworkers
5-6 user profiles at a time, out of which one profile was la-
beled by one of the authors (gold standard), and used to con-
trol the quality of the annotations. Given that we collect per-
ceived attributes and some of them might be subjective, the
profiles picked as gold standards were selected to be obvi-
ous cases for each of the categories. For all annotation jobs,
we collected at least 3 independent annotations for each pro-
file and categorization criteria, and kept the majority label.
About 100 crowdworkers participated in each task. Full an-
notation instructions are included in our data release.
Exploratory Analysis
The users distribution according to the number of tweets13
is long tailed (Figure 2): most users post only a few tweets
on the topic (e.g. 62% of users have only one tweet in the
collection), while only a few users post in the order of thou-
sands of tweets (only 3 users with more than 10K tweets).
This indicates that many users participate in the debate only
incidentally. For our analysis, we split users according to
their level of activity in 3 categories: a) non-active users—
769,231 users with less than 5 tweets; b) moderately ac-
tive users—96,905 users with 5 to 25 tweets; and c) highly
active users—14,033 users with more than 25 tweets. We
make this categorization as we conjecture that the activity
w.r.t. a topic is a proxy for a user’s interest in the topic and
her level of involvement, and we are interested in the inter-
play between the activity level and users demographics.
Further, by briefly exploring the triggers behind the peaks
of attention received by the movement,14 we find that most
of them are generated by events involving killing of African-
Americans by police in the US (when the debate focuses on
the discrimination against African-Americans), see Figure 1.
In addition, the attention peaks for a topic may be indicative
of the topic entering and exiting the public debate: when the
topic is in the spotlight, a larger community tends to get in-
volved in the debate, yet, as the topic fades away, only the
concerned community might care. To this end, we define a
peak window (or the spotlight interval) as a 4 days inter-
val including the day of the peak, the day before the peak,
and two days after the peak. Using this definition, we found
611,871 users tweeting in the peak times, as compared to
less than half of that number being active before the topic
“enters” or after it “exits” the public debate—268,298 users.
User Characterization
To study the demographic composition of users involved in
the debate we extracted 6 random samples15: 2,000 users
13For simplicity, by tweet(ing) we refer to both the creation of
an original tweet, as well as to passing on content, i.e. re-tweeting.
14To detect peaks we used a readily available implementation:
https://gist.github.com/endolith/250860#file-peakdet-m
15Due to technical limitations related to how the screenshots
were displayed—resulting in profiles not being shown correctly for
annotation—we were able to label only 5976 users.
311
Figure 1: The distribution of the volume of tweets for #BlackLivesMatter per day over time.
Figure 2: The distribution of number of tweets per user.
Table 2: Accounts of organizations vs. of individuals across
samples. Asterisks indicate stat. signif. differences w.r.t. the
distribution of all users at p<0.01 (**) and p<0.05 (*)
All Peak Non High Mod. Low
Users Peak Activ. Activ. Activ.
Org. 5.0% 4.6% 4.9% 11.1% 5.5% 4.2%
Indiv. 95.0% 95.4% 95.1% 88.9% 94.5% 95.8%
** * **
sampled from all users in our dataset, and 5samples of 1,000
users from: users tweeting during peak times, users tweeting
outside the peak times, highly active users, moderately ac-
tive users, and non-active users. The samples were labeled
in two rounds: the first annotation task aimed to distill the
accounts of individuals from those of organizations, while
the second task was designed to categorize accounts of indi-
viduals along 3 demographic criteria: race, gender and age.
Accounts of Organizations. Looking at the fraction of or-
ganization accounts w.r.t those of individuals, we notice that
the sample drawn from highly active users contains twice as
many organization accounts than other samples. The frac-
tion of organization accounts seems typically higher within
more active users: e.g. there are more organization accounts
among moderately active users than among non-active users.
This is largely explained by a higher fraction of accounts
associated with NGOs (7.4%, 3.6%, 1% for highly active,
moderately active and non-active users, and 2.2% across
all users) and media organizations, which, however, attains
the highest fraction among moderately active users (a possi-
ble artifact of the fact that media organizations tweet about
many topics, while NGOs are typically focused on a handful
of causes). Finally, accounts associated with governmental
agencies account for less than half a percent in all samples.
User Demographics. For individuals, we looked at the dis-
all users
high active**
mod. active**
non peak**
non-active
peak**
0.0
0.2
0.4
0.6
0.8
1.0
<17
18-29
30-64
>65
(a) Distribution of users’ age per sample.
all users
high active**
mod. active**
non peak**
non-active
peak
0.0
0.2
0.4
0.6
0.8
1.0
Black
White
Asian
Other
(b) Distribution of users’ race per sample.
all users
high active**
mod. active**
non peak
non-active*
peak
0.0
0.2
0.4
0.6
0.8
1.0
Female
Male
(c) Distribution of users’ gender per sample.
Figure 3: Race, age, and gender distribution across samples.
Asterisks indicate stat. signif. differences w.r.t. the distribu-
tion of all users at p<0.01 (**) and p<0.05 (*). (best
seen in color)
tribution of demographic factors. (Age) Figure 3(a) shows
that the fraction of young adults is lower among highly ac-
tive users, while the fraction of adults between 30 to 64 years
old is lowest outside the peak times—these users engaging
with the hashtag more actively during peak times when the
topic is in the public spotlight. (Race) Figure 3(b) shows user
distribution across racial groups and samples. We notice that
the fraction of African-Americans is the highest within the
sample of highly active users, and the smallest among the
non-active users or during peak times. (Gender) Finally, in
Figure 3(c) we see that the user distribution according to
their gender is relatively stable across samples.
Next, we looked at the distribution of users across age
312
<17 years 18-29 years 30-64 years >65 years
Black
White
Asian
Others
0.0
0.1
0.2
0.3
(a) Distribution of male users as a function of age and
race. All cells sum to 100%.
<17 years 18-29 years 30-64 years >65 years
Black
White
Asian
Others
0.7% 32.5% 14.4% 0.1%
1.2% 26.4% 19.2% 0.1%
0.4% 3.2% 0.5% 0.0%
0.1% 0.7% 0.5% 0.0%
0.0 (F)
0.3
0.6
0.9
1.2
1.5
1.8 (M)
(b) Male (M) to female (F) ratio. Red (resp. blue) in-
dicates a higher fraction of female (resp. male) users
w.r.t. the overall distribution (0.78 marked by white
in the colorbar). The percentages indicate the overall
distribution of users.
Figure 4: Race and age distribution for female vs. male
users. (best seen in color)
and race per gender—see Figure 4.16 We notice that the
most active users are white and African-American adults be-
tween 18 to 64 years old. However, while for male African-
American users the fraction of young adults (18 to 29 years
old) is higher, for white users it is lower. Inspecting the dif-
ferences between genders (Figure 4(b)), we see that women
younger than 29 years old are more active than men in the
same age category, while for users older than 30 years old,
men tend to tweet more about the movement.
User Involvement. Finally, we checked if users belonging
to specific demographic groups tend to be more vocal, or,
in other words, if they generate more content on average.
First, we find that organizations are more active than indi-
viduals (7:2). Then, depending on the demographic criteria,
we see that: (a) African-Americans are most active, followed
closely by white users; (b) women are more active than men
(3.8:2.6); and (c) adults between 30 and 64 years old are the
most active, followed by young adults (3.9:2.6:2).
Concluding Remarks
We started the study after one of the related events—the
shooting of Walter Scott—and based on empirical evidence
we hypothesized that the debate would be hold largely
among African-Americans. While our findings support this
premise, African-Americans being the largest group (even
up to 60% among highly active users), overall, whites make
up about 40% of individuals and Asians 4%. Future work
naturally includes an analysis of demographic factors across
various movements related to minority groups issues in or-
der to validate and broaden the observations we make here.
Parting Thoughts on Ethics. Although important, studies
investigating social media to understand the public opinion
and various narratives on minority issues across stakehold-
16This is based on users annotated along all demographic factors,
as only some factors may be perceptible based on user profile info.
ers are scant, but growing. One reason are the limits in col-
lecting and annotating users accurately and at scale (either
manually or automatically). Yet, as we learn to work around
these limits, we also need to develop protocols to mindfully
study such user collections while protecting the users.
Data Release. The list of tweet ids are available for research
purposes at http://crisislex.org/. The list of annotated users is
available upon signing for not using it to study users in iso-
lation or to single them out for their demographic attributes.
Acknowledgements. We thank Carlos Castillo for feedback
on an early draft. A.O. was partially supported by the grant
Sinergia (SNF 147609).
References
Bakhshi, S.; Shamma, D. A.; and Gilbert, E. 2014. Faces engage
us: Photos with faces attract more likes and comments on insta-
gram. In CHI.
Boyd, D., and Crawford, K. 2012. Critical questions for big
data: Provocations for a cultural, technological, and scholarly phe-
nomenon. Information, communication and society.
Graeff, E.; Stempeck, M.; and Zuckerman, E. 2014. The battle for
‘Trayvon Martin’: Mapping a media controversy online and off-
line. First Monday.
Lashinsky, A. 2015. Seven signs you are clueless about income
inequality. http://fortune.com/2015/03/20/anand-giridharadas-ted-
inequality/.
Minkus, T.; Liu, K.; and Ross, K. W. 2015. Children seen but
not heard: When parents compromise children’s online privacy. In
WWW.
Mislove, A.; Lehmann, S.; Ahn, Y.-Y.; Onnela, J.-P.; and Rosen-
quist, J. N. 2011. Understanding the demographics of twitter users.
ICWSM.
Moodie-Mills, D. 2015. Black lives matter: A tale of two
covers. http://www.nbcnews.com/news/nbcblk/black-lives-matter-
tale-two-covers-n339796.
O’Brien, S., and Kelly, T. 2013. Gen-
der equality won’t happen unless men speak up.
http://edition.cnn.com/2013/04/17/business/sandberg-gender-
equality/.
Pettit, J. 2006. Can we talk about race? a few rules
of engagement. http://articles.baltimoresun.com/2006-08-
01/news/0608010135 1 racial-inequality-political-change-
problem-of-racial.
Royles, D. 2014. What’s missing from the debate about women
leaders in the nhs? men. http://www.theguardian.com/healthcare-
network/2014/jan/08/female-managers-gender-equality-nhs.
Tufekci, Z. 2014. Big questions for social media big data:
Representativeness, validity and other methodological pitfalls. In
ICWSM.
Ward, J. A. 2013. The next dimension in public relations cam-
paigns: A case study of the it gets better project. Public Relations
Journal.
Yadav, D.; Singh, R.; Vatsa, M.; and Noore, A. 2014. Recognizing
age-separated face images: Humans and machines. PloS one.
Zagheni, E.; Garimella, V. R. K.; Weber, I.; et al. 2014. Inferring
international and internal migration patterns from twitter data. In
WWW companion publication.
Zuckerman, E. 2015. Paying attention to Garissa.
http://www.ethanzuckerman.com/blog/2015/04/04/paying-
attention-to-garissa/.
313
  • ... Over the last decades Twitter has turned out to be the most pervasive platform for communities to openly debate, converse with varied perspectives. The enormous data generated from Twitter has been used in many studies to understand social dynamics of interaction on various prevalent issues in society such as racial discrimination [20], political injustice and violence [21] [22]. Examining Twitter dataset can provide crucial information on user engagement and attitudes in their online discourse on climate change and gender, to imply their awareness in this sensitive subject. ...
    Research
    Full-text available
    The move towards sustainable development has intensified with growing ramifications of climate change leading to extensive online and offline promotions. Since climate change is a generalised theme under sustainable development there is a possibility that some of its vulnerability contexts such as gender differentiated impacts may go unnoticed in social discourse. The reasons are meagre role of civil society in global environmental governance, excessive media coverage on the observable destructs obscuring gender vulnerabilities and opposition to the contemporary social norms rooted with gender inequalities. The discourse is also hindered due to limited access to scientific and official reports on women vulnerabilities and virtuousness in relation to climate change. The tendency to assert gender vulnerabilities in climate mitigation as a problem of developing countries also understates the importance of gender issues in climate change and a reason for limited global discourse. This research investigates how online communities associate gender and climate change in their discourse. We use keyword based query method to extract Twitter datasets and examine user engagement, demographic, geographic coverage and views expressed. Our finding suggests the need for extensive online awareness campaigns and involvement of higher male participation alongside female and youths. We also suggest aggregated views among organization and media required to sensitize climate change and gender issues. Region specific campaigns to target online user communities from climate vulnerable developing countries is also suggested. This research contributed by uncovering online policies to raise awareness on gender differentiated impacts of climate change as an enabler for social change.
  • Article
    This research examines Twitter discourse related to #BlackLivesMatter and police-related shooting events in 2016 through a mixed-method, interpretative approach. We construct a "shared audience graph", revealing structural and ideological disparities between two groups of participants (one on the political left, the other on the political right). We utilize an integrated networked gatekeeping and framing lens to examine how #BlackLivesMatter frames were produced--and how they were contested--by separate communities of supporters and critics. Among other empirical findings, this work demonstrates hashtags being used in diverse ways--e.g. to mark participation, assert individual identity, promote group identity, and support or challenge a frame. Considered from a networked gatekeeping perspective, we illustrate how hashtags can serve as channeling mechanisms, shaping trajectories of information flow. This analysis also reveals a right-leaning community of BlackLivesMatter critics to have a more well-defined group of crowdsourced elite who largely define their side's counter-frame.
  • ... On the other hand, in Black Lives Matter images, the gender ratio was 0.46 : 0.54. Recent studies have attempted to measure the demographics of social media users involved in a social movement by analyzing their profile photographs[2,34]. In contrast, our measures and data focus on the demographics of actual protesters.Fig. ...
    ... On the other hand, in Black Lives Matter images, the gender ratio was 0.46 : 0.54. Recent studies have attempted to measure the demographics of social media users involved in a social movement by analyzing their profile photographs [2,34]. In contrast, our measures and data focus on the demographics of actual protesters. ...
    Article
    Full-text available
    We develop a novel visual model which can recognize protesters, describe their activities by visual attributes and estimate the level of perceived violence in an image. Studies of social media and protests use natural language processing to track how individuals use hashtags and links, often with a focus on those items' diffusion. These approaches, however, may not be effective in fully characterizing actual real-world protests (e.g., violent or peaceful) or estimating the demographics of participants (e.g., age, gender, and race) and their emotions. Our system characterizes protests along these dimensions. We have collected geotagged tweets and their images from 2013-2017 and analyzed multiple major protest events in that period. A multi-task convolutional neural network is employed in order to automatically classify the presence of protesters in an image and predict its visual attributes, perceived violence and exhibited emotions. We also release the UCLA Protest Image Dataset, our novel dataset of 40,764 images (11,659 protest images and hard negatives) with various annotations of visual attributes and sentiments. Using this dataset, we train our model and demonstrate its effectiveness. We also present experimental results from various analysis on geotagged image data in several prevalent protest events. Our dataset will be made accessible at https://www.sscnet.ucla.edu/comm/jjoo/mm-protest/.
  • Conference Paper
    Full-text available
    The political discourse in Western European countries such as Germany has recently seen a resurgence of the topic of refugees, fueled by an influx of refugees from various Middle Eastern and African countries. Even though the topic of refugees evidently plays a large role in online and offline politics of the affected countries, the fact that protests against refugees stem from the right-wight political spectrum has lead to corresponding media to be shared in a decentralized fashion, making an analysis of the underlying social and mediatic networks difficult. In order to contribute to the analysis of these processes, we present a quantitative study of the social media activities of a contemporary nationwide protest movement against local refugee housing in Germany, which organizes itself via dedicated Facebook pages per city. We analyse data from 136 such protest pages in 2015, containing more than 46,000 posts and more than one million interactions by more than 200,000 users. In order to learn about the patterns of communication and interaction among users of far-right social media sites and pages, we investigate the temporal characteristics of the social media activities of this protest movement, as well as the connectedness of the interactions of its participants. We find several activity metrics such as the number of posts issued, discussion volume about crime and housing costs, negative polarity in comments, and user engagement to peak in late 2015, coinciding with chancellor Angela Merkel’s much criticized decision of September 2015 to temporarily admit the entry of Syrian refugees to Germany. Furthermore, our evidence suggests a low degree of direct connectedness of participants in this movement, (i.a., indicated by a lack of geographical collaboration patterns), yet we encounter a strong affiliation of the pages’ user base with far-right political parties.
  • ... Topically, close to our work is the recent work of Olteanu et al. (2016), who studied the demographics of Twitter users who used the hashtag #BlackLivesMatter and found blacks to be more engaged with the hashtag. Though this work examines an offline context of the movement using demo- graphics, our work differs in that we use statistics about po- lice violence as well as about the protests as offline data, and include nuanced linguistic analysis for the social media data. ...
    Article
    Full-text available
    From the Arab Spring to the Occupy Movement, social media has been instrumental in driving and supporting socio-political movements throughout the world. In this paper, we present one of the first social media investigations of an activist movement around racial discrimination and police violence, known as "Black Lives Matter". Considering Twitter as a sensor for the broader community's perception of the events related to the movement, we study participation over time, the geographical differences in this participation, and its relationship to protests that unfolded on the ground. We find evidence for continued participation across four temporally separated events related to the movement, with notable changes in engagement and language over time. We also find that participants from regions of historically high rates of black victimization due to police violence tend to express greater negativity and make more references to loss of life. Finally, we observe that social media attributes of affect, behavior and language can predict future protest participation on the ground. We discuss the role of social media in enabling collective action around this unique movement and how social media platforms may help understand perceptions on a socially contested and sensitive issue like race.
  • ... The social media usage of political movements is of interest to many studies, e.g., with a focus on the Black Lives Matter movement in the United States [6,11]. However, the German far-right has seen little attention so far, with current research mostly focusing on exploratory analysis of the social media activities of the AfD party on Facebook [14] and the topics discussed by the local anti- immigrant movement Pegida [12] on Twitter, as well as their corresponding news sources [13]. ...
    Article
    Full-text available
    The political discourse in Western European countries such as Germany has recently seen a resurgence of the topic of refugees, fueled by an influx of refugees from various Middle Eastern and African countries. Even though the topic of refugees evidently plays a large role in online and offline politics of the affected countries, the fact that protests against refugees stem from the right-wight political spectrum has lead to corresponding media to be shared in a decentralized fashion, making an analysis of the underlying social and mediatic networks difficult. In order to contribute to the analysis of these processes, we present a quantitative study of the social media activities of a contemporary nationwide protest movement against local refugee housing in Germany, which organizes itself via dedicated Facebook pages per city. We analyse data from 136 such protest pages in 2015, containing more than 46,000 posts and more than one million interactions by more than 200,000 users. In order to learn about the patterns of communication and interaction among users of far-right social media sites and pages, we investigate the temporal characteristics of the social media activities of this protest movement, as well as the connectedness of the interactions of its participants. We find several activity metrics such as the number of posts issued, discussion volume about crime and housing costs, negative polarity in comments, and user engagement to peak in late 2015, coinciding with chancellor Angela Merkel's much criticized decision of September 2015 to temporarily admit the entry of Syrian refugees to Germany. Furthermore, our evidence suggests a low degree of direct connectedness of participants in this movement, (i.a., indicated by a lack of geographical collaboration patterns), yet we encounter a strong affiliation of the pages' user base with far-right political parties.
  • ... The social media usage of political movements is of interest to many studies, e.g., with a focus on the Black Lives Matter movement in the United States [6,11]. However, the German far-right has seen little attention so far, with current research mostly focusing on exploratory analysis of the social media activities of the AfD party on Facebook [14] and the topics discussed by the local anti- immigrant movement Pegida [12] on Twitter, as well as their corresponding news sources [13]. ...
    Conference Paper
    Full-text available
    We present a quantitative study of the social media activities of a contemporary nationwide protest movement against local refugee housing in Germany, which organizes itself via dedicated city-level Facebook pages. We analyse data from 2015, containing more than one million interactions by more than 200,000 users. We investigate the temporal characteristics of the social media activities of this protest movement, as well as the connectedness of the interactions of its participants. We find several activity metrics such as the number of posts issued, negative polarity in comments, and user engagement to peak in late 2015, coinciding with chancellor Angela Merkel's much criticized decision of September 2015 to temporarily admit the entry of Syrian refugees to Germany. Furthermore, our evidence suggests a low degree of direct connectedness of participants in this movement, (i.a., indicated by a lack of geographical collaboration patterns), yet we encounter a strong affiliation of the pages' user base with far-right political parties.
  • Article
    Full-text available
    Since the shooting of Black teenager Michael Brown by White police officer Darren Wilson in Ferguson, Missouri, the protest hashtag #BlackLivesMatter has amplified critiques of extrajudicial killings of Black Americans. In response to #BlackLivesMatter, other Twitter users have adopted #AllLivesMatter, a counter-protest hashtag whose content argues that equal attention should be given to all lives regardless of race. Through a multi-level analysis of over 860,000 tweets, we study how these protests and counter-protests diverge by quantifying aspects of their discourse. We find that #AllLivesMatter facilitates opposition between #BlackLivesMatter and hashtags such as #PoliceLivesMatter and #BlueLivesMatter in such a way that historically echoes the tension between Black protesters and law enforcement. In addition, we show that a significant portion of #AllLivesMatter use stems from hijacking by #BlackLivesMatter advocates. Beyond simply injecting #AllLivesMatter with #BlackLivesMatter content, these hijackers use the hashtag to directly confront the counter-protest notion of “All lives matter.” Our findings suggest that Black Lives Matter movement was able to grow, exhibit diverse conversations, and avoid derailment on social media by making discussion of counter-protest opinions a central topic of #AllLivesMatter, rather than the movement itself.
  • Conference Paper
    Children's online privacy has garnered much attention in media, legislation, and industry. Adults are concerned that children may not adequately protect themselves online. However, relatively little discussion has focused on the privacy breaches that may occur to children at the hands of others, namely, their parents and relatives. When adults post information online, they may reveal personal information about their children to other people, online services, data brokers, or surveillant authorities. This information can be gathered in an automated fashion and then linked with other online and offline sources, creating detailed profiles which can be continually enhanced throughout the children's lives. In this paper, we conduct a study to see how widespread these behaviors are among adults on Facebook and Instagram. We use a number of methods. Firstly, we automate a process to examine 2,383 adult users on Facebook for evidence of children in their public photo albums. Using the associated comments in combination with publicly available voter registration records, we are able to infer children's names, faces, birth dates, and addresses. Secondly, in order to understand what additional information is available to Facebook and the users' friends, we survey 357 adult Facebook users about their behaviors and attitudes with regard to posting their children's information online. Thirdly, we analyze 1,089 users on Instagram to infer facts about their children. Finally, we make recommendations for privacy-conscious parents and suggest an interface change through which Facebook can nudge parents towards better stewardship of their children's privacy.
  • Article
    Full-text available
    Humans utilize facial appearance, gender, expression, aging pattern, and other ancillary information to recognize individuals. It is interesting to observe how humans perceive facial age. Analyzing these properties can help in understanding the phenomenon of facial aging and incorporating the findings can help in designing effective algorithms. Such a study has two components - facial age estimation and age-separated face recognition. Age estimation involves predicting the age of an individual given his/her facial image. On the other hand, age-separated face recognition consists of recognizing an individual given his/her age-separated images. In this research, we investigate which facial cues are utilized by humans for estimating the age of people belonging to various age groups along with analyzing the effect of one's gender, age, and ethnicity on age estimation skills. We also analyze how various facial regions such as binocular and mouth regions influence age estimation and recognition capabilities. Finally, we propose an age-invariant face recognition algorithm that incorporates the knowledge learned from these observations. Key observations of our research are: (1) the age group of newborns and toddlers is easiest to estimate, (2) gender and ethnicity do not affect the judgment of age group estimation, (3) face as a global feature, is essential to achieve good performance in age-separated face recognition, and (4) the proposed algorithm yields improved recognition performance compared to existing algorithms and also outperforms a commercial system in the young image as probe scenario.
  • Article
    Full-text available
    One of the biggest news stories of 2012, the killing of Trayvon Martin, nearly disappeared from public view, initially receiving only cursory local news coverage. But the story gained attention and controversy over Martin’s death dominated headlines, airwaves, and Twitter for months, thanks to a savvy publicist working on behalf of the victim’s parents and a series of campaigns off–line and online. Using the theories of networked gatekeeping and networked framing, we map out the vast media ecosystem using quantitative data about the content generated around the Trayvon Martin story in both off–line and online media, as well as measures of engagement with the story, to trace the interrelations among mainstream media, nonprofessional and social media, and their audiences. We consider the attention and link economies among the collected media sources in order to understand who was influential when, finding that broadcast media is still important as an amplifier and gatekeeper, but that it is susceptible to media activists working through participatory or nonprofessional media to co–create the news and influence the framing of major controversies. Our findings have implications for social change organizations that seek to harness advocacy campaigns to news stories, and for scholars studying media ecology and the networked public sphere.
  • Article
    Full-text available
    Photos are becoming prominent means of communication online. Despite photos' pervasive presence in social media and online world, we know little about how people interact and engage with their content. Understanding how photo content might signify engagement, can impact both science and design, influencing production and distribution. One common type of photo content that is shared on social media, is the photos of people. From studies of offline behavior, we know that human faces are powerful channels of non-verbal communication. In this paper, we study this behavioral phenomena online. We ask how presence of a face, it's age and gender might impact social engagement on the photo. We use a corpus of 1 million Instagram images and organize our study around two social engagement feedback factors, likes and comments. Our results show that photos with faces are 38% more likely to receive likes and 32% more likely to receive comments, even after controlling for social network reach and activity. We find, however, that the number of faces, their age and gender do not have an effect. This work presents the first results on how photos with human faces relate to engagement on large scale image sharing communities. In addition to contributing to the research around online user behavior, our findings offer a new line of future work using visual analysis.
  • Conference Paper
    Full-text available
    Data about migration flows are largely inconsistent across countries, typically outdated, and often inexistent. Despite the importance of migration as a driver of demographic change, there is limited availability of migration statistics. Generally, researchers rely on census data to indirectly estimate flows. However, little can be inferred for specific years between censuses and for recent trends. The increasing availability of geolocated data from online sources has opened up new opportunities to track recent trends in migration patterns and to improve our understanding of the relationships between internal and international migration. In this paper, we use geolocated data for about 500,000 users of the social network website "Twitter". The data are for users in OECD countries during the period May 2011- April 2013. We evaluated, for the subsample of users who have posted geolocated tweets regularly, the geographic movements within and between countries for independent periods of four months, respectively. Since Twitter users are not representative of the OECD population, we cannot infer migration rates at a single point in time. However, we proposed a difference-in-differences approach to reduce selection bias when we infer trends in out-migration rates for single countries. Our results indicate that our approach is relevant to address two longstanding questions in the migration literature. First, our methods can be used to predict turning points in migration trends, which are particularly relevant for migration forecasting. Second, geolocated Twitter data can substantially improve our understanding of the relationships between internal and international migration. Our analysis relies uniquely on publicly available data that could be potentially available in real time and that could be used to monitor migration trends. The Web Science community is well-positioned to address, in future work, a number of methodological and substantive questions that we discuss in this article.
  • Article
    Full-text available
    Large-scale databases of human activity in social media have captured scientific and policy attention, producing a flood of research and discussion. This paper considers methodological and conceptual challenges for this emergent field, with special attention to the validity and representativeness of social media big data analyses. Persistent issues include the over-emphasis of a single platform, Twitter, sampling biases arising from selection by hashtags, and vague and unrepresentative sampling frames. The socio-cultural complexity of user behavior aimed at algorithmic invisibility (such as subtweeting, mock-retweeting, use of "screen captures" for text, etc.) further complicate interpretation of big data social media. Other challenges include accounting for field effects, i.e. broadly consequential events that do not diffuse only through the network under study but affect the whole society. The application of network methods from other fields to the study of human social activity may not always be appropriate. The paper concludes with a call to action on practical steps to improve our analytic capacity in this promising, rapidly-growing field.
  • Conference Paper
    Full-text available
    Every second, the thoughts and feelings of millions of people across the world are recorded in the form of 140-character tweets using Twitter. However, despite the enormous potential presented by this remarkable data source, we still do not have an understanding of the Twitter population itself: Who are the Twitter users? How representative of the overall population are they? In this paper, we take the first steps towards answering these questions by analyzing data on a set of Twitter users representing over 1 % of the U.S. population. We develop techniques that allow us to compare the Twitter population to the U.S. population along three axes (geography, gender, and race/ethnicity), and find that the Twitter population is a highly non-uniform sample of the population.
  • The next dimension in public relations campaigns: A case study of the it gets better project
    WARD, J. A. The next dimension in public relations campaigns: A case study of the it gets better project. Public Relations Journal (2013).
  • Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication and society
    • D And
    BOYD, D., AND CRAWFORD, K. Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication and society (2012).
  • What's missing from the debate about women leaders in the nhs? men
    • D Royles
    Royles, D. 2014. What's missing from the debate about women leaders in the nhs? men. http://www.theguardian.com/healthcare- network/2014/jan/08/female-managers-gender-equality-nhs.