Conference PaperPDF Available

Abstract and Figures

Pinterest is an image-based online social network, which was launched in the year 2010 and has gained a lot of traction, ever since. Within 3 years, Pinterest has attained 48.7 million unique users. This stupendous growth makes it interesting to study Pinterest, and gives rise to multiple questions about it's users, and content. We characterized Pinterest on the basis of large scale crawls of 3.3 million user profiles, and 58.8 million pins. In particular, we explored various attributes of users, pins, boards, pin sources, and user locations, in detail and performed topical analysis of user generated textual content. The characterization revealed most prominent topics among users and pins, top image sources, and geographical distribution of users on Pinterest. We then tried to predict gender of American users based on a set of profile, network, and content features, and achieved an accuracy of 73.17% with a J48 Decision Tree classifier. We then exploited the users' names by comparing them to a corpus of top male and female names in the U.S.A., and achieved an accuracy of 86.18%. To the best of our knowledge, this is the first attempt to predict gender on Pinterest.
Content may be subject to copyright.
Pinned it! A Large Scale Study of the Pinterest Network
Sudip Mittal, Neha Gupta, Prateek Dewan, Ponnurangam Kumaraguru
Indraprastha Institute of Information Technology, Delhi (IIIT-D)
{sudip09068, neha1209, prateekd, pk}@iiitd.ac.in
ABSTRACT
Pinterest is an image-based online social network, which was
launched in the year 2010 and has gained a lot of traction,
ever since. Within 3 years, Pinterest has attained 48.7 mil-
lion unique users. This stupendous growth makes it interest-
ing to study Pinterest, and gives rise to multiple questions
about it’s users, and content. We characterized Pinterest
on the basis of large scale crawls of 3.3 million user profiles,
and 58.8 million pins. In particular, we explored various
attributes of users, pins, boards, pin sources, and user loca-
tions, in detail and performed topical analysis of user gen-
erated textual content. The characterization revealed most
prominent topics among users and pins, top image sources,
and geographical distribution of users on Pinterest. We then
tried to predict gender of American users based on a set of
profile, network, and content features, and achieved an accu-
racy of 73.17% with a J48 Decision Tree classifier. We then
exploited the users’ names by comparing them to a corpus
of top male and female names in the U.S.A., and achieved
an accuracy of 86.18%. To the best of our knowledge, this
is the first attempt to predict gender on Pinterest.
Categories and Subject Descriptors
H.3.5 [Online Information Services]: Web-based services
Keywords
Online social networks, Pin, Classification
1. INTRODUCTION
Online Social Networks (OSNs) like Facebook, Twitter,
LinkedIn, and Google+ are web-based platforms that help
users to interact, share thoughts, interests, and activities.
These OSNs allow their users to imitate real life connections
over the Internet. A report by the International Telecom-
munication Union states that the total number of online
The first two authors contributed equally to this work.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00.
social media users has crossed the 1 billion mark as of May,
2012 [27]. According to Nielsen’s Social Media Report, users
continue to spend more time on social networks than on any
other kind of websites on the Internet [31]. With this out-
burst in the number of social media users across the world,
online social media has moved to the next level of innova-
tion. While all the aforementioned conventional social media
services are mostly text-intensive, some of them have gone
beyond text, and have introduced images as their building
blocks. Services like Instagram, and Tumblr have gained im-
mense popularity in the recent years, with Instagram (now
part of Facebook) attaining 100 million monthly active users,
and 40 million photo uploads per day [17]. These numbers
indicate successful entrance of image based social networks
in the world of online social media.
Pinterest is one of the most recent additions to this popu-
lar category of image-based online social networks. Within
a year of its launch, Pinterest was listed among the “50 Best
Websites of 2011” by Time Magazine [29]. It was also the
fastest site to break the 10 million unique visitors mark [8].
Number of users since then have increased, with Reuters
stating a figure of 48.7 million unique users in February
2013 [37]. Although fairly new to the social media fraternity,
Pinterest is being heavily used by many big business houses
like Etsy, The Gap, Allrecipes, Jettsetter, etc. to advertise
their products. 1Further, Pinterest drives more revenue per
click than Twitter or Facebook, and is currently valued at
USD 2.5 billion [37, 45].
The immense upsurge and popularity of Pinterest has
given rise to multiple basic questions about this network.
What is the general user behavior on Pinterest? What are
the most common characteristics of users, pins, and boards?
What is the sentiment associated with user-generated tex-
tual content? What is the geographical distribution of users?
Is it possible to predict gender of Pinterest users? There ex-
ists little research work on Pinterest [2, 5, 22, 32, 42, 43]; but
none of this work addresses the aforementioned basic ques-
tions. To answer these questions, and get deeper insights
into Pinterest, we collected and analyzed a dataset com-
prising of user details (3,323,054), pin details (58,896,156),
board details (777,748), and images (498,433). We applied
multiple machine learning algorithms to predict gender on
a true positive data-set of 6,309 male and 6,309 female Pin-
terest users living in U.S.A.
Based on our analysis, some of our key contributions are
summarized as follows:
1. Topical analysis of user generated textual content on
1http://business.pinterest.com/stories/
Pinterest: We found that the most common topics
across users, and pins were design, fashion, photog-
raphy, food, and travel.
2. User, pin, and board characterization: We analyzed
various user profile attributes, their geographical dis-
tribution, top pin sources, and board categories. Less
than 5% of all images on Pinterest are uploaded by
users; over 95% are pinned from pre-existing web
sources.
3. Gender prediction for American users: We extracted
true positive gender information from Facebook, for
over 66,000 Pinterest users from U.S.A., and were able
to achieve an accuracy of 86.18% while predicting gen-
der. We applied various machine learning algorithms
using multiple content and network based features from
Pinterest.
The rest of the paper is organized as follows. We discuss
the related work in Section 2. We then discuss Pinterest as
a social network in Section 3. In Section 4, we describe our
data collection methodology. Analysis of the collected data
and its results are covered in Section 5. Section 6 contains
discussion, limitations, and future work.
2. RELATED WORK
Social network characterization.
Online social networks, in general, have been studied in
detail by various researchers in the computer science com-
munity. Mislove et al. conducted a large scale measure-
ment study and analysis of Flickr, YouTube, LiveJournal,
and Orkut [30]. Their results confirmed power-law, small-
world, and scale-free properties of online social networks.
In a more recent work, Magno et al. performed a detailed
analysis of the Google+ network, and identified some key
differences and similarities between Google+, and existing
social networks like Facebook, and Twitter [28]. Ugander et
al. performed a large-scale analysis of the entire Facebook
social graph and found that 99.91% of all the users belonged
to a single large connected component [40]. They confirmed
the ‘six degrees of separation’ phenomenon and showed that
the value had dropped to 3.74 degrees of separation in the
entire Facebook network of active users.
Pinterest Introduction.
Considering the rapid growth rate of Pinterest since its
launch, there still exist only a few studies on this social net-
work. In closely related work by Gilbert et al., authors pre-
sented a statistical overview of the Pinterest network, and
showed that female users get more repins but lesser follow-
ers on Pinterest [12]. Their analysis was based on a smaller
dataset of 2.9 million pins, and 989,355 users, in contrast to
our dataset of over 58 million pins, and 3.3 million users.
Chang et al. worked towards finding activity patterns for
attracting attention on Pinterest. Some of the key findings
of this work revealed that male users were not particularly
interested in stereotypically male topics; sharing diverse con-
tent increases attention to a certain level; and homophily
drives repinning. Their dataset consisted of 46,365 users,
and 3.1 million pins [6]. Ottoni et al. analyzed Pinterest in
a gender-sensitive fashion, and found that the network was
heavily dominated by female users. Authors of this work
found that females on Pinterest make more use of lightweight
interactions than males, invest more effort in reciprocating
social links, are more active and generalist in content gen-
eration, and describe themselves using words of affection
and positive emotions. This study spanned across a large
dataset consisting of over 2 million users [32]. Kamath et al.
described a supervised model for board recommendation on
Pinterest. They used a content-based filtering approach for
recommending high quality information to users [21]. Du-
denhoffer et al. tried to use Pinterest as a library marketing
and information literacy tool at the Central Methodist Uni-
versity. They reported that the number of followers viewing
the library pinboards had outpaced usage of the text-based
lists in just one semester [10]. In another similar work by
Zarro et al., authors talked about how digital libraries and
other organizations could take advantage of Pinterest to ex-
pand the reach of their material, allowing users to create
personalized collections, incorporating their content [42]. In
their next piece of work, Zarro et al. found that Pinter-
est serves as infrastructure for repository building that sup-
ports discovery, collection, collaboration and publishing of
content, especially for professionals [43].
Gender prediction on other social networks.
Rao et al. attempted to predict gender of Twitter users
based on a rich set of profile, content, and network at-
tributes, and achieved an accuracy of 72.33% using a SVM
classifier. This was the first attempt to predict gender on
Twitter [36]. Burger et al. achieved a 74% accuracy using
Balanced Winnow2 classifier for predicting gender of Twit-
ter users. Their corpus comprised of 4.1 million tweets, and
15.6 million distinct features [4]. Pennacchiotti et al. tried
to extract gender information from Twitter users’ profiles
by applying regular expressions on users’ bio field. Authors
were able to extract gender information of 80% users from
a sample of 14 million users, but with a very low accuracy.
A manual annotation of over 15,000 users using only pro-
file / avatar picture revealed that only 57% images were
correlated with a specific gender [33]. Tang et al. applied
a name-centric approach for predicting gender of New York
City Facebook users, and achieved an accuracy of 95.2% [39].
Zheleva and Getoor [44] proposed techniques to predict the
private attributes of users in four real-world datasets (in-
cluding Facebook) using general relational classfication and
group-based classification. Their accuracy for gender infer-
ence with their Facebook dataset, was 77.2% based on users’
group affiliations, and the sample dataset used in their study
was quite small (1,598 users in Facebook). Other papers [15,
16, 26, 41] have also attempted to infer private information
inside social networks. Methods they used are mainly based
on link-based traditional Naive Bayes classifiers.
3. UNDERSTANDING PINTEREST
Pinterest is an image-based social bookmarking media,
where users share images which are of interest to them, in
the form of pins on a pinboard. It emphasizes on discov-
ery and curation of images rather than original content cre-
ation. 2This makes Pinterest a very promising conduit for
the promotion of commercial activities online.
2http://blogs.constantcontact.com/product-blogs/social-
media-marketing/what-the-heck-is-pinterest-and-why-
should-you-care/
Similar to other OSNs, Pinterest also uses some specific
terminology to refer to various elements and services it pro-
vides. Some terms are as follows:
1. Pins: A pin is an image that has some meta-data in-
formation associated with it. Pins can be thought of
as basic building blocks of Pinterest. The act of post-
ing a pin is known as pinning, and the user who posts
a pin is the pinner. Similar to images on Facebook,
pins can be liked and shared. Each of these pins has
the following meta-data associated with it – unique
pin number,description,number of likes,number of
comments,number of repins,board name,source, and
content in comments. The act of sharing an already
existing pin is referred to as repinning.
2. Pinboards: They are a themed collection of pins, or-
ganized by a user. Each board (“boards” and “pin-
boards” are used interchangeably) has a name, a de-
scription (optional), category (optional, e.g. Animals,
Art, Celebrities, Food and Drink, Design, Education,
Gardening), and an option to make it Secret. Secret
boards are only visible to the users who create them.
This analogy of pins and pin-boards replicates the real-
world concept of images on a scrapbook.
3. Source: Each pin on Pinterest has a source URL as-
sociated with it. As the name suggests, this is the
actual URL from which the image has been pinned by
a user. Images uploaded by users directly to Pinter-
est from their local computer, have pinterest.com as
their source, whereas images which are pinned from
an existing website (e.g. flickr.com) have this source
website (flickr.com) as the source.
4. Pin-It button: A Pin-It button is a browser book-
mark used to upload content to Pinterest. Some pop-
ular websites like Amazon, eBay, BHG, and Etsy also
provide their own pin-it button next to their product
images. This pin-it button makes it easier for a user
to share the content that she likes on Pinterest.
3.1 User Accounts
A user begins by creating an account using her Facebook
ID, Twitter ID, or an email address. On account creation,
Pinterest asks each new user to follow 5 boards to complete
the creation process, as a mandatory step to get started.
Each user has a profile page (Figure 1) that is publicly visible
to everyone, listing the user’s name, a description, location,
connected Facebook account (if available), connected Twit-
ter account (if available), a profile website, boards (which
are not secret), and associated pins, likes, followers, and fol-
lowees. A user also has a timeline where all pins from the
users she follows, are displayed.
3.2 Social Ties
A user has the option to follow a particular user or a
specific board of any other user. If a user follows another
user, she gets updates about all the boards owned by that
user. But, in case a user follows specific boards, she gets
updates only from those particular boards. This relationship
is quite similar to Twitter’s follower / followee relationship.
Interactions on Pinterest are in the form of pins. A user pins
an image, and can add a pin description to better describe





 
 











 





 


 

  



   
   










 

 
 
 
 









 

Profile Description
Personal Website, Facebook ID, Twitter ID
Location
Board Pins
Figure 1: User Profile on Pinterest. The profile
description, websites, location, board, and pin are
marked separately in the screen snapshot.
the pin. Other users can then repin the shared pin, like
it or share their views through a comment. These features
are similar to Facebook’s share, like, and comment features
respectively.
4. DATA COLLECTION
In this section, we discuss the methodology that we ap-
plied for data collection, and describe the data that we col-
lected. Given the size of the entire Pinterest network (48.7
million users), it would have been hard, and computationally
very expensive to be able to capture the entire network.
Pinterest does not provide a public API for data collec-
tion. Therefore, in order to collect data, we designed and
implemented a breadth first search (BFS) crawler in Python.
All data was collected using a Dell PowerEdge R620 server,
with 64 Gigabytes of RAM, 24 core processor, connected to a
1 Gbps Internet connection. The entire data collection pro-
cess spanned from December 26, 2012 to February 1, 2013.
Broadly, this process (Figure 2) was split into three phases
as described below:
4.1 User Handles Collection
The data collection process was initiated by selecting the
top 5 profiles in terms of the number of followers on Pinter-
est, as initial seeds, and feeding them into the crawler. The
crawler first extracted 4,995,974 direct followers of these 5 in-
put seeds, and then repeatedly crawled through the “follow-
ers of followers”. We collected a total of 17,964,574 unique
user handles through this process, which is slightly over 36%
of the entire Pinterest population [37]. We call this, the
userhandles dataset. This technique of snowball-sampling
is commonly used in online social media research [32].
4.2 User Data Collection
Next, we started data collection for user profiles of the
17.96 million user handles collected in the previous step,
and obtained a total of 3,323,054 user profiles, called the
userprofile dataset (we present the analysis on 3.3 million
userprofiles in this paper; though our data collection pro-
Loca%on'
331,530&
Facebook&
profile&data&
1,667,973&
Board'
Details'
(777,748)&
Source'
Details'
58,896,156'
Userprofiles'
3,323,054&
Twi?er&
profile&data&
49,416&
Pins'
58,896,156&
Images'
498,433&
EXIF'Data'
9,950&
Seed'
Users'
5&
Userhandles'
17,964,574'
Figure 2: Flow diagram depicting the flow se-
quence of our data collection process. The darkened
blocks represent our initial seed users, and primary
datasets. The lighter blocks denote the additional
information extracted through the primary dataset.
cess is still active). This userprofile dataset includes user
display name, description field, profile picture, number of
followers, number of followees, number of boards, number
of pins, boards, profile website, Facebook handle, Twitter
handle, location, pins, and likes. Along with user profiles,
we extracted 777,748 boards and their corresponding details
(called the boards dataset). These details include board cat-
egory, number of followers, and number of pins for each pin-
board.
Many times users also mention their Facebook and / or
Twitter profile URLs on Pinterest. Using this information
from the userprofile dataset, we collected publicly available
Facebook information of 1,667,973 users (50.19% of the user-
profile dataset) and Twitter information of 49,416 users (1.4%
of the userprofile dataset). Many Pinterest users also men-
tion location in their profile. We found location details
for 331,530 users (9.93% of the userprofile dataset). Some
users mentioned only their country, whereas others men-
tioned their city as well. Some users gave their location
as “The beach”, “mentally in lala land”, etc. In order to
verify the credibility of such location information, we used
Yahoo Placefinder API 3and obtained the correct details
for 192,261 (57.99%) of these locations.
4.3 Pin Data Collection
Using user profiles as seeds, we collected 58,896,156 unique
pins and their related information. We call this the pin
dataset. This information consists of the pin description,
number of likes, number of comments, number of repins,
board name, and source for each pin. We also collected a
random sample of 498,433 images (called the images dataset)
from these pins. For each of these images, we extracted
their Exchangeable Image File Format (EXIF) information
for further analysis. 4Most common pieces of EXIF infor-
mation available were date, time, image description, artist,
3http://developer.yahoo.com/boss/geo/docs/requests-
pf.html
4http://fotoforensics.com/tutorial-meta.php#EXIF
copyright, and camera make / model. We also extracted in-
formation about pin sources for each pin, referred to as the
source dataset.
5. ANALYSIS
We now present our analysis of the users, pins, and boards
in detail.
5.1 User characterization
5.1.1 Profile description
From our userprofile dataset of 3,323,054 user profiles,
we found that only 589,193 (17.73%) users had profile de-
scription. We observed that users revealed private details
through this field, like age, marital status, personal traits,
email IDs, phone numbers, etc. The profile description of
one user said, “I am 35, happily married, love kids & cats,
and have a disturbing sense of humor!” We extracted 100
most frequently occurring words from the profile descrip-
tion, and found topics like fashion, design, food, music, art,
photography, and travel as the most popular user interests
(Figure 3). We observed that the most common interests
were in line with the most common professions (like artist,
designer, cook, photographer) mentioned by the users. This
shows that large proportion of Pinterest consumers make
use of the network for professional activities.
Figure 3: Tag cloud of the top 100 words taken from
user’s profile description field.
5.1.2 Social and commercial links
Another source of information on the user profile is the
“website” field, where users can provide URLs to their per-
sonal websites, and blogs. In our dataset, we found that
177,462 (5.34%) users had mentioned a website. The top-
most domain was Facebook, where 9,697 (5.46%) users had
mentioned a link to their Facebook profiles. Twitter, Etsy,
YouTube, Flickr, About.me, LinkedIn, etc. were the other
domains which constitute the top 10. Apart from the web-
site field, Pinterest separately provides users with an option
to connect their Facebook and / or Twitter accounts with
their Pinterest profiles. Out of over 3.3 million user profiles
that we collected, over 2.71 million users (81.78%) had con-
nected their Facebook profiles with Pinterest. Only 328,570
(9.88%) users connected their Twitter accounts with Pin-
terest. Less than 4% (132,553) users had connected both
Facebook and Twitter, while 12.3% (409,399) users had con-
nected neither. Further, we found that 86,641 (26.36%) out
of 328,570 users had identical usernames on Twitter and
Pinterest. However, only 5,419 (5.02%) out of 107,910 users
had identical usernames on Facebook and Pinterest. Two
hundred and ninety seven (0.22%) users had identical user-
names on all three networks. Analysis of usernames for the
same user on various social networks can be useful for iden-
tity resolution across multiple OSNs [19].
5.1.3 Connections and popularity
The maximum number of followers for a user was found
to be 11,992,745 (as of January 2013). Table 1 lists the
description, number of followers and followees for the top
10 most followed users on Pinterest (to maintain users’ pri-
vacy, we do not mention usernames anywhere). The aver-
age number of followers per Pinterest user was found to be
approximately 176, as compared to 208 followers per Twit-
ter user [38]. With only one-tenth the number of users as
Twitter, this average number of followers depicts that the
Pinterest network is very-well connected.
Followers Followees Interests / Profession
11,992,745 149 Designer / Blogger / Food
9,099,998 143 Designer / Magic / Food
8,056,723 1,176 Interior Designer
7,519,854 205 Not Mentioned
6,004,793 1,106 Lifestyle Blog
5,023,007 242 Beauty Enthusiast / Blogger
4,793,914 310 Architecture Student/Blogger
4,409,097 66 Not Mentioned
4,126,895 1,001 Artist
3,658,844 383 Freelancer / Blogger
Table 1: Top 10 user profiles on Pinterest based
on number of followers (as of January, 2013). The
table also shows number of followees for users, and
interests / profession as captured from the about
field.
We then plotted the ratio of number of followers versus
the number of followees for all users (except for the users
with 0 followees) on a log scale as shown in figure 4(a), and
found that more than 70% users had more followees than
followers. The graph depicts that a very small fraction of
users had this ratio skewed, and most users on Pinterest
in our dataset had a comparable number of followers and
followees. Krishnamurthy et al. [23] found a similar relation
between followers and followees for Twitter users.
From the 328,570 users who had connected their Twitter
accounts with Pinterest, we extracted the number of Twitter
followers and followees for 93,659 users. We then plotted
the ratio of followers / followees for these users for both,
Pinterest and Twitter, on a log scale, as shown in figure 4(b).
As the plot suggests, the ratio of followers / followees on
Pinterest was weakly correlated with the ratio of followers /
followees on Twitter (correlation = 0.32). Users who were
popular on Pinterest were not necessarily popular on Twitter
(and vice versa).
5.1.4 Gender distribution
We extracted gender information from Facebook profiles
of over 1.85 million users who had linked their Pinterest pro-
files with Facebook. Over 1.61 million users (87.15%) were
females, and only 130,945 users (7.04%) were males. The
rest (5.81%) did not have their gender information publicly
available. This gender distribution is quite similar to the one
observed by Ottoni et al. in their work on Pinterest [32].
5.2 Pin characterization
5.2.1 Pin description
To understand the most common type of pins on Pinter-
est, we extracted the textual content present in the “pin
description” fields from all the pins, and analyzed the most
frequently occurring terms. Figure 4(c) represents the tag
cloud of the top 100 terms present in pin description. Similar
to user descriptions, terms related to food and creative arts
dominated the pin description. Other than food, decoration
and wedding related pins were also found to be very com-
mon in pin description. For example, “Printable Snowflake
Wedding Invitations”, “Silk Bride Bouquet Peony Flowers
Pink Cream Lavender Shabby Chic Wedding Decor. $94.99,
via Etsy.”, “Wedding dresses and bridals gowns by David
Tutera for Mon Cheri for every bride at an affordable price
Wedding Dress Style”, “Vintage Wedding Decorating Ideas”.
5.2.2 Statistics and topical analysis
From our dataset of over 58 million pins, the average num-
ber of pins per user was 444.86 (min = 0, max = 100,135).
The average number of repins per pin was found to be 0.72
(min = 0, max = 20,212). Almost 79% pins in our dataset
never got repinned. The average number of likes per pin was
0.21 (min = 0, max = 5,640). Also, 90.32% pins were not
“liked” by anyone. This low percentage of repins and likes
shows that there is a limited set of pins that get popular,
and that a majority of pins go unnoticed. In case of com-
ments, the results are even more skewed compared to pins.
The average number of comments on a pin was 0.0065 (min
= 0, max = 3,345), and 99.53% pins had no comments. This
shows lack of utility of the comment feature on Pinterest.
To get an insight about the content of these comments, we
randomly crawled 643,653 (1.1%) pins from our pin dataset,
and were able to extract 2,544 comments. We then applied
the Linguistic Inquiry and Word Count (LIWC) tool [34] on
these comments, pin descriptions (Section 5.2.1), and user
profile descriptions (Section 5.1.1). We found that a large
portion of the comments reflected positive emotion (Fig-
ure 5). A similar pattern of positive emotion was observed
for user description, as well as board names. In general, the
network was found to have a large fraction of social content
suggesting active human interaction. Presence of sad emo-
tion, anger, anxiety, and swear words was found to be min-
imal. Textual content depicting biological processes, work,
and leisure activities was also found in substantial quantity.
From all this analysis, we conclude that a user usually leaves
a positive remark for a pin on Pinterest, and posts positive
textual content in general.
5.3 Source Analysis
Each pin has a source embedded in it. This source is the
original URL of the image from where it is “pinned”. 5How-
ever, if the user has directly uploaded an image to Pinterest,
the source field is set as “pinterest.com”. Table 2 shows that
the top source for images on Pinterest is the users them-
selves, i.e. a large portion of images are directly uploaded
and pinned by the users. Out of all the pins in our dataset,
2,768,851 pins (4.7%) were uploaded by users, second spot
was taken by Google, which included images from Google
5Example of an image source:www.cookingchanneltv.com/
recipes/spanish-tortilla-recipe/index.html
(a) (b) (c)
Figure 4: (a) Followers / followees for the users on Pinterest, on a log scale. (b) The follower / followee ratio
on Pinterest had no correlation with the ratio on Twitter. (c) Pin description on Pinterest. Similar to user
descriptions, pin descriptions were also dominated by terms related to food and creative arts, and partially
overlapped with terms present in user descriptions.
Figure 5: LIWC analysis of textual content on Pin-
terest. Majority of the content comprised of positive
sentiment words, or words indicating social interac-
tions.
Image Search, and other Google products, followed by Etsy,
at the third spot. Not surprisingly, free image sharing plat-
forms dominated the top 10 sources. Six out of the top 10
sources on Pinterest were among the top 1,000 most visited
websites in the world [1]. Etsy, a commercial website being
ranked high, shows that a reasonable amount of user traffic
on Pinterest comes from e-commerce websites, and depicts
that commercial activity is widespread on Pinterest.
Source Count W.A.R. Category
Pinterest.com 2,768,851 N/A N/A
Google 1,293,749 1 Search engine
Etsy 1,157,815 164 Commercial
Flickr 625,686 70 Image sharing
Tumblr 486,984 31 Image sharing
Imgfave 376,179 9,462 Image sharing
Weheartit 306,443 970 Image sharing
Someecards 296,908 6,648 E-cards
Houzz 294,065 958 Home decor.
Marthastewart 292,128 2,439 Food / Art
Table 2: Top 10 image sources on Pinterest.
W.A.R.= Worldwide Alexa Rank. Apart from
free image sharing / social network platforms, top
sources include commercial platforms like Etsy.
5.4 Pinboard analysis
In addition to the above Pin analysis, we also analyzed
the names of Pinboards. The most common terms occur-
ring in board names were home, style, recipes, food, wed-
ding, crafts, etc. Pinterest also provides an option with 33
different predefined categories for board creation. We ana-
lyzed the popularity of all these categories based on 3 factors,
number of boards in each category, number of pins on these
boards, and number of followers of these boards under each
category. We saw that 69.37% boards were created with
no standard category selected. Apart from these, the top
three categories for board creation were food drink (5.6%)
followed by diy crafts (2.3%), and hair beauty (2.4%). Fol-
lowers of boards in the “travel” category outnumbered all
the other boards by a big margin, and had the highest ratio
of followers per pin (23.69 followers per pin). The next most
famous boards in terms of followers per pin were education
(10.34 followers per pin), health fitness (5.37 followers per
pin), and home decor (4.71 followers per pin).
5.5 Location analysis
We investigated location information to find the Pinterest
population distribution across the world. From our dataset,
we collected 192,261 valid user locations, and performed a
lookup using Yahoo PlaceFinder API. We inferred the top 10
countries in terms of number of users (Table 3) from Yahoo’s
API output. Similar to Facebook and Twitter [23], a ma-
jority of Pinterest users also came from the U.S.A., Canada,
U.K., Brazil, India, and Europe. We found minimal users
from Africa, Russia, and China. Table 3 also lists Pinterest’s
regional traffic ranks taken from Alexa, on 2nd June 2013.
These ranks show that Pinterest is among the top most pop-
ular sites in countries like U.S.A., Canada, U.K., Australia,
Brazil, India, etc., which are also the top user locations in
our dataset. After analyzing country-wise distribution, we
did a city level location analysis for these top 10 countries
(Table 3), and found that most Pinterest users belonged to
big metropolitan cities. More than half of the cities in top
20 were from the U.S.A. Pinterest’s penetration was found
to be quite low in smaller cities.
As most Pinterest users in our dataset were females (Sec-
tion 5.1.4), we analyzed gender distribution with respect to
location. We observed that approximately 88% of users from
the U.S.A. were females, and approximately 7% were males.
A similar trend was observed in U.K., Australia, Europe,
and Brazil (Table 3). India was the only country in the top
Countries Cities
Country P.R.R. Females (%) Males (%) World City Count World City Count
1. U.S.A 15 83.88 8.80 1. New York 5597 11. Dallas 1275
2. Canada 21 82.73 10.66 2. London 3424 12. Austin 1249
3. U.K. 38 72.79 18.47 3. Los Angeles 3194 13. San Diego 1213
4. Australia 23 80.59 11.05 4. Chicago 2593 14. Houston 1169
5. Brazil 73 73.94 18.47 5. Toronto 1752 15. Sidney 1157
6. Spain 54 66.83 24.56 6. San Francisco 1659 16. Paris 1078
7. Italy 142 62.91 27.04 7. Atlanta 1472 17. Melbourne 1034
8. France 183 70.36 22.53 8. Washington 1428 18. Portland 1010
9. India 20 45.30 46.64 9. Seattle 1332 19. Vancouver 959
10. Netherlands 29 75.88 16.52 10. Boston 1329 20. Philadelphia 851
Table 3: Top 10 countries, and top 20 cities in decreasing order of Pinterest population. Apart from India, all
other countries were dominated by female users. The penetration of Pinterest is maximum in big metropolitan
cities. P.R.R.= Pinterest Regional Rank.
10, where the number of male users (46.64%) was greater
than the number of female users (45.30%).
5.6 Gender Prediction
As mentioned in section 5.1.4, we collected gender infor-
mation of over 1.85 million Pinterest users (130,945 males,
and 1.61 million females) from Facebook, who had connected
their Facebook accounts with their Pinterest profile. Con-
sidering this information as true, we attempted to predict
gender of Pinterest users on the basis of profile, network,
and content based features. For this experiment, we limited
our analysis to users from USA only. 6
5.6.1 Dataset and feature description
For gender prediction, our training dataset comprised of
6,309 male users, and 60,047 female users from the USA.
To maintain a balance between the class sizes for applying
machine learning, and achieve better confidence, we picked
up six random samples of 6,309 female users from the 60,047
total female users, and calculated an average accuracy over
all of them. A similar technique was used by Benevenuto
et al. while classifying spam on Twitter using unbalanced
training data samples [3]. We used a total of 9 features for
classification, as listed below:
1. Number of followers: The number of users who fol-
low a given user.
2. Number of followees: The number of users who, the
given user follows.
3. Number of pins: The number of pins pinned by the
given user.
4. Number of boards: The number of boards created
by the given user.
5. Content from “about” field: We extracted the top
1000 most frequently occurring terms in the “about”
field of male and female users’ profiles, and normal-
ized these frequencies with the number of users in their
respective categories (male / female). Each data in-
stance was then assigned a male-female ratio score as
6We picked USA, since it had the largest proportion of users
in terms of country-wise user distribution. See table 3.
follows:
About Ratiom/f =Pn
i=1 Wi×(NMWi)
Pn
i=1 Wi×(NFWi)
where
Wi=W ords in the about f ield
NMWi=N ormaliz ed f requency f or word Wif or males
NFWi=N ormaliz ed f requency f or word Wif or f emales
6. Board names: Similar to the previous feature, we ex-
tracted the top 1000 most frequently occurring board
names from male and female users’ profiles separately,
and normalized these frequencies with the total num-
ber of users in their respective categories. Each data
instance was then assigned a male-female ratio score
as follows:
Board Descm/f =Pn
i=1 Pi×(NMPi)
Pn
i=1 Pi×(NFPi)
where
Pi=Individual pinboard in user0s set of pinboards
NMPi=N ormaliz ed f requency f or pinboard Pifor males
NFPi=N ormaliz ed f requency f or pinboard Pifor f emales
7. Presence of a linked Twitter account with pro-
file: True, if the Pinterest user has connected his / her
Twitter account; false otherwise.
8. Presence of personal website: True, if the Pinter-
est user has mentioned a website in their website field;
false otherwise.
9. Name: We got a list of the most common male and
female first names in the US population during the
1990 census, 7and assigned a ternary integer score to
each data instance according to the user’s name being
present in the list of males, females, or both / none.
The last feature is independent of the Pinterest network.
We wanted to examine the performance of Pinterest-specific
features for predicting gender, as compared to features based
on only names; which is a completely independent feature in
itself. We performed all classification tasks using WEKA [14].
7http://names.mongabay.com/
5.6.2 Classification results
First, we attempted to predict gender using a feature set
F8of only the first 8 features, i.e. features extracted from
Pinterest. We applied 3 classifiers on our dataset of 12,618
users, and achieved a maximum average accuracy of 73.17%
with 10-fold cross validation using the J48 Decision Tree
classifier. To enhance the prediction accuracy, we introduced
the Name feature Fname to our feature-set. Note that this
feature completely relies on the name of the user, and is
independent of Pinterest. We were able to achieve a better
accuracy of 86.18% with the addition of this feature, using
the J48 Decision Tree classifier. However, using only the
Fname feature for classification, we still achieved an accuracy
of 83.64%. Table 4 summarizes the results.
Classifier Feature
set
Accuracy
(σ)
F-Measure
(σ)
NB
F8+Fname 62.96%
(5.85)
0.586
(0.100)
F860.88%
(5.41)
0.554
(0.096)
Fname 56.71%
(0.18)
0.533
(0.001)
J48 DT
F8+Fname 86.18%
(0.29)
0.861
(0.003)
F873.17%
(0.39)
0.732
(0.004)
Fname 83.64%
(0.27)
0.834
(0.003)
RF
F8+Fname 85.26%
(0.31)
0.853
(0.003)
F871.38%
(0.41)
0.713
(0.004)
Fname 83.64%
(0.27)
0.834
(0.003)
Table 4: Classification results for Naive Bayesian,
J48 Decision Tree, and Random Forest classi-
fiers. The accuracy and weighted average F-measure
scores are averaged over a labeled dataset of 6,309
male users, and 6 random samples of 6,309 instances
each, from 60,047 female users.
From the six random samples of training data we picked
for female users, the J48 Decision Tree classifier performed
the best across all the samples individually. We achieved a
maximum accuracy of 73.51% using F8(Pinterest-specific
features), 86.53% using F8+Fname (all 9 features), and
83.99% using only Fname; across all samples. Table 5 rep-
resents the confusion matrix for these results. As expected,
Fname was the most informative feature, followed by board
names, number of pins, presence of personal website, num-
ber of boards, content from about field, and presence of
linked Twitter account. The number of followers and fol-
lowees were the least informative features. Rao et al. [36]
achieved a similar score of 72.33% while predicting gender
of Twitter users, with the help of a rich feature-set using a
SVM classifier. Since the ratio of male to female users on
Pinterest is highly skewed [32] as opposed to Twitter (which
is fairly balanced 8), the size of our training dataset was
limited. We believe that with this limited training data, our
classification accuracy is reasonable.
8http://www.beevolve.com/twitter-statistics/#a1
Feature
set
Cls TP FP Precision Recall F-
Meas
F8+Fname
F 0.842 0.112 0.883 0.842 0.862
M 0.888 0.158 0.849 0.888 0.868
F8
F 0.76 0.29 0.724 0.76 0.742
M 0.71 0.24 0.748 0.71 0.728
Fname
F 0.724 0.044 0.943 0.724 0.819
M 0.956 0.276 0.776 0.956 0.857
Table 5: Confusion matrix representing true posi-
tive, false positive, precision, recall, and F-measure
scores for J48 Decision Tree classifier in the best
case. #F = Number of features; Cls = Class (Male
/ Female); TP = True Positive score; FP = False
Positive score.
These results imply that even though gender is not a pub-
licly accessible attribute on Pinterest, it is not difficult to
predict gender using a small number of other publicly avail-
able attributes.
5.7 Privacy and security issues
To study privacy implications of the public nature of Pin-
terest, we attempted to extract email addresses and phone
numbers from the publicly available user description field
from users’ profiles. We found that a total of 9,926 users in
our dataset shared their email addresses publicly. We then
searched for phone numbers, which are widely considered to
be PII [25], and found a total of 1,046 phone numbers and
/ or BBM pins from the users’ profile description field. Re-
search shows that it is possible for third-parties to link PII,
which is leaked via OSNs, with user actions both within
OSN sites and elsewhere resulting in privacy leakage [24].
A recent study also investigated the risks of sharing phone
numbers publicly on Facebook, and Twitter, and highlighted
the extent to which these phone numbers could be exploited
to gather much more private information about a user [18].
While various brands are using Pinterest for legitimate
commercial purposes by promoting their work through pin-
boards, Pinterest has also attracted spammers and malicious
users. With the growth in the number of users, there has
been a simultaneous growth in the number of spammers on
Pinterest. 9Numerous online scams have been reported in
recent times [7, 9, 20, 35], and Pinterest has taken measures
to solve this problem. 10 To get a better understanding
of the presence of spam and malware on Pinterest, we used
Google’s Safe Browsing API 11 to check for malicious source
URLs on the network. We analyzed the source URLs of a
random sample of 5.5 million pins from our pin dataset and
found 1,322 (0.024%) unique malware pins. Despite numer-
ous reported incidents of spam and malware, such low num-
ber suggests that the techniques deployed by Pinterest to
avert malware are indeed effective. Since we collected these
pins in January 2013, we wanted to check if the captured
malware continued to exist on Pinterest. We then crawled
these 1,322 pins again in May 2013 and observed that 33
of these pins no longer existed. It is hard to predict if the
users themselves deleted these pins, or Pinterest removed it.
We re-checked the source URLs of the 1,322 malware pins
in May 2013, and found that 223 source URLs no longer ex-
9http://mashable.com/2012/12/06/pinterest-spam-
accounts/
10http://blog.pinterest.com/post/37347668045/fighting-
spam
11https://developers.google.com/safe-browsing/
ist. Corresponding to the 1,322 malware pins, we identified
1,171 unique users from our dataset. Re-crawling these user
accounts in May 2013 revealed that 100 out of these 1,171
user accounts did not exist. This shows that other than re-
moving malicious content, Pinterest also take measures to
remove malicious user profiles.
6. DISCUSSION
In this work, we characterized the Pinterest social net-
work, and tried to predict it’s users’ gender using profile,
content, and network based features. We collected 17,964,574
unique user handles, 3,323,054 complete user profiles, 777,748
boards with their corresponding details, and 58,896,156 unique
pins with their related information, using Snowball sam-
pling [13]. Our analysis was based on a partial subgraph
of the Pinterest network, and suggests that Pinterest is a
social network dominated by “fancy” topics like fashion, de-
sign, food, travel, love etc. across users, boards, and pins.
A large part of the network was found to have a compara-
ble number of followers and followees. Only a small frac-
tion of people had large number of followers as compared
to followees and vice-versa. The largest contributors of con-
tent (images) on Pinterest were the users themselves, with
2,768,851 (4.7%) users uploading original content; the re-
maining content (95.3%) was pinned from pre-existing web
sources. Google Images, and Etsy followed as the next most
famous sources, from where images are pinned onto Pin-
terest. USA, Canada, and UK contributed the maximum
proportion of users, together accounting for over 73% of the
total Pinterest population.
We then focused our analysis on predicting gender of Pin-
terest users based on their profile, content, and network fea-
tures. Our labeled dataset consisted of a total of 12,618 user
profiles from USA, with equal distribution, and we were able
to achieve an accuracy of 73.17% using only Pinterest spe-
cific features. Addition of the “name” feature increased the
accuracy to 86.18%. Using only the name feature, we were
able to achieve an accuracy of 83%, which shows that adding
Pinterest specific features helps very little in predicting gen-
der of a USA Pinterest user.
Finally, we did some preliminary analysis to explore the
privacy and security implications associated with Pinterest,
and found multiple instances of publicly available PII leak-
age due to the all-public nature of Pinterest. We also found
presence of malware, and discovered that most of this mal-
ware continued to exist for at least 4 months; between our
two crawls of the network. Given that Pinterest is fairly new
in the social media fraternity, we suspect that the amount
of malware would only grow in the near future.
We picked the initial seeds for our data collection process
as the top 5 most followed users on Pinterest. We under-
stand that this technique suffers from bias, and the sample
taken is not completely random. We crawled only partial
sub-graphs for all the 5 seed users. Similarly, on the next
level of our BFS crawl, we crawled not more than 48 fol-
lowers for each user. Since there is not much prior work
on Pinterest, we do not have enough academic literature to
claim that our dataset is representative of the whole Pin-
terest population. However, the previous work by Ottoni
et al. [32], Gilbert et al. [12], and a report from Engauge,
a digital marketing agency [11], show similar gender distri-
butions for users, and similar topic distributions for boards
and pins as our dataset.
In future, we would like to perform a deeper analysis for
gender prediction on Pinterest. Our current feature set of
9 features can be expanded to accommodate features based
on natural language, content, profile attributes, and net-
work features. Users’ about field, and comments can be
utilized for this purpose. More network based features like
betweenness, and closeness centrality can also be explored.
We would also like to generalize gender prediction over the
entire geographic Pinterest population, rather than limiting
it to users from USA only.
To the best of our knowledge, this is one of the first at-
tempt to characterize Pinterest, and study its various com-
ponents in depth, on such a large scale. For this analysis,
we use profile information for only about 3.3 million users
from over 17 million unique user handles that we had in our
dataset. Our data collection process is still active, and we
would like to redo our analysis on the largest connected com-
ponent (LCC) of the complete Pinterest network. We would
also like to perform a more detailed analysis of image-spam,
and copyright violations on this network. Given that Pin-
terest has been the fastest growing social network in recent
times, it would be interesting to see if malicious users are
targeting Pinterest for spiteful purposes.
7. REFERENCES
[1] Alexa Internet Inc. Alexa: The web information
company. http:// www.alexa.com/ , 2013.
[2] K. K. Ana-Maria Popescu and J. Caverlee. Mining top
users in pinterest categories. UMAP, 2013.
[3] F. Benevenuto, G. Magno, T. Rodrigues, and
V. Almeida. Detecting spammers on twitter. In CEAS,
volume 6, 2010.
[4] J. D. Burger, J. Henderson, G. Kim, and G. Zarrella.
Discriminating gender on twitter. In EMNLP, pages
1301–1309. Association for Computational Linguistics,
2011.
[5] C. Carpenter. Copyright infringement and the second
generation of social media websites: Why pinterest
users should be protected from copyright infringement
by the fair use defense. Available at SSRN 2131483,
2012.
[6] S. Chang, V. Kumar, E. Gilbert, and L. Terveen.
Specialization, homophily, and gender in a social
curation site: Findings from pinterest. 2014.
[7] G. Cluley. Pinterest spam promotes acai berry diet.
http:// nakedsecurity.sophos.com/ 2012/ 04/ 02/
pinterest-spam-acai-berry-diet/ , 2012.
[8] J. Constine. Pinterest hits 10 million u.s. monthly
uniques faster than any standalone site ever -comscore.
http:// techcrunch.com/ 2012/ 02/ 07/ pinterest-
monthly-uniques/ , 2012.
[9] Consumer Threat Alerts. New wave of social scams
target pinterest users, mcafee warns.
http:// blogs.mcafee.com/ consumer/ consumer-threat-
notices/ new-wave-of- social-scams-target-pinterest-
users-mcafee-warns, 2012.
[10] C. Dudenhoffer. Pin it! pinterest as a library
marketing and information literacy tool. College &
Research Libraries News, 73(6):328–332, 2012.
[11] Engauge Insights. Pinterest: A review of social
media’s newest sweetheart.
http:// www.engauge.com/ assets/ pdf/ Engauge-
Pinterest.pdf , 2012.
[12] E. Gilbert, S. Bakhshi, S. Chang, and L. Terveen. I
need to try this?: A statistical overview of pinterest.
In CHI, pages 2427–2436. ACM, 2013.
[13] L. A. Goodman. Snowball sampling. The annals of
mathematical statistics, 32(1):148–170, 1961.
[14] M. Hall, E. Frank, G. Holmes, B. Pfahringer,
P. Reutemann, and I. H. Witten. The weka data
mining software: an update. ACM SIGKDD
Explorations Newsletter, 11(1):10–18, 2009.
[15] J. He, W. W. Chu, and Z. V. Liu. Inferring privacy
information from social networks. In Intelligence and
Security Informatics, pages 154–165. Springer, 2006.
[16] R. Heatherly, M. Kantarcioglu, and
B. Thuraisingham. Preventing private information
inference attacks on social networks. 2009.
[17] Instagram Press Center.
http:// instagram.com/ press/ , 2013.
[18] P. Jain, P. Jain, and P. Kumaraguru. Call me maybe:
Understanding nature and risks of sharing mobile
numbers on online social networks. ACM COSN, 2013.
[19] P. Jain, P. Kumaraguru, and A. Joshi. @ i seek ‘fb.
me’: identifying users across multiple online social
networks. In WoLE, pages 1259–1268. IW3C2, 2013.
[20] I. Jelea. Scammers blur lines between pinterest and
facebook.
http:// www.hotforsecurity.com/ blog/ scammers-blur-
lines-between-pinterest-and-facebook-1283.html , 2012.
[21] K. Kamath, A.-M. Popescu, and J. Caverlee. Board
recommendation in pinterest. UMAP, 2013.
[22] K. Y. Kamath, A.-M. Popescu, and J. Caverlee. Board
coherence in pinterest: non-visual aspects of a visual
site. In WWW, pages 49–50. IW3C2, 2013.
[23] B. Krishnamurthy, P. Gill, and M. Arlitt. A few chirps
about twitter. In WOSN, pages 19–24. ACM, 2008.
[24] B. Krishnamurthy and C. E. Wills. On the leakage of
personally identifiable information via online social
networks. In WOSN, pages 7–12. ACM, 2009.
[25] P. Kumaraguru and N. Sachdeva. Privacy in india:
Attitudes and awareness v 2.0. Available at SSRN
2188749, 2012.
[26] J. Lindamood, R. Heatherly, M. Kantarcioglu, and
B. Thuraisingham. Inferring private information using
social network data. In WWW, pages 1145–1146.
ACM, 2009.
[27] I. Lunden. There are now over 1 billion users of social
media worldwide, most on mobile.
http:// techcrunch.com/ 2012/ 05/ 14/ itu-there-are-
now-over-1-billion-users- of-social-media-worldwide-
most-on- mobile/ , 2012.
[28] G. Magno, G. Comarela, D. Saez-Trumper, M. Cha,
and V. Almeida. New kid on the block: Exploring the
google+ social graph. In IMC, pages 159–170. ACM,
2012.
[29] H. McCracken. Pinterest - the 50 best websites of 2011
- time.
http:// www.time.com/ time/ specials/ packages/
article/ 0,28804, 2087815 2088159 2088155,00.html ,
2011.
[30] A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel,
and B. Bhattacharjee. Measurement and analysis of
online social networks. In IMC, pages 29–42. ACM,
2007.
[31] Nielsen. Social media report 2012: Social media comes
of age. http:
// www.nielsen.com/ us/ en/ newswire/ 2012/ social-
media-report-2012- social-media-comes-of-age.html,
2012.
[32] R. Ottoni, J. P. Pesce, D. Las Casas,
G. Franciscani Jr, W. Meira Jr, P. Kumaraguru, and
V. Almeida. Ladies first: Analyzing gender roles and
behaviors in pinterest. ICWSM, 2013.
[33] M. Pennacchiotti and A.-M. Popescu. A machine
learning approach to twitter user classification. In
ICWSM, 2011.
[34] J. W. Pennebaker, C. K. Chung, M. Ireland,
A. Gonzales, and R. J. Booth. The development and
psychometric properties of liwc2007. Austin, TX,
LIWC. Net, 2007.
[35] A. Pichel. Survey scams find their way into pinterest.
http:// blog.trendmicro.com/ trend labs-security-
intelligence/ survey-scams-find- their-way-into-
pinterest/ , 2012.
[36] D. Rao, D. Yarowsky, A. Shreevats, and M. Gupta.
Classifying latent user attributes in twitter. In SMUC,
pages 37–44. ACM, 2010.
[37] Reuters. Start-up pinterest wins new funding, $2.5
billion valuation.
http:// www.reuters.com/ article/ 2013/ 02/ 21/ net-us-
funding-pinterest-idUSBRE91K01R20130221 , 2013.
[38] C. Smith. By the numbers: 16 amazing twitter stats.
Digital Marketing Ramblings
http:// expandedramblings.com/ index.php/ march-
2013-by- the-numbers-a-few- amazing-twitter-stats/ ,
May, 2013.
[39] C. Tang, K. Ross, N. Saxena, and R. Chen. What’s in
a name: A study of names, gender inference, and
gender behavior in facebook. In DASFAA, pages
344–356. Springer, 2011.
[40] J. Ugander, B. Karrer, L. Backstrom, and C. Marlow.
The anatomy of the facebook social graph. arXiv
preprint arXiv:1111.4503, 2011.
[41] W. Xu, X. Zhou, and L. Li. Inferring privacy
information via social relations. In ICDEW, pages
525–530. IEEE, 2008.
[42] M. Zarro and C. Hall. Pinterest: Social collecting for
#linking #using #sharing. In JCDL, 2012, pages
417–418.
[43] M. Zarro, C. Hall, and A. Forte. Wedding dresses and
wanted criminals: Pinterest. com as an infrastructure
for repository building. In ICWSM, 2013.
[44] E. Zheleva and L. Getoor. To join or not to join: the
illusion of privacy in social networks with mixed
public and private user profiles. In WWW, pages
531–540. ACM, 2009.
[45] J. Zwelling. Pinterest drives more revenue per click
than twitter or facebook. http:
// venturebeat.com/ 2012/ 04/ 09/ pinterest-drives-
more-revenue-per-click-than- twitter-or-facebook/ .
... The final participatory steps are platform-specific: (a) having an account; (b) having saved (called pinning on the platform) something on the site; (c) having saved something from the web to Pinterest. These two forms of contribution are the most common on the platform and are clearly distinct from one another (Mittal et al., 2014). The interviews covered all modes of contribution possible on Pinterest, so that reasons and experiences related to their participation or lack thereof could become clear. ...
... Analyses of Pinterest accounts have found significant gender disparities on the site, with women making up the vast majority of users (Gilbert et al., 2013;Mittal et al., 2014;Ottoni et al., 2013). However, while the majority of studies found a gender gap on the website, the gender distribution varied greatly across them, presumably as a result of varying data collection methods. ...
... However, while the majority of studies found a gender gap on the website, the gender distribution varied greatly across them, presumably as a result of varying data collection methods. Some studies used web crawlers that started with popular pins or user profiles (Gilbert et al., 2013;Mittal et al., 2014;Ottoni et al., 2013), which may skew the gender distribution (cf. Hargittai, 2020). ...
Article
Full-text available
Digital inequality scholarship has highlighted the importance of sociodemographic factors and internet experiences in how people use digital media in their lives. Some of this research has focused specifically on the adoption and use of social media, but much of this work has only investigated text-based platforms. Image-based sites such as Pinterest have largely been ignored in work about online participation inequalities. It remains unclear how existing findings about participation inequalities on text-based social media translate to image-based platforms. The present paper fills this gap by exploring differences in user engagement on Pinterest, one of the most popular social media platforms. The paper uses a mixed methods approach and analyzes both survey and interview data. This approach allows for a deeper understanding of the pipeline of online participation inequalities, a digital inequalities framework introduced by Shaw and Hargittai (2018). The survey data reveal that age, gender and internet skills strongly relate to participation on the platform. The interviews add more nuance by providing insights into reasons and motivations for Pinterest use as well as reasons for dropping out of the pipeline, beyond those identified in the survey. This mixed-methods approach allows insights into how participation barriers apply to image-based social media.
... 1. The impact of gender on the use of Pinterest (Chang et al., 2014;Gilbert et al., 2013;Ottoni et al., 2013;Alperstein, 2015;Han et al., 2014;Phillips et al. 2014) 2. Pinterest as a social curation and information literacy tool (Linder et al., 2014;Hall & Zarro, 2012;Dudenhoffer, 2012;Robertson, 2012;Carpenter et al., 2018;Kaminski, 2018) 3. User motives to use Pinterest (Wang et al., 2016;Han et al., 2014;Mittal et al., 2014;Miller et al.,2015;Sashittal & Jassawalla, 2015;Mull & Lee, 2014;Schmidt & Evans, 2018) The main focus of these studies is laid on social, economic, socio-economic or technical as well as psychological aspects relating to Pinterest (Lewallen & Behm-Morawitz, 2016). Furthermore, most seem to view Pinterest only as a kind of social media platform, but no one has considered it as an information retrieval (IR) system. ...
... Pinterest has described itself as a search engine for a longer period of time (Xu, 2014). However, many users and scientists saw it differently and assigned Pinterest to different types of information services, including social media (Mittal et al., 2014;Mull & Lee, 2014;Miller et al. 2015;Büscher, 2018;Carpenter, 2012). One reason for these different classifications could be the variety of possible uses of Pinterest, which makes it difficult to define it as one specific system type. ...
... Pinterest is considered by many scientists to be a Social Networking Service (SNS) (Mittal et al., 2014;Mull & Lee, 2014;Miller et al. 2015;Büscher, 2018;Carpenter, 2012). Social media or SNSs are web and app services on which users can create and share content, as well as communicate and network with other users. ...
Conference Paper
Full-text available
Information services play an important role in a knowledge-based society. Due to the constant change in technology and its possibilities, more and more new information services are emerging with different focuses and contents. One of the most popular top 10 social media is Pinterest. With a steadily increasing course of success, Pinterest has received a lot of attention in recent years. In the scientific research, many studies label Pinterest differently, for example, it is known as a curation platform, bookmarking service, search engine and social media. Pinterest introduces itself as a search engine for inspiration. The definition of Pinterest is, therefore, rather fuzzy and it is unclear on which application of the service the focus is being set (?). The following study focused on the aspects of the perceived and objective quality as well as the classification of the information service. For this purpose an online survey with 365 participants (to examine the perceived service quality), a literature research (to compare different platform specifics and definitions with the ones of Pinterest) and sample analyses (to measure the objective quality of the service) were conducted. It has been found that to some extend both, the perceived and the objective quality of the information service, are very good. It was also found that Pinterest is a kind of visual hybrid media and is, therefore, a unicorn among social media.
... Although school psychologists can access helpful information in multiple ways including online continuing education, peer-reviewed journal articles, conferences, and workshop attendance, school psychologists are also turning to the Internet to find ways to support their students (Mittal, Gupta, Dewan, & Kumaraguru, 2014;Pham, 2014;Seaman & Tinti-Kane, 2013). Specifically, one avenue professionals may use is the use of extant social networking websites such as Pinterest. ...
... Specifically, one avenue professionals may use is the use of extant social networking websites such as Pinterest. With the rise in social media use as well as the growing presence of educational professionals on social networking sites (Mittal et al., 2014), understanding the content type and quality of material being shared online concerning internalizing disorders is vital as little research has been conducted on what content is actually being shared on these sites and if those resources stem from quality sources. Moreover, school psychologists are not well informed about evidence-based interventions, reporting less than one third of respondents in the study by McKevitt (2012) indicated that they used journal articles to learn about interventions. ...
... Research on online information has focused primarily on resources pertaining to autism, parent information, and platforms other than Pinterest, and a dearth of research exists on online resource utilization for educational professionals with the exception of instructional use of social media (Mittal et al., 2014;Seaman & Tinti-Kane, 2013) and ethics of social media use (Segool, Goforth, Bowman, & Pham, 2016). Currently, limited research regarding internalizing disorders is reflected in the content on Pinterest. ...
Article
Many efforts have been made to understand social media and the resources existing online. However, prior studies have not thoroughly assessed specific platforms and the content being shared. The present study examined Pinterest content sharing as a proxy for interest among school personnel. Using Hall, Breeden, and Giacobe’s coding scheme, 657 pins from 499 randomly selected pinners following the National Association of School Psychologists’ Pinterest account were coded by content area and assessed for level of evidence base. Significant associations were found in chi-square analyses between category of internalizing disorders, evidence base, and types of pins shared. In addition, the category of internalizing disorder and level of evidence base were found to have a significant interaction with the ease of implementation. Assessing the content shared on Pinterest may inform future evidence-based implementation difficulties in schools.
... Gonzalez et al. (2013) also discovered that Google+ was utilized by internet users for propagating messages like Twitter. Similarly to Google+, Pinterest which was launched later in 2010 is a social media platform sharing images of interest to users (Mittal, Gupta, Dewan, & Kumaraguru, 2014). According to Mittal et al. (2014), Pinterest is best used for promoting commercial activities online. ...
... Similarly to Google+, Pinterest which was launched later in 2010 is a social media platform sharing images of interest to users (Mittal, Gupta, Dewan, & Kumaraguru, 2014). According to Mittal et al. (2014), Pinterest is best used for promoting commercial activities online. ...
Article
Full-text available
Significant evidence has supported a positive association between consumption of sugar-sweetened beverage and increased risks of the non-communicable diseases of obesity, metabolic disorders, dental caries, and dental erosion. Thus, using social marketing concepts to change people’s attitude and behavior towards the consumption is imperative. Social media is considered as a cheap and quick tool to disseminate health messages in health communication. The use of social media has increased significantly but knowledge of its utilization in sugar-sweetened beverage health campaigns is limited. This study was conducted to identify social media health campaigns against the sugar-sweetened beverage consumption, their social media platforms and types of materials distributed, and to identify health messages being highlighted in the campaigns. The authors conducted a systematic search for the campaigns and employed content analysis to identify health messages. As a result, 34 campaigns were identified. Facebook and YouTube were commonly used to disseminate campaign materials—83% of them were videos and text articles. Obesity/overweight, diabetes, and cardiovascular diseases were the most frequently mentioned health messages in the campaigns. The increased use of social media with their low-cost operation and capacity to increase campaign reach makes them potential communication channels for health campaigns against sugar-sweetened beverage consumption.
... Previous studies found that one's gender can be predicted fairly accurately based on their collaboration patterns, specialization, and the style of code they produce (38,52). In a recent study, the gender of users was predicted on Pinterest, where the ratio of women is higher than that of men, based on the content they share (53). In a study exploring the digital music platform The Echo Nest (54), where only 25% of the solo artists are women, authors managed to predict the artist's gender based on the musical features of their songs with 90% accuracy. ...
Article
Full-text available
Digital collaborative platforms have become crucial venues of career advancement and individual success in many creative fields, from engineering to the arts. Gender discrimination related to behavioral choices of users is a key component to gendered disadvantage on platforms. Such platforms carried the promise of opening avenues of advancement to previously discriminated groups, such as women, as platforms lack managerial gatekeepers with conventional prejudice. We analyzed the extent of behavior-based gender discrimination on two digital platforms, GitHub and Behance, focused on software development and fine arts and design. We found that the main cause of women's disadvantage in attention, success, and survival is largely due to the gender typicality of their behavior that varies between 60-90% of the total disadvantage of women. Men and women are penalized if they follow highly female-like behavior, while categorical gender is no longer significant. As platforms employ algorithmic tools and AI systems to manage users' activity and visibility, and recommend new projects to collaborate, stereotypes associated with behavior can have long-lasting consequences.
Article
Full-text available
Amaç: Popüler sosyal medya ağlarının arasında çok da öne çıkmayan, buna rağmen bambaşka bir prensiple donatılan Pinterest’in bünyesinde taşıdığı nitelikler ışığında iletişim alanında üstlendiği role değinerek Pinterest kültürünü ortaya koymaktır. Yöntem: Dünyanın nasıl anlamlandırıldığı hakkında bilgi toplama ve çeşitli kültürlerin nasıl uyum sağladığını anlama yaklaşımı doğrultusunda niteliksel bir araştırma olan metin analizi, bu çalışma için uygunluk göstermektedir. En yüzeyselden en derin düzleme kadar inilerek çözümleme gerçekleştirilebildiği ve pek çok şeyin metin olarak değerlendirilebildiği bir yöntem sunan metin analizi, araştırmanın görsel odaklı incelemelerinde işlevsel bir yol açmaktadır. Çalışmada veri toplama metodu olarak Pinterest’in arama motorunda yer alan 6 kategorinin her birindeki ilk üç görsel incelenerek toplamda 54 görsel ile Pinterest kültürü anlamlandırılmaya çalışılmıştır. Bulgular: Pinterest, kullanıcıların zevk ve ilgi alanlarına göre görsel bir içerik akışı sunan; ancak bunu koleksiyoner bir tutku ve keşifsel bir deneyimle temellendiren bir yaklaşımla yapılandırılmıştır. İşte bu nokta, özellikle iletişim literatürü için yeni bir disipline işaret eden küratörlük kavramıyla eşleşmektedir. Bununla birlikte Pinterest’in küratörlükle ilgili araçsal bir çalışma mekanizmasına sahip olması, kullanıcıların çeşitli içeriklerden, bu içeriklerin sunduğu keşifsel deneyimden fikirler edinmelerine ve bu fikirleri ilham kaynağı olarak kullanmalarına olanak tanımaktadır. Sonuç: Kendine özgü tarzı ve çalışma mekanizması olan Pinterest, bu yönüyle diğer sosyal medya platformları arasında ayırıcı bir özelliğe sahip ve niş bir kitleye hitap eden bir mecradır. Özgünlük: Dijital çağın yarattığı yeni dünya düzeninde, sosyal medya platformlarının artan çeşitliliği ve buna mukabil yeni olguların hayatımıza girmesiyle birlikte pek çok araştırma yapılsa da söz konusu Pinterest’in ekosistemi ve diğer platformlardan farklı olarak kullanıcılara sunduğu yenilik olduğunda, çok dar bir literatürle karşılaşılmaktadır. Bugün mevcut literatüre bakıldığında iletişim alanı dışında sadece 3 çalışma (tez) bulunmaktadır. Oysa uluslararası literatürde hem çok geniş bir çalışma alanı bulunmakla birlikte platforma ayrı bir önem atfedilmektedir. Bu yüzden bu çalışmanın ulusal iletişim literatürüne mütevazı bir katkı sunacağı umulmaktadır.
Article
Full-text available
China is one of the foremost online markets owing to the rising popularity of online shopping, and being the most populated country in the world, it is expected to develop into the largest market in the foreseeable future. India is not far behind. To conduct business online, website quality (WQ) has been regarded as a significant step. Henceforth, for WQ, along with analysis, several discussions have been dedicated. System quality (SQ), information quality (IQ), service quality, and website design are the WQ's strongest elements. Customer satisfaction (CS) is directly and positively affected by WQ. Similarly, CS has a direct and positive effect on purchase intentions (PIs). This effect is mediated significantly by CS with the existence of WQ's effect on PIs. Consequently, CS, WQ, and hypotheses, together with the research methodology, have been reviewed here. Website accuracy, along with consistency, is also explained here.
Chapter
The presence of a hidden enforcement is a matter in social media networks, whose contents are made attractive by rich images illustrating the rearrangement of the living spaces belonging to the followers of these networks. Every detail of private life including personal appearance, spaces where time is spent with friends, food is consumed, coffee is drunk, and houses are decorated, is presented through charming images. Inspired by these images, people have started to make their preferences regarding what mobile phone to use, what sports to practice, or what films to watch. The content of social media has begun to draw attention to “lifestyle advertising” and has provided a convenient ground for the advertising industry. Pinterest is a network where images reflecting modern people's daily habits, including consumption, are pinned in order to serve as sources of inspiration. In this study, the perfect living spaces which have been fictionalised as models in the images shared on Pinterest will be investigated in terms of “lifestyle advertising” and in comparison to real life.
Chapter
Full-text available
The presence of a hidden enforcement is a matter in social media networks, whose contents are made attractive by rich images illustrating the rearrangement of the living spaces belonging to the followers of these networks. Every detail of private life including personal appearance, spaces where time is spent with friends, food is consumed, coffee is drunk, and houses are decorated, is presented through charming images. Inspired by these images, people have started to make their preferences regarding what mobile phone to use, what sports to practice, or what films to watch. The content of social media has begun to draw attention to "lifestyle advertising" and has provided a convenient ground for the advertising industry. Pinterest is a network where images reflecting modern people's daily habits, including consumption , are pinned in order to serve as sources of inspiration. In this study, the perfect living spaces which have been fictionalised as models in the images shared on Pinterest will be investigated in terms of "lifestyle advertising" and in comparison to real life.
Article
Full-text available
Online social networks (OSNs) have become popular platforms for people to connect and interact with each other. Among those networks, Pinterest has recently become noteworthy for its growth and promotion of visual over textual content. The purpose of this study is to analyze this image-based network in a gender-sensitive fashion, in order to understand (i) user motivation and usage pattern in the network, (ii) how communications and social interactions happen and (iii) how users describe themselves to others. This work is based on more than 220 million items generated by 683,273 users. We were able to find significant differences w.r.t. all mentioned aspects. We observed that, although the network does not encourage direct social communication, females make more use of lightweight interactions than males. Moreover, females invest more effort in reciprocating social links, are more active and generalist in content generation, and describe themselves using words of affection and positive emotions. Males, on the other hand, are more likely to be specialists and tend to describe themselves in an assertive way. We also observed that each gender has different interests in the network, females tend to make more use of the network's commercial capabilities, while males are more prone to the role of curators of items that reflect their personal taste. It is important to understand gender differences in online social networks, so one can design services and applications that leverage human social interactions and provide more targeted and relevant user experiences. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Conference Paper
Full-text available
This paper presents a detailed analysis of the Google+ social net-work. We identify the key differences and similarities with other popular networks like Facebook and Twitter, in order to determine whether Google+ is a new paradigm or yet another social network. This work is based on large-scale crawls of over 27 million user profiles that represented nearly 50% of the entire network in 2011. We observe that the average path length between users is slightly higher than other networks, possibly because Google+ is a new sys-tem where relationships are still rapidly growing. Google+ shows a higher level of reciprocity than Twitter, which also has directed social links. The newly available "places lived" field could be used to study how users are distributed around the world and how ag-gressively the service has been adopted in different countries. We find that Google+ is popular in countries with relatively low Inter-net penetration rate. Based on the amount and types of information publicly shared in user profiles, we also find that the notion of pri-vacy varies significantly across different cultures.
Conference Paper
Full-text available
An online user joins multiple social networks in order to enjoy different services. On each joined social network, she creates an identity and constitutes its three major dimensions namely profile, content and connection network. She largely governs her identity formulation on any social network and therefore can manipulate multiple aspects of it. With no global identifier to mark her presence uniquely in the online domain, her online identities remain unlinked, isolated and difficult to search. Literature has proposed identity search methods on the basis of profile attributes, but has left the other identity dimensions e.g. content and network, unexplored. In this work, we introduce two novel identity search algorithms based on content and network attributes and improve on traditional identity search algorithm based on profile attributes of a user. We apply proposed identity search algorithms to find a user's identity on Facebook, given her identity on Twitter. We report that a combination of proposed identity search algorithms found Facebook identity for 39% of Twitter users searched while traditional method based on profile attributes found Facebook identity for only 27.4%. Each proposed identity search algorithm access publicly accessible attributes of a user on any social network. We deploy an identity resolution system, Finding Nemo, which uses proposed identity search methods to find a Twitter user's identity on Facebook. We conclude that inclusion of more than one identity search algorithm, each exploiting distinct dimensional attributes of an identity, helps in improving the accuracy of an identity resolution process.
Article
Full-text available
There is a great concern about the potential for people to leak private information on OSNs, but few quantitative studies on this. This research explores the activity of sharing mobile numbers on OSNs, via public profiles and posts. We attempt to understand the characteristics and risks of mobile numbers sharing behaviour on OSNs and focus on Indian mobile numbers. We collected 76,347 unique mobile numbers posted by 85905 users on Twitter and Facebook and analysed 2997 numbers, prefixed with +91. We observed, most users shared their own mobile numbers to spread urgent information; and to market products and escort business. Fewer female users shared mobile numbers on OSNs. Users utilized other OSN platforms and third party applications like Twitterfeed, to post mobile numbers on multiple OSNs. In contrast to the user's perception of numbers spreading quickly on OSN, we observed that except for emergency, most numbers did not diffuse deep. To assess risks associated with mobile numbers exposed on OSNs, we used numbers to gain sensitive information about their owners (e.g. name, Voter ID) by collating publicly available data from OSNs, Truecaller, OCEAN. On using the numbers on WhatApp, we obtained a myriad of sensitive details (relationship status, BBM pins) of the number owner. We communicated the observed risks to the owners by calling. Few users were surprised to know about the online presence of their number, while a few others intentionally posted it online for business purposes. We observed, 38.3% of users who were unaware of the online presence of their number have posted their number themselves on the social network. With these observations, we highlight that there is a need to monitor leakage of mobile numbers via profile and public posts. To the best of our knowledge, this is the first exploratory study to critically investigate the exposure of Indian mobile numbers on OSNs.
Article
We present findings from a qualitative study of activity on Pinterest.com, in which we investigated professional and personal uses of the site using interview data and observations of online activity. We find that Pinterest serves as an infrastructure for repository building that supports a wide range of activities including: discovery, collecting, collaborating, and publishing. We discuss these concepts using the language of "boundary objects" from the sociology of science. We suggest that scale is a critical dimension of boundary objects for understanding how people make sense of Pinterest and their diverse goals for using it. Professionals often attempt to use Pinterest to create repositories that scale to groups, organizations and societies and interface with multiple social worlds whereas personal repositories often have highly localized meanings. Our approach builds on quantitative descriptions of Pinterest to understand how the site fits into a growing ecology of social network sites. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Conference Paper
Pinterest is a popular social curation site where people collect, organize, and share pictures of items. We studied a fundamental issue for such sites: what patterns of activity attract attention (audience and content reposting)-- We organized our studies around two key factors: the extent to which users specialize in particular topics, and homophily among users. We also considered the existence of differences between female and male users. We found: (a) women and men differed in the types of content they collected and the degree to which they specialized; male Pinterest users were not particularly interested in stereotypically male topics; (b) sharing diverse types of content increases your following, but only up to a certain point; (c) homophily drives repinning: people repin content from other users who share their interests; homophily also affects following, but to a lesser extent. Our findings suggest strategies both for users (e.g., strategies to attract an audience) and maintainers (e.g., content recommendation methods) of social curation sites.
Conference Paper
Pinterest is a fast-growing interest network with significant user engagement and monetization potential. This paper explores quality signals for Pinterest boards, in particular the notion of board coherence. We find that coherence can be assessed with promising results and we explore its relation to quality signals based on social interaction.
Conference Paper
Over the past decade, social network sites have become ubiquitous places for people to maintain relationships, as well as loci of intense research interest. Recently, a new site has exploded into prominence: Pinterest became the fastest social network to reach 10M users, growing 4000% in 2011 alone. While many Pinterest articles have appeared in the popular press, there has been little scholarly work so far. In this paper, we use a quantitative approach to study three research questions about the site. What drives activity on Pinterest? What role does gender play in the site's social connections? And finally, what distinguishes Pinterest from existing networks, in particular Twitter? In short, we find that being female means more repins, but fewer followers, and that four verbs set Pinterest apart from Twitter: use, look, want and need. This work serves as an early snapshot of Pinterest that later work can leverage.