ArticlePDF Available

Tracking the Digital Footprints of Personality

Authors:

Abstract and Figures

A growing portion of offline and online human activities leave digital footprints in electronic databases. Resulting big social data offers unprecedented insights into population-wide patterns and detailed characteristics of the individuals. The goal of this paper is to review the literature showing how pervasive records of digital footprints, such as Facebook profile, or mobile device logs, can be used to infer personality, a major psychological framework describing differences in individual behavior. We briefly introduce personality and present a range of works focusing on predicting it from digital footprints and conclude with a discussion of the implications of these results in terms of privacy, data ownership, and opportunities for future research in computational social science.
Content may be subject to copyright.
INVITED
PAPER
Tracking the Digital Footprints
of Personality
This paper reviews literature showing how pervasive records of digital footprints can be
used to infer personality.
By Renaud Lambiotte and Michal Kosinski
ABSTRACT |A growing portion of offline and online human
activities leave digital footprints in electronic databases.
Resulting big social data offers unprecedented insights into
population-wide patterns and detailed characteristics of the
individuals. The goal of this paper is to review the literature
showing how pervasive records of digital footprints, such as
Facebook profile, or mobile device logs, can be used to infer
personality, a major psychological framework describing
differences in individual behavior. We briefly introduce
personality and present a range of works focusing on
predicting it from digital footprints and conclude with a
discussion of the implications of these results in terms of
privacy, data ownership, and opportunities for future research
in computational social science.
KEYWORDS |Big data; personality; psychology; social networks
I. INTRODUCTION
In recent years, a growing portion of human activities such
as social interactions and entertainment have become
mediated by digital services and devices. The records of
those activities, or ‘‘big social data,’’ are changing the
paradigm in the social sciences, as it undergoes a transition
from small-scale studies, typically employing question-
naires or lab-based observations and experiments, to large-
scale studies, in which researchers observe the behavior of
thousands or millions of individuals and search for
statistical regularities and underlying principles [1]–[6].
These works provide empirical observations at an unprec-
edented scale offering the potential to radically improve
our understanding of the individuals and social systems.
One of the major insights offered by big social data
research relates to the predictability of individuals’
psychological traits from their digital footprint [3]. Ability
to automatically assess psychological profiles opens the
way for improved products and services as personalized
search engines, recommender systems [7], and targeted
online marketing [8]. On the other hand, however, it
creates significant challenges in the areas of privacy [9],
[10]. The main goal of this paper is to provide a review of
the works investigating the potential of the big social data
to predict a five-factor model of personalityVthe major
set of psychological traitsVsupporting further studies of
the relationship between personality and digital footprint
and its implications for privacy and new products and
services.
II. PERSONALITY
The most widespread and generally accepted model of
personality is the five-factor model of personality (FFM;
[11]). FFM was shown to subsume most known personality
traits, and it is claimed to represent the basic structure
underlying the variations in human behavior and prefer-
ences, providing a nomenclature and a conceptual
framework that unifies much of the research findings in
the psychology of individual differences. FFM includes the
following traits.
1) Openness is related to imagination, creativity,
curiosity, tolerance, political liberalism, and
appreciation for culture. People scoring high on
openness like change, appreciate new and unusual
ideas, and have a good sense of aesthetics.
Manuscript received January 29, 2014; revised July 24, 2014; accepted September 9,
2014. Date of publication October 29, 2014; date of current version November 18, 2014.
The work of R. Lambiotte was supported by the F.R.S.–Fonds de la Recherche
Scientifique (FNRS), the European Union (EU) project Optimizr, and COST Action TD1210
KnowEscape. The work of M. Kosinski was supported by the Psychometrics Centre at
the University of Cambridge, Boeing Corporation, Microsoft Research, the National
Science Foundation (NSF), the Defense Advanced Research Projects Agency (DARPA),
and Center for the Study of Language and Information at Stanford University (CLSI).
This paper presents results of the Belgian Network Dynamical Systems, Control, and
Optimization (DYSCO), funded by the Interuniversity Attraction Poles Programme,
initiated by the Belgian State, Science Policy Office.
R. Lambiotte is with the Namur Center for Complex Systems (naXys), University of
Namur, Namur 5000, Belgium (e-mail: renaud.lambiotte@unamur.be).
M. Kosinski is with InfoLab, Stanford University, Stanford, CA 94305 USA, and also
with the Psychometrics Centre, University of Cambridge, Cambridge CB2 1TN, U.K.
Digital Object Identifier: 10.1109/JPROC.2014.2359054
0018-9219 Ó2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
1934 Proceedings of the IEEE |Vol.102,No.12,December2014
2) Conscientiousness measures the preference for an
organized approach to life in contrast to a
spontaneous one. Conscientious people are more
likely to be well organized, reliable, and consis-
tent. They enjoy planning, seek achievements, and
pursue long-term goals. Nonconscientious indivi-
duals are generally more easygoing, spontaneous,
and creative. They tend to be more tolerant and
less bound by rules and plans.
3) Extroversion measures a tendency to seek stimu-
lation in the external world, the company of
others, and to express positive emotions. Extro-
verts tend to be more outgoing, friendly, and
socially active. They are usually energetic and
talkative; they do not mind being at the center of
attention and make new friends more easily.
Introvertsaremorelikelytobesolitaryor
reserved and seek environments characterized by
lower levels of external stimulation.
4) Agreeableness relates to a focus on maintaining
positive social relations, being friendly, compas-
sionate, and cooperative. Agreeable people tend to
trust others and adapt to their needs. Disagreeable
people are more focused on themselves, less likely
to compromise, and may be less gullible. They also
tend to be less bound by social expectations and
conventions and are more assertive.
5) Emotional stability (opposite referred to as
neuroticism) measures the tendency to experi-
ence mood swings and emotions, such as guilt,
anger, anxiety, and depression. Emotionally un-
stable (neurotic) people are more likely to
experience stress and nervousness, whereas emo-
tionally stable people (low neuroticism) tend to be
calmer and self-confident.
Research has shown that personality is correlated with
many aspects of life, including job success [12], attractive-
ness [13], drug use [14], marital satisfaction [15], infidelity
[16], and happiness [17]. The main limitations of classical
personality studies are, however, the size of the samples,
often too poor for statistical validation, and their strong
bias toward white, educated, industrialized, rich, and
democratic (WEIRD) people [18].
III. FROM OFFLINE TO ONLINE...
The increasingly prevalent access to digital media enables
large-scale online projects aimed at collecting personality
profiles and exploring their relations with digital foot-
prints. Personality has been investigated through different
types of online media, for instance, by focusing on website
browsing logs [2], [19], contents of personal websites [20],
music collections [21], or properties of Twitter profiles
[22], [23].
The most complete online social environment is
arguably Facebook, due to its popularity and rich social
and semantic data stored on its users’ profiles that can be
conveniently recorded. It is important to note that
Facebook profiles are increasingly becoming a channel
through which to form impressions about others, for
example, before dating [24] or before a job interview [25].
Moreover, research tends to show that a Facebook profile
reflects the actual personality of an individual rather than
an idealized role [26], and that personality can be
successfully judged by the others based on Facebook
profiles [27], [28]. These results suggest that personality is
manifested not only in the offline, but also online
behavior, and thus digital footprints can be used to
predict it.
The most popular data set used to study the
relationship between personality and digital footprint
comes from the myPersonality project. myPersonality was
a Facebook application set up by David Stillwell in 2007
that offered participants access to 25 psychological tests
and attracted over six million users. myPersonality users
received immediate feedback (see Fig. 1) on their results
and could donate their Facebook profile information to
research resulting in a database that, after anonymization,
is being shared with the academic community at
mypersonality.org, allowing for the study of hitherto
unanswered questions in a wide range of topics, such as
geographical variations in personality ([29]; see Fig. 2),
social networks [2], [22], [30], [31], privacy [32], language
[6] (see Fig. 3), predicting individual traits [33], [3],
computer science [34], happiness [35], music [36], and
delayed discounting [37].
IV. SOCIAL NETWORK STRUCTURE
Social network structure is one of the major types of digital
footprint left by the users, and a growing number of studies
shows that it is predictive of often intimate personal traits.
For instance, it is known that the location within a
Facebook friendship network is predictive of sexual
orientation [38]. Similarly, it is possible to accurately
detect users’ romantic partner by observing overlap in
social circles [39].
Fig. 1. Snapshot of a personality profile generated by the
myPersonality Facebook App, representing an individual thatis liberal
and open minded (high openness), well-organized (high
conscientiousness), contemplative and happy with own company (low
extroversion), of average competitiveness (average
agreeableness), and laid back and relaxed (low neuroticism).
Lambiotte and Kosinski: Tracking the Digital Footprints of Personality
Vol. 102, No. 12, December 2014 | Proceedings of the IEEE 1935
Personality is expected to affect people’s social
networksurroundingsasitaffectsthetypesandnumber
of social ties formed by people. There are a number of
studies exploring this relationship. Neuroticism is usually
associated with negative social interactions, while extro-
version positively correlates with the size of the network
and greater social status [40], [41]. Results related to the
remaining traits tend to be inconsistent, perhaps due to
small sample sizes. More recently, Quercia et al. [31] used
myPersonality data set to study the relation between
sociometric popularity and personality traits, at a scale
several orders of magnitudes larger than in the previous
studies. They have shown that the strongest predictor for
the number of friends is extroversion, while other
personality traits do not play a significant role. On
average, extreme extroverts tend to have twice as many
friends as extreme introverts. A subsequent work [42]
went one step further and, for the first time, quantitatively
explained the way in which egocentric network topology is
shaped by personality. It confirmed that extroversion plays
a major role by showing that introverts are part of fewer
but larger communities, whereas extroverts tend to act as
bridges between more frequent but smaller communities
(see Fig. 4).
V. FACEBOOK LIKES
The Facebook profile of a user is not purely demographic,
as it also contains robust records of digital footprints. In
particular, Facebook likes exemplify a typical variety of
digital footprintVa connection between the user and a
content that is similar to other pervasive records such as
playlists (see Fig. 5), website browsing logs, purchase
records, or web search queries. A recent paper [3] based
on the myPersonality database and using relatively
straightforward methods (singular value decomposition
and linear regression) showed that Facebook likes are
highly predictive of personality and number of other
psychodemographic traits, such as age, gender, intelli-
gence, political and religious views, and sexual orientation
(see Fig. 6). The paper provided examples of likes most
strongly associated with given personality traits. For
example,userswholiked‘HelloKitty’brandtendedto
Fig. 3. Words, phrases, and topics most distinguishing extroversion
from introversion. Source: [6].
Fig. 2. Personality maps of U.S. states for neuroticism (upper) and
extroversion (lower). Dark (light) blue indicates values higher (lower)
than average. Figure based on myPersonality data.
Fig. 4. Typical egocentric networks of introverts (left) and extroverts
(right). Introverts tend to belong to fewer but larger and denser
communities, while extroverts tend to act as bridges between more
frequent,smaller, andoverlappingcommunities.Connections between
Ego and his friends have not been depicted for the sake of clarity.
Lambiotte and Kosinski: Tracking the Digital Footprints of Personality
1936 Proceedings of the IEEE |Vol.102,No.12,December2014
have high openness, low conscientiousness, and low
agreeableness.
VI. SEMANTIC ANALYSIS
Similar predictions can be based on the textual analysis of
people’s posts and other samples of text. There is a long
tradition in using text to infer personality [44], [45], [46],
however, never at the scale presented in [6]. This study
applied differential language analysis to uncover features
distinguishing demographic and psychological attributes to
700 million words, phrases, and topic instances collected
by myPersonality from Facebook status updates of 75 000
participants. It showed a striking variations of language
driven by personality, gender, and age. This work has not
only confirmed existing observations (such as neurotic
people’s tendency to use the word ‘‘depressed’’), but also
posed new hypotheses (such as a relationship between
physical activity and low neuroticism).
VII. ...AND BACK FROM ONLINE TO
OFFLINE
The proliferation of mobile-devices loaded with sensors
means that offline human activities are also increasingly
leaving digital footprint [47], [48]. For instance, physical
states such as running or walking can be inferred from
accelerometer data; colocation with other devices can be
detected using Bluetooth; geolocation can be established
using WiFi, Global Positioning System (GPS), or Global
System for Mobile (GSM) triangulation; and social
interactions can be measured by records of text messages
and phone calls. These data can be recorded by dedicated
apps, such as EmotionSense [49], which measures
emotional states based on the speech patterns and matches
it with physical activity, geolocation, and colocation with
other users. In the last few years, call data records (CDRs)
have been used to study the organization of social networks
and human mobility [50], [51], [52].
Similarly to digital footprints left in the online
environment, offline activities recorded with mobile
devices’ sensors reflect users’ personality. A recent study
combined CDRs with personality profiles of mobile device
users and identified a number of mobility and social factors
correlated with personality [53]. For instance, mobility
indicators, such as distance traveled, significantly correlate
with neuroticism, while social life indicators, such as the
size of the social network, correlated with extroversion, in
agreement with the previous results based on online digital
footprints.
Fig. 6. Prediction accuracy of regression for numeric attributes and
traits expressed by the Pearson correlation coefficient between
predicted and actual attribute values; all correlations are significant at
the pG0:001 level. The red outline bars indicate the questionnaire’s
baseline accuracy, expressed in terms of test-retest reliability.
Source: [3].
Fig. 5. Dendrogram illustrating the structure of music tastes and its
relation ship to the persona lity trait of ope nness among myPe rsonality
users. The structure was produced using hierarchical clustering of
the most popular Facebook likes from musician/band category. The
color scalerepresents the averageopenness of its subscribers, ranging
from conservative (cyan) to liberal (magenta). The height of the
nodes is proportional to the dissimilarity between individual likes or
clustersat both ends. The shorteris the path between two musicians or
bands, the larger overlap in audience. Source: [43].
Lambiotte and Kosinski: Tracking the Digital Footprints of Personality
Vol. 102, No. 12, December 2014 | Proceedings of the IEEE 1937
VIII. CONCLUSION
The main purpose of this paper was to review the evidence
of the relationship between digital footprint and person-
ality. We have shown that a wide range of pervasive and
often publicly available digital footprints such as Facebook
profiles or data from mobile devices can be used to infer
personality. As our life is increasingly interwoven with
digital services and devices, it is becoming critical to
understand the consequences of the apparent ability to
automatically and rapidly assess people’s psychological
traits.
Works cited in this paper indicate that the accuracy of
the personality predictions is moderate, with typical
correlation between the prediction and personality in the
range of r¼0:2andr¼0:4. It has to be noted, however,
that the ground truth (i.e., personality scores) is also
merely an approximation of the underlying latent traits. For
example, the accuracy of the personality scales used in [3]
expressed as a correlation between scores achieved by the
same person in two points of time (test-retest reliability)
ranged between r¼0:55 and r¼0:75. It is reasonable to
expect that with, an increasing amount of data available and
improved methods, assessment accuracy will improve.
Predicting users’ personality can be used to improve
numerous products and services. Digital systems and
devices (such as online stores or cars) could be designed to
adjust their behavior to best fit their users’ inferred profiles
[54]. For example, a car could adjust the parameters of the
engine and the music to the personality and current mood
of the driver. Also, the relevance of marketing and product
recommendations could be improved by adding psycho-
logical dimensions to current user models. For example,
online insurance advertisements might emphasize security
when facing emotionally unstable (neurotic) users but
stress potential threats when dealing with emotionally
stable ones. Moreover, digital footprint may provide a
convenient and reliable way to measure psychological
traits at a low cost. Such automated assessment could
prove to be more accurate and less prone to cheating and
misrepresentation than traditional questionnaires.
Furthermore, it is likely that new insights into
individual differences in human behavior offered by big
social data will fuel the emergence of new, more accurate,
robust models describing individuals and societies [5]. The
translation of big social data into models and policies calls
for a new wave of multidisciplinary collaborations between
fields as diverse as psychology, social sciences, linguistics,
computer science, and applied mathematics (perhaps
under the banner of computational social psychology).
On the other hand, the results presented here may
have considerable negative implications because it can
easily be applied to large numbers of people without
obtaining their individual consent and without them
noticing. Commercial companies, governmental institu-
tions, or even one’s Facebook friends could use software
to infer personality (and other attributes, such as
intelligence or sexual orientation) that an individual
may not have intended to share. There is a risk that the
growing awareness of such digital exposure may decrease
their trust in digital technologies, or even completely
deter them from them. We hope that researchers, policy
makers, and customers will find solutions to address those
challenges and retain the balance between the promises
and perils of the Digital Age. h
REFERENCES
[1] N. Aharony, W. Pan, C. Ip, I. Khayal, and
A. Pentland, ‘‘The social FMRI: Measuring,
understanding, designing social mechanisms
in the real world,’’ in Proc. 13th Int. Conf.
Ubiquitous Comput., 2011, pp. 445–454.
[2] M. Kosinski, Y. Bachrach, P. Kohli,
D. Stillwell, and T. Graepel, ‘‘Manifestations
of user personality in website choice and
behaviour on online social networks,’’ Mach.
Learn., vol. 95, pp. 357–380, 2013.
[3] M. Kosinski, D. Stillwell, and T. Graepel,
‘‘Private traits and attributes are predictable
from digital records of human behavior,’’ Proc.
Nat. Acad. Sci., vol. 110, pp. 5802–5805, 2013.
[4] A. Kramer, J. Guillory, and J. Hancock,
‘‘Experimental evidence of massive-scale
emotional contagion through social
networks,’’ Proc. Nat. Acad. Sci., vol. 111,
pp. 8788–8790, 2014.
[5] D. Lazer et al., ‘‘Social science: Computational
social science,’’ Science, vol. 323, no. 5915,
pp. 721–723, 2009.
[6] H. Schwartz et al., ‘‘Personality, gender, age in
the language of social media: The
open-vocabulary approach,’’ PloS One, vol. 8,
no. 9, 2013, DOI: 10.1371/journal.pone.
0073791.
[7] Y. Koren, R. Bell, and C. Volinsky, ‘‘Matrix
factorization techniques for recommender
systems,’’ Computer, vol. 42, no. 8, pp. 30–37,
2009.
[8] Y. Chen, D. Pavlovy, and J. F. Canny,
‘‘Large-scale behavioral targeting,’’ in Proc.
Conf. Knowl. Disc. Data Mining, 2009,
pp. 209–217.
[9] D. Butler, ‘‘Data sharing threatens privacy,’’
Nature, vol. 449, no. 7163, pp. 644–645,
2007.
[10] A. Narayanan and V. Shmatikov, ‘‘Robust
de-anonymization of large sparse
datasets,’’ in Proc. Symp. Security Privacy,
2008, pp. 111–125.
[11] L. Goldberg, ‘‘The structure of phenotypic
personality traits,’’ Amer. Psychol., vol. 48,
no. 1, pp. 26–34, 1993.
[12] T. Judge, C. Higgins, C. Thoresen, and
M. Barrick, ‘‘The Big Five personality traits,
general mental ability, career success across
the life span,’’ Personnel Psychol., vol. 52,
no. 3, pp. 621–652, 1999.
[13] D. Byrne, W. Griffitt, and D. Stefaniak,
‘‘Attraction and similarity of personality
characteristics,’’ J. Personality Social Psychol.,
vol. 5, no. 1, pp. 82–90, 1967.
[14] B. W. Roberts, O. S. Chernyshenko, S. Stark,
and L. R. Goldberg, ‘‘The structure of
conscientiousness: An empirical investigation
based on seven major personality
questionnaires,’’ Personnel Psychol., vol. 58,
no. 1, pp. 103–139, 2005.
[15] E. Kelly and J. Conley, ‘‘Personality and
compatibility: A prospective analysis of
marital stability and marital satisfaction,’’
J. Personality Social Psychol., vol. 52, no. 1,
pp. 27–40, 1987.
[16] T. Orzeck and E. Lung, ‘‘Big-Five personality
differences of cheaters and non-cheaters,’’
Current Psychol., vol. 24, pp. 274–287, 2005.
[17] D. Ozer and V. Benet-Martinez, ‘‘Personality
and the prediction of consequential
outcomes,’’ Annu. Rev. Psychol., vol. 57,
pp. 401–421, 2006.
[18] J. Henrich, S. Heine, and A. Norenzayan,
‘‘The WEIRDest people in the world,’’ Behav.
Brain Sci., vol. 33, no. 2–3, pp. 61–83, 2010.
[19] J. Hu, H.-J. Zeng, H. Li, C. Niu, and Z. Chen,
‘‘Demographic prediction based on user’s
browsing behavior,’’ in Proc. Int. World-Wide
Web Conf., 2007, pp. 151–160.
[20] B. Marcus, F. Machilek, and A. Schu
¨tz,
‘‘Personality in cyberspace: Personal web sites
as media for personality expressions and
impressions,’’ J. Personality Social Psychol.,
vol. 90, no. 6, pp. 1014–1031, 2006.
[21] P. Rentfrow and S. Gosling, ‘‘The do re mi’s
of everyday life: The structure and
personality correlates of music preferences,’’
J. Personality Social Psychol., vol. 84, no. 6,
pp. 1236–1256, 2003.
[22] D. Quercia, M. Kosinski, D. Stillwell, and
J. Crowcroft, ‘‘Our Twitter profiles, our
selves: Predicting personality with Twitter,’’
Lambiotte and Kosinski: Tracking the Digital Footprints of Personality
1938 Proceedings of the IEEE |Vol.102,No.12,December2014
in Proc. Int. Conf. Privacy Security Risk
Trust/IEEE Int. Conf. Social Comput., 2011,
pp. 180–185.
[23] J. Golbeck, C. Robles, M. Edmondson, and
K. Turner, ‘‘Predicting personality from
Twitter,’’ in Proc. Int. Conf. Privacy Security
Risk Trust/IEEE Int. Conf. Social Comput., 2011,
pp. 149–156.
[24] S. Zhao, S. Grasmuck, and J. Martin, ‘‘Identity
construction on Facebook: Digital
empowerment in anchored relationships,’’
Comput. Human Behav., vol. 24, no. 5,
pp. 1816–1836, 2008.
[25] A. Finder, ‘‘For some, online persona
undermines a re
´sume
´,’’ New York Times,
Jun. 2006. [Online]. Available: http://www.
nytimes.com/2006/06/11/us/11recruit.html?
pagewanted=all&_r=0.
[26] M. Back et al., ‘‘Facebook profiles reflect
actual personality, not self-idealization,’’
Psychol. Sci., vol. 21, no. 3, 2010, DOI: 10.
1177/0956797609360756.
[27] D. Evans, S. Gosling, and A. Carroll, ‘‘What
elements of an online social networking
profile predict target-rater agreement in
personality impressions,’’ in Proc. Conf.
Weblogs Social Media, 2008, pp. 45–50.
[28] S. Gosling, A. Augustine, S. Vazire,
N. Holtzman, and S. Gaddis, ‘‘Manifestations
of personality in online social networks:
Self-reported Facebook-related behaviors and
observable profile information,’’
Cyberpsychol. Behav. Social Netw., vol. 14,
no. 9, pp. 483–488, 2011.
[29] P. Rentfrow et al., ‘‘Divided we stand: Three
psychological regions of the united states and
their political, economic, social, health
correlates,’’ J. Personality Social Psychol.,
vol. 105, pp. 996–1012, 2013.
[30] D. Quercia, R. Lambiotte, M. Kosinski,
D. Stillwell, and J. Crowcroft, ‘‘The
personality of popular Facebook users,’’ in
Proc. Conf. Comput. Supported Cooperative
Work, 2012, pp. 955–964.
[31] D. Quercia, M. Bodaghi, and J. Crowcroft,
‘‘Loosing friends on Facebook,’’ in Proc. Web
Science Conf., 2012, pp. 251–254.
[32] D. Quercia et al., ‘‘Facebook and privacy: The
balancing act of personality, gender,
relationship currency,’’ in Proc. Conf. Weblogs
Social Media, 2012, pp. 306–313.
[33] Q. He, C. Glas, M. Kosinski, D. Stillwell, and
B. Veldkamp, ‘‘Predicting self-monitoring
skills using textual posts on Facebook,’’
Comput. Human Behav., vol. 33, pp. 69–78,
2014.
[34] B. Bi, M. Shokouhi, M. Kosinski, and
T. Graepel, ‘‘Inferring the demographics of
search users,’’ in Proc. Int. World-Wide Web
Conf., 2013, pp. 131–140.
[35] J. Wang, S. Faridani, and P. Ipeirotis,
‘‘Estimating completion time for
crowdsourced tasks using survival analysis
models,’’ in Proc. Workshop Crowdsourcing
Search Data Mining/ACM Int. Conf. Web Search
Data Mining, M. Lease, V. Carvalho, and
E. Yilmaz, Eds., Hong Kong, China,
Feb. 2011, pp. 31–34.
[36] P. Rentfrow et al., ‘‘The song remains the
same: A replication and extension of the
music model,’’ Music Percept., vol. 30, no. 2,
pp. 161–185, 2012.
[37] V. Mahalingam, D. Stillwell, M. Kosinski,
J. Rust, and A. Kogan, ‘‘Who can wait for the
future? A personality perspective,’Social
Psychol. Personality Sci., 2013, DOI: 10.1177/
1948550613515007.
[38] C. Jernigan and B. F. T. Mistree, ‘‘Gaydar:
Facebook friendships expose sexual
orientation,’’ First Monday, vol. 14, no. 10,
2009. [Online]. Available: http://firstmonday.
org/ojs/index.php/fm/rt/printerFriendly/
2611/2302.
[39] L. Backstrom and J. Kleinberg, ‘‘Romantic
partnerships and the dispersion of social ties:
A network analysis of relationship status on
Facebook,’’ in Proc. 17th ACM Conf. Comput.
Supported Cooperative Work Social Comput.,
2014, pp. 831–841.
[40] C. Anderson, O. John, D. Keltner, and
A. Kring, ‘‘Who attains social status? Effects of
personality and physical attractiveness in
social groups,’’ J. Personality Social Psychol.,
vol. 81, no. 1, pp. 116–132, 2001.
[41] R. Swickert, C. Rosentreter, J. Hittner, and
J. Mushrush, ‘‘Extraversion, social support
processes, stress,’’ Personality Individual
Differences, vol. 32, no. 5, pp. 877–891, 2002.
[42] A. Friggeri, R. Lambiotte, M. Kosinski, and
E. Fleury, ‘‘Psychological aspects of social
communities,’’ in Proc. Int. Conf. Privacy
Security Risk Trust/Int. Conf. Social Comput.,
2012, pp. 195–202.
[43] M. Kosinski, ‘‘Measurement and prediction of
individual and group differences in the digital
environment,’’ Ph.D. dissertation,
Dept. Psychol., Cambridge Univ.,
Cambridge, U.K., 2014.
[44] S. Gosling, S. Gaddis, and S. Vazire,
‘‘Personality impressions based on facebook
profiles,’’ in Proc. Int. Conf. Weblogs Social
Media, 2007, vol. 7, pp. 1–4.
[45] A. Kramer and K. Rodden, ‘‘Word usage and
posting behaviors: Modeling blogs with
unobtrusive data collection methods,’’ in Proc.
SIGCHI Conf. Human Factors Comput. Syst.,
2008, pp. 1125–1128.
[46] S. Vazire and S. D. Gosling, ‘‘E-perceptions:
Personality impressions based on personal
websites,’’ J. Personality Social Psychol., vol. 87,
pp. 123–132, 2004.
[47] N. Lane et al., ‘‘Enabling large-scale human
activity inference on smartphones using
community similarity networks (CSN),’’ in
Proc. 13th Int. Conf. Ubiquitous Comput., 2011,
pp. 355–364.
[48] H. Lu et al., ‘‘Stresssense: Detecting stress in
unconstrained acoustic environments using
smartphones,’’ in Proc. ACM Conf. Ubiquitous
Comput., 2012, pp. 351–360.
[49] K. K. Rachuri et al., ‘‘Emotionsense: A mobile
phones based adaptive platform for
experimental social psychology research,’’ in
Proc. 12th ACM Int. Conf. Ubiquitous Comput.,
2010, pp. 281–290.
[50] J.-P. Onnela et al., ‘‘Structure and tie strengths
in mobile communication networks,’’ Proc.
Nat. Acad. Sci., vol. 104, no. 18, pp. 7332–
7336, 2007.
[51] R. Lambiotte et al., ‘‘Geographical
dispersal of mobile communication
networks,’’ Physica A, Stat. Mech. Appl.,
vol. 387, no. 21, pp. 5317–5325, 2008.
[52] M. Gonzalez, C. Hidalgo, and A.-L. Barabasi,
‘‘Understanding individual human
mobility patterns,’’ Nature, vol. 453, no. 7196,
pp. 779–782, 2008.
[53] Y.-A. de Montjoye, J. Quoidbach, F. Robic,
and A. Pentland, ‘‘Predicting personality using
novel mobile phone-based metrics,’’ Social
Computing, Behavioral-Cultural Modeling and
Prediction, vol. 7812, Berlin, Germany:
Springer-Verlag, 2013, pp. 48–55.
[54] C. Nass and K. M. Lee, ‘‘Does
computer-generated speech manifest
personality? An experimental test of
similarity-attraction,’’ J. Exp. Psychol., vol. 7,
no. 3, pp. 171–181, 2000.
ABOUT THE AUTHORS
Renaud Lambiotte received the Ph.D. degree in
theoretical physics from the Universite
´Libre de
Bruxelles, Brussels, Belgium, in 2004.
He is a Professor in the Department of Mathe-
matics,UniversityofNamur,Namur,Belgium.He
was a Research Associate at the E
´cole normale
supe
´rieure de Lyon (ENS Lyon), Lyon, France;
Universite
´de Lie
´ge, Lie
´ge, Belgium; Universite
´
catholique de Louvain, Louvain-la-Neuve, Belgium;
and Imperial College London, London, U.K. His
research interests include network science, data mining, stochastic
processes, social dynamics, and neuroimaging.
Michal Kosinski received the Ph.D. degree in
psychology and computer science from the Uni-
versity of Cambridge, Cambridge, U.K., in 2014.
He is a Research Associate at the Computer
Science Department, Stanford University, Stanford,
CA, USA and the Deputy Director of the Psycho-
metrics Centre, University of Cambridge. He studies
big social data and its consequences for privacy,
occupational markets, and wellbeing. He also
coordinates the myPersonality project, which in-
volves global collaboration between over 150 researchers analyzing a
sample of over eight million Facebook users.
Lambiotte and Kosinski: Tracking the Digital Footprints of Personality
Vol. 102, No. 12, December 2014 | Proceedings of the IEEE 1939
... Personality evaluations can be used to tailor persuasive messages to resonate with the psychological makeup of broad audience groups [6]. Among the varied data sources tapped for these insights are online likes, mobile device usage patterns, and even music preferences [7]. Intriguingly, algorithmic evaluations of personalities often surpass the judgment accuracy of one's close peers [8]. ...
Preprint
Full-text available
The European Union's Artificial Intelligence Act aims to regulate manipulative and harmful uses of AI, but lacks precise definitions for key concepts. This paper provides technical recommendations to improve the Act's conceptual clarity and enforceability. We review psychological models to define "personality traits," arguing the Act should protect full "psychometric profiles." We urge expanding "behavior" to include "preferences" since preferences causally influence and are influenced by behavior. Clear definitions are provided for "subliminal," "manipulative," and "deceptive" techniques, considering incentives, intent, and covertness. We distinguish "exploiting individuals" from "exploiting groups," emphasising different policy needs. An "informed decision" is defined by four facets: comprehension, accurate information, no manipulation, and understanding AI's influence. We caution the Act's therapeutic use exemption given the lack of regulation of digital therapeutics by the EMA. Overall, the recommendations strengthen definitions of vague concepts in the EU AI Act, enhancing precise applicability to regulate harmful AI manipulation.
... In line with the increasing demands for digital media and social interactions, digital services provided to people are being recorded in an electronic database (Lambiotte & Kosinski, 2014). Thanks to these records, people's identity information, number of steps taken, geographical location, phone calls, call and message records, all kinds of photos, videos, texts shared by people, internet search histories of people, what they like or do not like while shopping on the internet, like buttons, physical many data such as destinations and passwords, can be stored in digital media. ...
Article
Full-text available
The aim of this study is to examine the digital footprint awareness of students while surfing on cyberbullying, cyber security and social networks with demographic data such as class, school, gender, internet usage level. This research was carried out using a descriptive survey model. The study group of the research consisted of 467 volunteer students studying at Anatolian and vocational high schools in Çorum. Research data were collected using the Digital Footprint Awareness Scale for Secondary School Students, Personal Cyber Security Scale, and Cyberbullying Scales for High School Students. The obtained data were analyzed using arithmetic mean, standard deviation, ANOVA, MANOVA, and logistic regression analysis. Because of this research, vocational high school students exhibit more cyberbullying behavior than Anatolian high school students, female students exhibit less cyberbullying behavior than male students, cyberbullying behavior increases as the time spent on the internet increases, and there is no difference according to classes. When the cyber security levels are examined, it is seen that there is no difference according to gender, internet usage level, and classes, but it differs according to the type of school and the Anatolian high school students have higher cyber security levels than the vocational high school students.
... Following DeFleur's argument above, data exchanges have always been incremental to the double-sided marketplace of attention (Webster, 2014)-in analogue media systems, data came in the form of representative panels that have evolved into today's census counts, or what is often simply referred to as big data. While former approaches collected under the headings of, for instance, "digital tracing" or "digital footprints" have focused on what the comprehensive collections of big data can-or cannot-be used for (Golder & Macy, 2014;Lambiotte & Kosinski, 2014;Lewis, 2015), infrastructural approaches draw attention to the ways data flows are handled and controlled. The "turn to infrastructure" (Hesmondhalgh, 2021;Parks et al., 2015;Plantin & Punathambekar, 2019;Sandvig, 2013) in media and communication studies thus broadly entails a renewed attention to the material structures rather than the symbolic content of digital communication. ...
Article
Full-text available
Recalling the well-known strategy of “following the money” when investigating the underlying power structures and business models of legacy media, this article argues that studies of digital political economies can benefit instead from following the data. Combining perspectives from critical data studies and infrastructure research, we first discuss how direct money flows can be difficult to trace in digital ecosystems, creating a need for alternative analytical approaches for studying and scrutinising contemporary power configurations in digital societies. As a theoretical backdrop, we elaborate on the concept of infrastructural power and apply it in a walkthrough of critical data infrastructures. To illustrate the efficacy of this strategy, we provide perspectives and examples from the political economies of internet infrastructures in Northern Europe and discuss how control over data is translated into economic profit and societal power. In doing so, we argue that increased attention to data infrastructures is needed to advance both critical data and infrastructure studies, improve digital market monitoring, and ground future regulation and policy.
... A number of researchers have investigated the development of Natural Language Processing (NLP) models for predicting personality traits, such as self-reported Big Five test scores, from online text data [4]. Personality data allows us to predict a person's emotions based on their experiences [5], social media posts [6], and the computational analysis of sentiment [7]. ...
Conference Paper
Abstract— The study of automated personality trait detection has become increasingly prevalent in the field of affective computing and sentiment analysis. Psycholinguistic databases and linguistic styles are found to have correlations with personality traits. Natural language processing has improved thanks to transfer learning. With pre-trained language models as features, this paper presents a machine learning model that benefits from transfer learning. Compared to state-of-the-art personality prediction models, the proposed method improved accuracy by 26.79%, 35.08 percent, and 38.96% percent on myPersonality dataset.
Chapter
Techniques to predict participants’ personality traits in real-time are not yet developed or well-studied. The objective of the current study was to explore the use of gaze and behavioral metrics and machine learning techniques in a hybrid foraging search task to infer an individual's personality traits to enable personalized interaction. We recruited and collected data from 40 university student participants in a hybrid foraging search task experiment. Specifically, the metrics were extracted from different time window sizes (5s, 10s, 15s, and 20s), which referred to the length of time before the participant stopped searching the current screen. Hierarchical clustering analysis was performed on the personality traits scores to group the participants into three groups, namely neuroticism (47.50%), conscientiousness (25.00%), and agreeableness (27.50%). Machine learning models were trained using the eye-gaze and behavioral metrics as inputs and personality trait groups as labels using well-known algorithms (including random forest (RF), support vector machine (SVM), k- nearest neighbor (kNN), and artificial neural network (ANN)). The results from the machine learning modeling showed that the prediction accuracy increased as the window size increased in general. The highest prediction accuracy (83%) was achieved with the kNN algorithm with a 15s time window. Combining eye-gaze and behavioral metrics as input features usually resulted in a better-performing model compared to using eye-gaze metrics alone (up to 10% improvement in accuracy). The current results can be to implement this approach in a brief game-like activity to infer a user's personality traits to enable subsequent intelligent user interface adaptations.KeywordsPersonality traitsEye-trackingVisual search taskMachine learning
Article
Full-text available
During the present-day digitization, entrepreneurs can make use of the great connectivity ofered by the Internet. The digital entrepreneur is just a click away from any information needed, buying products, exchanging opinions on a public level, and making use of many other functions ofered by the network. This power given to the entrepreneurs is of utmost importance for the good achievement of concrete actions according to their personality types and for relevant success in their entrepreneurial projects. However, the diferences between digital entrepreneurs and users’ personalities and traits have made marketers aware of having to adapt their actions according to what consumers demand. In addition to keeping abreast of trends and dominant patterns, entrepreneurs should be aware of the personalities and the infuence they exert on users’ behavior. In this context, the present study explores the infuence of diferent digital entrepreneurs’ personalities on their digital behavior and usage processes. In order to identify the diferent roles and personalities adopted by entrepreneurs in digital environments, in this study, we undertake a systematic literature review. Based on the results, we classify 7 personalities of digital entrepreneurs that directly infuence their relationship with the environment and with brands, as well as companies with digital presence. In addition, information about the classic personalities (also known as Big Five) of the digital entrepreneur are analyzed. The paper concludes with a discussion of the different processes that can be followed to find out what type of role each entrepreneur belongs to. We also discuss the issue of personal data and privacy issues on the Internet.
Article
Full-text available
Who can wait for larger, delayed rewards rather than smaller, immediate ones? Delay discounting (DD) measures the rate at which subjective value of an outcome decreases as the length of time to obtaining it increases. Previous work has shown that greater DD predicts negative academic, social, and health outcomes. Yet, little is known about who is likely to engage in greater or less DD. Taking a personality perspective, in a large sample (N = 5,888), we found that greater DD was predicted by low openness and conscientiousness and higher extraversion and neuroticism. Smaller amounts were also discounted more than larger amounts; furthermore, amount magnified the effects of openness and neuroticism on DD. Our findings show that personality is one predictor of individual differences in DD-an important implication for intervention approaches targeted at DD.
Conference Paper
Full-text available
The present study provides the first evidence that personality can be reliably predicted from standard mobile phone logs. Using a set of novel psychology-informed indicators that can be computed from data available to all carriers, we were able to predict users’ personality with a mean accuracy across traits of 42% better than random, reaching up to 61% accuracy on a three-class problem. Given the fast growing number of mobile phone subscription and availability of phone logs to researchers, our new personality indicators open the door to exciting avenues for future research in social sciences. They potentially enable cost-effective, questionnaire-free investigation of personality-related questions at a scale never seen before.
Article
Full-text available
Significance We show, via a massive ( N = 689,003) experiment on Facebook, that emotional states can be transferred to others via emotional contagion, leading people to experience the same emotions without their awareness. We provide experimental evidence that emotional contagion occurs without direct interaction between people (exposure to a friend expressing an emotion is sufficient), and in the complete absence of nonverbal cues.
Article
This personal historical article traces the development of the Big-Five factor structure, whose growing acceptance by personality researchers has profoundly influenced the scientific study of individual differences. The roots of this taxonomy lie in the lexical hypothesis and the insights of Sir Francis Galton, the prescience of L. L. Thurstone, the legacy of Raymond B. Cattell, and the seminal analyses of Tupes and Christal. Paradoxically, the present popularity of this model owes much to its many critics, each of whom tried to replace it, but failed. In reaction, there have been a number of attempts to assimilate other models into the five-factor structure. Lately, some practical implications of the emerging consensus can be seen in such contexts as personnel selection and classification.
Article
Social media profiles are telling examples of the everyday need for disclosure and concealment. The balance between concealment and disclosure varies across individuals, and personality traits might partly explain this variability. Experimental findings on the relationship between information disclosure and personality have been so far inconsistent. We thus study this relationship anew with 1,313 Facebook users in the United States using two personality tests: the big five personality test and the self-monitoring test. We model the process of information disclosure in a principled way using Item Response Theory and correlate the resulting user disclosure scores with personality traits.We find a correlation with the trait of Openness and observe gender effects, in that, men and women share equal amount of private information, but men tend to make it more publicly available, well beyond their social circles. Interestingly, geographic (e.g., residence, hometown) and work-related information is used as relationship currency, in that, it is selectively shared with social contacts and is rarely shared with the Facebook community at large.
Article
Who can wait for larger, delayed rewards rather than smaller, immediate ones? Delay discounting (DD) measures the rate at which subjective value of an outcome decreases as the length of time to obtaining it increases. Previous work has shown that greater DD predicts negative academic, social, and health outcomes. Yet, little is known about who is likely to engage in greater or less DD. Taking a personality perspective, in a large sample (N = 5,888), we found that greater DD was predicted by low openness and conscientiousness and higher extraversion and neuroticism. Smaller amounts were also discounted more than larger amounts; furthermore, amount magnified the effects of openness and neuroticism on DD. Our findings show that personality is one predictor of individual differences in DD-an important implication for intervention approaches targeted at DD.
Article
When a small consulting company in Chicago was looking to hire a summer intern this month, the company's president went online to check on a promising candidate who had just graduated from the University of Illinois. At Facebook, a popular social networking site, the executive found the candidate's Web page with this description of his interests: "smokin' blunts" (cigars hollowed out and stuffed with marijuana), shooting people and obsessive sex, all described in vivid slang. It did not matter that the student was clearly posturing. He was done. "A lot of it makes me think, what kind of judgment does this person have?" said the company's president, Brad Karsh. "Why are you allowing this to be viewed publicly, effectively, or semipublicly?" Many companies that recruit on college campuses have been using search engines like Google and Yahoo to conduct background checks on seniors looking for their first job. But now, college career counselors and other experts say, some recruiters are looking up applicants on social networking sites like Facebook, MySpace, Xanga and Friendster, where college students often post risqué or teasing photographs and provocative comments about drinking, recreational drug use and sexual exploits in what some mistakenly believe is relative privacy.
Article
There is overwhelming anecdotal and empirical evidence for individual differences in musical preferences. However, little is known about what drives those preferences. Are people drawn to particular musical genres (e.g., rap, jazz) or to certain musical properties (e.g., lively, loud)? Recent findings suggest that musical preferences can be conceptualized in terms of five orthogonal dimensions: Mellow, Unpretentious, Sophisticated, Intense, and Contemporary (conveniently, MUSIC). The aim of the present research is to replicate and extend that work by empirically examining the hypothesis that musical preferences are based on preferences for particular musical properties and psychological attributes as opposed to musical genres. Findings from Study 1 replicated the five-factor MUSIC structure using musical excerpts from a variety of genres and subgenres and revealed musical attributes that differentiate each factor. Results from Studies 2 and 3 show that the MUSIC structure is recoverable using musical pieces from only the jazz and rock genres, respectively. Taken together, the current work provides strong evidence that preferences for music are determined by specific musical attributes and that the MUSIC model is a robust framework for conceptualizing and measuring such preferences.