Michal KosinskiStanford University | SU · Graduate School of Business
Michal Kosinski
PhD
About
172
Publications
216,644
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
15,273
Citations
Publications
Publications (172)
Eleven large language models (LLMs) were assessed using 40 bespoke false-belief tasks, considered a gold standard in testing theory of mind (ToM) in humans. Each task included a false-belief scenario, three closely matched true-belief control scenarios, and the reversed versions of all four. An LLM had to solve all eight scenarios to solve a single...
Big Data is everywhere. Globally, billions of people use online platforms and digital devicesto communicate, socialize, study, shop, and work. In the process, they generate a vast trove ofdigital footprints, documenting their thoughts, feelings, actions, and interactions. Socialexchanges, for example, are captured in emails, instant messages, and s...
The progress in artificial intelligence (AI), fueled by big data, is transforming the economy, culture, society, and lives of individuals. It is also transforming behavioral science: We can ask new kinds of questions and answer old ones in new ways. This special issue aims to highlight some of the most exciting trends in behavioral science fueled b...
Friends and spouses tend to be similar in a broad range of characteristics, such as age, educational level, race, religion, attitudes, and general intelligence. Surprisingly, little evidence has been found for similarity in personality—one of the most fundamental psychological constructs. We argue that the lack of evidence for personality similarit...
We show how the quality of decisions based on the aggregated opinions of the crowd can be conveniently studied using a sample of individual responses to a standard IQ questionnaire. We aggregated the responses to the IQ questionnaire using simple majority voting and a machine learning approach based on a probabilistic graphical model. The score for...
A growing number of studies have linked facial width-to-height ratio (fWHR) with various antisocial or violent behavioral tendencies. However, those studies have predominantly been laboratory based and low powered. This work reexamined the links between fWHR and behavioral tendencies in a large sample of 137,163 participants. Behavioral tendencies...
Manual code reviews are an essential but time-consuming part of software development, often leading reviewers to prioritize technical issues while skipping valuable assessments. This paper presents an algorithmic model that automates aspects of code review typically avoided due to their complexity or subjectivity, such as assessing coding time, imp...
ChatGPT-4 and 600 human raters evaluated 226 public figures’ personalities using the Ten-Item Personality Inventory. The correlation between ChatGPT-4 and aggregate human ratings ranged from r=.76 to .87, outperforming the models specifically trained to make such predictions. Notably, the model was not provided with any training data or feedback on...
Can large multimodal models have a human-like ability for emotional and social reasoning, and if so, how does it work? Recent research has discovered emergent theory-of-mind (ToM) reasoning capabilities in large language models (LLMs). LLMs can reason about people's mental states by solving various text-based ToM tasks that ask questions about the...
Politicians invest heavily in social media to amplify narratives about their nations, but the effectiveness of such approaches remains unclear. Analyzing 758,222 posts from US and UK politicians on X (formerly Twitter), we found that right-wing politicians’ posts portraying the nation as exceptional and entitled (defensive identity rhetoric) receiv...
Carefully standardized facial images of 591 participants were taken in the laboratory while controlling for self-presentation, facial expression, head orientation, and image properties. They were presented to human raters and a facial recognition algorithm: both humans (r = .21) and the algorithm (r = .22) could predict participants’ scores on a po...
We show that people’s perceptions of public figures’ personalities can be accurately predicted from their names’ location in GPT-3’s semantic space. We collected Big Five personality perceptions of 226 public figures from 600 human raters. Cross-validated linear regression was used to predict human perceptions from public figures’ name embeddings e...
We design a battery of semantic illusions and cognitive reflection tests, aimed to elicit intuitive yet erroneous responses. We administer these tasks, traditionally used to study reasoning and decision-making in humans, to OpenAI’s generative pre-trained transformer model family. The results show that as the models expand in size and linguistic pr...
What environmental factors are associated with individual differences in political ideology, and do such associations change over time? We examine whether reductions in pathogen prevalence in U.S. states over the past 60 years are associated with reduced associations between parasite stress and conservatism. We report a positive association between...
We show that ChatGPT can predict public figures’ perceived personalities without being provided with any training data or feedback on its performance. ChatGPT and 600 human raters evaluated 300 public figures’ personalities using the Ten Item Personality Inventory. The correlation between ChatGPT and humans’ ratings ranged from r=.81 to .96, outper...
Modern large language models generate texts that are virtually indistinguishable from those written by humans and achieve near-human performance in comprehension and reasoning tests. Yet, their complexity makes it difficult to explain and predict their functioning. We examined a state-of-the-art language model (GPT-3) using lexical decision tasks w...
A facial recognition algorithm was used to extract face descriptors from carefully standardized images of 591 neutral faces taken in the laboratory setting. Face descriptors were entered into a cross-validated linear regression to predict participants' scores on a political orientation scale (Cronbach's alpha=.94) while controlling for age, gender,...
The digital revolution has had a momentous impact on almost every facet of social life. This sea change offers social psychologists new tools to deploy in their quest to understand human behavior and new types of social interaction to study. In a short time span, large data sets have emerged containing “digital footprints” of people’s behavior, com...
Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central to human social interactions, communication, empathy, self-consciousness, and morality. We administer classic false-belief tasks, widely used to test ToM in humans, to several language models, without any examples or pre-training. Our results show that m...
We show that people’s perceptions of public figures’ personalities can be accurately predicted from their names’ location in GPT-3’s semantic space. We collected Big Five personality perceptions of 300 public figures from 600 human raters. Cross-validated linear regression was used to predict human perceptions from public figures names’ embeddings...
Modern language models generate texts that are virtually indistinguishable from those written by humans and achieve near-human performance in comprehension and reasoning tests. Yet, their complexity makes it difficult to explain and predict their functioning. We examined a state-of-the-art language model (GPT-3) using lexical decision tasks widely...
Artificial intelligence (AI) technologies revolutionize vast fields of society. Humans using these systems are likely to expect them to work in a potentially hyperrational manner. However, in this study, we show that some AI systems, namely large language models (LLMs), exhibit behavior that strikingly resembles human-like intuition - and the many...
Objective
We explore the personality of counties as assessed through linguistic patterns on social media. Such studies were previously limited by the cost and feasibility of large-scale surveys; however, language-based computational models applied to large social media datasets now allow for large-scale personality assessment.
Method
We applied a...
Beyond being facilitators of human interactions, social networks have become an interesting target of research, providing rich information for studying and modeling user’s behavior. Identification of personality-related indicators encrypted in Facebook profiles and activities are of special concern in our current research efforts. This paper explor...
Social media profiles are telling examples of the everyday need for disclosure and concealment. The balance between concealment and disclosure varies across individuals, and personality traits might partly explain this variability. Experimental findings on the relationship between information disclosure and personality have been so far inconsistent...
In the Workshop on Computational Personality Recognition (Shared Task), we released two datasets, varying in size and genre, annotated with gold standard personality labels. This allowed participants to evaluate features and learning techniques, and even to compare the performances of their systems for personality recognition on a common benchmark....
Ubiquitous facial recognition technology can expose individuals’ political orientation, as faces of liberals and conservatives consistently differ. A facial recognition algorithm was applied to naturalistic images of 1,085,795 individuals to predict their political orientation by comparing their similarity to faces of liberal and conservative other...
The widely disseminated convergence in physical appearance hypothesis posits that long-term partners’ facial appearance converges with time due to their shared environment, emotional mimicry, and synchronized activities. Although plausible, this hypothesis is incompatible with empirical findings pertaining to a wide range of other traits—such as pe...
Powered by better hardware and software, and fueled by the emergence of computational social science, digital traces of human activity can be used to make highly personal inferences about their owner’s preferences, habits and psychological characteristics. The gained insights allow the application of psychological targeting and make it possible to...
Psychological targeting describes the practice of extracting people's psychological profiles from their digital footprints (e.g. their Facebook Likes, Tweets or credit card records) in order to influence their attitudes, emotions or behaviors through psychologically informed interventions at scale. We discuss how the increasingly blurred lines betw...
The convergence in physical appearance hypothesis posits that long-term partners’ faces become more similar with time as a function of the shared environment, diet, and synchronized facial expressions. While this hypothesis has been widely disseminated in psychological literature, it is supported by a single study of 12 married couples. Here, we ex...
The parasite stress hypothesis predicts that individuals living in regions with higher infectious disease rates will show lower openness, agreeableness, and extraversion, but higher conscientiousness. This article, using data from more than 250,000 U.S. Facebook users, reports tests of these predictions at the level of both U.S. states and individu...
Over the past century, personality theory and research has successfully identified core sets of characteristics that consistently describe and explain fundamental differences in the way people think, feel and behave. Such characteristics were derived through theory, dictionary analyses, and survey research using explicit self-reports. The availabil...
Predictive performance on questionnaire based tasks for factors without residualization of age and gender for 10 and 30 factors.
For comparison, with the questionnaire items, we calculate the 10 aspect scores and 30 facet based scores, using the relevant IPIP items. Demog indicates that age and gender were also added as co-variates to learn predict...
Predictive performance as a function of vocabulary size.
We show mean Pearson’s R over 10 random train-test splits for FriendSize, and IQ while for Likes we show the mean area under the curve (AUC) over all 20 categories. In particular, we learn factors by restricting the vocabulary size to the top K words and evaluate these learned factors on thei...
Predictive performance on social media based tasks for factors with residualization of age and gender.
We show mean Pearson’s R over 10 random train-test splits for FriendSize, Income and IQ while for Likes we show the mean area under the curve (AUC) over all 20 categories. Language based factors (FA) perform competitively and even outperform quest...
Predictive performance on social media tasks for factors with residualization of age and gender for 10 and 30 factors.
Demog indicates that age and gender were also added as co-variates to learn predictive models. We show mean Pearson’s R over 10 random train-test splits for FriendSize, Income and IQ while for Likes we show the mean area under the...
Predictive performance on questionnaire based tasks for factors with residualization of age and gender for 10 and 30 factors.
Demog indicates that age and gender were also added as co-variates to learn predictive models. We show mean Pearsons R over 10 random train-test splits. Language based factors (FA) perform do not outperform questionnaire bas...
Predictive performance on social media tasks for factors without residualization of age and gender for 10 and 30 factors.
For comparison, with the questionnaire items, we calculate the 10 aspect scores and 30 facet based scores, using the relevant IPIP items. Demog indicates that age and gender were also added as co-variates to learn predictive mod...
Predictive performance on questionnaire based tasks for factors with residualization of age and gender.
Demog indicates that age and gender were also added as co-variates to learn predictive models. We show mean Pearsons R over 10 random train-test splits. Language based factors (FA) do not outperform questionnaire based factors.
(JPG)
Word clouds showing the most/least correlated words for each FA factor as obtained using differential language analysis with age and gender residualized.
Residualizing out demographics like age and gender appears to reveal other dimensions of variance like (geography, ethnicity) as illustrated by F5 that reveals a factor highlighting language use o...
Background: Research suggests that humans have the tendency to increase the valence of events when these are imagined to happen in the future, but to decrease the valence when the same events are imagined to happen in the past. This line of research, however, has mostly been conducted by asking participants to value imagined, yet probable, events....
Research over the past decade has shown that various personality traits are communicated through musical preferences. One limitation of that research is external validity, as most studies have assessed individual differences in musical preferences using self-reports of music-genre preferences. Are personality traits communicated through behavioral...
This study empirically examines context collapse on Facebook by examining audience influences on content and language in self-disclosures. Context collapse is the process of disparate audiences being conjoined into one. Using a public longitudinal behavioral data set of 6,378 Facebook users, the study found that the size and heterogeneity of people...
Elderly people are exposed to information technologies to keep them in touch with younger generations. Among various technologies, social network sites (SNSs) are seldom used by the majority of elderly people. To bridge the digital divide, it is necessary to dig deeply into the minority elderly users of SNSs. This study explores usage patterns of e...
Research over the past decade has shown that various personality traits are communicated through musical preferences. One limitation of that research is external validity, as most studies have assessed individual differences in musical preferences using self-reports of music-genre preferences. Are personality traits communicated through behavioral...
We show that faces contain much more information about sexual orientation than can be perceived or interpreted by the human brain. We used deep neural networks to extract features from 35,326 facial images. These features were entered into a logistic regression aimed at classifying sexual orientation. Given a single facial image, a classifier could...
For the past 40 years, the conventional univariate model of self-monitoring has reigned as the dominant interpretative paradigm in the literature. However, recent findings associated with an alternative bivariate model challenge the conventional paradigm. In this study, item response theory is used to develop measures of the bivariate model of acqu...
Subjective well-being includes ‘affect’ and ‘satisfaction with life’ (SWL). This study proposes a unified approach to construct a profile of subjective well-being based on social media language in Facebook status updates. We apply sentiment analysis to generate users’ affect scores, and train a random forest model to predict SWL using affect scores...
Variable importance table, ranked in descending order.
The variables are ranked in descending order according to the mean decrease in accuracy.
(DOCX)
Investigaton of 30-day period immediately prior to SWL survey completion.
Here we repeat the SWL pipeline for a limited time period (30 days) for each user. We also study the impact of two different thresholds for minimum number of status updates per user.
(DOCX)
Topic and topic words for Facebook activities and the ratings from two raters.
The list of words relating to activities used for activity sentiment analysis.
(DOCX)
Variable importance graph.
The graph shows the top 50 important topics in the random forest model.
(TIF)
Significance
Building on recent advancements in the assessment of psychological traits from digital footprints, this paper demonstrates the effectiveness of psychological mass persuasion—that is, the adaptation of persuasive appeals to the psychological characteristics of large groups of individuals with the goal of influencing their behavior. On t...
People are exposed to persuasive communication across many different contexts: governments, companies, and political parties use persuasive appeals to encourage people to eat healthier, purchase a particular product, or vote for a specific candidate. Laboratory studies show that such persuasive appeals are more effective in influencing behavior whe...
Inspired by the work of the AAAS Science and Human Rights Coalition (AAAS is the publisher of Science), we asked young scientists this
question: Describe how applications of knowledge in your field (information, methodologies, services, and/or products) could support civil, political, economic, social, or cultural rights. We received responses fro...
A growing number of studies have linked facial width-to-height ratio (fWHR) with various antisocial or violent behavioral tendencies. However, those studies have predominantly been laboratory based and low powered. This work reexamined the links between fWHR and behavioral tendencies in a large sample of 137,163 participants. Behavioral tendencies...
People spend considerable effort managing the impressions they give others. Social psychologists have shown that people manage these impressions differently depending upon their personality. Facebook and other social media provide a new forum for this fundamental process; hence, understanding people's behaviour on social media could provide interes...
We show that faces contain much more information about sexual orientation than can be perceived and interpreted by the human brain. We used deep neural networks to extract features from 35,326 facial images. These features were entered into a logistic regression aimed at classifying sexual orientation. Given a single facial image, a classifier coul...
A growing number of studies have linked facial width-to-height ratio (fWHR) with various antisocial or violent behavioral tendencies. However, those studies have predominantly been laboratory based and low powered. This work reexamined the links between fWHR and behavioral tendencies in a large sample of 137,163 participants. Behavioral tendencies...
Religious affiliation is an important identifying characteristic for many individuals and relates to numerous life outcomes including health, well-being, policy positions, and cognitive style. Using methods from computational linguistics, we examined language from 12,815 Facebook users in the United States and United Kingdom who indicated their rel...
People spend considerable effort managing the impressions they give others. Social psychologists have shown that people manage these impressions differently depending upon their personality. Facebook and other social media provide a new forum for this fundamental process; hence, understanding people's behaviour on social media could provide interes...
As participant recruitment and data collection over the Internet have become more common, numerous observers have expressed concern regarding the validity of research conducted in this fashion. One growing method of conducting research over the Internet involves recruiting participants and administering questionnaires over Facebook, the world’s lar...
Research has typically examined the link of activity patterns and affect among late middle-aged and older people, in the context of continuity and activity theory. The aim of this present research was to test continuity and activity theory among younger employed age (25-54), and late middle-age and older age (over 55years of age) in the online cont...
Friends and spouses tend to be similar in a broad range of characteristics, such as age, educational level, race, religion, attitudes, and general intelligence. Surprisingly, little evidence has been found for similarity in personality-one of the most fundamental psychological constructs. We argue that the lack of evidence for personality similarit...
Do others perceive the personality changes that take place between the ages of 14 and 29 in a similar fashion as the aging person him- or herself? This cross-sectional study analyzed age trajectories in self- versus other-reported Big Five personality traits and in self-other agreement in a sample of more than 10,000 individuals from the myPersonal...
This article aims to fill some gaps in theory and research on age trends in musical preferences in adulthood by presenting a conceptual model that describes three classes of determinants that can affect those trends. The Music Preferences in Adulthood Model (MPAM) posits that some psychological determinants that are extrinsic to the music (individu...
Do others perceive the personality changes that take place between the ages of 14 and 29 in a similar fashion as the aging person him- or herself? This cross-sectional study analyzed age trajectories in self- versus other-reported Big Five personality traits and in self-other agreement in a sample of more than 10,000 individuals from the myPersonal...
There are two conflicting perspectives regarding the relationship between profanity and dishonesty. These two forms of norm-violating behavior share common causes and are often considered to be positively related. On the other hand, however, profanity is often used to express one’s genuine feelings and could therefore be negatively related to disho...