Conference PaperPDF Available

Using Machine Learning Techniques to Examine the Relationship between Money, Personality, and Well-being.



Examining two different case studies in which machine learning methods have enhanced psychological investigation, compared with an explanation based approach. Case 1: Using random forests to extract which personality variables are most predictive of plastic bag purchases. Case 2: Using decision trees to understand the sub-groups and non-linear relationships between the variables predicting well-being.
Using Machine Learning Techniques to
Examine the Relationship between
Money, Personality, and Well-being
Rosa Lavelle-Hill
Aim: To illustrate how machine learning tools can aid
psychological investigation
Collaborators: James Goulding2, David D. Clarke1, Anya Skatova3, and Peter A. Bibby1
1 Department of Psychology, University of Nottingham
2 N-LAB, Business School, University of Nottingham
3 School of Experimental Psychology, University of Bristol
Why a Machine Learning Approach?
The goal of understanding human behaviour involves being able to explain behaviour,
and then to predict it.
Prediction and explanation are not synonymous (Shmueli, 2010)
Increased false positive results/ inflated effect sizes (Yarkoni & Westfall, 2017).
Successful replications in Psychology is relatively low (Open Science Collaboration, 2015; Yarkoni &
Westfall, 2017).
Machine learning models can be optimised to the point in which they generalise best
Our Data
Loyalty card transaction data
Large health and beauty retailer
80,000 customers
12,968 fully completed and matched to participants’ purchasing data
2,474,011 individual transactions
91% were female
Participants gave informed consent for their questionnaire to be matched with
pseudo-anonymized loyalty card data prior to full anonymization
Case study 1: Predicting Plastic Bag Purchases
Q: What profile of person continues to consistently buy plastic bags after the 5p
Levy was brought in?
Bottom-up approach:
What can the best predictive model tell us about the motivations for the
76% accuracy on a binary classification task
Most important predictors were not the questions relating to environmental
People who bought more plastic bags were:
Less frugal
(questions on saving and disciplined spending)
More impulsive
And had lower self-control
Implications: Future interventions or plastic reduction campaigns need to appeal
to/target these personality profiles
Methodological Highlights
Not restricted by theory as model makes prior assumptions
Duration was predictive – hypothesis generating
Confident we have the best estimate of variable importance
Confident our findings generalise
Case study 2: Predicting Well-being
Personality Well-being
Understanding the Variables Predicting Well-being
In psychology, we not only want to build strong predictive models, but we want to be
able to know why they work.
Decision Tree Analysis Predicting Well-being
Any data, no
Missing data
60% less input data
Additional insights…
Insight 1: ‘Protective’ effect of Extroversion
The positive effect of social
Insight 2: Age and Well-being
There is only one part of
the age variable which is
highly predictive of
Insight 3: Money and Happiness
The data supports a ‘basic needs’
hypothesis, where after a
threshold amount, money doesn’t
strongly relate to happiness.
Decision Trees
More (not less) interpretable
Subgroup analysis, can inform where to target interventions
Inform a future reduction in data collection
(i.e. sensitive questions)
Gosling, S. D., Rentfrow, P. J., & Swann, W. B., Jr. (2003). A Very Brief Measure of the
Big Five Personality Domains. Journal of Research in Personality, 37, 504-528.
Open Science Collaboration. (2015). Estimating the reproducibility of psychological
science. Science, 349(6251), aac4716.
Shmueli, G. (2010). To explain or to predict?. Statistical science, 25(3), 289-310.
Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology:
Lessons from machine learning. Perspectives on Psychological Science, 12(6),
ResearchGate has not been able to resolve any citations for this publication.
Psychology has historically been concerned, first and foremost, with explaining the causal mechanisms that give rise to behavior. Randomized, tightly controlled experiments are enshrined as the gold standard of psychological research, and there are endless investigations of the various mediating and moderating variables that govern various behaviors. We argue that psychology’s near-total focus on explaining the causes of behavior has led much of the field to be populated by research programs that provide intricate theories of psychological mechanism but that have little (or unknown) ability to predict future behaviors with any appreciable accuracy. We propose that principles and techniques from the field of machine learning can help psychology become a more predictive science. We review some of the fundamental concepts and tools of machine learning and point out examples where these concepts have been used to conduct interesting and important psychological research that focuses on predictive research questions. We suggest that an increased focus on prediction, rather than explanation, can ultimately lead us to greater understanding of behavior.
When time is limited, researchers may be faced with the choice of using an extremely brief measure of the Big-Five personality dimensions or using no measure at all. To meet the need for a very brief measure, 5 and 10-item inventories were developed and evaluated. Although somewhat inferior to standard multi-item instruments, the instruments reached adequate levels in terms of: (a) convergence with widely used Big-Five measures in self, observer, and peer reports, (b) test–retest reliability, (c) patterns of predicted external correlates, and (d) convergence between self and observer ratings. On the basis of these tests, a 10-item measure of the Big-Five dimensions is offered for situations where very short measures are needed, personality is not the primary topic of interest, or researchers can tolerate the somewhat diminished psychometric properties associated with very brief measures.
Estimating the reproducibility of psychological science
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.