ArticlePDF Available

How Replicable Are Links Between Personality Traits and Consequential Life Outcomes? The Life Outcomes Of Personality Replication Project

Authors:

Abstract and Figures

The Big Five personality traits have been linked with dozens of life outcomes. However, metascientific research has raised questions about the replicability of behavioral science. The Life Outcomes Of Personality Replication (LOOPR) Project was therefore conducted to estimate the replicability of the personality-outcome literature. Specifically, we conducted preregistered, high-powered (median N = 1,504) replications of 78 previously published trait-outcome associations. Overall, 87% of the replication attempts were statistically significant in the expected direction. The replication effects were typically 77% as strong as the corresponding original effects, which represents a significant decline in effect size. The replicability of individual effects was predicted by the effect size and design of the original study, as well as the sample size and statistical power of the replication. These results indicate that the personality-outcome literature provides a reasonably accurate map of trait-outcome associations, but also stands to benefit from efforts to improve replicability.
Content may be subject to copyright.
Replicability of Trait-Outcome Associations 1
Soto, C. J. (2019). How replicable are links between personality traits and consequential life
outcomes? The Life Outcomes Of Personality Replication Project. Psychological
Science, 30, 711-727.
How Replicable Are Links Between Personality Traits and Consequential Life Outcomes?
The Life Outcomes Of Personality Replication Project
Christopher J. Soto
Colby College
Corresponding Author
Christopher J. Soto, Department of Psychology, Colby College, 5550 Mayflower Hill,
Waterville, ME 04901. E-mail: christopher.soto@colby.edu.
Replicability of Trait-Outcome Associations 2
Abstract
The Big Five personality traits have been linked with dozens of life outcomes. However,
metascientific research has raised questions about the replicability of behavioral science. The
Life Outcomes Of Personality Replication (LOOPR) Project was therefore conducted to estimate
the replicability of the personality-outcome literature. Specifically, we conducted preregistered,
high-powered (median N = 1,504) replications of 78 previously published trait-outcome
associations. Overall, 87% of the replication attempts were statistically significant in the
expected direction. The replication effects were typically 77% as strong as the corresponding
original effects, which represents a significant decline in effect size. The replicability of
individual effects was predicted by the effect size and design of the original study, as well as the
sample size and statistical power of the replication. These results indicate that the personality-
outcome literature provides a reasonably accurate map of trait-outcome associations, but also
stands to benefit from efforts to improve replicability.
Keywords: Big Five; life outcomes; metascience; personality traits; replication
Replicability of Trait-Outcome Associations 3
How Replicable Are Links Between Personality Traits and Consequential Life Outcomes?
The Life Outcomes Of Personality Replication Project
Do personality characteristics reliably predict consequential life outcomes? A sizable
research literature has identified links between the Big Five personality traits and dozens of
outcomes (Ozer & Benet-Martinez, 2006; Roberts, Kuncel, Shiner, Caspi, & Goldberg, 2007).
Based on this personality-outcome literature, economists, educators, and policymakers have
proposed initiatives to promote well-being through positive personality development
(Chernyshenko, Kankaraš, & Drasgow, 2018; Kautz, Heckman, Diris, ter Weel, & Borghans,
2014; OECD, 2015; Primi, Santos, John, & De Fruyt, 2016). However, recent metascientific
research has raised questions about the replicability of behavioral science (Button et al., 2013;
Camerer et al., 2016; Cova et al., in press; Open Science Collaboration, 2015; Simmons, Nelson,
& Simonsohn, 2011; Vul, Harris, Winkielman, & Pashler, 2009). We therefore conducted the
Life Outcomes Of Personality Replication (LOOPR) Project, an effort to estimate the
replicability of the personality-outcome literature. Specifically, we attempted preregistered, high-
powered replications of 78 previously published associations between the Big Five traits and a
diverse set of consequential life outcomes.
Personality Traits and Consequential Life Outcomes
A personality trait is a characteristic pattern of thinking, feeling, or behaving that tends to
be consistent over time and across relevant situations (Allport, 1961). The world’s languages
include thousands of adjectives for describing personality, many of which can be organized in
terms of the Big Five trait dimensions: Extraversion (e.g., sociable, assertive, energetic vs. quiet,
reserved), Agreeableness (compassionate, respectful, trusting vs. rude, suspicious),
Conscientiousness (orderly, hard-working, responsible vs. disorganized, unreliable), Negative
Replicability of Trait-Outcome Associations 4
Emotionality (or Neuroticism; worrying, pessimistic, temperamental vs. calm, stable), and Open-
Mindedness (or Openness to Experience; intellectual, artistic, imaginative vs. incurious,
uncreative) (De Raad, Perugini, Hrebícková, & Szarota, 1998; Goldberg, 1993; John, Naumann,
& Soto, 2008).
The Big Five constitute the most widely used framework for conceptualizing and
measuring personality traits (Almlund, Duckworth, Heckman, & Kautz, 2011; John et al., 2008).
This scientific consensus reflects their usefulness for organizing personality-descriptive
language, as well as a substantial research literature linking them with life outcomes. The most
comprehensive literature review conducted to date summarized associations between the Big
Five and dozens of individual, interpersonal, and social institutional outcomes (Ozer & Benet-
Martinez, 2006). For example, high Extraversion has been linked with social status and
leadership capacity, Agreeableness with volunteerism and relationship satisfaction,
Conscientiousness with job performance and health, Negative Emotionality with relationship
conflict and psychopathology, and Open-Mindedness with spirituality and political liberalism.
The Replicability of Behavioral Science
Drawing on both conceptual and empirical evidence, recent metascientific research (i.e.,
the scientific study of science itself) has raised questions about the replicability of behavioral
science: the likelihood that independent researchers conducting similar studies will obtain similar
results. Conceptually, this work has focused on researcher degrees of freedom, statistical power,
and publication bias. Researcher degrees of freedom represent undisclosed flexibility in the
design, analysis, and reporting of a scientific study (Simmons et al., 2011). Statistical power is
the probability of obtaining a statistically significant result, when the effect being tested truly
exists in the population (Cohen, 1988). Publication bias occurs when journals selectively publish
Replicability of Trait-Outcome Associations 5
studies with statistically significant results, thereby producing a literature that under-represents
null results (Sterling, Rosenbaum, & Weinkam, 1995). Multiple observers have expressed
concern that much behavioral science is characterized by many researcher degrees of freedom,
modest statistical power, and strong publication bias, leading to the publication of numerous
false-positive results: statistical flukes that are unlikely to replicate (Fraley & Vazire, 2014;
Franco, Malhotra, & Simonovits, 2014; Rossi, 1990; Simmons et al., 2011; Sterling et al., 1995;
Tversky & Kahneman, 1971).
Recently, large-scale replication projects have begun to empirically test these concerns.
For example, the Reproducibility Project: Psychology (RP:P) attempted to replicate 100 studies
published in high-impact psychology journals. Despite high statistical power, the RP:P observed
a replication success rate of only 36% (when success was defined as a statistically significant
result in the expected direction), and found that the replication effects were only half as strong as
the original effects, on average (Open Science Collaboration, 2015). Similar projects in
economics and experimental philosophy have also obtained replicability estimates considerably
lower than would be expected in the absence of published false positives, although results have
varied somewhat across projects (Camerer et al., 2016; Cova et al., in press). These findings
reinforce concerns about the replicability of behavioral science, and suggest that replicability
may vary both between and within disciplines. For example, replicability appears to be higher for
original studies that (a) examined main effects rather than interactions, (b) reported intuitive
rather than surprising results, and (c) obtained a greater effect size, sample size, and strength of
evidence (Camerer et al., 2016; Cova et al., in press; Open Science Collaboration, 2015).
The LOOPR Project
Replicability of Trait-Outcome Associations 6
In sum, previous research suggests that the Big Five personality traits relate with many
consequential life outcomes, but also raises questions about the replicability of behavioral
science. We therefore conducted the LOOPR Project to estimate the replicability of the
personality-outcome literature. Specifically, we attempted to replicate 78 previously published
trait-outcome associations, and then used the replication results to test two descriptive
hypotheses. First, we hypothesized that trait-outcome associations would be less than perfectly
replicable, due to the likely presence of published false positives and biased reporting of effect
sizes. Second, we hypothesized that the replicability of the personality-outcome literature may be
greater than the estimates obtained by previous large-scale replication projects in psychology,
due to normative practices in personality research such as using relatively large samples to
examine the main effects of personality traits. We also conducted exploratory analyses to search
for predictors of replicability, tentatively hypothesizing that original studies with greater effect
size, sample size, and strength of evidence, as well as replication attempts with greater sample
size and statistical power, may yield greater replicability.
Method
The LOOPR Project was conducted in six phases, which are briefly described below. An
extended description is available in the Supplemental Online Material. Additional materials,
including coded lists of the selected trait-outcome associations, original sources, and measures,
as well as the final survey materials, preregistration protocol and revisions, data, and analysis
code are available at https://osf.io/d3xb7. This research was approved by the Colby Institutional
Review Board.
The first phase of the project was to select a set of trait-outcome associations for
replication. We selected these from a published review of the personality-outcome literature
Replicability of Trait-Outcome Associations 7
(Ozer & Benet-Martinez, 2006), whose Table 1 summarizes 86 associations between the Big
Five traits and 49 life outcomes. The author and a research assistant examined the summary
table, main text, and citations of this review to identify the empirical evidence supporting each
trait-outcome association. We then selected 78 associations, spanning all of the Big Five and 48
life outcomes, that could be feasibly replicated. These 78 hypothesized trait-outcome
associations served as the LOOPR Project’s primary units of analysis for estimating
replicability.1
The second phase was to code the empirical sources supporting each association, so that
our replication attempts could follow the original studies as closely as was feasible. We therefore
coded information about the sample, measures, analytic method, and results of one empirical
study or meta-analysis for each of the 78 trait-outcome associations, which resulted in the coding
of 38 original sources. Some sources assessed multiple traits, outcomes, sub-outcomes, or
subsamples; when results differed across these components, we coded each one separately.
Supplemental Appendix A lists citations for the 38 original sources. Detailed coding of the
original studies, including information about their samples, measures, and design, is available at
https://osf.io/mc3z7.
The third phase was to develop a survey procedure for assessing the Big Five traits and
48 selected life outcomes. We assessed personality using a brief, consensus measure of the Big
1 Previous large-scale replication projects have typically treated the individual study as the
primary unit of analysis. Because personality-outcome studies often examine multiple trait-
outcome associations, we selected the individual association as the most appropriate unit of
analysis for estimating replicability in this literature.
Replicability of Trait-Outcome Associations 8
Five: the Big Five Inventory–2 (BFI-2; Soto & John, 2017). This 60-item questionnaire uses
short phrases to assess the prototypical facets of each Big Five trait domain. The 48 target life
outcomes were assessed using a battery of measures selected to follow the original studies as
closely as was feasible. For most outcomes, this involved administering the same outcome
measure used in the original study, or a subset of the original measures. For some outcomes, it
involved adapting interview items to a questionnaire format, or constructing items based on the
information available in the original source. To conserve assessment time, lengthy outcome
measures were abbreviated to approximately six items per outcome, sampling equally across
subscales or content domains to preserve content validity. After developing this assessment
battery, we used the Qualtrics platform to construct two online surveys; each survey included the
BFI-2 and approximately half of the outcome measures. Supplemental Table S1 lists the outcome
measures used in the original studies and replications, and Supplemental Appendix B lists
citations for these measures. Detailed coding of the original and replication outcome measures is
available at https://osf.io/mc3z7, and the final LOOPR surveys can be viewed at
https://osf.io/9nzxa (Survey 1) and https://osf.io/vdb6w (Survey 2).
The fourth phase was data collection. We used the Qualtrics Online Sample service to
administer our surveys to four samples of adults (ages 18 and older; used to replicate studies that
analyzed adult community samples) and young adults (ages 18-25; used to replicate studies that
analyzed student or young-adult samples). This yielded samples of 1,559 adults and 1,550 young
adults who completed Survey 1, and samples of 1,512 adults and 1,505 young adults who
completed Survey 2. Quota sampling was used to ensure that each sample would be
approximately representative of the United States population in terms of sex, race, and ethnicity,
and that the adult samples would also be representative in terms of age, educational attainment,
Replicability of Trait-Outcome Associations 9
and household income. Participants were compensated approximately $3 per 25-minute survey.
A minimum sample size of 1,500 participants per sample was selected to maximize statistical
power within our budgetary constraints; this sample size provides power of 97.3% to detect a
small true correlation (.10), and greater than 99.9% power to detect a medium-sized (.30) or
large (.50) correlation, using two-tailed tests and a .05 significance level (Cohen, 1988).
The fifth phase was preregistration. We registered our hypotheses, design, materials, and
planned analyses on the Open Science Framework, an online platform for sharing scientific
projects (see https://osf.io/d3xb7). The preregistration protocol was submitted during data
collection, and prior to data analysis, thereby minimizing the influence of researcher degrees of
freedom.
The final phase was data analysis. Descriptive statistics for all personality and outcome
variables are presented in Supplemental Table S2. We conducted two key sets of planned
analyses and one set of exploratory analyses. The first set attempted to replicate each of the 78
hypothesized trait-outcome associations. The second set aggregated the results of these 78
replication attempts to estimate the overall replicability of the personality-outcome literature. We
examined replicability in terms of both statistical significance and effect size, using Pearson’s r
(or standardized regression coefficients when the original results could not be converted to r) as
our common effect size metric, and using Fisher’s r-to-z transformation to aggregate effects. The
final set of analyses searched for predictors of replicability by correlating indicators of
replication success with characteristics of the original study and replication attempt.
Results
Testing the Hypothesized Trait-Outcome Associations
Replicability of Trait-Outcome Associations 10
Did the trait-outcome associations replicate? Our first set of planned analyses attempted
to replicate each of the 78 hypothesized associations. For each association, we conducted a
preregistered analysis specified to parallel the original study. For outcomes that included
multiple sub-outcomes or subsamples, we conducted a separate analysis for each component,
then aggregated these results (e.g., effect size, number of statistically significant results) to the
outcome level. For analyses involving outcome measures that had been abbreviated to conserve
assessment time, we computed the observed trait-outcome associations, and also estimated the
associations that would be expected if the outcome measure had not been abbreviated.
Specifically, we used the Spearman-Brown prediction formula and Spearman disattenuation
formula to estimate the trait-outcome associations that would be expected if our outcome
measure had used the same number of items or indicators as the original study (Lord & Novick,
1968). These corrected associations address the possibility that some failures to replicate could
simply reflect the attenuated reliability and validity of the abbreviated measures.
Table 1 presents the basic results of these analyses, including the number of significance
tests conducted for each hypothesized association, the (mean) sample size, the proportion of tests
that were statistically significant (i.e., two-tailed p-value < .05) in the hypothesized direction, the
(mean) original effect size, the (mean) replication effect size, and the ratio of the replication
effect size to the original effect size. To check the robustness of these results to variations in
sample size, Table 2 presents the replication success rates that would be expected using different
sample sizes: the sample size used in the original study, a sample size 2.5 times as large as the
original study (as recommended by Simonsohn, 2015), and a sample size with 80% power to
detect the original effect size (a heuristic that is often used to plan follow-up studies). More
Replicability of Trait-Outcome Associations 11
detailed information about all of these analyses, including complete results by sub-outcome and
subsample, is available at https://osf.io/mc3z7.
The results shown in Tables 1 and 2 indicate that many of the 78 replication attempts
obtained statistically significant support for the hypothesized associations, with effect sizes
comparable to the original results. However, these tables also suggest substantial variability in
the results of the replication attempts, in terms of both statistical significance and effect size.
Replicability of Trait-Outcome Associations 12
Table 1
Summary of the Hypothesized Trait-Outcome Associations and Replication Results
Outcome
Expected trait
associations Number
of tests Replication
sample size Original
sample size Replication
success rate Replication
effect size Original
effect size Effect size
ratio
Individual outcomes
Subjective well-being
E+
4
1,559
100/100
.37/.39
2.18/2.31
N–
4
1,559
100/100
.52/.54
2.64/2.78
Religious beliefs and behavior
A+
2
1,550
100/100
.18/.19
0.63/0.69
C+
2
1,550
100/100
.14/.15
0.60/0.65
Existential/phenomenological concerns
O+
2
1,550
100/100
.18/.20
0.50/0.56
Existential well-being
E+
1
1,550
100/100
.35/.37
1.12/1.18
N–
1
1,550
100/100
.60/.63
0.87/0.93
Gratitude
E+
1
1,559
100/100
.37/.37
1.17/1.17
A+
1
1,559
100/100
.54/.54
1.39/1.39
Forgiveness
A+
1
1,550
100/100
.48/.57
0.79/0.97
Inspiration
E+
1
1,514
100/100
.39/.39
2.04/2.04
O+
1
1,514
100/100
.35/.35
0.80/0.80
Humor
A+
1
1,550
100/100
.16/.16
N–
1
1,550
100/100
.13/.13
Heart disease
A–
1
1,235
0/0
.04/.04
0.24/0.24
Risky behavior
C–
15
1,336
72/72
.08/.08
0.31/0.31
Coping
E+
2
1,505
50/100
.17/.19
1.04/1.19
N–
2
1,505
100/100
.21/.24
1.32/1.50
Resilience
E+
1
1,505
100/100
.18/.18
0.96/0.96
Substance abuse
C–
1
1,505
100/100
.06/.06
0.25/0.25
O+
1
1,505
0/0
.02/.02
0.12/0.12
Anxiety
N+
1
1,505
100/100
.31/.31
0.90/0.90
Depression
E–
1
1,505
100/100
.13/.13
0.28/0.28
N+
1
1,505
100/100
.31/.31
0.64/0.64
Personality disorders
E(+/–)
4
1,505
75/75
.30/.41
0.66/0.93
A–
3
1,505
100/100
.42/.58
0.95/1.40
Replicability of Trait-Outcome Associations 13
C(+/–)
5
1,505
100/100
.30/.42
0.71/1.03
N(+/–)
4
1,505
100/100
.31/.41
0.82/1.11
Identity achievement
C+
1
1,550
100/100
.23/.25
0.75/0.83
Identity foreclosure
O–
1
1,550
100/100
.33/.35
0.63/0.66
Identity integration/consolidation
N–
1
804
100/100
.47/.57
2.31/2.86
O+
1
804
100/100
.21/.25
0.77/0.92
Ethnic culture identification
(for minorities)
C+
1
181
100/100
.18/.18
0.91/0.91
Majority culture identification
(for minorities)
E+
1
181
0/0
.10/.10
0.28/0.28
O+
1
181
0/0
.12/.12
0.41/0.41
Interpersonal outcomes
Family satisfaction
C+
2
1,466
0/0
-.07/-.08
-0.69/-0.77
N–
1
1,489
100/100
.17/.19
1.74/1.89
Peers’ acceptance and friendship
E+
1
1,549
100/100
.35/.35
0.84/0.84
Dating variety
E+
1
1,284
100/100
.12/.12
0.73/0.73
Attractiveness
E+
1
1,550
100/100
.33/.33
1.39/1.39
Peer status
E+
2
775
100/100
.39/.39
0.93/0.93
Peer status (men)
N–
1
749
100/100
.31/.31
0.69/0.69
Romantic satisfaction
E+
2
795
100/100
.15/.18
0.53/0.63
N–
2
795
100/100
.20/.23
0.62/0.73
Romantic satisfaction (dating couples)
A+
1
757
100/100
.18/.22
0.51/0.63
C+
1
757
100/100
.16/.19
0.44/0.53
Romantic conflict
N+
1
1,154
0/0
.01/.01
0.02/0.02
Romantic abuse
N+
1
1,154
100/100
.09/.09
0.35/0.37
Romantic dissolution
N+
1
1,098
100/100
.10/.10
0.45/0.45
Social institutional outcomes
Investigative occupational interests
O+
1
1,503
100/100
.15/.16
0.58/0.63
Artistic occupational interests
O+
1
1,503
100/100
.41/.43
1.39/1.51
Social occupational interests
E+
1
1,503
100/100
.15/.17
0.96/1.05
A+
1
1,503
100/100
.08/.09
0.77/0.84
Enterprising occupational interests
E+
1
1,503
100/100
.18/.20
1.14/1.23
Occupational performance
C–
3
829
33/33
.03/.03
0.31/0.31
Occupational satisfaction
E+
1
747
100/100
.19/.21
1.09/1.17
N–
1
747
100/100
.17/.18
0.72/0.77
Replicability of Trait-Outcome Associations 14
Occupational commitment
E+
1
748
100/100
.32/.32
1.96/1.96
N–
1
748
100/100
.26/.26
1.38/1.38
Extrinsic success
A–
1
481
100/100
.15/.15
0.63/0.63
C+
1
481
0/0
-.07/-.07
-0.13/-0.13
N–
1
481
100/100
.10/.10
0.28/0.28
Intrinsic success
C+
1
512
100/100
.24/.25
1.22/1.25
N–
1
512
100/100
.31/.32
1.20/1.24
Job attainment
A+
1
838
0/0
-.02/-.02
-0.09/-0.09
Occupational involvement
E+
1
944
100/100
.17/.17
0.93/0.95
Financial security
N–
1
944
100/100
.33/.33
1.52/1.52
Right-wing authoritarianism
O–
1
1,549
100/100
.29/.32
0.80/0.92
Conservatism
C+
1
1,559
100/100
.14/.18
0.56/0.75
O–
1
1,550
100/100
.17/.25
0.49/0.74
Volunteerism
E+
1
1,504
100/100
.20/.20
1.41/1.41
A+
1
1,504
100/100
.17/.17
0.74/0.74
Leadership
E+
1
747
100/100
.45/.47
2.16/2.28
A+
1
747
100/100
.27/.28
1.00/1.05
Antisocial behavior
C–
1
1,550
100/100
.26/.29
0.92/1.04
N+
1
1,550
100/100
.06/.07
0.20/0.23
Criminal behavior
A–
1
1,550
100/100
.23/.23
1.14/1.17
C–
1
1,550
100/100
.18/.19
0.58/0.59
Note. E = Extraversion. A = Agreeableness. C = Conscientiousness. N = Negative Emotionality. O = Open-Mindedness. + = Hypothesized positive
association. – = Hypothesized negative association. n.r. = Not reported. For replication success rate, replication effect size, and effect size ratio,
values left of the forward slash represent the observed trait-outcome associations, and values right of the slash represent the corrected associations.
All effect sizes are in the correlation metric or standardized regression coefficient metric, and oriented so that positive values represent effects in the
hypothesized direction. For outcomes that include multiple sub-outcomes or subsamples, results are aggregated within each outcome. Mean effect
sizes and effect size ratios were computed using Fisher’s r-to-z transformation.
Replicability of Trait-Outcome Associations 15
Table 2
Obtained and Expected Replication Success Rates for Varying Sample Sizes
Replication success rate
Outcome
Expected
trait
associations Number
of tests Replication
sample size Original
sample size Original sample
size × 2.5
Sample size with
80% power
Individual outcomes
Subjective well-being
E+
4
100/100
100/100
100/100
100/100
N–
4
100/100
100/100
100/100
100/100
Religious beliefs and behavior
A+
2
100/100
100/100
100/100
50/50
C+
2
100/100
100/100
100/100
50/50
Existential/phenomenological concerns
O+
2
100/100
100/100
100/100
0/0
Existential well-being
E+
1
100/100
100/100
100/100
100/100
N–
1
100/100
100/100
100/100
100/100
Gratitude
E+
1
100/100
100/100
100/100
100/100
A+
1
100/100
100/100
100/100
100/100
Forgiveness
A+
1
100/100
100/100
100/100
100/100
Inspiration
E+
1
100/100
100/100
100/100
100/100
O+
1
100/100
100/100
100/100
100/100
Humor
A+
1
100/100
100/100
100/100
n.a.
N–
1
100/100
0/0
100/100
n.a.
Heart disease
A–
1
0/0
0/0
0/0
0/0
Risky behavior
C–
15
72/72
61/61
89/89
33/33
Coping
E+
2
50/100
50/50
50/100
50/50
N–
2
100/100
100/100
100/100
100/100
Resilience
E+
1
100/100
100/100
100/100
100/100
Substance abuse
C–
1
100/100
0/0
100/100
0/0
O+
1
0/0
0/0
0/0
0/0
Anxiety
N+
1
100/100
100/100
100/100
100/100
Depression
E–
1
100/100
100/100
100/100
0/0
Replicability of Trait-Outcome Associations 16
N+
1
100/100
100/100
100/100
0/0
Personality disorders
E(+/–)
4
75/75
75/75
75/75
25/75
A–
3
100/100
100/100
100/100
100/100
C(+/–)
5
100/100
100/100
100/100
60/80
N(+/–)
4
100/100
100/100
100/100
50/100
Identity achievement
C+
1
100/100
100/100
100/100
100/100
Identity foreclosure
O–
1
100/100
100/100
100/100
0/0
Identity integration/consolidation
N–
1
100/100
100/100
100/100
100/100
O+
1
100/100
100/100
100/100
100/100
Ethnic culture identification
(for minorities)
C+
1
100/100
100/100
100/100
100/100
Majority culture identification
(for minorities)
E+
1
0/0
0/0
100/100
0/0
O+
1
0/0
0/0
100/100
0/0
Interpersonal outcomes
Family satisfaction
C+
2
0/0
0/0
0/0
0/0
N–
1
100/100
100/100
100/100
100/100
Peersacceptance and friendship
E+
1
100/100
100/100
100/100
100/100
Dating variety
E+
1
100/100
100/100
100/100
100/100
Attractiveness
E+
1
100/100
100/100
100/100
100/100
Peer status
E+
2
100/100
100/100
100/100
100/100
Peer status (men)
N–
1
100/100
100/100
100/100
0/0
Romantic satisfaction
E+
2
100/100
100/100
100/100
50/50
N–
2
100/100
50/50
100/100
50/50
Romantic satisfaction (dating couples)
A+
1
100/100
100/100
100/100
0/0
C+
1
100/100
100/100
100/100
0/0
Romantic conflict
N+
1
0/0
0/0
0/0
0/0
Romantic abuse
N+
1
100/100
100/100
100/100
0/0
Romantic dissolution
N+
1
100/100
n.a.
n.a.
0/0
Social institutional outcomes
Investigative occupational interests
O+
1
100/100
100/100
100/100
0/0
Artistic occupational interests
O+
1
100/100
100/100
100/100
100/100
Social occupational interests
E+
1
100/100
100/100
100/100
100/100
A+
1
100/100
100/100
100/100
100/100
Replicability of Trait-Outcome Associations 17
Enterprising occupational interests
E+
1
100/100
100/100
100/100
100/100
Occupational performance
C–
3
33/33
33/33
67/67
33/33
Occupational satisfaction
E+
1
100/100
100/100
100/100
100/100
N–
1
100/100
100/100
100/100
100/100
Occupational commitment
E+
1
100/100
100/100
100/100
100/100
N–
1
100/100
100/100
100/100
100/100
Extrinsic success
A–
1
100/100
100/100
100/100
0/0
C+
1
0/0
0/0
0/0
0/0
N–
1
100/100
0/0
100/100
0/0
Intrinsic success
C+
1
100/100
100/100
100/100
100/100
N–
1
100/100
100/100
100/100
100/100
Job attainment
A+
1
0/0
0/0
0/0
0/0
Occupational involvement
E+
1
100/100
100/100
100/100
100/100
Financial security
N–
1
100/100
100/100
100/100
100/100
Right-wing authoritarianism
O–
1
100/100
100/100
100/100
100/100
Conservatism
C+
1
100/100
0/0
100/100
0/100
O–
1
100/100
100/100
100/100
0/100
Volunteerism
E+
1
100/100
100/100
100/100
100/100
A+
1
100/100
100/100
100/100
100/100
Leadership
E+
1
100/100
100/100
100/100
100/100
A+
1
100/100
100/100
100/100
100/100
Antisocial behavior
C–
1
100/100
100/100
100/100
100/100
N+
1
100/100
0/0
0/0
0/0
Criminal behavior
A–
1
100/100
100/100
100/100
100/100
C–
1
100/100
100/100
100/100
0/0
Note. Replication sample size = Sample size obtained in the replication study (cf. Table 1). Original sample size = Sample size obtained in the
original study. Original sample size × 2.5 = Sample size 2.5 times as large as the original study (cf. Simonsohn, 2015). Sample size with 80% power
= Sample size required to provide 80% statistical power to detect the original effect size. E = Extraversion. A = Agreeableness. C =
Conscientiousness. N = Negative Emotionality. O = Open-Mindedness. + = Hypothesized positive association. – = Hypothesized negative
association. n.a. = Not applicable, because required information was not available from the original study. For replication success rates, values left of
the forward slash represent the observed trait-outcome associations, and values right of the slash represent the corrected associations. For outcomes
that include multiple sub-outcomes or subsamples, results are aggregated within each outcome.
Replicability of Trait-Outcome Associations 18
Testing Overall Replicability
How replicable is the personality-outcome literature, overall? Our second set of planned
analyses addressed this question by aggregating the results of the 78 replication attempts
summarized in Table 1. These analyses compared the results of the LOOPR Project with two
benchmarks: (a) the results that would be expected if all of the original findings represented true
effects (i.e., if the personality-outcome literature did not include any false positive results), and
(b) the results of a previous large-scale replication project conducted to estimate the overall
replicability of psychological science, the RP:P (Open Science Collaboration, 2015).2
We began by examining the rate of successful replication, defined simply as the
proportion of replication attempts that yielded statistically significant results in the hypothesized
direction. The results of this analysis are presented in Figure 1. Across the 76 trait-outcome
associations with an original effect size available for power analysis, the present research
obtained successful replication rates of 87.2% (66.3 successes, 95% CI [79.7%, 94.7%]) in tests
of the observed associations, and 87.9% (66.8 successes, 95% CI [80.6%, 95.2%]) after partially
correcting for the unreliability of abbreviated outcome measures. These success rates were
significantly lower than the rate of 99.3% (75.5 successes, 95% CI [97.4%, 100.0%]) expected
from power analyses of the original effect sizes and replication sample sizes (for observed
associations, χ2(1) = 8.79, p = .003; for corrected associations, χ2(1) = 8.23, p = .004). However,
they were significantly higher than the success rate of 36.1% (35 successes in 97 attempts, 95%
2 Because some of our replication attempts were dependent (due to a shared Big Five trait) rather
than independent, or aggregated results across multiple sub-outcomes or subsamples, the p-
values for these analyses should be considered approximate rather than exact.
Replicability of Trait-Outcome Associations 19
CI [26.5%, 45.6%]) obtained in the RP:P (for observed associations, χ2(1) = 45.96, p < .001; for
corrected associations, χ2(1) = 47.25, p < .001). These significant differences from the RP:P also
held for the complete set of 78 trait-outcome associations, with success rates of 87.6% (68.3
successes, 95% CI [80.2%, 94.9%]) for the observed associations, and 88.2% (68.8 successes,
95% CI [81.1%, 95.4%]) for the corrected associations (for observed associations, χ2(1) = 47.39,
p < .001; for corrected associations, χ2(1) = 48.69, p < .001).
The results presented in Table 2 indicate that these findings were also fairly robust to
variations in sample size. Specifically, the expected replication success rates would be 80.9%
(60.7 successes in 75 attempts, 95% CI [72.0%, 89.8%]) when using the same sample size as the
original study,3 89.1% (66.8 successes in 75 attempts, 95% CI [82.0%, 96.1%]) when using a
sample size 2.5 times as large as the original study, and 59.9% (45.5 successes in 76 attempts,
95% CI [48.9%, 70.9%]) when using a sample size that provides 80% statistical power to detect
the original effect. After partially correcting for unreliability, these expected success rates were
80.9% (60.7 successes, 95% CI [72.0%, 89.8%]), 89.7% (67.3 successes, 95% CI [82.9%,
96.6%]), and 64.1% (48.7 successes, 95% CI [53.3%, 74.9%]), respectively. All of these success
rates were significantly lower than would be expected from power analyses (all χ2(1) 4.77, p
.029), but significantly higher than those obtained in the RP:P (all χ2(1) 9.71, p .002).
Next, we examined the frequency with which the replication attempts obtained a trait-
outcome association weaker than the corresponding original effect, or not in the expected
direction. Across the 76 trait-outcome associations with an original effect size available for
comparison, the observed replication effect was weaker than the original effect 71.1% of the time
3 The original sample size was not available for one outcome.
Replicability of Trait-Outcome Associations 20
(54 cases, 95% CI [60.9%, 81.2%]); after partially correcting for the unreliability of abbreviated
outcome measures, the rate was 63.2% (48 cases, 95% CI [52.3%, 74.0%]). Binomial tests
indicated that both of these rates were significantly higher than the 50% rate that would be
expected if all of the original effect sizes represented true effects (for observed associations, p <
.001; for corrected associations, p = .029). However, Fisher’s exact tests indicated that the rate of
weaker replication effects obtained in the present research was less than the corresponding rate
of 82.8% (82 of 99 cases, 95% CI [75.4%, 90.3%]) obtained in the RP:P, and that this difference
was significant after correcting for unreliability (for observed associations, p = .070; for
corrected associations, p = .005).
Focusing on cases where the observed replication effect was substantially weaker than
the original effect (i.e., the z-transformed replication effect was at least 0.10 less than the
transformed original effect; Cohen, 1988), or not in the expected direction, yielded a similar
pattern of results. In the present research, the observed replication effect was substantially
weaker than the original effect 42.1% of the time (32 of 76 cases, 95% CI [31.0%, 53.2%]); after
correcting for unreliability, the rate was 30.3% (23 of 76 cases, 95% CI [19.9%, 40.6%]).
Fisher’s exact tests indicated that both of these rates were significantly lower than the
corresponding rate of 69.1% (67 of 97 cases, 95% CI [59.9%, 78.3%]) obtained in the RP:P (for
observed associations, p = .001; for corrected associations, p < .001).
Finally, we tested whether the mean and median of the z-transformed replication effect
sizes differed from the transformed original effect sizes, and whether the median effect size ratio
(i.e., the ratio of the replication effect size to the original effect size) differed between the present
research and the RP:P. Paired-samples t-tests indicated that the mean original effect size of 0.29
(95% CI [.26, .32]) was significantly stronger than both the mean observed replication effect of
Replicability of Trait-Outcome Associations 21
0.23 (95% CI [.20, .27], t(75) = 3.46, p = .001) and the mean corrected replication effect of 0.26
(95% CI [.22, .29], t(75) = 2.06, p = .043). Similarly, Wilcoxon signed-rank tests indicated that
the median original effect of 0.27 (95% CI [.23, .31]) was significantly stronger than the median
observed replication effect of 0.19 (95% CI [.17, .26], z = 3.59, p < .001) and the median
corrected replication effect of 0.22 (95% CI [.18, .27], z = 2.40, p = .016). However, Mann-
Whitney U tests indicated that the median effect size ratios of 0.77 (95% CI [.63, .92]) for
observed trait-outcome associations and 0.87 (95% CI [.73, .97]) for corrected associations
obtained in the present research were both significantly greater than the corresponding median
ratio of 0.43 (95% CI [,28, .62]) obtained in the RP:P (for observed effects, z = 4.22, p < .001;
for corrected effects, z = 4.86, p < .001). The results of this analysis, presented in Figure 2,
indicate that the replication effects obtained in the LOOPR Project were typically about 80% as
large as the corresponding original effects.
Taken together, these results support our hypothesis that the personality-outcome
literature is less replicable than would be expected if it did not include any false positive results,
but more replicable than the broader set of psychology studies examined by the RP:P. This
conclusion held whether replicability was assessed in terms of statistical significance or effect
size.
Replicability of Trait-Outcome Associations 22
Figure 1. Replication success rates obtained in the LOOPR Project, compared with the rate
expected from power analyses of the original effect size and replication sample size, and the rate
obtained in the Reproducibility Project: Psychology. A successful replication was defined as a
statistically significant effect (i.e., two-tailed p-value < .05) in the hypothesized direction.
Corrected associations were partially disattenuated to correct for the unreliability of abbreviated
outcome measures. Error bars represent 95% confidence limits.
Replicability of Trait-Outcome Associations 23
Figure 2. Median effect size ratios obtained in the LOOPR Project, compared with the ratio
expected if all original effect sizes represented true effects, and the median ratio obtained in the
Reproducibility Project: Psychology. Effect size ratios were computed as the ratio of the z-
transformed replication effect size to the transformed original effect size. Corrected associations
were partially disattenuated to correct for the unreliability of abbreviated outcome measures.
Error bars represent 95% confidence limits.
Replicability of Trait-Outcome Associations 24
Predictors of Replicability
What factors might influence the replicability of a trait-outcome association? Our final,
exploratory set of analyses searched for predictors of replicability. Specifically, we computed
Spearman’s rank correlations (ρ) across the set of 78 hypothesized trait-outcome associations to
correlate three characteristics of the original studies (effect size, sample size, and obtained p-
value4), two characteristics of the replication attempts (sample size and statistical power to detect
the original effect), and three aspects of similarity between the original study and the replication
attempt (whether the outcome was measured using the same indicators, the same data source, and
the same assessment timeline), with five indicators of replicability (statistical significance of the
replication effect, replication effect size, whether the replication effect was stronger than the
original effect, whether the replication effect was not substantially weaker than the original
effect, and ratio of the replication effect size to the original effect size).
These correlations, presented in Table 3, suggest three noteworthy patterns. First, the
original effect size positively predicted the replication effect size (for observed effects, ρ(74) =
.34, 95% CI [.12, .53], p = .002; for corrected effects, ρ(74) = .39, 95% CI [.18, .58], p < .001).
The original effect size also negatively predicted the likelihood that the replication effect would
be stronger than the original effect (for observed effects, ρ(74) = -.40, 95% CI [-.58, -.18], p <
.001; for corrected effects, ρ(74) = -.30, 95% CI [-.49, -.07], p = .009), as well as the likelihood
that the replication effect would not be substantially weaker than the original effect (for observed
effects, ρ(74) = -.40, 95% CI [-.58, -.18], p < .001; for corrected effects, ρ(74) = -.22, 95% CI [-
4 Because many original studies did not report exact p-values, we estimated these from the
reported effect size and degrees of freedom.
Replicability of Trait-Outcome Associations 25
.43, .01], p = .053). This pattern, illustrated in Figure 3, indicates that strong original effects were
more likely to yield strong replication effects, but also provided more room for the replication
effect to be weaker than the original effect.
The second noteworthy pattern was that the likelihood of successful replication (i.e., a
statistically significant effect in the hypothesized direction) was positively predicted by the
statistical power (for observed effects, ρ(74) = .37, 95% CI [.15, .55], p = .001; for corrected
effects, ρ(74) = .33, 95% CI [.11, .52], p = .003) and sample size (for observed effects, ρ(76) =
.25, 95% CI [.03, .45], p = .026; for corrected effects, ρ(76) = .27, 95% CI [.05, .47], p = .017) of
the replication attempt. This pattern likely reflects the influence of sample size on statistical
significance, especially when attempting to detect small effects.
The final pattern was that the replication effect size and the effect size ratio were both
positively predicted by whether the original study and the replication measured the target
outcome using the same items or indicators, as well as the same data source and format (i.e., a
self-report questionnaire) (all ρ .19, all p .107, see Table 3 for 95% CIs). This pattern,
although weaker and less consistent than the previous two, indicates that replications using
assessment methods more similar to the original studies tended to obtain trait-outcome
associations that were somewhat stronger and more comparable to the original effects.
Taken together, the results presented in Table 3 and Figure 3 suggest that the predictors
of replicability vary depending on how replicability is indexed: original effect size was the best
predictor of replication effect size, whereas replication power and sample size were the best
predictors of statistical significance. However, the conclusions that can be drawn from these
results should be tempered by the limited variability of some predictors (e.g., replication sample
Replicability of Trait-Outcome Associations 26
size and statistical power were generally quite high) and some replicability indicators (e.g.,
relatively few replication effects were not statistically significant).
Replicability of Trait-Outcome Associations 27
Table 3
Predictors of Replicability across the 78 Hypothesized Trait-Outcome Associations
Replication success
Replication
effect size
Replication
effect stronger
Replication effect not
substantially weaker
Effect size ratio
Observed
Corrected
Observed
Corrected
Observed
Corrected
Observed
Corrected
Observed
Corrected
Original study characteristics
Effect size
.12
.07
.34
**
.39
***
-.40
***
-.30
**
-.40
***
-.22
-.26
*
-.22
[-.11,
.33]
[-.15,
.29]
[.12,
.53]
[.18,
.58]
[-.58,
-.18]
[-.49,
-.07]
[-.58,
-.18]
[-.43,
.01]
[-.46,
-.03]
[-.43,
.01]
Sample size
-.13
-.12
-.17
-.18
.26
*
.14
.10
.05
.05
.04
[-.35,
.10]
[-.33,
.11]
[-.38,
.06]
[-.39,
.05]
[.03,
.46]
[-.09,
.36]
[-.13,
.32]
[-.18,
.28]
[-.18,
.27]
[-.19,
.26]
P-value
.09
.08
.02
-.01
.07
-.02
.14
.09
.16
.13
[-.14,
.31]
[-.15,
.30]
[-.21,
.25]
[-.24,
.22]
[-.16,
.29]
[-.25,
.21]
[-.10,
.35]
[-.14,
.31]
[-.07,
.37]
[-.10,
.35]
Replication characteristics
Sample size
.25
*
.27
*
.29
*
.31
**
.07
.09
.02
.20
.14
.17
[.03,
.45]
[.05,
.47]
[.06,
.48]
[.08,
.50]
[-.16,
.29]
[-.14,
.31]
[-.21,
.24]
[-.03,
.41]
[-.09,
.35]
[-.06,
.39]
Statistical power
.37
**
.33
**
.27
*
.30
**
-.21
-.09
-.12
.03
-.05
-.03
[.15,
.55]
[.11,
.52]
[.05,
.47]
[.08,
.50]
[-.41,
.02]
[-.31,
.14]
[-.34,
.11]
[-.20,
.25]
[-.28,
.17]
[-.26,
.20]
Similarity of original study and replication
Outcome indicators
-.04
-.05
.24
*
.19
.14
.08
.23
*
.17
.24
*
.19
[-.26,
.19]
[-.27,
.18]
[.01,
.44]
[-.03,
.40]
[-.09,
.36]
[-.14,
.30]
[.00,
.43]
[-.06,
.38]
[.02,
.45]
[-.04,
.40]
Outcome data source
.16
.19
.25
*
.29
*
.09
.11
.06
.27
*
.19
.23
*
[-.07,
.37]
[-.04,
.40]
[.03,
.45]
[.06,
.48]
[-.13,
.31]
[-.12,
.33]
[-.17,
.28]
[.05,
.47]
[-.04,
.40]
[.00,
.44]
Assessment timeline
.03
.04
-.04
.00
-.16
-.06
-.20
-.07
-.19
-.16
[-.20,
.25]
[-.18,
.26]
[-.26,
.18]
[-.23,
.22]
[-.37,
.07]
[-.28,
.17]
[-.41,
.03]
[-.29,
.16]
[-.40,
.03]
[-.38,
.07]
Note. *p < .05. **p < .01. ***p < .001. N = 73 to 78. Values are Spearman’s rank correlations. Values in brackets are 95% confidence limits. Replication success =
Replication effect was statistically significant in the hypothesized direction. Replication effect stronger = Replication effect was in the hypothesized direction and
stronger than the original effect. Replication effect not substantially weaker = Replication effect was in the hypothesized direction and not substantially weaker
than the corresponding original effect (i.e., Cohen’s q > -.10). Effect size ratio = Ratio of the z-transformed replication effect to the transformed original effect.
Observed = Analyses of observed trait-outcome associations. Corrected = Analyses of trait-outcome associations partially corrected for the unreliability of
abbreviated outcome measures. Outcome indicators = Whether the original study and replication used the same items or indicators to measure the outcome (1 =
Both used the same indicators; 0.5 = Replication used a subset of the original indicators; 0 = Replication used different indicators). Outcome data source =
Whether the original study and replication used the same data source and format to measure the outcome (1 = Both used self-report questionnaire data; 0.5 =
Original study used either self-report or questionnaire data. 0 = Original study used neither self-report nor questionnaire data). Assessment timeline = Whether the
original study and replication used the same timeline to assess the trait and outcome (1 = Both used concurrent assessment of the trait and outcome. 0.5 = Original
study aggregated results from concurrent and non-concurrent assessments. 0 = Original study did not assess the trait and outcome concurrently.)
Replicability of Trait-Outcome Associations 28
Figure 3. Scatterplot of the z-transformed original and (observed) replication effect sizes, by
success of the replication attempt. Successful replication = Replication effect was statistically
significant in the hypothesized direction. Unsuccessful replication = Replication effect was not
statistically significant or not in the hypothesized direction. Partial replication = Replication was
successful for some sub-outcomes or subsamples, but not for others. The solid, diagonal line
represents replication effect sizes equal to the original effect sizes. The dashed, horizontal line
represents a replication effect size of 0, and points below this line represent replication effects
that were not in the hypothesized direction.
Replicability of Trait-Outcome Associations 29
Discussion
The LOOPR Project was conducted to estimate the replicability of the personality-
outcome literature by attempting preregistered, high-powered replications of 78 previously
published trait-outcome associations. When replicability was defined in terms of statistical
significance, we successfully replicated 87% of the hypothesized effects, or 88% after partially
correcting for the unreliability of abbreviated outcome measures. A replication effect was
typically 77% as strong as the corresponding original effect, or 87% after correcting for
unreliability. Moreover, the statistical significance of a replication attempt was best predicted by
the sample size and statistical power of the replication, whereas the strength of a replication
effect was best predicted by the original effect size.
These results can be interpreted either optimistically or pessimistically. An optimistic
interpretation is that replicability estimates of 77% to 88% (across statistical significance and
effect size criteria) are fairly high. These findings suggest that the extant personality-outcome
literature provides a reasonably accurate map of how the Big Five traits relate with consequential
life outcomes (Ozer & Benet-Martinez, 2006). In contrast, a pessimistic interpretation is that our
replicability estimates are lower than would be expected if all the originally published findings
were unbiased estimates of true effects. This suggests that the personality-outcome literature
includes some false-positive results, and that reported effect sizes may be inflated by researcher
degrees of freedom and publication bias. Thus personality psychology, like other areas of
behavioral science, stands to benefit from efforts to improve replicability by constraining
researcher degrees of freedom, increasing statistical power, and reducing publication bias. Taken
together, these interpretations leave us cautiously optimistic about the current state and future
prospects of the personality-outcome literature (cf. Nelson, Simmons, & Simonsohn, 2018).
Replicability of Trait-Outcome Associations 30
Compared with previous large-scale replication projects in the behavioral sciences, the
LOOPR Project obtained relatively high replicability estimates. Why was this? When evaluating
replicability in terms of statistical significance, one likely contributor to our high success rates
was the large sample size (median N = 1,504) and correspondingly high statistical power
(median > 99.9%) of the replication attempts. When evaluating replicability in terms of relative
effect size, we speculate that the relatively high estimates obtained here may reflect
methodological norms in personality-outcome research, which typically examines the main
effects of traits using samples of several hundred participants and standardized measures (Fraley
& Vazire, 2014; Open Science Collaboration, 2015; Simmons et al., 2011). However, we note
that comparisons between replication projects should be tempered by the fact that different
projects have used different approaches to select the original studies and design the replication
attempts. Additional research is clearly needed to further investigate variation in replicability
across scientific disciplines and research literatures.
The present findings also have implications for understanding why replication attempts in
the behavioral sciences might generally succeed or fail. Failures to replicate are sometimes
attributed to unmeasured moderators: subtle differences between the original study and the
replication attempt that cause an effect to be observed in the former but not the latter (e.g.,
Stroebe & Strack, 2014). In the LOOPR Project, there were unavoidable differences between the
original studies and the replication attempts in terms of historical context (original studies
conducted from the 1980s to 2000s vs. replication in 2017), local context (many original research
sites vs. national American samples), sampling method (mostly student or community samples
vs. survey panels), administration method (mostly in-person surveys or interviews vs. online
surveys), and personality measures (many original measures vs. the BFI-2). The relatively high
Replicability of Trait-Outcome Associations 31
replicability estimates obtained despite these differences converge with previous results
suggesting that unmeasured moderators are not generally powerful enough to explain many
failures to replicate (Ebersole et al., 2016; Klein et al., 2014).
Strengths, Limitations, and Future Directions
The LOOPR Project had a number of important strengths, including its broad sample of
life outcomes, representative samples, preregistered design, and high statistical power. However,
it also had some noteworthy limitations that suggest promising directions for future research.
Most notably, all of the present data come from cross-sectional, self-report surveys completed by
online research panels, whereas some of the original studies used longitudinal designs or other
data sources (e.g., interviews, informant-reports, community samples). Indeed, our analyses of
replicability predictors indicated that replication effect sizes tended to be somewhat stronger
when the original study had also used a self-report survey to measure the target outcome. Thus,
the present research is only a first step toward establishing the replicability of these trait-outcome
associations, and future research using longitudinal designs, as well as alternative sampling and
assessment methods, is clearly needed.
A broader issue is that large-scale replication projects can be conducted using different
approaches (Tackett, McShane, Bockenholt, & Gelman, 2017). Any particular approach will
have advantages and disadvantages, and the choice of an optimal approach will depend on the
goals of a particular project. The main goal of the LOOPR Project was to estimate the overall
replicability of the personality-outcome literature. We therefore adopted a many-studies
approach that attempted to replicate a large number of original effects, with one replication
attempt per effect and relatively brief outcome measures (cf. Camerer et al., 2016; Cova et al., in
press; Open Science Collaboration, 2015). An alternative approach would be to replicate a
Replicability of Trait-Outcome Associations 32
smaller number of effects, with lengthier measures or multiple replication attempts per effect
(i.e., a many-labs approach; cf. Ebersole et al., 2016; Hagger et al., 2016; Klein et al., 2014).
Such an approach would be less well suited for estimating the overall replicability of a literature,
but better suited for achieving other goals. For example, future research can complement the
LOOPR Project by testing individual trait-outcome associations more robustly, and by directly
investigating factors—such as location, sampling method, mode of administration, measures, and
analytic methodthat might moderate these associations.
Conclusion
The results of the LOOPR Project provide grounds for cautious optimism about the
personality-outcome literature. Optimism, because we successfully replicated most of the
hypothesized trait-outcome associations, with many replication effect sizes comparable to the
original effects. Caution, because these replicability estimates were lower than would be
expected in the absence of published false positives. We therefore conclude that the extant
literature provides a reasonably accurate map of how the Big Five personality traits relate with
consequential life outcomes, but that personality psychology still stands to gain from ongoing
efforts to improve the replicability of behavioral science.
Replicability of Trait-Outcome Associations 33
Author Contributions
Christopher J. Soto designed the study, collected and analyzed the data, and drafted and
revised the manuscript.
Acknowledgements
The author thanks Alison Russell and Samantha Rizzo for their assistance with this
research.
Declaration of Conflicting Interests
Christopher J. Soto is a copyright holder for the Big Five Inventory–2 (BFI-2), which was
used in the present research. The BFI-2 is freely available for research use at
http://www.colby.edu/psych/personality-lab.
Funding
This research was supported by a faculty research grant from Colby College to
Christopher J. Soto.
Open Practices
Supporting materials, including the preregistration protocol and revisions, list of original
effects selected for replication, materials, data, analysis code, and detailed results are publicly
available at https://osf.io/d3xb7.
Replicability of Trait-Outcome Associations 34
References
Allport, G. W. (1961). Pattern and growth in personality. Oxford, England: Holt, Reinhart &
Winston.
Almlund, M., Duckworth, A. L., Heckman, J., & Kautz, T. (2011). Personality psychology and
economics. In E. A. Hanushek, S. Machin, & L. Woessmann (Eds.), Handbook of the
economics of education (Vol. 4, pp. 1-181). Amsterdam, Netherlands: Elsevier.
Button, K. S., Ioannidis, J. P., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S., & Munafò,
M. R. (2013). Power failure: Why small sample size undermines the reliability of
neuroscience. Nature Reviews Neuroscience, 14, 365-376.
Camerer, C. F., Dreber, A., Forsell, E., Ho, T.-H., Huber, J., Johannesson, M., . . . Chan, T.
(2016). Evaluating replicability of laboratory experiments in economics. Science, 351,
1433-1436.
Chernyshenko, O. S., Kankaraš, M., & Drasgow, F. (2018). Social and emotional skills for
student success and well-being. Paris, France: OECD Publishing.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ:
Lawrence Erlbaum Associates.
Cova, F., Strickland, B., Abatista, A., Allard, A., Andow, J., Attie, M., . . . Zhou, X. (in press).
Estimating the Reproducibility of Experimental Philosophy. Review of Philosophy and
Psychology.
De Raad, B., Perugini, M., Hrebícková, M., & Szarota, P. (1998). Lingua franca of personality:
Taxonomies and structures based on the psycholexical approach. Journal of Cross-
Cultural Psychology, 29, 212-232.
Replicability of Trait-Outcome Associations 35
Ebersole, C. R., Atherton, O. E., Belanger, A. L., Skulborstad, H. M., Allen, J. M., Banks, J. B., .
. . Boucher, L. (2016). Many Labs 3: Evaluating participant pool quality across the
academic semester via replication. Journal of Experimental Social Psychology, 67, 68-82.
Fraley, R. C., & Vazire, S. (2014). The N-pact factor: Evaluating the quality of empirical
journals with respect to sample size and statistical power. PLoS ONE, 9, e109019.
Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences:
Unlocking the file drawer. Science, 345, 1502-1505.
Goldberg, L. R. (1993). The structure of phenotypic personality traits. American Psychologist,
48, 26-34.
Hagger, M. S., Chatzisarantis, N. L., Alberts, H., Anggono, C. O., Batailler, C., Birt, A. R., . . .
Zwienenberg, M. (2016). A multilab preregistered replication of the ego-depletion effect.
Perspectives on Psychological Science, 11, 546-573.
John, O. P., Naumann, L. P., & Soto, C. J. (2008). Paradigm shift to the integrative big five trait
taxonomy Handbook of personality: Theory and research (3rd ed., pp. 114-158). New
York, NY: Guilford.
Kautz, T., Heckman, J. J., Diris, R., ter Weel, B., & Borghans, L. (2014). Fostering and
measuring skills: Improving cognitive and non-cognitive skills to promote lifetime
success. NBER Working Paper 20749.
Klein, R. A., Ratliff, K. A., Vianello, M., Adams Jr, R. B., Bahník, Š., Bernstein, M. J., . . .
Brumbaugh, C. C. (2014). Investigating variation in replicability: A “many labs”
replication project. Social Psychology, 45, 142-152.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Oxford, England:
Addison-Wesley.
Replicability of Trait-Outcome Associations 36
Nelson, L. D., Simmons, J., & Simonsohn, U. (2018). Psychology's renaissance. Annual Review
of Psychology, 69, 511-534.
OECD. (2015). Skills for social progress: The power of social and emotional skills. Paris,
France: OECD Publishing.
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science.
Science, 349, aac4716.
Ozer, D. J., & Benet-Martinez, V. (2006). Personality and the prediction of consequential
outcomes. Annual Review of Psychology, 57, 401-421.
Primi, R., Santos, D., John, O. P., & De Fruyt, F. (2016). Development of an inventory assessing
social and emotional skills in Brazilian youth. European Journal of Psychological
Assessment, 32, 5-16.
Roberts, B. W., Kuncel, N. R., Shiner, R., Caspi, A., & Goldberg, L. R. (2007). The power of
personality: The comparative validity of personality traits, socioeconomic status, and
cognitive ability for predicting important life outcomes. Perspectives on Psychological
Science, 2, 313-345.
Rossi, J. S. (1990). Statistical power of psychological research: What have we gained in 20
years? Journal of Consulting and Clinical Psychology, 58, 646-656.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed
flexibility in data collection and analysis allows presenting anything as significant.
Psychological Science, 22, 1359-1366.
Simonsohn, U. (2015). Small telescopes: Detectability and the evaluation of replication results.
Psychological Science, 26, 559-569.
Replicability of Trait-Outcome Associations 37
Soto, C. J., & John, O. P. (2017). The next Big Five Inventory (BFI-2): Developing and
assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity, and
predictive power. Journal of Personality and Social Psychology, 113, 117-143.
Sterling, T. D., Rosenbaum, W. L., & Weinkam, J. J. (1995). Publication decisions revisited: The
effect of the outcome of statistical tests on the decision to publish and vice versa. The
American Statistician, 49, 108-112.
Stroebe, W., & Strack, F. (2014). The alleged crisis and the illusion of exact replication.
Perspectives on Psychological Science, 9, 59-71.
Tackett, J. L., McShane, B. B., Bockenholt, U., & Gelman, A. (2017). Large scale replication
projects in contemporary psychological research. Unpublished manuscript.
Tversky, A., & Kahneman, D. (1971). Belief in the law of small numbers. Psychological
Bulletin, 76, 105-110.
Vul, E., Harris, C., Winkielman, P., & Pashler, H. (2009). Puzzlingly high correlations in fMRI
studies of emotion, personality, and social cognition. Perspectives on Psychological
Pcience, 4, 274-290.
... Decades of research have demonstrated that human personality can be meaningfully and efficiently summarized by five broad trait factors known as the Big Five: Extraversion (which encompasses traits such as sociability, assertiveness, and energy level), agreeableness (compassion, respectfulness, and trust), conscientiousness (organization, productiveness, and responsibility), neuroticism (anxiety, depression, and volatility) and openness to experience (imagination, curiosity, and aesthetic sensitivity) (John et al., 2008;Matthews et al., 2009;Soto & John, 2017). The Big Five predict academic performance and educational attainment, labor market outcomes, and career choices, at comparable levels to cognitive ability and socioeconomic status (Anni et al., 2024;Borghans et al., 2016;Bucher et al., 2019;Deary et al., 2010;Hampson, 2012;Roberts et al., 2007;Soto, 2019). They relate to a wide range of health behaviors, such as eating habits, drug use, and physical activity (Bogg & Roberts, 2004;De Moor et al., 2006;Hampson et al., 2007;Willroth et al., 2023) and to disease burden (Yoneda et al., 2023) and longevity (Beck & Jackson, 2022;Graham et al., 2017). ...
... Building on epidemiological and longitudinal research that has demonstrated the breadth of personality's associations, we quantified the widespread relevance of personality genetics to a plethora of socially relevant behaviors and important life outcomes (Bleidorn et al., 2019;Roberts et al., 2007;Soto, 2019;Wright et al., 2023) across tests of genetic correlation, PGI prediction, and Mendelian Randomization (MR) applied to EUR GWAS data. Results for AFR GWAS data are reported in the Supplementary Text and Supplementary Figures S30-S31. ...
Preprint
Full-text available
Personality traits describe stable differences in how individuals think, feel, and behave and how they interact with and experience their social and physical environments. We assemble data from 46 cohorts including 611K-1.14M participants with European-like and African-like genomes for genome-wide association studies (GWAS) of the Big Five personality traits (extraversion, agreeableness, conscientiousness, neuroticism, and openness to experience), and data from 51K participants for within-family GWAS. We identify 1,257 lead genetic variants associated with personality, including 823 novel variants. Common genetic variants explain 4.8%-9.3% of the variance in each trait, and 10.5%-16.2% accounting for measurement unreliability. Genetic effects on personality are highly consistent across geography, reporter (self vs. close other), age group, and measurement instrument, and we find minimal spousal assortment for personality in recent history. In stark contrast to many other social and behavioral traits, within-family GWAS and polygenic index analyses indicate little to no shared environmental confounding in genetic associations with personality. Polygenic prediction, genetic correlation, and Mendelian randomization analyses indicate that personality genetics have widespread, potentially causal associations with a wide range of consequential behaviors and life outcomes. The genetic architecture of personality is robust and fundamental to being a human.
... Both levels and changes in personality traits are related to important life outcomes in many life domains such as occupation, relationships, and health (Allemand et al., 2019;Beck & Jackson, 2022;Ozer & Benet-Martínez, 2006;Roberts et al., 2007;Soto, 2019Soto, , 2021Wright & Jackson, 2023). Therefore, the prospect of changing one's personality intentionally is highly attractive for individuals, as also evidenced by the popularity of commercial self-help products and services (Bergsma, 2008). ...
Article
Full-text available
Volitional personality change interventions have been shown to help people change their current personality toward their ideal personality. Here, we address three limitations of this literature. First, we contrast the dominant theoretical perspective of self-improvement with self-acceptance as pathways to reduce the discrepancy between current and ideal personality. Second, we test how well-being aspects change as a by-product of targeting personality. Third, we use a waitlist control group to account for expectancy and demand effects. Across three studies (combined N = 2,094; 1,044 women, 1,050 men; Mage = 30.74, SDage = 9.57, rangeage = 18–75), we implemented randomized online interventions of self-improvement or self-acceptance over a 3-month period, with another follow-up 6 months after baseline and a waitlist control group added in Study 2. Across Studies 1 and 2, participants in both intervention groups reduced discrepancies between current and ideal personality and increased in well-being. In both intervention groups, current personality increased, whereas ideal personality remained stable. Critically, however, control group participants changed similarly, with no significant differences in change compared to participants who received the interventions. Study 3 compared different control group specifications and highlighted that the intervention recruitment framing might have induced selection effects and expectancy and demand effects leading to positive changes in neuroticism, conscientiousness, and extraversion as well as life satisfaction and self-esteem. Thus, we demonstrate both shortcomings of previous intervention designs and imprecisions in theoretical frameworks of personality change mechanisms. We discuss future directions including multimethod studies, measurement advances, and microrandomization of intervention components.
... The Big Five taxonomy organises personality traits into ve domains: neuroticism (tendency to experience negative emotions), extraversion (sociability and assertiveness), openness to experience (tendency to be intellectual and creative), agreeableness (tendency to be compassionate and empathetic), and conscientiousness (organisation and responsibility; McCrae & Costa, 2008). These personality traits have been associated with a variety of life outcomes such as career success, divorce, and mortality (Roberts et al., 2007;Soto, 2019). Research shows that the Big Five can predict the life outcomes of annual income and overall life satisfaction better than childhood socio-economic status (Kajonius & Carlander, 2017 roles related to adulthood, attaining higher status in a career and achieving more at work is associated with increases in the social dominance facet of extraversion and in conscientiousness (Le et al., 2014;Roberts et al., 2003). ...
Preprint
Full-text available
Adulthood is often inferred from age and the attainment of milestones such as marriage, parenthood, and career. Yet, personality traits likely influence whether we feel like adults and how we evaluate adulthood. Here we tested associations between personality and perceptions of adulthood in a UK sample (N = 714, age 18–77 years, M Ag e = 39.2). Regression analyses indicated that personality independently explained an incremental 15% of variance in subjective adult status, and an additional 27% of variance in attitudes towards adulthood, after age, gender, SES, and the attainment of the traditional adult social roles of marriage, parenthood, and career were accounted for. Regression results also illustrated the presence of individual differences in perceived importance of markers of adulthood, with personality explaining between 5–11% of unique variance across the sample. Traditional markers of adulthood including marriage, parenthood, and career were predicted by high Neuroticism and low Openness, whereas psychosocial markers of adulthood such as ‘taking responsibility for the consequences of my actions’ were predicted by high Openness. This study demonstrates that personality traits shape perceptions of adulthood and calls for the consideration of personality and other individual differences in future studies exploring associations between perceptions of adulthood and life outcomes.
... Over the last few decades, the questions of whether and how personality traits change over time have been widely-studied in personality science (Bleidorn et al., 2020;Rauthmann & Kuper, 2025). These question are of theoretical and practical relevance as personality traits and personality trait changes have been found to predict various important life outcomes (Beck & Jackson, 2022;Bleidorn et al., 2019;Friedman & Kern, 2014;Jokela et al., 2018;Soto, 2019). ...
Preprint
Full-text available
Personality development has become one of the most-widely studied topics in personality science. However, existing research has mostly focused on the Big Five domains, typically measured across long intervals between assessments using data from non-representative samples. Here, we examined personality trait changes at the domain level and at the level of lower-order aspects in a representative Swiss sample (N = 4’495). Participants in this sample rated their personality traits, life satisfaction, and self-esteem five times over 2 years. Using local structural equation models, we found high rank-order stabilities across the adult lifespan, with similar 1-year stabilities for Big Five domains and aspects (domains: raverage = .88, aspects: raverage = .87). Mean-level changes of aspects belonging to the same Big Five domain differed in timing and direction, and cumulative mean-level changes in personality traits were comparable to changes in self-esteem and life satisfaction. Finally, we found medium to strong correlated changes among Big Five domains (r = .33) and among aspects belonging to the same Big Five domain (r = .42), but confidence intervals of these correlated changes were broad. Our results contribute to a fine-grained picture of personality development and help to advance theoretical perspectives on personality trait changes.
... For instance, to provide empirical grounding to the debate on replicability, Nosek et al. (2022) summarized evidence regarding this topic within the field of psychological science. The results indicate varying degrees of success for both systematic and multi-site replications (Camerer et al., 2018;Open Science Collaboration, 2015;Soto, 2019). Among 307 replications considered, 64% reported statistically significant evidence in Extended author information available on the last page of the article the same direction, with effect sizes 68% as large as in the original studies. ...
Article
Full-text available
Are scientific papers providing all essential details necessary to ensure the replicability of study protocols? Are authors effectively conveying study design, data analysis, and the process of drawing inferences from their results? These represent only a fraction of the pressing questions that cognitive psychology and neuropsychology face in addressing the “crisis of confidence.” This crisis has highlighted numerous shortcomings in the journey from research to publication. To address these shortcomings, we introduce PECANS (Preferred Evaluation of Cognitive And Neuropsychological Studies), a comprehensive checklist tool designed to guide the planning, execution, evaluation, and reporting of experimental research. PECANS emerged from a rigorous consensus-building process through the Delphi method. We convened a panel of international experts specialized in cognitive psychology and neuropsychology research practices. Through two rounds of iterative voting and a proof-of-concept phase, PECANS evolved into its final form. The PECANS checklist is intended to serve various stakeholders in the fields of cognitive sciences and neuropsychology, including: (i) researchers seeking to ensure and enhance reproducibility and rigor in their research; (ii) journal editors and reviewers assessing the quality of reports; (iii) ethics committees and funding agencies; (iv) students approaching methodology and scientific writing. PECANS is a versatile tool intended not only to improve the quality and transparency of individual research projects but also to foster a broader culture of rigorous scientific inquiry across the academic and research community.
... Among personality trait research's main justifications and implications are the traits' associations with consequential life outcomes. When these outcomes involve self-reported psychological constructs such as life satisfaction or gratitude, the associations can range up to r ≈ .50 but are more commonly in the range of r ≈ .10-.30 (e.g., the associations in the data reported in Soto, 2019). With more objective outcomes characterizing specific choices, achievements, health conditions, or behavior patterns, the associations are usually considerably smaller (e.g., Seeboth & Mõttus, 2018); for example, in the mega-analysis of Beck and Jackson (2022), all associations were smaller than r < .05. ...
Preprint
Full-text available
Despite theoretical reasons to think that occupation plays a role in life satisfaction (LS), empirical evidence on the association is surprisingly limited. Moreover, because LS closely tracks personality traits, not controlling for these has left the existing results inconclusive. In pre-registered analyses, we examined occupational differences in LS and job satisfaction (JS) among 59,000 Estonian Biobank participants who represented 263 occupations, controlling for demographic variables and comprehensively assessed personality traits. Jobs differed in LS and JS before (η2 = .05 and η2 = .07, respectively) and even after adjusting for all covariates (η2 = .01 to .02 and η2 = .06, respectively). Various medical professionals, psychologists, special needs teachers, and self-employed individuals tended to have the highest LS levels, whereas security guards, survey interviewers, waiters, sales workers, mail carriers, carpenters, and chemical engineers tended to score the lowest. Jobs with the highest JS included religious professionals, various medical professionals, and authors, while kitchen, transport, storage, and manufacturing labourers, survey interviewers, and sales workers were among the least satisfied with their jobs. Exploratory analyses suggested that income and O*NET-derived job characteristics, such as interest orientations like realistic, enterprising, and conventional, could explain some of the satisfaction variance among occupations. We conclude that occupation ranks among satisfaction’s strongest correlates, besides, and partly net of, personality traits.
Article
Full-text available
Mental health challenges in high-pressure corporate environments are rising globally, yet personality-based approaches to workplace well-being remain underutilized, especially in emerging economies like Bangladesh. This study investigates how the Big Five personality traits-Neuroticism, Extraversion, Openness, Agreeableness, and Conscientiousness-predict psychological distress (depression, anxiety, and stress) among 300 corporate professionals in Dhaka. Using a cross-sectional design and validated Bangla versions of the BFI-44 and DASS-21, data were analyzed through multiple regression, correlation, and ANOVA. Findings reveal neuroticism as the strongest positive predictor of all distress outcomes, while extraversion and conscientiousness show consistent protective effects. Openness and agreeableness display complex roles-buffering some symptoms but amplifying others under specific workplace conditions. Gender emerged as a moderator for anxiety, though job position and education level showed no significant effects. These results underscore the need for personality-sensitive mental health strategies in collectivist, performance-driven settings. The study offers practical recommendations for integrating personality assessments into organizational wellness policies to foster resilient and productive workforces.
Article
Introduction Most research investigating relationships between the Big Five and emotional states has focused on how emotional attributes relate to Extraversion and Neuroticism. However, the potential for discrete emotional states to enable a richer understanding of the emotive nature of all Big Five traits and their subtraits has been neglected. Methods Participants ( N = 203) completed the Big Five Aspects Scale, watched six emotionally stimulating video clips, and self‐reported their experience of basic emotions before (Baseline) and after (Reaction) each video. Spearman correlations identified state–trait relationships, followed by regression analyses to assess the unique contribution of each trait to emotional experiences. Results Conscientiousness negatively correlated with Baseline Sadness, while Agreeableness positively correlated with Reaction Disgust, Fear, and Sadness. Extraversion predicted higher Joy, and Neuroticism was linked to greater Fear and Sadness. Conclusion Findings reinforce Extraversion and Neuroticism's links to positive and negative emotionality, respectively, while also showing that Agreeableness predicts heightened sensitivity to negative affect. Conscientiousness, particularly Orderliness, appears protective against Baseline Sadness, and Openness to Experience, especially Intellect, is linked to lower sensitivity to Surprise. Potential mechanisms underlying these relationships are discussed.
Preprint
IntroductionMost research investigating relationships between the Big Five and emotional states has focused on how emotional attributes relate to Extraversion and Neuroticism. However, the potential for discrete emotional states to enable a richer understanding of the emotive nature of all Big Five traits and their sub-traits has been neglected.Methods Participants (N = 203) completed the Big Five Aspects Scale, watched six emotionally stimulating video clips, and self-reported their experience of basic emotions before (Baseline) and after (Reaction) each video. Spearman correlations identified state-trait relationships, followed by regression analyses to assess the unique contribution of each trait to emotional experiences.Results Conscientiousness negatively correlated with Baseline Sadness, while Agreeableness positively correlated with Reaction Disgust, Fear, and Sadness. Extraversion predicted higher Joy, and Neuroticism was linked to greater Fear and Sadness.Conclusion Findings reinforce Extraversion and Neuroticism’s links to positive and negative emotionality, respectively, while also showing that Agreeableness predicts heightened sensitivity to negative affect. Conscientiousness, particularly Orderliness, appears protective against Baseline Sadness, and Openness to Experience, especially Intellect, is linked to lower sensitivity to Surprise. Potential mechanisms underlying these relationships are discussed.Keywords: Personality, Emotion, Basic Emotions, Five Factor Model, Big Five Aspects Scale
Article
Full-text available
Responding to recent concerns about the reliability of the published literature in psychology and other disciplines, we formed the X-Phi Replicability Project (XRP) to estimate the reproducibility of experimental philosophy (osf.io/dvkpr). Drawing on a representative sample of 40 x-phi studies published between 2003 and 2015, we enlisted 20 research teams across 8 countries to conduct a high-quality replication of each study in order to compare the results to the original published findings. We found that x-phi studies – as represented in our sample – successfully replicated about 70% of the time. We discuss possible reasons for this relatively high replication rate in the field of experimental philosophy and offer suggestions for best research practices going forward.
Article
Full-text available
We discuss replication, specifically large scale replication projects such as the Many Labs project (Klein et al., 2014), the Open Science Collaboration (OSC) project (Open Science Collaboration, 2015), and Registered Replication Reports (RRRs; Simons, Holcombe, and Spellman (2014)), in contemporary psychology. Our focus is on a key difference between them, namely whether they feature many studies each replicated by one lab (OSC) or one study replicated by many labs (Many Labs, RRRs), and the implication of this difference for assessing heterogeneity (or between-study variation) in effect sizes, generalizability, and robustness. We also offer recommendations for similar projects going forward. We then discuss what large scale replication might look like in other domains of psychological research where existing models of large scale replication are not feasible. We conclude by discussing the implications of these projects for creativity, productivity, and progress.
Article
Full-text available
Whereas the structure of individual differences in personal attributes is well understood in adults, much less work has been done in children and adolescents. On the assessment side, numerous instruments are in use for children but they measure discordant attributes, ranging from one single factor (self-esteem; grit) to three factors (social, emotional, and academic self-efficacy) to five factors (strength and difficulties; Big Five traits). To construct a comprehensive measure for large-scale studies in Brazilian schools, we selected the eight most promising instruments and studied their structure at the item level (Study 1; N = 3,023). The resulting six-factor structure captures the major domains of child differences represented in these instruments and resembles the well-known Big Five personality dimensions plus a negative self-evaluation factor. In a large representative sample in Rio de Janeiro State (Study 2; N = 24,605), we tested a self-report inventory (SENNA1.0) assessing these six dimensions of socio-emotional skills with less than 100 items and found a robust and replicable structure and measurement invariance across grades, demonstrating feasibility for large-scale assessments across diverse student groups in Brazil. Discussion focuses on the contribution to socio-emotional research in education and its measurement as well as on limitations and suggestions for future research.
Article
Full-text available
Good self-control has been linked to adaptive outcomes such as better health, cohesive personal relationships, success in the workplace and at school, and less susceptibility to crime and addictions. In contrast, self-control failure is linked to maladaptive outcomes. Understanding the mechanisms by which self-control predicts behavior may assist in promoting better regulation and outcomes. A popular approach to understanding self-control is the strength or resource depletion model. Self-control is conceptualized as a limited resource that becomes depleted after a period of exertion resulting in self-control failure. The model has typically been tested using a sequential-task experimental paradigm, in which people completing an initial self-control task have reduced self-control capacity and poorer performance on a subsequent task, a state known as ego depletion. Although a meta-analysis of ego-depletion experiments found a medium-sized effect, subsequent meta-analyses have questioned the size and existence of the effect and identified instances of possible bias. The analyses served as a catalyst for the current Registered Replication Report of the ego-depletion effect. Multiple laboratories (k = 23, total N = 2,141) conducted replications of a standardized ego-depletion protocol based on a sequential-task paradigm by Sripada et al. Meta-analysis of the studies revealed that the size of the ego-depletion effect was small with 95% confidence intervals (CIs) that encompassed zero (d = 0.04, 95% CI [−0.07, 0.15]. We discuss implications of the findings for the ego-depletion effect and the resource depletion model of self-control.
Article
In 2010–2012, a few largely coincidental events led experimental psychologists to realize that their approach to collecting, analyzing, and reporting data made it too easy to publish false-positive findings. This sparked a period of methodological reflection that we review here and call Psychology’s Renaissance. We begin by describing how psychologists’ concerns with publication bias shifted from worrying about file-drawered studies to worrying about p-hacked analyses. We then review the methodological changes that psychologists have proposed and, in some cases, embraced. In describing how the renaissance has unfolded, we attempt to describe different points of view fairly but not neutrally, so as to identify the most promising paths forward. In so doing, we champion disclosure and preregistration, express skepticism about most statistical solutions to publication bias, take positions on the analysis and interpretation of replication failures, and contend that meta-analytical thinking increases the prevalence of false positives. Our general thesis is that the scientific practices of experimental psychologists have improved dramatically. Expected final online publication date for the Annual Review of Psychology Volume 69 is January 4, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Article
The university participant pool is a key resource for behavioral research, and data quality is believed to vary over the course of the academic semester. This crowdsourced project examined time of semester variation in 10 known effects, 10 individual differences, and 3 data quality indicators over the course of the academic semester in 20 participant pools (N = 2696) and with an online sample (N = 737). Weak time of semester effects were observed on data quality indicators, participant sex, and a few individual differences—conscientiousness, mood, and stress. However, there was little evidence for time of semester qualifying experimental or correlational effects. The generality of this evidence is unknown because only a subset of the tested effects demonstrated evidence for the original result in the whole sample. Mean characteristics of pool samples change slightly during the semester, but these data suggest that those changes are mostly irrelevant for detecting effects.