PreprintPDF Available

The Science of Startups: The Impact of Founder Personalities on Company Success

Authors:

Abstract

Startup companies solve many of today’s most complex and challenging scientific, technical and social problems, such as the decarbonisation of the economy, air pollution, and the development of novel life-saving vaccines. Startups are a vital source of social, scientific and economic innovation, yet the most innovative are also the least likely to survive. The probability of success of startups has been shown to relate to several firm-level factors such as industry, location and the economy of the day. Still, attention has increasingly considered internal factors relating to the firm’s founding team, including their previous experiences and failures, their centrality in a global network of other founders and investors as well as the team’s size. The effects of founders’ personalities on the success of new ventures are mainly unknown. Here we show that founder personality traits are a significant feature of a firm’s ultimate success. We draw upon detailed data about the success of a large-scale global sample of startups (n=26,781). We found that the Big 5 personality traits of startup founders across 30 dimensions significantly differed from that of the population at large. We can train a classifier to distinguish founders from employees with 82.5% accuracy. Key personality facets that distinguish successful entrepreneurs include a preference for variety, novelty and starting new things (openness to adventure), like being the centre of attention (lower levels of modesty) and being exuberant (higher activity levels). However, we do not find one “Founder-type” personality; instead, six different personality types appear, with startups founded by a “Hipster, Hacker and Hustler” being twice as likely to succeed. Our results also demonstrate the benefits of larger, personality-diverse teams in startups, which has the potential to be extended through further research into other team settings within business, government and research.
The Science of Startups: The Impact of Founder
Personalities on Company Success
Fabian Braesemann ( fabian.braesemann@oii.ox.ac.uk )
University of Oxford
Paul McCarthy ( paul@onlinegravity.com )
UNSW Sydney
Xian Gong ( elaine@onlinegravity.com )
University of Technology Sydney
Fabian Stephany ( fabian.stephany@oii.ox.ac.uk )
University of Oxford
Marian-Andrei Rizoiu ( Marian-Andrei.Rizoiu@uts.edu.au )
University of Technology Sydney
Margaret Kern ( Peggy.Kern@unimelb.edu.au )
The University of Melbourne
Article
Keywords:
DOI: https://doi.org/
License: This work is licensed under a Creative Commons Attribution 4.0 International License. 
Read Full License
Additional Declarations: No competing interests reported.
The Science of Startups: The Impact of Founder
Personalities on Company Success
Paul X. McCarthy1,2, , Xian Gong3, Fabian Stephany4,5, ,
Fabian Braesemann4,5, Marian-Andrei Rizoiu3, Margaret L. Kern6
March 1, 2023
Abstract
Startup companies solve many of today’s most complex and challenging scientific, tech-
nical and social problems, such as the decarbonisation of the economy[1], air pollution[2],
and the development of novel life-saving vaccines[3]. Startups are a vital source of social,
scientific and economic innovation, yet the most innovative are also the least likely to sur-
vive[4]. The probability of success of startups has been shown to relate to several firm-level
factors such as industry, location and the economy of the day[5]. Still, attention has in-
creasingly considered internal factors relating to the firm’s founding team, including their
previous experiences and failures[6], their centrality in a global network of other founders
and investors[7] as well as the team’s size[8]. The effects of founders’ personalities on the
success of new ventures are mainly unknown. Here we show that founder personality traits
are a significant feature of a firm’s ultimate success. We draw upon detailed data about the
success of a large-scale global sample of startups (n=26,781). We found that the Big 5 per-
sonality traits of startup founders across 30 dimensions significantly differed from that of the
1The Data Science Institute, University of Technology Sydney, NSW, Australia. paul@onlinegravity.com
2School of Computer Science and Engineering, UNSW Sydney, NSW, Australia.
3Faculty of Engineering and Information Technology, University of Technology Sydney, Australia
4Oxford Internet Institute, University of Oxford, Oxford, UK. fabian.stephany@oii.ox.ac.uk
5DWG Datenwissenschaftliche Gesellschaft Berlin, Germany. fabian.braesemann@oii.ox.ac.uk
6Melbourne Graduate School of Education, The University of Melbourne, Parkville, VIC, Australia.
1
population at large. We can train a classifier to distinguish founders from employees with
82.5% accuracy. Key personality facets that distinguish successful entrepreneurs include a
preference for variety, novelty and starting new things (openness to adventure), like being
the centre of attention (lower levels of modesty) and being exuberant (higher activity levels).
However, we do not find one “Founder-type” personality; instead, six different personality
types appear, with startups founded by a “Hipster, Hacker and Hustler” being twice as likely
to succeed. Our results also demonstrate the benefits of larger, personality-diverse teams
in startups, which has the potential to be extended through further research into other team
settings within business, government and research.
Background
The success of startups is vital to economic growth and renewal, with a small number of young,
high-growth firms creating a disproportionately large share of all new net jobs[9]. Startups create
jobs and drive economic growth, and they are also an essential vehicle for solving some of
society’s most pressing challenges.
As a poignant example, six centuries ago, the German city of Mainz was abuzz as the birth-
place of the world’s first moveable-type press created by Johannes Gutenberg. However, in the
early part of this century, it faced several economic challenges, including rising unemployment
and a significant and growing municipal debt. Then in 2008, two Turkish immigrants formed
the company BioNTech in Mainz with another university research colleague. Together they pi-
oneered new mRNA-based technologies. In 2020, BioNTech partnered with US pharmaceutical
giant Pfizer to create one of only a handful of vaccines worldwide for Covid-19, saving an es-
timated six million lives[10]. The economic benefit to Europe and, in particular, the German
city where the vaccine was developed has been significant, with windfall tax receipts to the gov-
ernment clearing Mainz’s C1.3bn debt and enabling tax rates to be reduced, attracting other
businesses to the region as well as inspiring a whole new generation of startups[11].
2
While stories such as the success of BioNTech are often retold and remembered, their success
is the exception rather than the rule. The overwhelming majority of startups ultimately fail. One
study of 775 startups in Canada that successfully attracted external investment found only 35%
were still operating seven years later[12]. An industry “autopsy” into 101 tech startup failures
found 23% were due to not having the right team the number three cause of failure ahead of
running out of cash or not having a product that meets the market need[13].
Introduction
In this project, we aimed to understand whether certain combinations of founder personalities are
related to startup success, defined as when the firm has been acquired, acquired another firm or is
listed on a public stock exchange. The project provides a large-scale quantitative perspective on
the colloquial “Hacker, Hustler, Hipster”[14] dream team that is envisaged to form the optimal
combination of personalities to accomplish business success. For the quantitative analysis, we
draw on a previously published methodology[15], which matched people to their ideal jobs based
on social media-predicted personality traits.
Here, we applied the same methodology to another set of Twitter users: founders and ex-
ecutives with a Crunchbase profile. Crunchbase is the world’s largest directory on startups. It
provides information about more than 1 million companies, primarily focused on funding and
investors. A company’s Crunchbase profile can be considered a digital business card of an early-
stage venture. As such, the founding teams tend to provide information about themselves, in-
cluding their educational background or a link to their Twitter account. Again, as with Twitter,
all information on Crunchbase is publicly available.
In this project, we inferred the personality profiles of the founding teams of early-stage ven-
tures using the methodology described from their publicly available Twitter profiles. Then, we
correlated this information to funding from Crunchbase to determine whether particular combi-
nations of personality traits correspond to the success of early-stage ventures.
3
What makes for a successful startup?
Venture capitalists and other investors, especially in early-stage unproven startup companies,
each have their perspective on the key factors that make for likely success. Three different
schools of thought can mostly characterise these different perspectives:
Supply-side or product investors:those who prioritise investing in firms they consider to have
novel and superior products and services, investing in companies with intellectual property
such as patents and trademarks.
Demand-side or market-based investors:those who prioritise investing in areas of highest mar-
ket interest such as in hot areas of technology like quantum computing or recurrent or
emerging large-scale social and economic challenges such as decarbonisation of the econ-
omy.
Talent investors:those who prioritise the foundation team above the startup’s initial products or
what industry or problem it is looking to address.
Getting to the point at which the startup has demonstrated the market is willing to use and
pay for its novel products and services regularly, known as product-market fit, is seen as a vital
milestone for investors and founders alike, and is often a conditional trigger for additional rounds
of investment.
Much focus in recent years has been on reconciling the first two of these investor perspectives
to achieve product-market fit as quickly and with the least possible capital invested in creating a
minimum viable product.
However, investors who adopt the third perspective and prioritise talent recognise that a good
team can overcome many challenges in the lead-up to product-market fit. And while the initial
products of a startup may or may not work, a successful and well-functioning team has the
potential to pivot to new markets and new products, even if the initial ones prove untenable.
4
Some of today’s most prominent startup success stories, such as Twitter, were not the star-
tups’ first idea for a product or service but the result of trying several other things that failed.
This story is common in product innovation, with many well-known consumer products emerg-
ing from previous “failures”. For example, the renowned engineering lubricant WD-40 is so
named as the result of the 40th attempt to create the formula, and 3M’s Post-It notes were a
product made from a “failed” adhesive project.
In this article, we analyse a variety of firm-level,founder-level and founder-team-level de-
terminants of the success of startups, which are by their very nature experimental, high risk and
likely to fail.
Firstly, we examine a range of firm-level determinants of startup success, including loca-
tion (Fig. 1A), industry (Fig. 1B) and age of startup (Fig. 1C) to explore to what extent these
factors are associated with success. Then building on our previous occupation-personality fit
research[16], we use a large collection of public data on startup companies from Crunchbase to
examine the detailed personality profiles of founders. Finally, in a series of experiments with
large-scale samples, we explore three fundamental questions:
1. What, if any, personality features distinguish them as entrepreneurs? And if so, what types
of personality combinations exist among startup entrepreneurs?
2. Does the personality of its founders play a role in a startup’s success when accounting for
other external factors known to influence it, such as location, industry and company age?
3. Does the combination of founders and their personalities play a role in startup success, and
is there any evidence to support the commonly held view in the venture capital investment
community that startups require three types of founders: a Hacker, a Hustler and a Hipster?
5
A
low
medium
high
Frequency of
Success
B
Privacy_and_Security
Payments
Data_and_Analytics
Transportation
Advertising
Information_Technology
Software
Sales_and_Marketing
Messaging_and_Telecommunications
Real_Estate
Mobile
Artificial_Intelligence
Travel_and_Tourism
Financial_Services
Lending_and_Investments
Hardware
Video
Internet_Services
Gaming
Commerce_and_Shopping
Biotechnology
Science_and_Engineering
Food_and_Beverage
Energy
Consumer_Electronics
Apps
Content_and_Publishing
Platforms
Events
Health_Care
Administrative_Services
Media_and_Entertainment
Other
Consumer_Goods
Professional_Services
Government_and_Military
Navigation_and_Mapping
Natural_Resources
Manufacturing
Design
Music_and_Audio
Sports
Sustainability
Education
Community_and_Lifestyle
Clothing_and_Apparel
Agriculture_and_Farming
0
0.05
0.1
0.15
0.2
0.25
2k
4k
6k
8k
10k
Company Counts
Relative Frequency of Successful Startups
C
1990 2000 2010 2020
0
0.1
0.2
0.3
0.4
0.5
0.6
Relative Frequency of Successful Startups
Fig. 1: |Firm-Level Factors of Startup Success. a, On a country level, chances for success are
highest in the US, Japan, West Europe, and Scandinavian countries. b, Firms from the payment
and software industries have high chances of success. c, Chances of success are positively related
to a firm’s maturity, with firms that are seven years or older having higher chances of success.
6
The rise of the hipster in startups
Clear functional roles have evolved in established industries such as film and television, con-
struction and advertising.
In advertising, there is a long-established functional distinction between the categorical roles
of creatives (people who devise the words, images and music for advertisements, including copy-
writers and creative directors), suits (client-facing account managers and sales executives) and
quants (strategy and planning roles associated with audience measurement and the buying and
placements of advertisements across different media).
The necessary tension, especially between suits and creatives in advertising, is well under-
stood, as “there is an enduring oppositional culture between the creatives and the suits within
agencies. From the point of view of the creatives’, the lifeblood of the agency is considered to lie
in the creative team with the other functions either considered inferior or unavoidable evils”[17].
In technology, the categorical roles of Hackers (skilful computer programmers and devel-
opers) and Hustlers (entrepreneurial leaders able to win over customers and investors to new
products and ideas) have been around for decades, with similar oppositional tension. For exam-
ple, when Steve Jobs announced he would take medical leave from Apple in January 2009, Mat
“Wilto” Marquis described him as a hacker and a hustler in a well-wishing tweet.
However, the first use of Hacker and Hustler in conjunction with Hipster in the context of
the putative startup founder dream was coined by influential venture capitalist Elias Bizannes
in 2011. It was then popularised in 2012 by an address at the influential technology confer-
ence South by Southwest by Rei Inamoto and in a subsequent Forbes article The Dream Team:
Hipster, Hacker, and Hustler”[14].
Hipster is a broad term used to describe members of an urban subculture in many cities
in the US and other countries who are design conscious and favour non-mainstream fashions,
trendy foods and alternative music. Bizannes co-opted the term to reflect what he perceived
was the increasing need for successful startups to have a founder with design-savvy, aesthetic
7
imagination and insider knowledge (Hipster) in addition to the traditional roles of someone good
at selling things (Hustler) and creating technology products (Hacker).
Founders are not like most other people
As a first step, we explore whether the personalities of successful startup founders are measurably
different from those of people in other occupations.
While recent research has demonstrated that many employees in the same occupations share
similar personality traits[15], being a startup founder is not a conventional job. So while we now
have maps of the personality signatures of many jobs, startup founders’ personality signatures
have yet to be identified.
Employing established methods[18, 19, 20], we inferred the personality traits across 30 di-
mensions (Big 5 facets) of a large global sample (n=4.4k) of successful startup founders. The
successful startup founders cohort was created from a subset of self-identified founders from the
global startup industry directory Crunchbase, who are also active on the social media platform
Twitter and have a record of successfully attracting external venture capital investment. Success
in a startup is typically staged and can appear in different forms and times. For example, a startup
may be seen to be successful when it finds a clear solution to a widely recognised problem, such
as developing a successful vaccine. On the other hand, it could be achieving some measure of
commercial success, such as rapidly accelerating sales or becoming profitable or at least cash
positive. Or it could be reaching an exit for foundation investors via a trade sale, acquisition
or listing of its shares for sale on a public stock exchange via an Initial Public Offering (IPO).
However, one commonly agreed measure of success is the attraction of external investment by
venture capitalists.
Many startup founders wear multiple hats. In addition to being startup founders or co-
founders, they often perform functional roles (and sometimes hold the titles of conventional
C-Suite leaders) such as CEO, CTO or CFO. Some also have full-time or part-time jobs as en-
8
gineers, managers or consultants in other companies unrelated to their startup while in the early
stages of developing their fledgling business.
While not all CEOs are founders (and indeed, most are not), some are also CEOs. Founders
much more commonly hold some occupations like CEOs than others, and other job types are
rarely held by founders. We use this overlap between startup founders holding conventional
roles to create a complementary sample of successful employees unlikely to be founders.
To begin, we leveraged data from previous occupation-fit research on the personality traits of
successful employees in 624 different occupations across various industries.
Then, we developed an Entrepreneurial Occupational Index (EOI; see Extended Data Fig. 15)
based on LinkedIn data that looks at the percentage of people currently employed in that role
worldwide and who also hold or have previously held the position of founder or co-founder. We
found EOI values for each of the 624 Occupations we have personality profiles for. We ranked the
occupations from most entrepreneurial (public speaker 21.11%, chief technology officer 20.75%,
and creative director 19.33%) to least entrepreneurial (cashier 0.02%, palaeontologist 0.00%, fur-
niture removalist 0.00%, aged carer 0.00%, bacteriologist 0.00%).
We then created a list of low EOI occupations (n=112), each of which had less than 0.5% of
whom also held the titles founder or co-founders in their LinkedIn Profile. People in these roles
may still be founders and co-founders, but it is unlikely that they are. Any individual in even the
most entrepreneurial of these 112 occupations (internal auditor) is still five times less likely also
to be a founder or co-founder than the global average (2.5%) across all 624 occupations. From
our previous study, we randomly selected a sample of Successful Employees (n=6k) for whom
we have inferred personality data and who are unlikely to be entrepreneurs as they are drawn
from the 112 low EOI occupations.
Using the two samples together: Successful Entrepreneurs and Successful Employees (un-
likely to be founders), we trained and tested a machine learning random forest classifier to dis-
tinguish and classify entrepreneurs from employees and vice-versa using inferred personality
9
vectors alone. As a result, we found we could correctly predict Entrepreneurs with 77% accu-
racy and Employees with 88% accuracy (Fig. 2A). Thus, based on personality information alone,
we correctly predict all unseen new samples with 82.5% accuracy (See Extended Data Fig. 1 for
details on modelling and prediction accuracy.).
Adventurousness the key feature
We explored in greater detail which personality features are the most important in distinguishing
successful entrepreneurs from successful employees and found that the subdomain or facet of
Adventurousness within the Big 5 Domain of Openness was both significant and had the largest
effect size. The facet of Modesty within the Big 5 Domain of Agreeableness and Activity Level
within the Big 5 Domain of Extraversion was the subsequent most considerable effect (Fig. 2B).
All thirty dimensions of the Big 5 facet were found to be significantly different in their distri-
bution, with ten features having large effect sizes. (See Extended Data Table 1 for more details
of Cohen’s D analysis with a complete list of features and their effect sizes and Extended Data
Fig. 2 for Big5 personality facets of Employees and Entrepreneurs visualised as a heatmap and
dendrogram.)
This is important because, to our knowledge, this is the first study to show differences be-
tween employees and entrepreneurs at the facet level of the Big 5 personality domains and the
largest-scale study (n=10.4k) of any kind in this field.
In our sample, Successful Entrepreneurs were defined as founders or co-founders of compa-
nies who have attracted over USD $100k+ in investments from venture capitalists. This is consis-
tent with previous research that found higher values in the personality trait Openness significantly
predict VC financing even after accounting for observable founder and firm characteristics[21]
and the key Big 5 Domain that distinguishes entrepreneurs from non-entrepreneurs[22].
Adventurousness in the Big 5 framework is defined as the preference for variety, novelty
and starting new things - which are consistent with the role of a startup founder whose role,
10
especially in the early life of the company, is to explore things that do not scale easily[23] and is
about developing and testing new products, services and business models with the market.
Six types of startup founders
Once we understood that startup founders have distinctive personality features that are different
from regular employees, we explored whether there are distinct types of personalities among
startup founders.
First, we examined whether there is evidence to show that startup founders naturally clus-
ter according to their personality features using a Hopkins test. We discovered clear clustering
tendencies in the data compared with other renowned reference data sets known to have clus-
ters. Specifically, we found that founders’ personalities have higher clustering tendency scores
than that of two well-known scientific data sets with known in-built clustering: Edgar Ander-
son’s classic detailed measurements of three species of Irises[24] and the more recent size mea-
surements for three species of Pygoscelis penguins that breed on islands throughout the Palmer
Archipelago[25] (see Extended Data Fig. 3).
Then, once we established the founder data clusters, we used agglomerative hierarchical clus-
tering; a “bottom-up” clustering technique that initially treats each observation as an individual
cluster and then merges them to create a hierarchy of possible cluster schemes with differing
numbers of groups (See Extended Data Fig. 4).
And lastly, we identified the optimum number of clusters based on the outcome of four
different clustering performance measurements: Davies-Bouldin Index, Silhouette coefficients,
Calinski-Harabas Index and Dunn Index. We found that the optimum number of clusters of
startup founders based on their personality features is six (labelled #0 through to #5).
11
Personality footprints of founders
To better understand the unique personality characteristics of each of the six different clusters of
founders and co-founders we:
1. Analysed the personality footprints of each cluster. We examined the distinctive per-
sonality traits of each group and identified which clusters were home to the maximums in
each of the 30 personality facets (See summary in Table 1) and also created a heat map
revealing the complete personality footprint of each of the six types (Fig. 2D).
2. Matched the occupation closest to the centre of each cluster using the personality-
occupation matrix from our previous research in two separate studies based on 128,279
people in 3,513 professions using ten dimensions[15] and a second more recent study
based on 99,897 people in 624 occupations using 30 personality dimensions[16].
3. Identified which of the eight occupation-tribes from previous research[16] each founder
or co-founder belonged to. Leveraging previous research, we then looked at the distribu-
tion of tribe membership of each founder within each cluster.
Founders within the personality-occupation landscape
To better understand the context of different founder types, we positioned each of the footprints
of each of the six types of founders within an occupation-personality matrix (n=624 jobs) estab-
lished from previous research[16]. Prior research showed that “each job has its own personality”
using a substantial sample of employees (n=99k) across various jobs. Furthermore, we found
that the occupations themselves clustered into eight different groups—which we refer to as oc-
cupation tribes based on their personality alone. The key personality attributes of each of
these tribes from this prior research is reproduced in Extended Data Fig. 16.
12
Tab. 1: |Typology of Founders by Personality |Typology of Founders
by Personality. Six different types of founders are revealed by clustering
founders (n=32k) by their Big 5 personality facets. Each type Fighter,
Operator, Accomplisher, Leader, Engineer and Developer (FOALED)
has its distinctive personality footprint, but three are equivalent to varia-
tions of Hackers (Fighters, Operators and Developers), two are variations
of Hustlers (Leaders and Accomplishers) and one can be characterised as
equivalent to a Hipster (Engineer).
Founder Type
Clustered by
Personality
Distinctive Personality Traits
Personality traits of founders
in this cluster (Big 5 facets)
Closest Occupation
Occupation maps
(Repec[16] and PNAS[15])
3H Typology
Hipster /
Hacker / Hustler
Leaders (#2)
Highest in openness in the facets
of artistic interests and emotionality
also highest in agreeableness in
facets of altruism and sympathy.
Executive Director, Medical
Director Hustler (Pure)
Accomplisher
(#0)
Highly extraverted (all facets) and
Conscientious (five facets)
Chief Information Officer,
Export Manager
Hustler
(Technology
Focus)
Operator (#4)
Highest in conscientiousness in the
facet of orderliness and high
agreeableness in the facet of
humility for founders in this cluster.
Bicycle Mechanic, Mechanic
and Service Manager.
Hacker
(Operations focus)
Developer (#3)
“Middle child” cluster no facets
are maximums or minimums, but it
shares characteristics similar to
fighters but higher in extraversion.
Application Developer and
related technology roles
such as Business Systems
Analyst and Product
Manager.
Hacker (Product
focus)
Fighters (#5)
Emotional range (anger, anxiety,
depression, immoderation,
self-consciousness, vulnerability)
Software Developer,
Computer Engineer Hacker (Pure)
Engineer (#1) Highest in openness in the facets
of imagination and intellect.
Materials Engineer and
Chemical Engineer. Hipster
For each founder and co-founder, we found the closest corresponding occupation tribe for
each based on personality similarity. Then we tallied the founders within each cluster by tribe to
reveal the level of coherence or the extent to which most founders within each group belonged
to one occupation tribe.
This revealed three “purebred” clusters: #0, #2 and #5, whose members are dominated by a
single tribe (larger than 60%). Thus these clusters represent and share personality attributes of
these previously identified[16] occupation-personality tribes, which have the following known
distinctive personality attributes:
13
Accomplishers (#0) Organised & outgoing. confident, down-to-earth, content, accom-
modating, mild-tempered & self-assured.
Leaders (#2) Adventurous, persistent, dispassionate, assertive, self-controlled, calm
under pressure, philosophical, excitement-seeking & confident.
Fighters (#5) Spontaneous and impulsive, tough, sceptical, and uncompromising.
These labels also accord with the distribution of roles founders in each of these clusters hold.
Accomplishers are often CEOs, CFOs or COOs while Fighters tend to be CTOs, CPOs and CCO.
(See Extended Data Fig. 6 for more details).
We labelled these clusters with these tribe names, acknowledging that labels are somewhat
arbitrary, based on our best interpretation of the data (See Extended Data Fig. 5 for more details).
For the remaining three clusters #1, #3 and #4, we can see they are “hybrids”, meaning that
the founders within them come from a mix of different tribes, with no one tribe representing
more than 50% of the members of that cluster. However, the tribes with the largest share were
noted as #1 Experts; #3 Fighters and #4 Accomplishers.
To label these three hybrid clusters, we examined the closest occupations to the median per-
sonality features of each cluster. We selected a name that reflected the common themes of these
occupations, namely:
Engineers (#1) as the closest roles included Materials Engineers and Chemical Engineers.
This is consistent with this cluster’s personality footprint, which is highest in openness in
the facets of imagination and intellect.
Developers (#3) as the closest roles include Application Developers and related technology
roles such as Business Systems Analysts and Product Managers.
Operators (#4) as the closest roles include service, maintenance and operations functions,
including Bicycle Mechanic, Mechanic and Service Manager. This is also consistent with
14
one of the key personality traits of high conscientiousness in the facet of orderliness and
high agreeableness in the facet of humility for founders in this cluster.
Together, these six different types of startup founders (Fig. 2C) represent a framework we
call the FOALED model of founder types an acronym of Fighters, Operators, Accomplishers,
Leaders, Engineers and Developers.
Each founder Personality-Type has its distinct facet footprint. Also, we observe a central core
of correlated features that are high for all types of entrepreneurs, including intellect, adventur-
ousness and activity level (Fig. 2D).
Evidence for the “Hipster, Hacker, and Hustler” thesis
By analysis of the six types of startup founders in our FOALED model within the broader
Occupation-Personality landscape, we identify three types to be characterised as types of Hack-
ers (Fighters, Operators and Developers) and two as Hustlers (Accomplishers and Leaders). The
remaining type is different in personality to both Hackers and Hustlers. It is more of a subject
matter expert whose insider field knowledge and problem-solving design strengths can be seen
as a type of Hipster (Engineer).
When we subsequently explored the combinations of personality types among founders and
their relationship to the probability of the firm’s success, adjusted for a range of other factors in a
multi-factorial analysis, we found significantly increased chances of startup success for Hipster,
Hacker and Hustler foundation teams (Fig. 3C).
15
A B
0.88 0.12
0.23 0.77
Employee Entrepreneur
Entrepreneur
Employee
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Percent
Predicted Label
True Label
C D
−60 −40 −20 020 40 60
−60
−40
−20
0
20
40
60
t-SNE dimension 1
t-SNE dimension 2
Accomplisher
(Hustler)
Developer
(Hacker)
Engineer
(Hipster)
Fighter
(Hacker)
Leader
(Hustler)
Operator
(Hacker)
Engineer
Fighter
Developer
Operator
Accomplisher
Leader
Emotional Stability (anxiety)
Emotional Stability (vulnerability)
Emotional Stability (selfconsciousness)
Emotional Stability (anger)
Emotional Stability (depression)
Extraversion (cheerfulness)
Extraversion (gregariousness)
Openness (emotionality)
Agreeableness (modesty)
Emotional Stability (immoderation)
Openness (intellect)
Openness (liberalism)
Openness (adventurousness)
Extraversion (activitylevel)
Agreeableness (trust)
Conscientiousness (selfefficacy)
Conscientiousness (achievementstriving)
Extraversion (assertiveness)
Openness (imagination)
Agreeableness (altruism)
Openness (artisticinterests)
Agreeableness (sympathy)
Conscientiousness (selfdiscipline)
Conscientiousness (cautiousness)
Agreeableness (cooperation)
Extraversion (friendliness)
Extraversion (excitementseeking)
Agreeableness (morality)
Conscientiousness (dutifulness)
Conscientiousness (orderliness)
0.2
0.4
0.6
0.8
Personality Score
Fig. 2: |Founder-Level Factors of Startup Success. a, Successful entrepreneurs differ from
successful employees. They can be accurately distinguished using a classifier with personality
information alone b, Successful entrepreneurs have different Big 5 facet distributions, especially
on adventurousness, modesty and activity level. c, Founders come in six different types: Fight-
ers, Operators, Accomplishers, Leaders, Engineers and Developers (FOALED) d, Each founder
Personality-Type has its distinct facet footprint.
16
A
1234567
0
0.1
0.2
0.3
0.4
0.5
Number of Founders
Relative Frequency of Successful Startups
B
0.69
0.692
0.694
0.696
0.698
0.7
0.702
0.704
0.706
0.61
0.615
0.62
0.625
0.63
0.635
0.64
0.645
0.485
0.49
0.495
0.5
0.505
0.51
0.515
0.52
0.525
0.27
0.275
0.28
0.285
0.29
0.4
0.405
0.41
0.415
Success (IPO/Bought/Sold) Others
Percentile Score of Personality Domain
Openness Conscientiousness Extraversion Agreeableness Emotional Stability
C
0246810 12 14
Accomplisher (x2)
Developer and Operator
Accomplisher (x3)
Engineer and Leader and Developer
Developer (x2) and Operator
Leader (x2) and Developer
Odds Ratio
Fig. 3: |The Ensemble Theory of Team-Level Factors of Startup Success. a, Having a
larger founder team elevates the chances of success. This can be due to multiple reasons, e.g., a
more extensive network or knowledge base but also personality diversity. b, We show that joint
personality combinations of founders are significantly related to higher chances of success. This
is because it takes more than one founder to cover all beneficial personality traits that ”breed”
success. c, In our multifactor model, we show that firms with diverse and specific combinations
of types of founders, including (Hipster, Hustler, and Hacker) have significantly higher odds of
success.
17
Ensemble Theory of Success
Definition of success
The success of startups is uncertain, dependent on many factors and can be measured in various
ways. Due to the likelihood of failure in startups, some large-scale studies have looked at which
features predict startup survival rates[26] and others focus on fundraising from external investors
at various stages[27]. Success for startups can be measured in multiple ways, such as the amount
of external investment attracted, the number of new products shipped or the annual growth in
revenue. But sometimes external investments are misguided, revenue growth can be short-lived,
and new products may fail to find traction.
The definition used by Bonaventura et al. [7], namely that a startup either is acquired, ac-
quires another company or has an initial public offering (IPO), sees any of these major capital
liquidation events as a clear threshold signal that the company has matured from an early-stage
venture to becoming or is on its way to becoming a mature company with clear and often signif-
icant business growth prospects.
Rather than looking at associations of any one factor of success, we use a quantitative multi-
factor analysis of success that incorporates a range of firm-level factors such as where a startup is
located, when it was founded and what industry it is in, combined with founder-level factors such
as the inferred Big 5 personality features in 30 dimensions for each founder and lastly founder-
team level factors that look at the number of founders and the permutations and combination of
their personalities. We look at these factors independently and in combination to explore their
relative impacts on the likelihood of startup firm success.
Factors associated with startup success
Using multifactor analysis and a binary classification prediction model of startup success, we
looked at many variables together and their relative influence on the probability of the success
18
of startups. We looked at seven categories of factors through three lenses of firm-Level factors:
1) Location, 2) Industry, 3) Age of Startup; Founder-level factors: 4) Number of Founders, 5)
Gender of Founders, 6) Personality characteristics of Founders and; lastly Team-level factors: 7)
Founder-team personality combinations.
The model performance and relative impacts on the probability of startup success of each
of these categories of founders are illustrated in more detail in Extended Data Fig. 13 and in
Extended Data Fig. 14 respectively.
In total, we considered over three hundred variables (n=323) and their relative significant
association with success.
Firm-level factors and success
The first lens we looked through was at the firm-level. Much of the previous literature on startups
has been focused on firm-level or external factors and their influence on success[5]. Startup
success has been shown to relate to how much capital the startup has raised, how old it is and
what industry it is in, among other things[28].
Here we show startup success is influenced strongly by its location (firms from Japan, Scan-
dinavia, USA, France, and Germany are more likely to be successful than those from Turkey,
Argentina, Mexico or other countries); industry (firms in Payment Systems and Privacy & Secu-
rity are most successful) and a company’s age (more details in the SI).
Founder-level factors and success
The second lens we looked through was that of founder-level factors or those internal to the firm,
i. e. the personality features of founders and their association with success. Our modelling shows
firms with multiple founders are more likely to succeed, as illustrated in Fig. 3A), which shows
firms with three or more founders are more than twice as likely to succeed as solo-founded
19
startups. This finding is consistent with investors’ advice to founders and previous studies[8].
(We also noted that some types of additional founders increase the probability of success more
than others as shown in Extended Data Fig. 10 and Extended Data Fig. 11).
Access to more extensive networks and capital could explain the benefits of having more
founders. Still, as we find here, it also offers a greater diversity of combined personalities, which
naturally provides a broader range of maximum traits. So, for example, one founder may be more
open and adventurous, and another could be highly agreeable and trustworthy, thus potentially
complementing each other’s particular strengths associated with startup success.
The benefits of larger and more personality-diverse foundation teams can be seen in the ap-
parent differences between successful and unsuccessful firms based on their combined Big 5
personality team footprints, illustrated in Figure 3B). Here maximum values within each startup
for each Big 5 trait for any of its cofounders are mapped, and the spread of these between suc-
cessful firms those who have IPOed, been acquired or acquired another firm and the other
firms are shown.
Team-level factors and success
Lastly, we considered team-level factors founder team personality combinations and how they
related to startup success.
We found that ten combinations of founders with different personality types were signifi-
cantly correlated with greater chances of startup success when accounting for other variables in
the model. The coefficient of each of these factors is illustrated concerning other features that
were also found to be significantly associated with success in Figure 3C (see Supplementary
Figure 14 for more details on the performance of modelling).
Three combinations of trio-founder companies were more than twice as likely to succeed
than other combinations, namely teams with:
20
A Leader and two Developers (a hustler and two hackers)
An Operator and two Developers (three hackers of two different types)
An Engineer,Leader and Developer (a hipster, hustler and hacker)
The last of these aligns with and provides evidence for the Hipster, Hustler & Hacker hy-
pothesis as well as a commonality of Developers or “purebred hackers” in all three of the most
successful combinations.
Discussion
Startups are one of the key mechanisms for brilliant ideas to become solutions to some of the
world’s most challenging economic and social problems. Examples include the Google search
algorithm, disability technology startup Fingerwork’s touchscreen technology that became the
basis of the Apple iPhone, or the Biontech mRNA technology that powered Pfizer’s COVID-19
vaccine.
We have shown that founders’ personalities and the combination of personalities in the found-
ing team of a startup have a material and significant impact on its likelihood of success. We have
also shown that successful startup founders’ personality traits are significantly different from
those of successful employees - so much so that a simple predictor can be trained to distinguish
between employees and entrepreneurs with more than 80% accuracy using personality trait data
alone.
Just as occupation-personality maps derived from data based on people already successful
in those roles can provide career-guidance tools, so too can data on successful entrepreneurs’
personality traits help others decide whether to become a founder may be a good choice for
them.
21
We have learnt through this research that there is not one type of ideal “entrepreneurial”
personality but six different types. Many successful startups have multiple co-founders with a
combination of these different personality types.
Startups are, to a large extent, a team sport; as such, diversity and complementarity of person-
alities matter in the foundation team. It has an outsized impact on the company’s likelihood of
success. While all startups are high risk, the risk becomes lower with more founders, particularly
if they have distinct personality traits. Our work demonstrates the benefits of diversity among
the founding team of startups. Greater awareness of these benefits may help create more resilient
startups capable of more significant innovation and impact.
Biases and Limitations
While each is large and comprehensive, there are some known and likely sample biases in the
principal data sources used (namely Crunchbase, Twitter and LinkedIn).
Crunchbase is the principal public chronicle of Venture Capital funding, and so there is some
likely sample bias toward:
Startup companies that are funded externally. Self-funded or bootstrapped companies are
less likely to be represented in Crunchbase.
Technology companies, as that is Crunchbase’s roots.
Multifounder companies. As it’s a public social record, companies with multiple founders
are likely better represented in Crunchbase than those with one founder.
Male founders. Like the technology industry itself, founders represented in Crunchbase
are overwhelmingly male. Although the representation of female founders is now double
that of the mid-2000s, women still represent less than 25% of the sample. (See Extended
Data Fig. 12 for more detail of how this manifests in the data):
22
Companies that succeed. Companies that fail, especially those that fail early, are likely to
be less represented in the data.
Samples were also limited to those whose founders are active on Twitter, which adds addi-
tional selection biases. For example, Twitter users typically are younger, more educated and have
a higher median income[29].
In addition to sampling biases within the data, there are also significant historical biases in
startup culture. For many aspects of the entrepreneurship ecosystem, women, for example, are at
a disadvantage[30]. Male-founded companies have historically dominated most startup ecosys-
tems worldwide, representing the majority of founders and the overwhelming majority of venture
capital investors. As a result, startups with women have historically attracted significantly fewer
funds[31], in part due to the male bias among venture investors, although this is now changing,
albeit slowly[32].
Opportunities and Future research questions
The global startup ecosystem is evolving, bringing a variety of questions for the dynamics of
startups. For instance:
Will the recent growing focus on promoting and investing in female founders change the
nature, composition and dynamics of startups and their personalities?
Will the growth of startups outside of the United States change what success looks like to
investors and hence the role of different personality traits and their association to diverse
success metrics?
Many of today’s most renowned entrepreneurs are either Baby Boomers (Gates, Bran-
son, Bloomberg) or Generation Xers (Benioff, Cannon-Brooks, Musk). However, as we
can see, personality is both a predictor and driver of success in entrepreneurship. Will
generation-wide differences in personality and outlook affect startups and their success?
23
The findings of this research have natural extensions and applications beyond startups, such
as for new projects within large established companies. While not technically startups, many
large enterprises and industries such as construction, engineering and the film industry rely on
forming new project-based, cross-functional teams that are often new ventures and share many
characteristics of startups.
There is also potential for extending this research in other settings in government, NGOs and
within research itself. In scientific research, for example, team diversity in terms of age, ethnicity
and gender has been shown to be predictive of impact, and personality diversity may be another
critical dimension[33].
This study demonstrates that successful startup founders have significantly different person-
alities than many successful employees. It also shows that many factors influence startup success.
The methods and data described here reveal that firm-level factors such as the startup’s context
within geography (where it is located), the economy (which industry it addresses), and timing
(when it was founded and how old it is) all have a significant influence of the likelihood of firm
success. In addition to these more well-understood factors, we showed that a range of founder-
level factors, notably the character traits of its founders, as revealed by their personality features,
have a significant impact on a startup’s likelihood of success. Lastly, we looked at team-level
factors and discovered in a multifactor analysis that personality-diverse teams have the most
considerable impact of all those examined on the probability of a startup’s success.
References
1. Goldstein, A., Doblinger, C., Baker, E. & Anad´
on, L. D. Patenting and business outcomes
for cleantech startups funded by the Advanced Research Projects Agency-Energy. Nature
Energy 5, 803–810 (2020).
2. Lewis, A. & Edwards, P. Validate personal air-pollution sensors. Nature 535, 29–31 (2016).
24
3. Mulligan, M. J. et al. Phase I/II study of COVID-19 RNA vaccine BNT162b1 in adults.
Nature 586, 589–593 (2020).
4. Hyytinen, A., Pajarinen, M. & Rouvinen, P. Does innovativeness reduce startup survival
rates? Journal of business venturing 30, 564–581 (2015).
5. ˙
Zbikowski, K. & Antosiuk, P. A machine learning, bias-free approach for predicting busi-
ness success using Crunchbase data. Information Processing & Management 58, 102555
(2021).
6. Yin, Y., Wang, Y., Evans, J. A. & Wang, D. Quantifying the dynamics of failure across
science, startups and security. Nature 575, 190–194 (2019).
7. Bonaventura, M. et al. Predicting success in the worldwide start-up network. Scientific
reports 10, 1–6 (2020).
8. Klotz, A. C., Hmieleski, K. M., Bradley, B. H. & Busenitz, L. W. New venture teams: A
review of the literature and roadmap for future research. Journal of management 40, 226–
255 (2014).
9. Henrekson, M. & Johansson, D. Gazelles as job creators: a survey and interpretation of the
evidence. Small business economics 35, 227–244 (2010).
10. Which vaccine saved the most lives in 2021?: Covid-19. English. The Economist (Online).
Name - AstraZeneca; Pfizer Inc; BioNTech SE; Copyright - Copyright The Economist
Newspaper NA, Inc. Jul 14, 2022; Last updated - 2022-11-29. http : / / ezproxy .
lib.uts.edu.au/login?url=https://www.proquest.com/magazines/
which-vaccine- saved-most-lives-2021/docview/2689254523/se-2
(July 2022).
11. Oltermann, P. Pfizer/BioNTech tax windfall brings Mainz an early Christmas present En-
glish. Name - Pfizer Inc; BioNTech SE; Copyright - Copyright Guardian News & Media
25
Limited Dec 27, 2021; Last updated - 2021-12-28. http://ezproxy.lib.uts.edu.
au/ login?url=https://www- proquest- com.ezproxy.lib .uts.edu.
au / blogs - podcasts - websites / pfizer - biontech - tax - windfall -
brings-mainz-early/docview/2614496082/se-2.
12. Grant, K. A., Croteau, M. & Aziz, O. The Survival Rate of Startups Funded by Angel
Investors. I-INC WHITE PAPER SERIES: MAR 2019, 1–21 (2019).
13. Top 20 reasons start-ups fail CB Insights version English. Copyright - Copyright Newstex
Oct 21, 2019; Last updated - 2022-10-25. Oct. 2019. http: //ezproxy.lib .uts.
edu. au/login?url=https:// www- proquest-com.ezproxy .lib.uts.
edu .au /blogs - podcasts- websites /top - 20- reasons - start- ups -
fail-cb-insights-version/docview/2307120037/se-2.
14. Ellwood, A. The Dream Team: Hipster, Hacker, and Hustler. Forbes (2012).
15. Kern, M. L., McCarthy, P. X., Chakrabarty, D. & Rizoiu, M.-A. Social media-predicted
personality traits and values can help match people to their ideal jobs. Proceedings of the
National Academy of Sciences 116, 26459–26464 (2019).
16. McCarthy, P. X., Kern, M. L., Gong, X., Parker, M. & Rizoiu, M.-A. Occupation-personality
fit is associated with higher employee engagement and happiness (2022).
17. Pratt, A. C. Advertising and creativity, a governance approach: a case study of creative
agencies in London. Environment and planning A 38, 1883–1899 (2006).
18. Schwartz, H. A. et al. Personality, gender, and age in the language of social media: The
open-vocabulary approach. PloS one 8, e73791 (2013).
19. Plank, B. & Hovy, D. Personality traits on twitter—or—how to get 1,500 personality tests
in a week in Proceedings of the 6th workshop on computational approaches to subjectivity,
sentiment and social media analysis (2015), 92–98.
26
20. Arnoux, P.-H. et al. 25 tweets to know you: A new model to predict personality with social
media in Eleventh international AAAI conference on web and social media (2017).
21. Chapman, G. & Hottenrott, H. Founder Personality and Start-up Subsidies. Founder Per-
sonality and Start-up Subsidies (2021).
22. Antoncic, B., Bratkovic Kregar, T., Singh, G. & DeNoble, A. F. The big five personality–
entrepreneurship relationship: Evidence from Slovenia. Journal of small business manage-
ment 53, 819–841 (2015).
23. Graham, P. Do Things That Don’t Scale. Paul Graham (2013).
24. Anderson, E. The irises of the Gaspe Peninsula. Bull. Am. Iris Soc. 59, 2–5 (1935).
25. Horst, A. M., Hill, A. P. & Gorman, K. B. Palmer Archipelago Penguins Data in the palmer-
penguins R Package-An Alternative to Anderson’s Irises. R JOURNAL 14, 244–254 (2022).
26. Antretter, T., Blohm, I. & Grichnik, D. Predicting startup survival from digital traces: To-
wards a procedure for early stage investors (2018).
27. Dworak, D. Analysis of Founder Background as a Predictor for Start-up Success in Achiev-
ing Successive Fundraising Rounds (2022).
28. Corea, F., Bertinetti, G. & Cervellati, E. M. Hacking the venture industry: An Early-stage
Startups Investment framework for data-driven investors. Machine Learning with Applica-
tions 5, 100062 (2021).
29. Duggan, M., Ellison, N. B., Lampe, C., Lenhart, A. & Madden, M. Demographics of key
social networking platforms. Pew Research Center 9(2015).
30. Brush, C., Edelman, L. F., Manolova, T. & Welter, F. A gendered look at entrepreneurship
ecosystems. Small Business Economics 53, 393–408 (2019).
27
31. Kanze, D., Huang, L., Conley, M. A. & Higgins, E. T. We ask men to win and women not
to lose: Closing the gender gap in startup funding. Academy of Management Journal 61,
586–614 (2018).
32. Fan, J. S. Startup Biases. UC Davis Law Review (2022).
33. AlShebli, B. K., Rahwan, T. & Woon, W. L. The preeminence of ethnic diversity in scien-
tific collaboration. Nature communications 9, 1–10 (2018).
Methods
Data Sources
Entrepreneurs Only (EO) Dataset. Data about the founders of startups were collected from
Crunchbase (Table 2), an open reference platform for business information about private and
public companies, primarily early-stage startups. It is one of the largest and most comprehensive
data sets of its kind and has been used in over 100 peer-reviewed research articles about economic
and managerial research.
Crunchbase contains data on over two million companies - mainly startup companies and the
companies who partner with them, acquire them and invest in them, as well as profiles on well
over one million individuals active in the entrepreneurial ecosystem worldwide from over 200
countries and spans. While Crunchbase started in the technology startup space, it now covers all
sectors, specifically focusing on entrepreneurship, investment and high-growth companies.
While Crunchbase contains data on over one million individuals in the entrepreneurial ecosys-
tem, some are not entrepreneurs or startup founders but play other roles, such as investors,
lawyers or executives at companies that acquire startups. To create a subset of only entrepreneurs,
we selected a subset of 32,732 who self-identify as founders and co-founders (by job title) and
who are also publicly active on the social media platform Twitter. We also removed those who
28
also are venture capitalists to distinguish between investors and founders.
We selected founders active on Twitter to be able to use natural language processing to infer
their Big 5 personality features using an open-vocabulary approach shown to be accurate in the
previous research by analysing users’ unstructured text, such as Twitter posts in our case. For
this project, as with previous research (Kern et al. 2019), we employed a commercial service,
IBM Watson Personality Insight, to infer personality facets. This service provides raw scores and
percentile scores of Big Five Domains (Openness, Conscientiousness, Extraversion, Agreeable-
ness and Emotional Stability) and the corresponding 30 Subdomain or facets. In addition, the
public content of Twitter posts was collected, and there are 32,732 profiles that each had enough
Twitter posts (more than 150 words) to get relatively accurate personality scores (less than 12.7%
Average Mean Absolute Error).
The “Entrepreneurs Only” (EO) dataset is analysed in combination with other data about the
companies they founded to explore questions around the nature and patterns of personality traits
of entrepreneurs and the relationships between these patterns and company success.
For the multifactor analysis, we cleaned EO the data filtering by a number of factors to ensure
the sample was robust and consistent. More details on this data wrangling is included in Extended
Data Fig. 7 and Extended Data Fig. 8.
Tab. 2: |Summary of the basic information of the Entrepreneurs
Only (EO) dataset the number of founders and associated startups in
population, how many countries those startups are across, and the time
span the data collected covers, the number of features included. ).
Founders with
Personality Data
Associated
Startups Countries Date Range Founders Individual
Features
32,732 23,292 215 2008-2021 100
Successful Entrepreneurs and Successful Employees (SESE) Dataset. The EO data set
contains two categories of Founders: those that have raised funds or attracted external investment
29
to their companies or Funded Founders (n=17,057) and those who have not - Unfunded Founders
(n=16,675). The attraction of a significant investment from outside, especially from specialist
venture capitalists, is seen as one measure that indicates a startup has had some degree of success
or, at the very least, shows promise of future success. Therefore, we filtered the EO Funded
Founders by those whose companies had attracted more than US$100k in investment to create a
reference set of Successful Entrepreneurs (n=4,400).
Most company founders also adopt regular occupation titles such as CEO or CTO. Many
founders will be Founder and CEO or Co-founder and CTO. While founders are often CEOs or
CTOs, the reverse is not necessarily true, as many CEOs are professional executives that were
not involved in the establishment or ownership of the firm.
To create a control group of Successful Employees, who are not also entrepreneurs or very
unlikely to be of have been entrepreneurs, we leveraged the fact that while some occupational
titles like CEO, CTO and Public Speaker are commonly shared by founders and co-founders,
some others such as Cashier,Zoologist and Detective very rarely co-occur with founder or co-
founder. Using data from LinkedIn, we created an Entrepreneurial Occupation Index (EOI)
based on the ratio of entrepreneurs for each of the 624 occupations used in a previous study of
occupation-personality fit. It was calculated based on the percentage of all people working in the
occupation from LinkedIn compared to those who shared the title Founder or Co-founder (See SI
for more detail). A reference set of Successful Employees (n=6,685) was then selected across 112
different occupations with the lowest propensity for entrepreneurship (less than 0.5% EOI) from
a large corpus of Twitter users with known occupations, also from the previous occupational-
personality fit study (PX McCarthy and others, 2022).
The Successful Entrepreneurs and Successful Employees were combined to create the SEE
dataset, which was used to test whether it may be possible to distinguish successful entrepreneurs
from successful employees based on the different patterns of personality traits alone.
30
Hierarchical Clustering
We applied a number of clustering techniques and tests to the personality vectors of the EO data
set to determine if there are natural clusters and, if so, how many are the optimum number.
Firstly, to determine if there is a natural typology to founder personalities, we applied the
Hopkins statistic - a statistical test we used to answer whether the “EO” dataset contains in-
herent clusters. It measures the clustering tendency based on the ratio of the sum of distances
of real points within a sample of the “EO” dataset to their nearest neighbours and the sum of
distances of randomly selected artificial points from a simulated uniform distribution to their
nearest neighbours in the real “EO” dataset. The ratio measures the difference between the “EO”
data distribution and the simulated uniform distribution, which tests the randomness of the data.
The range of Hopkins statistics is from 0 to 1. Where the scores are close to 0, 0.5 and 1, respec-
tively, this indicates whether the dataset is uniformly distributed, randomly distributed or highly
clustered.
To cluster the founders by personality facets, we used Agglomerative Hierarchical Cluster-
ing (AHC) - a bottom-up approach that treats an individual data point as a singleton cluster and
then iteratively merges pairs of clusters until all data points are included in the single big collec-
tion. Ward’s linkage method is used to choose the pair of clusters for minimising the increase in
the within-cluster variance after combining. AHC was widely applied to clustering analysis since
a tree hierarchy output is more informative and interpretable than K-means. Dendrograms were
used to visualise the hierarchy to provide the perspective of the optimal number of clusters. The
heights of the dendrogram represent the distance between groups, where the lower heights repre-
sent more similar groups of observations. A horizontal line through the dendrogram was drawn
to distinguish the number of significantly different clusters with higher heights. However, as it
is not possible to determine the optimum number of clusters from the dendrogram, we applied
other clustering performance metrics to analyse the optimal number of clusters.
A range of Clustering performance metrics were used to help determine the optimal num-
31
ber of clusters in the dataset after an obvious clustering tendency was confirmed. The following
metrics were implemented to comprehensively evaluate the differences between within-cluster
and between-cluster distances: Dunn Index, Calinski-Harabasz Index, Davies-Bouldin Index and
Silhouette Index. The Dunn Index measures the ratio of the minimum inter-cluster separation and
the maximum intra-cluster diameter. At the same time, the Calinski-Harabasz Index improves
the measurement of the Dunn Index by calculating the ratio of the average sum of squared dis-
persion of inter-cluster and intra-cluster. The Davies-Bouldin Index simplifies the process by
treating each cluster individually, which compares the sum of the average distance among intra-
cluster data points to its cluster centre of two separate clusters with the distance between their
centre points. Finally, the Silhouette Index is the overall average of the silhouette coefficients
for each sample. The coefficient measures the similarity of the data point to its cluster compared
with the other clusters. Higher scores of the Dunn, Calinski-Harabasz and Silhouette Index and
a lower score of the Davies-Bouldin Index indicate better clustering configuration.
Classification Modelling
Classification algorithms. To obtain a comprehensive and robust conclusion, we explored the
following classifiers: Na¨
ıve Bayes, Elastic Net regularisation, Support Vector Machine, Random
Forest, Gradient Boosting and Stacked Ensemble. The Na¨
ıve Bayes classifier is a probabilistic
algorithm based on Bayes’ theorem with assumptions of independent features and equiprobable
classes. Compared with other more complex classifiers, it saves computing time for large datasets
and performs better if the assumptions hold. However, in the real world, those assumptions
are generally violated. Elastic Net regularisation combines the penalties of Lasso and Ridge
to regularise the Logistic classifier. It eliminates the limitation of multicollinearity in the Lasso
method and improves the limitation of feature selection in the Ridge method. Even though Elastic
Net is as simple as the Na¨
ıve Bayes classifier, it is more time-consuming. The Support Vector
Machine (SVM) aims to find the ideal line or hyperplane to separate successful entrepreneurs and
32
employees in this study. The dividing line can be non-linear based on a non-linear kernel, such
as the Radial Basis Function Kernel. Therefore, it performs well on high-dimensional data while
the “right” kernel selection needs to be tuned. Random Forest (RF) and Gradient Boosting Trees
(GBT) are ensembles of decision trees. All trees are trained independently and simultaneously
in RF, while a single new tree is trained each time and is corrected by previously trained trees
in GBT. RF is a more robust and simple model since it does not have many hyperparameters to
tune. GBT optimises the objective function and learns a more accurate model since there is a
successive learning and correction process. Stacked Ensemble combines all existing classifiers
through a Logistic Regression. Better than bagging with only variance reduction and boosting
with only bias reduction, the ensemble leverages the benefit of model diversity with both lower
variance and bias. All the above classification algorithms distinguish successful entrepreneurs
and employees based on the personality matrix.
Evaluation metrics. A range of evaluation metrics comprehensively explains the perfor-
mance of a classification prediction. The most straightforward metric is accuracy, which mea-
sures the overall portion of correct predictions. It will mislead the performance of an imbalanced
dataset. The F1 score is better than accuracy by combining precision and recall and considering
the False Negatives and False Positives. Specificity measures the proportion of detecting the true
negative rate that correctly identifies employees, while Positive Predictive Value (PPV) calculates
the probability of accurately predicting successful entrepreneurs. Area Under the Receiver Op-
erating Characteristic Curve (AUROC) determines the capability of the algorithm to distinguish
between successful entrepreneurs and employees. A higher value means the classifier performs
better on separating classes.
Feature importance. To further understand and interpret the classifier, it is critical to iden-
tify variables with significant predictive power on the target. Feature importance of tree-based
models measures Gini importance scores for all predictors, which evaluate the overall impact
of the model after cutting off the specific feature. The measurements consider all interactions
33
among features. However, it does not provide insights into the directions of impacts since the
importance only indicates the ability to distinguish different classes.
Statistical analysis. T-test, Cohen’s D and two-sample Kolmogorov-Smirnov test are intro-
duced to explore how the mean values and distributions of personality facets between successful
entrepreneurs and employees differ. The T-test is applied to determine whether the mean of per-
sonality facets of two group samples are significantly different from one another or not. The
facets with significant differences detected by the hypothesis testing are critical to separate the
two groups. Cohen’s d is to measure the effect size of the results of the previous t-test, which is
the ratio of the mean difference to the pooled standard deviation. A larger Cohen’s d score indi-
cates that the mean difference is greater than the variability of the whole sample. Moreover, it is
interesting to check whether the probability distributions of personality facets of the two groups
are from the same distribution through the two-sample Kolmogorov-Smirnov test. There is no
assumption about the distributions, but the test is sensitive to deviations near the centre rather
than the tail.
Privacy and ethics
The focus of this research is to provide high-level insights about groups of startups, founders
and types of founder teams rather than on specific individuals or companies. While we used unit
record data from the publicly available data of company profiles from Crunchbase, we removed
all identifiers from the underlying data on individual companies and founders and generated
aggregate results, which formed the basis for our analysis and conclusions.
34
Data and Code Availability
A dataset which includes only aggregated statistics about the success of startups and the factors
that influence is released as part of this research. Underlying data for all figures and the code to
reproduce them are also available.
Please contact Fabian Braesemann (fabian.braesemann@oii.ox.ac.uk) in case you
require access to any data and code.
Acknowledgements
We thank Gary Brewer from BuiltWith; Leni Mayo from Influx, Rachel Slattery from TeamSlatts
and Daniel Petre from AirTree Ventures for their ongoing generosity and insights about startups,
founders and venture investments. We also thank Tim Li from Crunchbase for advice and liaison
regarding data on startups and Richard Slatter for advice and referrals in Twitter.
Author contributions
All authors designed research; All authors analysed data and undertook investigation; FB and
FS led multi-factor analysis; PM, XG and MAR led the founder/employee prediction; MLK led
personality insights; XG collected and tabulated the data; XG, FB, and FS created figures; XG
created final art, and all authors wrote the paper.
Competing interests
The authors have declared that no competing interests exist.
35
Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.
McCarthyetal2023ScienceofStartupsSUPPLEMENT.pdf
McCarthyetal2023ScienceofStartupsSUPPLEMENT.pdf
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Predicting the success of a business venture has always been a struggle for both practitioners and researchers. However, thanks to companies that aggregate data about other firms, it has become possible to create and validate predictive models based on an unprecedented amount of real-world examples. In this study, we use data obtained from one of the largest platforms integrating business information – Crunchbase. Our final training set consisted of 213 171 companies. This work aims to create a predictive model based on machine learning for the purpose of forecasting a company’s success. Many similar attempts have been made in recent years. Plenty of those experiments, often conducted with the use of data gathered from several different sources, reported promising results. However, we found that very often they were significantly biased by their use of data containing information that was a direct consequence of a company reaching some level of success (or failure). Such an approach is a classic example of the look-ahead bias. It leads to very optimistic test results, but any attempt at using such an approach in a real-world scenario may result in dramatic consequences. We designed our experiments in a way that would prevent the leaking of any information unavailable at the decision moment to the training set. We compared three algorithms – logistic regression, support vector machine, and the gradient boosting classifier. Despite the conscious decision to limit the number of predictors, we reached very promising results in terms of precision, recall, and F1 scores which, for the best model, were 57%, 34%, and 43% respectively. The best outcomes were obtained with the gradient boosting classifier. We give detailed information about the importance of different features, with the top three being country and region that the company operates in and the company’s industry. Our model can be applied directly as a decision support system for different types of venture capital funds.
Article
Full-text available
Innovation to reduce the cost of clean technologies has large environmental and societal benefits. Governments can play an important role in helping cleantech startups innovate and overcome risks involved in technology development. Here we examine the impact of the US Advanced Research Projects Agency-Energy (ARPA-E) on two outcomes for startup companies: innovation (measured by patenting activity) and business success (measured by venture capital funding raised, survival, and acquisition or initial public offering). We compare 25 startups funded by ARPA-E in 2010 to rejected ARPA-E applicants, startups funded by a related government programme and other comparable cleantech startups. We find that ARPA-E awardees have a strong innovation advantage over all the comparison groups. However, while we find that ARPA-E awardees performed better than rejected applicants in terms of post-award business success, we do not detect significant differences compared to other cleantech startups. These findings suggest that ARPA-E was not able to fully address the ‘valley of death’ for cleantech startups within 10–15 yr after founding.
Article
Full-text available
In March 2020, the World Health Organization (WHO) declared coronavirus disease 2019 (COVID-19), which is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)¹, a pandemic. With rapidly accumulating numbers of cases and deaths reported globally², a vaccine is urgently needed. Here we report the available safety, tolerability and immunogenicity data from an ongoing placebo-controlled, observer-blinded dose-escalation study (ClinicalTrials.gov identifier NCT04368728) among 45 healthy adults (18–55 years of age), who were randomized to receive 2 doses—separated by 21 days—of 10 μg, 30 μg or 100 μg of BNT162b1. BNT162b1 is a lipid-nanoparticle-formulated, nucleoside-modified mRNA vaccine that encodes the trimerized receptor-binding domain (RBD) of the spike glycoprotein of SARS-CoV-2. Local reactions and systemic events were dose-dependent, generally mild to moderate, and transient. A second vaccination with 100 μg was not administered because of the increased reactogenicity and a lack of meaningfully increased immunogenicity after a single dose compared with the 30-μg dose. RBD-binding IgG concentrations and SARS-CoV-2 neutralizing titres in sera increased with dose level and after a second dose. Geometric mean neutralizing titres reached 1.9–4.6-fold that of a panel of COVID-19 convalescent human sera, which were obtained at least 14 days after a positive SARS-CoV-2 PCR. These results support further evaluation of this mRNA vaccine candidate.
Article
Full-text available
By drawing on large-scale online data we are able to construct and analyze the time-varying worldwide network of professional relationships among start-ups. The nodes of this network represent companies, while the links model the flow of employees and the associated transfer of know-how across companies. We use network centrality measures to assess, at an early stage, the likelihood of the long-term positive economic performance of a start-up. We find that the start-up network has predictive power and that by using network centrality we can provide valuable recommendations, sometimes doubling the current state of the art performance of venture capital funds. Our network-based approach supports the theory that the position of a start-up within its ecosystem is relevant for its future success, while at the same time it offers an effective complement to the labour-intensive screening processes of venture capital firms. Our results can also enable policy-makers and entrepreneurs to conduct a more objective assessment of the long-term potentials of innovation ecosystems, and to target their interventions accordingly.
Article
Full-text available
Work is thought to be more enjoyable and beneficial to individuals and society when there is congruence between one’s personality and one’s occupation. We provide large-scale evidence that occupations have distinctive psychological profiles, which can successfully be predicted from linguistic information unobtrusively collected through social media. Based on 128,279 Twitter users representing 3,513 occupations, we automatically assess user personalities and visually map the personality profiles of different professions. Similar occupations cluster together, pointing to specific sets of jobs that one might be well suited for. Observations that contradict existing classifications may point to emerging occupations relevant to the 21st century workplace. Findings illustrate how social media can be used to match people to their ideal occupation.
Article
Full-text available
Human achievements are often preceded by repeated attempts that fail, but little is known about the mechanisms that govern the dynamics of failure. Here, building on previous research relating to innovation1–7, human dynamics8–11 and learning12–17, we develop a simple one-parameter model that mimics how successful future attempts build on past efforts. Solving this model analytically suggests that a phase transition separates the dynamics of failure into regions of progression or stagnation and predicts that, near the critical threshold, agents who share similar characteristics and learning strategies may experience fundamentally different outcomes following failures. Above the critical point, agents exploit incremental refinements to systematically advance towards success, whereas below it, they explore disjoint opportunities without a pattern of improvement. The model makes several empirically testable predictions, demonstrating that those who eventually succeed and those who do not may initially appear similar, but can be characterized by fundamentally distinct failure dynamics in terms of the efficiency and quality associated with each subsequent attempt. We collected large-scale data from three disparate domains and traced repeated attempts by investigators to obtain National Institutes of Health (NIH) grants to fund their research, innovators to successfully exit their startup ventures, and terrorist organizations to claim casualties in violent attacks. We find broadly consistent empirical support across all three domains, which systematically verifies each prediction of our model. Together, our findings unveil detectable yet previously unknown early signals that enable us to identify failure dynamics that will lead to ultimate success or failure. Given the ubiquitous nature of failure and the paucity of quantitative approaches to understand it, these results represent an initial step towards the deeper understanding of the complex dynamics underlying failure.
Article
Investing in early-stage companies is incredibly hard, especially when no data are available to support the decision process. Venture capitalists often rely on gut feeling or heuristics to reach a decision, which is biased and potentially harmful. This work proposes a new data-driven framework to help investors be more effective in selecting companies with a higher probability of success. We built upon existing interdisciplinary research and augmented it with further analysis on more than 600,000 companies over a 20-year timeframe. The resulting framework is therefore a smart checklist of 21 relevant features that may help investors to select the companies more likely to succeed.
Technical Report
In developed and developing economies, governments and policy-makers seek to encourage the creation of new firms in order to drive economic development and growth, and to create jobs. In 2013, the Government of Canada spent over $5.4 billion on federal tax and spending programs that support small businesses and entrepreneurship (Carey, Lester, & Luong, 2016).