Content uploaded by Moreno Bonaventura

Author content

All content in this area was uploaded by Moreno Bonaventura on Apr 27, 2019

Content may be subject to copyright.

Predicting success in the worldwide start-up network

Moreno Bonaventura1,2,†, Valerio Ciotti1,2,†, Pietro Panzarasa2

Silvia Liverani1,3, Lucas Lacasa1, Vito Latora1,3,4,5

1School of Mathematical Sciences, Queen Mary University of London,

Mile End Road, E14NS, London (UK)

2School of Business and Management, Queen Mary University of London,

Mile End Road, E14NS, London (UK).

3The Alan Turing Institute, The British Library NW12DB, London (UK)

4Dipartimento di Fisica e Astronomia, Universit`

a di Catania and INFN, 95123 Catania (Italy)

5Complexity Science Hub Vienna (CSHV), Vienna (Austria)

†M.B. and V.C. contributed equally to this work.

By drawing on large-scale online data we construct and analyze the time-

varying worldwide network of professional relationships among start-ups. The

nodes of this network represent companies, while the links model the ﬂow of

employees and the associated transfer of know-how across companies. We use

network centrality measures to assess, at an early stage, the likelihood of the

long-term positive performance of a start-up, showing that the start-up net-

work has predictive power and provides valuable recommendations doubling

the current state of the art performance of venture funds. Our network-based

approach not only offers an effective alternative to the labour-intensive screen-

ing processes of venture capital ﬁrms, but can also enable entrepreneurs and

policy-makers to conduct a more objective assessment of the long-term poten-

tials of innovation ecosystems and to target interventions accordingly.

Recent years have witnessed an unprecedented growth of interest in start-up companies. Policy-

1

arXiv:1904.08171v1 [physics.soc-ph] 17 Apr 2019

makers have been keen to sustain young entrepreneurs’ innovative efforts with a view to inject-

ing new driving forces into the economy and foster job creation and technological advance-

ments (1–4). Investors have been lured by the opportunity of disproportionally high returns

typically associated with radical new developments and technological discontinuities. Large

corporations have relied on various forms of external collaborations with newly established

ﬁrms to outsource innovation processes and stay abreast of technological breakthroughs (5).

Undoubtedly, knowledge-intensive ventures such as start-ups can have a large positive impact

on the economy and society. Yet they typically suffer from a liability of newness (6), and

cannot avoid the uncertainties and sunk costs resulting from disruptive product developments,

uncharted markets and rapidly changing technological regimes (7). For these reasons, their

long-term beneﬁts are inherently difﬁcult to predict, and their economic net present value can-

not be unambiguously assessed (8).

Indeed traditional models of business evaluation, based on historical trends of data (e.g., on

sales, production capacity, internal growth, and markets size) are mostly inapplicable to start-

ups, chieﬂy because their limited history does not provide sufﬁcient data. Venture capitalists

and private investors often evaluate start-ups primarily based on the qualiﬁcations and dexterity

of the entrepreneurs, on their potential to create new markets or niches and to unleash the “gales

of creative destruction” (9). The process of screening and evaluating companies in their early

stages is therefore a subjective and labor-intensive task, and is inevitably fraught with biases

and uncertainty.

To overcoming these limitation, we propose a novel and data-driven framework for assessing

the long-term economic potential of newly established start-ups. Our study draws upon the con-

struction and analysis of the worldwide network of professional relationships among start-ups.

Such network provides the backbone and the channels through which knowledge can be gained,

transferred, shared, and recombined. For instance, skilled employees moving across ﬁrms in

2

search of novel opportunities can bring with them know-how on cutting-edge technologies,

advisors who gained experience in one ﬁrm can help identify the most effective strategies in

another, whilst well connected investors, lenders and board members can rely on the knowledge

gained in one ﬁrm to tap business and funding opportunities in another.

Previous work has investigated how knowledge transfer impacts upon the performance of start-

ups; yet information ﬂows have been simply inferred mainly through data on patents (10),

interorganizational collaborations (11), co-location of ﬁrms and their proximity to universi-

ties (12). Other studies have analyzed social networks (e.g., inventor collaboration networks,

interlocking directorates) to unveil the microscopic level of interactions among individuals; yet

their scope has been limited mostly to speciﬁc industries or small geographic areas, and to a

fairly small observation period (11, 13, 14). Owing to lack of data, what still remains to be

studied is the global network underpinning knowledge exchange in the worldwide innovation

ecosystem. Equally, the competitive advantage of differential information-rich network po-

sitions and their role in opening up, expediting, or obstructing pathways to ﬁrms’ long-term

success have been left largely unexplored.

The world-wide network of start-ups. Here we study the complex time-varying network

(15,16) of interactions among all start-ups in the worldwide innovation ecosystem over a period

of 26 years (1990-2015). To this end, we collected all data on ﬁrms and people (i.e., founders,

employees, advisors, investors, and board members) available from the www.crunchbase.com

website. Drawing on the data, we ﬁrst constructed a bipartite graph in which people are con-

nected to start-ups according to their professional role. We then obtained the projected one-

mode time-varying graph in which start-ups are the nodes and two companies are connected

when they share at least one individual that plays or has played a professional role in both com-

panies (see Supplementary Material (SM) for details). At the micro scale, employees working

3

Figure 1: The time-varying network of professional relationships among start-ups. (A)

Countries that, over time, joined the largest connected component (LCC) of the worldwide

start-up (WWS) network are highlighted in blue. (B) Evolution over time of the number of

ﬁrms and links in the WWS network. (C) Evolution over time of the fraction of nodes in the

LCC. (D) Evolution over time of the closeness centrality rank of ﬁve popular ﬁrms. (E) Airbnb’s

ego-centered network.

4

in a company can perceive the intrinsic value of new appealing opportunities and switch compa-

nies accordingly. This mobility creates an intel ﬂow between companies, where those receiving

employees increase their ﬁtness by capitalizing on the know-how the employee is bringing with

her. Such microscopic dynamics is thus captured and modelled by the creation of new edges at

the level of the network of start-ups. As a consequence, companies which are perceived at the

micro scale as appealing opportunities by mobile employees will likely boost their connectivity

and therefore will acquire a more central position in the overall time-varying network.

The resulting time-varying World Wide Start-up (WWS) network comprises 41,830 companies

distributed across 117 countries around the globe, and 135,099 links among them (see Fig S3

and S4 in the SM). Fig 1A highlights the countries in which start-ups have joined, over time,

the largest connected component of the network (15, 16). Fig 1B indicates that the number of

nodes and links in the WWS network has grown exponentially over the last 26 years. In the

same period, various communities of start-ups around the globe have joined together to form

the largest connected component including about 80% of the nodes of the network (Fig 1C).

Currently, an average of 4,74 “degrees of separation” between any two companies characterizes

the WWS network.

At the micro scale, Fig 1E shows a snapshot of the network of interactions between Airbnb

and other companies based on shared individuals. As an illustration, in 2013 Airbnb hired Mr

Thomas Arend (highlighted in the red square), who had previously acted as a senior product

manager in Google, as an international product leader in Twitter, and as a product manager in

Mozilla. As previously pointed out, the professional network thus reveals the potential ﬂow

of knowledge between Airbnb and the three other companies in which Mr Arend had played a

role. Moreover, as new links were forged over time, the topological distances from Airbnb to

all other ﬁrms in the WWS network were reduced, which in turn enabled Airbnb to gain new

knowledge and tap business opportunities beyond its immediate local neighborhood.

5

The mechanistic interpretation of employees’ mobility inducing link creation discussed

above and illustrated in Fig 1E suggests that the potential exposure to knowledge of a start-

up in the WWS network, and its subsequent likelihood to excel in the future, should be well

captured by its network centrality over time. To test this hypothesis we have considered dif-

ferent measures of node centrality (32). For parsimony here we focus on the results obtained

by closeness centrality as it assesses the centrality of a node in the network from its average

distance from all the other nodes, although similar results has also been found by some other

centrality measures, such as betweenness or degree (see SM). In each month of the observation

period, we ranked companies according to their values of closeness centrality (i.e., top nodes

are ﬁrms with the highest closeness). Fig 1D is an example of the large variety of observed

trajectories as companies moved towards higher or lower ranks, i.e., they obtained a larger or

smaller proximity to all other companies in the network. Notice that Apple has always been in

the Top 10 ﬁrms over the entire period, while Microsoft exhibited an initial decline followed by

a constant rise towards the central region of the network. The trajectories of once upon a time

younger start-ups, such as Facebook, Airbnb, and Uber, are instead characterized by an abrupt

and swift move to the highest positions of the ranking soon after their foundation, possibly as a

result of the boost in activity that has characterized the venture capital industry in recent years.

Early-stage prediction of high performance. To investigate the interplay between the po-

sition of a given ﬁrm in the WWS network and its long-term economic performance, from

www.crunchbase.com we collected additional data on funding rounds, acquisitions, and initial

public offerings (IPOs). For each month t, we obtained the list of N(t)ﬁrms, ranked in terms

of closeness, that can be classiﬁed as “open deals” for investors, namely: (i) they have not yet

received funding; (ii) they have not yet been acquired; and (iii) they have not yet been listed in

the stock exchange market (see Fig S5 in SM). As an example, the company WhatsApp, which

6

ranked 1,060th in June 2009 in the full list, occupied the 15th position in the open-deals list

in the same month. Notice that, by assessing a ﬁrm’s network position prior to any ﬁnancial

acquisition or IPO, our analysis is not subject to possible biases arising from the effects that the

capital market might have upon the ﬁrm’s expected performance. Furthermore, predicting the

long-term economic performance of ﬁrms in the open-deal list is arguably a challenging task,

as illustrated by the fact that the average success of venture funds early-stage investments in

similar open deals is only around 10-15% (see section S4.2 in SM). Over the range of 26 years

of the dataset, a total of 5305 different start-ups were identiﬁed as open-deals.

Our recommendation method is based on the hypothesis that start-ups with higher values of

closeness centrality at an early stage are more likely to show signs of positive long-term eco-

nomic performance. Accordingly, we counted the total number M(t)of ﬁrms inside the open-

deal list that, within a time window ∆t= 7 years starting at month t, succeeded in securing

at least one of the following positive outcomes: (i) they took over one or more ﬁrms; (ii) they

were acquired by one or more ﬁrms; or (iii) they underwent an IPO. To assess the accuracy of

our recommendation method in early identifying successful companies, we checked how many

of the Top n= 20 companies in the closeness-based ranking of open-deals obtained a positive

outcome (see Fig S6 in SM).

Fig 2A reports the “success rate” S(blue curve) of the recommendation method, deﬁned as

S(t) = m(t)/n, where m(t)is the number of ﬁrms with a positive outcome included in the Top

n= 20 ﬁrms, and Srand(t) = M(t)/N(t)(black curve) is the success rate expected in the case

of random ordering of companies, i.e. the expected success of a null model of random sam-

pling without replacement which complies with a hypergeometric distribution (see SM section

4 for details). The p-value in the top panel of Fig 2A measures the probability of obtaining, by

chance, a success rate larger than S(t), with low values of p(highlighted regions) indicating the

time periods where the prediction is statistically signiﬁcant (p-value <0.05). From mid 2001

7

to mid 2004, the success rate of our recommendation method (blue curve) is remarkably larger

than the one based on random expectations (black curve), and the p-value is always smaller than

0.01. S(t)exhibits an exceptional peak of 50% in June 2003 (p-value = 0.0001). From 2004 to

2007, the blue curve decreases, reaching a local minimum at a time when a global ﬁnancial cri-

sis was triggered by the US housing bubble. In this period (as well as during the collapse of the

dot-com bubble in 1999-2001), even though the success rate still exceeds random expectations,

the high p-values indicate that the predictions are not statistically signiﬁcant. Finally, after mid

2007, the performance of the prediction increases, and it stabilizes around 35% (p-value = 0.01).

For completeness, Fig S7 in SM reports results based on different lengths of the recommenda-

tion list and on different time windows. The observed dependence of the performance of our

network-based recommendation method on the level of external ﬁnancial market stress should

be studied in more depth.

In Fig 2B, we characterize the overall performance of the recommendation method over the

entire period of observation. Results indicate that about 30% of the ﬁrms appearing in the Top

20 in any month from 2000 to 2009 have indeed achieved a positive economic outcome within 7

years since the time of our recommendation. The black error bars indicate the expected success

rates and standard deviations in the case of random ordering of companies (p-values in this case

are all below 10−5). Interestingly, the random null model provides an expected success rate

which is indeed comparable to the actual performance that venture funds focusing on early-

stage start-ups reach through costly and labour-intensive screening processes (see section 4.2 in

SM for details), while the performance of our recommendation method is considerably superior.

We further checked the robustness of our methodology by replicating the analysis based on the

Top 50 and Top 100 (reported in Fig 2B), for two additional time windows ∆t= 6 and ∆t= 8

years (see Fig S8 in SM) and an alternative method of aggregation of the success rate across

the entire observation period (see Fig S9 in SM). We also controlled for different confounding

8

factors such as start-up size, geographical location or structural role of venture capital funds in

the start-up network, ﬁnding that our conclusions hold (see section 5 of the SM).

Finally, notice that the method presented here only provides a simple heuristic recommendation,

i.e. it does not quantify the probability of each start-up in the open-deal list to show economic

success in the future. In Section 6 of the SM we further studied this possibility by using a suite

of logistic regression methods to predict success of each and every start-up in the open-deal list.

We indeed found that a snapshot of the closeness centrality ranking of a given start-up could

predict its future economic outcome (F1 SCORE = 0.6), in qualitative agreement with ﬁndings

in Fig 2.

Implications. As lack of data and subjective biases inevitably impede a proper and rigorous

evaluation of risky and newly established innovative activities, our study has indicated that the

network of professional relationships among start-ups can unlock the long-term potential of

risky ventures whose economic net present value would otherwise be difﬁcult to measure. Our

recommendation method can help stakeholders devise and ﬁne-tune a number of effective strate-

gies, simply based on the underlying network. Employees, business consultants, board mem-

bers, bankers and lenders can identify the opportunities with the highest long-term economic

potential. Individual and institutional investors can discern ﬁnancial deals and build appropriate

portfolios that most suit their investment preferences. Entrepreneurs can hone their networking

prowess and strategies for sustaining professional inter-ﬁrm partnering and securing a winning

streak over the long run. Finally, governmental bodies and policy-makers can concentrate their

attention and efforts on the economic activities and geographic areas with the most promising

value-generating potential (e.g., activities with the capacity of job creation, youth employment

and skill development, educational and technological enhancement) for both the national and

local communities. Sociological and economic research has vastly investigated the impact of

9

Figure 2: Closeness-based ranking of open-deals and predicting long-term success. (A) The

performance of our recommendation method in predicting companies’ success on a monthly

basis compared to the expected performance of a null model (random ordering of companies).

The top panel reports the probability (p-value) of obtaining, by chance, a success rate larger than

the one observed in the corresponding month. The gray-shaded region indicates the time periods

where the prediction is statistically signiﬁcant (p-value <0.05). (B) The overall performance

of our method over the entire period of observation based on the Top 20,50 and 100 ﬁrms with

the highest closeness centrality. The black error bars indicate the expected success rates and

standard deviations in the case of random ordering of companies.

10

knowledge spillovers (18), involvement in inter-ﬁrm alliances (19) and network position (20)

on ﬁrms’ performance, innovation capacity, propensity to collaborate, and growth rates. Yet,

whether the centrality in the professional network of newly established knowledge-intensive

ﬁrms can help predict their long-term economic success has largely remained a moot question.

Our work is the ﬁrst attempt to pave the way in this direction, and represents a contribution,

from a different angle, to the ongoing discussion on the science of success (21), complementing

recent ﬁndings in different ﬁelds such as science (22–24) and arts (25, 26).

Acknowledgements: LL acknowledges funding from EPSRC grant EP/P01660X/1. VL acknowledges

funding from EPSRC grant EP/N013492/1. Authors express their gratitude to StartupNetwork s.r.l for

providing data and computational infrastructure.

References

1. European Commission, “Towards a job-rich recovery” (COM(2012) 173 ﬁnal, 18 April

2012).

2. The White House, “Economic report of the President”, (US Government Printing Ofﬁce,

Washington, DC, 2016).

3. J. Haltiwanger, R. S. Jarmin, J. Miranda, Who creates jobs? Small versus large versus

young. Review of Economics and Statistics,95(2), 347-361(2013).

4. M. Mazzucato, The Entrepreneurial State: Debunking the Public vs. Private Myth in Risk

and Innovation (Anthem, London, UK, 2013).

5. H. W. Chesbrough, The era of open innovation. MIT Sloan Management Review,44(3),

35-41 (2003).

11

6. J. Freeman, G. R. Carroll, M. T. Hannan, Liability of newness: Age dependence in organi-

zational death rates. American Sociological Review,48(5), 692-710 (1983).

7. W. W. Powell, D. R. White, K. W. Koput, J. O. Smith, Network dynamics and ﬁeld evolu-

tion: The growth of interorganizational collaboration in the life sciences. American Journal

of Sociology,110(4), 1132-1205 (2005).

8. S. A. Shane, The Illusions of Entrepreneurship: The Costly Myths that Entrepreneurs, In-

vestors, and Policy Makers Live by (Yale Univ. Press, New Haven, CT, 2008).

9. J. A. Schumpeter, The Theory of Economic Development (Harvard University Press, Cam-

bridge, MA, 1934).

10. J. Guzman, S. Stern, Where is Silicon Valley?, Science,347(6222), 606-609 (2015).

11. W. W. Powell, K. W. Koput, L. Smith-Doerr, Interorganizational collaboration and the locus

of innovation: Networks of learning in biotechnology. Administrative Science Quarterly,

41(1), 116-145 (1996).

12. A. Saxenian, Regional Advantage (Harvard University Press, Cambridge, MA, 1996).

13. O. Sorenson, J. W. Rivkin, L. Fleming, Complexity, networks and knowledge ﬂow. Re-

search Policy,35(7), 994-1017 (2006).

14. M. Ferrary, M. Granovetter, The role of venture capital ﬁrms in Silicon Valley’s complex

innovation network, Economy and Society,38(2), 326-359 (2009).

15. V. Latora, V. Nicosia, G. Russo, Complex Networks: Principles, Methods and Applications

(Cambridge University Press, 2017)

12

16. N. Masuda, R. Lambiotte, A Guide to Temporal Networks (World Scientiﬁc, Singapore,

2016).

17. S. Wasserman, K. Faust, Social Network Analysis: Methods and Applications (Cambridge

University Press, Cambridge, 1994).

18. T. E. Stuart, O. Sorenson, Liquidity events and the geographic distribution of en-

trepreneurial activity. Administrative Science Quarterly,48(2), 175-201 (2003).

19. R. C. Sampson, R&D alliances and ﬁrm performance: The impact of technological di-

versity and alliance organization on innovation. Academy of Management Journal,50(2),

364-386 (2007).

20. B. Uzzi, Embeddedness in the making of ﬁnancial capital: How social relations and

networks beneﬁt ﬁrms seeking ﬁnancing. American Sociological Review,64(4), 481-505

(1999).

21. A.-L. Barabasi, The Formula: The Universal Laws of Success (Little Brown and Company,

New York, 2018)

22. O. Penner, R. K. Pan, A. M. Petersen, K. Kaski, S. Fortunato, On the predictability of future

impact in science, Scientiﬁc Reports,3:3052 (2013).

23. A. Ma, R. J. Mondrag´

on, V. Latora, Anatomy of funded research in science, Proceedings

of the National Academy of Sciences,112(48), 14760–14765 (2015).

24. R. Sinatra, D. Wang, P. Deville, C. Song, A. L. Barabasi, Quantifying the evolution of

individual scientiﬁc impact, Science,354, 3612 (2016).

25. O. E. Williams, L. Lacasa, V. Latora, Quantifying and predicting success in show business,

arXiv:1901.01392.

13

26. S. P. Fraiberger, R. Sinatra, M. Resch, C. Riedl, A. L. Barab´

asi, Quantifying reputation and

success in art, Science,362, 6416, 825-829 (2018).

Supplementary Materials include:

Supplementary section S1) Data set: additional details

Supplementary section S2) Construction of the World Wide Start-up (WWS) network

Supplementary section S3) Analysis of the WWS network

Supplementary section S4) Open-deals recommendation method

Supplementary section S5) Additional analysis

Supplementary section S6) From recommendation to prediction of start-up success: supervised

learning approaches

Supplementary Tables S1–S4

Supplementary ﬁgures Fig S3–S19

Supplementary References (27–37)

14

SUPPLEMENTARY MATERIAL

S1 Data set: additional details

Data were collected from the crunchbase.com Web API and were updated until December

2015. The data provided by the Crunchbase website are manually recorded and managed by

several contributors (e.g., incubators, venture funds, individuals) afﬁliated with the Crunchbase

platform. Moreover, the data are further enriched by Web crawlers that scrape the Web, on a

daily basis, in search for news about IPOs, acquisitions, and funding rounds. To date Crunch-

base is widely regarded as the world’s most comprehensive open data set about start-up com-

panies. It contains detailed information on organizations from all over the world and belonging

to four categories, namely companies, investors, schools, and groups. Among schools there are

383 universities, including top-tier institutions such as Stanford University, the Massachusetts

Institute of Technology (MIT), and many others. In addition to people’s business activity, the

data track information about their educational paths, and consequently their access to academic

knowledge.

The total number of organizations listed at the date of data collection amounted to 530,604.

However, a large number of entries contained very limited information, no proﬁle pictures, and

no employees’ records. Accordingly, we needed to clean the data keeping only the organiza-

tions for which enough information was provided, and for which such information was reliable

(see Section S2 for more details). This ﬁnally limited the number of organizations to 41,830.

For this work, all these organizations were included in the construction of the network. Notice

however, that only organizations belonging to the category “companies” and, at the same time,

15

younger than two years, have been included in the recommendation list. For each organiza-

tion we extracted all the people included in the team (e.g., founders, advisors, board member,

employees, alumni) and additional information such as details on ﬁrms’ foundation dates, lo-

cations of the ﬁrms’ headquarters, founding rounds, acquisitions, and IPOs. Organizations and

people are uniquely identiﬁed by alphanumeric IDs. All data are time-stamped, and an accurate

reconstruction of historical events was made possible by the use of trust codes, i.e., numerical

codes provided by Crunchbase to indicate the reliability of a certain timestamp. The timestamps

indicate the dates of foundation, funding rounds, acquisitions, and IPOs, as well as the start and

the end times of job roles.

S2 Construction of the World Wide Start-up (WWS) network

We constructed a bipartite time-varying graph with N1= 41,830 nodes representing organi-

zations distributed across 117 countries around the globe, N2= 36,278 nodes representing

people, and K12 = 284,460 links between people and organizations. The graph is time-varying

because each node and each link have an associated timestamp, representing, respectively, the

time an organization was founded and the time a person was afﬁliated (and held a variety of

roles) with a given organization. Notice that in the construction of the time-varying graph we

retained only the timestamps whose trust code guarantees the reliability of the year and month.

Additionally, we cleaned the data by solving and removing inconsistencies such as an em-

ployee’s role starting at a date prior to the company’s foundation. In these cases, we retained

the most reliable information according to the trust code value. Inconsistencies were removed

by adopting a strong self-penalising data cleaning strategy. In particular, we did not make any

assumption on dates, nor did we attempt to infer timestamps. As a result, we do not retain in the

graph links whose timestamps cannot be determined in a reliable way. This approach to data

cleaning strengthens the validity of our results because it ensures that companies do not gain

16

higher positions in the closeness centrality rank score as a result of connections that were forged

at subsequent dates to those incorrectly or only partially reported in the data set. In this way we

avoid biases that could artiﬁcially inﬂate the success rate of the method, and accordingly our

results can safely be seen as conservative lower bounds.

We then projected the bipartite time-varying graph onto a one-mode graph in which two

companies are connected when they share at least one individual that plays or has played a

professional role in both companies. Such a graph comprises N1= 41,830 companies and

K= 135,099 links among them, and is here referred to as the World Wide Start-up (WWS)

network. The projected graph is time-varying like the original bipartite graph: a link between

any two companies is forged as soon as one individual with a professional role in one company

takes on a role in the other company. Since the creation of these links denote intel transfer

between companies, we realistically assume that such intel ﬂow generates considerable know-

how for the nodes receiving new links. Once created, the links are then maintained, since the

know-how of a given company is not destroyed or removed.

Figure S3: Assortativity and degree distribution of the WWS network. (A) Average degree

of nearest neighbours knn for classes of nodes with degree kin the WWS. (B) The power-law

degree distribution of the WWS network. (C) Similar to panel (A), but where all venture capital

ﬁrms (see Section S5.5 for a list) have been removed.

17

S3 Analysis of the WWS network

For completeness, we have calculated a variety of quantities for measuring the characteristics

of the structure of the WWS network. In particular, from 1990 to 2015, for every month, we

have computed the number of companies (nodes) and links, and examined the partition of the

WWS network into distinct connected components. A connected component of a network is a

subgraph in which any two nodes are connected to each other by at least one path [15, 16]. If

the network has more than one component, one can identify the largest connected component

(LCC), namely the component with the largest number of nodes. The countries highlighted in

blue in Fig 1A (main text) are those that have at least one start-up that is part of the LCC of

the WWS network. Fig 1C (main text) shows a rapid growth in the fraction of start-ups in the

LCC, thus highlighting the tendency that companies have to establish new connections with

one another and move toward the core of the network. Like many other real-world complex

networks, the WWS network is characterised by a rich topological structure, a small average

shortest path length (`= 4.74), and a high value of the average clustering coefﬁcient, C=

0.6, as expected from the one-mode projection of a bipartite network [15, 16]. The value of

the average shortest path length is similar to the one obtained for an equivalent Erd¨

os-Renyi

random graphs (29) with the same number of nodes and edges (`random = 4.17). However, the

statistical features of the WWS differ from those characterising random graphs: the degree

distribution approaches a power-law with an exponent greater than 2(see Fig. S3, panel B), the

assortativity coefﬁcient (28) is positive, namely γ= 0.11 (see Fig. S3, panel A) and this result

holds even if all venture capital ﬁrms are removed from the network (panel C). The clustering

coefﬁcient is signiﬁcantly larger than the one obtained for a corresponding random network,

Crand = 0.00013.

To offer a glimpse of the structure of the WWS network, in Fig. S4 we show the subgraph

18

obtained by using the k-core decomposition technique and including only the nodes that belongs

to the 10th shell. The k-core decomposition of a graph (30–32) is a technique that iteratively

deletes nodes starting from the most peripheral ones (i.e., nodes with degree equal to 1) and pro-

gressively unveil the most central and interconnected core of the network. Nodes are assigned

to a core value equal to kaccordingly to the k-core subgraph to which they belong.

Figure S4: Visualisation of the WWS network. Owing to visualisation constraints, only the

10th shell of the k-core decomposition is displayed in the image. The graph shown here includes

8% of the nodes and 31% of the links in the complete WWS network.

S4 Open-deals recommendation method

Our working hypothesis is that companies with a central position in the network have higher

exposure to knowledge and easier access to resources than companies with peripheral positions.

If this is the case, centrally positioned companies will be better equipped to compete and have

higher chances to survive, grow and ﬂourish than peripheral ones. We have therefore used

network centrality measures [15, 16] that capture the structural centrality of a node in a graph,

with a view to identifying companies with a large long-term economic potential.

19

The concept of centrality and the related measures were ﬁrst introduced in the context of

social network analysis (32). the centrality of a company we have computed, on a monthly

basis, its closeness centrality in the WWS network. Several other centrality measures have also

been considered, and the results are reported in Section S5. The closeness centrality quantiﬁes

the importance of a node in the graph by measuring its mean distance from all other nodes. The

closeness centrality Ci(t)of a node i,i= 1,2, . . . , N(t)is deﬁned as:

Ci(t) = N(t)−1

Pjdij (t),(S1)

where N(t)is the number of nodes in the graph at time t, while dij (t)is the graph distance

between the two nodes iand j, measured as the number of links in the shortest path between

the two companies. To account for disconnected components we used the generalisation of

closeness centrality proposed in (33).

Our claim is that young start-ups with proportionally higher values of closeness centrality

will have a higher likelihood to become successful in later years. This can be readily translated

into several possible heuristics to provide recommendation for investing into a given start-up.

Among other possibilities, we have considered the following recommendation method. For

each month t, we ranked all the N(t)companies according to their values of closeness centrality

Ci(t), such that the top nodes are those with the highest closeness. From the ranked lists we

then removed the companies that can reasonably be regarded as irrelevant deals to investors, i.e.,

those companies that had already been acquired, had already been listed in a stock market, or

had received funding from other investors. The N(t)companies retained in the analysis belong

to the so-called open-deals ranked list at month t. Notice that, by deﬁnition, the open-deal

list considers newly-established start-ups. As a matter of fact, incubators such as 500 Startups,

Y Combinator,Techstars or Wayra indeed target early-stage companies, i.e. they make risky

investments on ideas and small teams without much of previous history. Their investment targets

20

are therefore similar to the ones captured by our the deﬁnition of ‘open-deal list’, and it is easy

to realize that predicting future positive outcomes of ﬁrms in such a set is more challenging than

predicting future positice outcomes of more established ﬁrms.

Fig S5 shows an example of the procedure adopted. The companies highlighted in grey are

those which, prior to December 2008, had not yet received funding, had not yet been acquired,

or had no yet been listed in any stock market. These companies thus could be seen as investment

opportunities at month t. Since we want to focus on early-stage companies, we also removed

any company that was more than two years old.

S4.1 Success rate in open-deals lists

Each open-deals list in month tcontains M(t)successful companies (0≤M(t)≤N(t)), i.e.,

those companies that have obtained, within a time window ∆t= 6,7or 8years since month

t, a positive outcome. A positive outcome is here deﬁned in terms of the occurrence of at

least one of the following events: (i) the company makes an acquisition; (ii) the company is

acquired; or (iii) the company undergoes an IPO. To each company in the open-deals list we

then assigned the value of a binary variable, namely 1if the company has achieved a positive

outcome within the chosen time windows ∆t, or 0otherwise. Fig S6 shows an example of the

monthly open-deals lists, in which names are replaced by their associated binary values. Notice

that the higher the number of ones in the top regions of the rankings, the better the performance

of the recommendation method in predicting positive outcomes.

We focus on companies in the top positions of our open-deals recommendation list, and

we indicate by m(t)the number of companies in the Top 20 in month tthat have obtained a

positive outcome, i.e., the number of ones in the ﬁrst n= 20 entries of the list. Notice that

the same procedure has been repeated for the Top 50 companies (n= 50) to check for the

robustness of results. The accuracy of the recommendation method is assessed by computing

21

Figure S5: Example of the construction of the open-deals ranked list. For each month of

the observation period, all companies in the network were ranked according to their values of

closeness centrality. Top-ranked nodes are those with the highest closeness centrality in the

WWS network in the corresponding month. Only those companies (highlighted in grey) that

had not yet received funding, had not yet been acquired, and had not yet been listed in the stock

exchange market, were retained in the open-deals ranked list.

Figure S6: Illustrative example of a monthly open-deals ranking. Companies’ names are

replaced by the values of their associated binary variable, with a value equal to 1indicating the

achievement of a positive outcome. The last line reports the success rate STop10(t)of companies

in the Top 10 of our recommendation list.

22

the success rate S(t)deﬁned as the ratio m(t)/n. How does this compare to a null model

where network properties are not taken into account? If the open-deals lists were randomly

ordered, the expected number of successful companies mrand(t)in e.g. the Top 20 (n= 20)

would be given by the expected value of the hypergeometric distribution H(N(t), M(t), n). In

particular, the expected value of mrand(t)is nM (t)/N(t)and thus the expected success rate is

Srand(t) = M(t)/N(t). Similarly, it follows that var(Srand(t)) = var(mrand(t))/n2.

Fig 2 (main text) and Fig S7 show that S(t)(blue curve) is systematically much higher than

Srand(t)(black curve), except during two short periods corresponding, respectively, to the dot-

com bubble (1999-2001) and to the 2008 ﬁnancial crisis. In both cases, the difference between

S(t)and Srand(t)becomes narrower, yet S(t)always remains higher than Srand(t). Moreover,

Fig S7 shows that these ﬁndings are robust against variations in the length of the time window

(i.e., ∆t= 6,7,8) and in the number of companies considered in the recommendation (i.e., Top

20 and Top 50).

The statistical signiﬁcance of the results is assessed by computing the hypergeometric p-

values, which give the probability of obtaining, by chance, a success rate equal to or greater

than the one obtained with real data. Denoting as P(·)the probability mass function of mrand(t)

we can compute the p-value at time tas:

p(t) =

n

X

k=m(t)

P(mrand(t) = k).

The top charts in Fig 2 (main text) and in each panel of Fig S7 report the evolution of the p-

values over time. Low p-values (<0.05) are observed in most parts of the observation period.

This suggests that the discrepancy between the success rate of the 20 top-ranked companies

selected according to our recommendation method and the success rate of the same number of

companies selected at random from the open-deals list is statistically signiﬁcant. Conversely,

high p-values are observed in correspondence of the downturns, thus indicating that in such

23

periods the success rates predicted by our recommendation method could have been obtained

also by chance.

S4.2 Real investors performance is similar to random expectation

It is important to highlight that, although the random expectation null model has mainly been

introduced to assess whether our results are statistically signiﬁcant, the performance of real

investements is remarkably similar to the expected success rate in the null model. To illustrate

this, a summary statistics of the Top 15 investors, according to the number of investments, is

reported in Table S4.2 (data extracted from crunchbase.com). Notice that there is a great

variability in investors performance, which reﬂects the variability in the type of investments.

Highlighted in pink are those investors whose target complies with our deﬁnition of open-deal

list. Incubators such as 500 Startups,Y Combinator,Techstars or Wayra focus indeed their

interest on very early-stage companies, i.e. they invest on ideas and small teams of entrepreneurs

without much history. They make the most risky bets in the landscape of start-ups investments

and their performance lies around 15%. Their investment target is very close to the type of

companies that we have isolated in our deﬁnition of “open deals”. On the other end, large

venture ﬁrms such as Intel Capital,Accel Partners, or Goldman Sachs invest in companies

at later stage of maturity. They are interested in organizations with larger teams, that have

already previously received funding, and they typically inject funds to boost a business that

has already found a market ﬁt and has history of revenues, customers, and other indicators of

growth. The presence of quantitative indicators of growth allows large venture ﬁrms to perform

a more objective evaluation of the company and its success potential, which in turn is reﬂected

on higher investment performances.

In summary, while investors decide on which start-ups to invest through costly and labour-

intensive screening processes, results conﬁrm that the percentage of real investments that were

24

Figure S7: The success rate of our recommendation method. The success rate S(t)of

our method (blue curve) is compared to the expected success rate Srand(t)associated with

the recommendation of randomly selected companies (black curve). Different lengths, namely

n= 20,50, of the recommendation lists, and different time windows, i.e., ∆t= 6,7,8years, to

assess the performance of a company have been considered. The statistical signiﬁcance of the

discrepancy between S(t)and Srand(t)is quantiﬁed through the associated p-values, shown in

the top charts of each panel.

25

Investor # investments # successful investments Success rate

500 Startups 1022 153 15%

Y Combinator 953 154 16%

Intel Capital 744 313 40%

Start-up Chile 710 10 1.4%

Sequoia Capital 700 267 38%

New Enterprise Associates (NEA) 672 272 40%

SV Angel 600 258 43%

Techstars 549 95 17%

Brand Capital 537 80 14%

Accel Partners (Accel) 536 270 50%

Sos Ventures (SOSV) 493 17 3%

Wayra 476 11 2%

Kleiner Perkins Cauﬁeld & Byers (KPCB) 457 203 43%

Right Side Capital Management (RSCM) 449 44 10%

Goldman Sachs 410 209 50%

Table S1: Top 15 investment companies according to the number of investments made, along

with the percentage of successful investments. Highlighted in pink are investors focused on

very early-stage companies as those considered in our open-deal lists. The success rates of such

investors are comparable to the random expectation null model, and much below the success

rate obtained using our recommendation method.

deemed ‘successful’ is consistently similar to the success rate given by our random expectation

model. In other words, state-of-the-art success rate is not much better than a random expectation

null model. This means that any improvement upon the null model provides valuable informa-

tion. We conclude that our recommendation method based on centrality –whose success rate

consistently exceeds random expectation over several periods– is a considerable improvement

with respect to the state of the art.

S4.3 Details on overall success rate

To obtain an overall measure of the performance of our method, the success rate can be aggre-

gated across the entire observation period. This can be carried out in two complementary ways

leading to two different measures of the overall success, namely e

SIand e

SII. Here we discuss

and provide some details with regards to both measures.

26

The ﬁrst measure of overall success rate, e

SI, which is used in the main text, takes into

account the total number of positive entries in the top positions in all open-deals lists, regardless

of the speciﬁc companies that occupy those positions. In this way e

SIprovides a measure of the

overall goodness of the ranking across months, but it does not provide information about the

number of unique companies correctly or wrongly identiﬁed as successful. As an example of the

computation of e

SI, let us consider the period starting in January 2000 and ending in December

2007, and the Top 20 companies (bottom-left charts in Fig S7). Such a period includes δ= 96

months. The overall success rate e

SIis deﬁned as:

e

SI=emI

enI

,

where enI= 20 ∗δis the total number of entries in the Top 20 list across the δmonths, and

emI=Ptm(t), where the sum runs over all months in the observation period. To construct a

null model to which we can compare these measures, we then proceed to randomly shufﬂing

the entries in each open-deal list independently for each month and apply the same procedure

(i.e., the null model makes a random sampling of the list without replacement). Accordingly, at

month twe count the number of successful companies within the Top 20 and label it mrand(t).

The expected total number of successful companies within all the Top 20 lists in this null model

is thus given by:

emrand

I=X

t

mrand(t),

and the corresponding variance is given by the sum of the variances in each month

var(emrand

I) = X

t

var(mrand(t)),

where var(mrand(t)) denotes the variance associated to the random null model, i.e. the variance

of the hypergeometric distribution. The expected overall success rate in the case of random

ordering is then given by

e

Srand

I=emrand

I

enI

,

27

Figure S8: Observed and randomly expected success rates. The overall success rate em-

pirically found e

SI(blue bars), the overall success rate e

Srand

Iexpected by chance (black dots),

and its standard deviation (black error bars), for various values of ∆tand lengths of the list of

top-ranked companies (i.e., Top 20, 50, and 100).

and its standard deviation σIis

σI=pvar(emrand

I)

enI

.

Figure S8 reports the overall success rate empirically found e

SI(blue bars), the overall success

rate e

Srand

Iexpected by chance (black dots), and its standard deviation (black error bars) for

various values of ∆t, and for different numbers of recommended companies (i.e., Top 20, 50,

and 100).

The second measure of the overall success rate, e

SII, does not simply capture the overall

28

performance of the ranking-based recommendation method, but compares the number of unique

companies in the Top 20s correctly predicted as successful by our method, across the entire

observed period, against the number of successful companies that would be expected under

random selection. In particular, this second measure of overall success is based on: (i) the total

number e

NII of unique companies available in any month; (ii) the total number f

MII of unique

companies that have achieved a positive outcome at any time since their foundation up to 2015;

(iii) the number enII of unique companies included in all Top 20 rankings in any month; and (iv)

the number emII of unique companies, listed in all Top 20 rankings, that have achieved a positive

outcome at any time since their foundation up to 2015.

Notice that, in this way, each company contributes only once to the evaluation of the success

rate. Therefore, the probability of ﬁnding exactly emII successful companies in any ranking of

Top 20 (50, or 100) is given by the hypergeometric function H( e

NII,f

MII,enII ,emII). The success

rate shown in Fig S9 is computed as e

SII =emII/enII , while the success rate e

Srand

II in the case of

the null model is given by e

Srand

II = (f

MII/e

NII). Fig S9 reports also the error bars of the success

rate computed as the standard deviation of the hypergeometric distribution.

While the ﬁrst index of overall performance assesses the average goodness of the ranking,

the second index measures only the number of companies correctly identiﬁed as successful

across the entire observation period. The two aggregation methods produce comparable results,

and achieve a substantial success rate of about 40% in the case of ∆t=∞. Moreover, in

both cases, the success rate found in reality and the one expected by random chance are very

different, and their discrepancy is always statistically signiﬁcant with p-values smaller than

10−5.

29

Figure S9: Observed and randomly expected success rates. The overall success rate e

SII (blue

bars) obtained through the second method of aggregation assessed against the overall success

rate e

Srand

II expected by chance (black dots), and its standard deviation (black error bars), for

various lengths of the list of top-ranked companies (i.e., Top 20, 50, and 100).

S5 Additional analysis

S5.1 Closeness centrality in successful vs non-successful start-ups

To have a better understanding of how closeness centrality is distributed among start-ups, in Fig

S10 we compare the estimated frequency histograms of closeness centralities rescaled ranking.

To obtain the rescaled ranking, in each calendar month we calculated the closeness centrality

of each ﬁrm in the global network and ranked all ﬁrms in terms of their centrality, what gives

an ‘absolute rank‘ for each ﬁrm. We then extract those ﬁrms which belong to the open-deal

list, and re-rank them accordingly (so that the ﬁrm with top ranking acquires a rank 0in the

open-deal ranking, the second acquires rank 1, and so on). The rescaled ranking is then deﬁned

as the ratio between the open-deal-rank and the maximum absolute-rank of open-deal compa-

nies at a given month. Thus, the ﬁrm with the highest position (i.e., zero ranking) maintained

the same value (i.e., zero) in the rescaled ranking. By contrast, ﬁrms at lower positions were

30

Figure S10: Closeness centrality distributions. Histograms of closeness centrality rescaled

rankings (in bins of 0.005, see the text) for start-ups which will have a positive outcome (blue)

and for which no positive outcome occurs (orange). Successful start-ups have statistically lower

rescaled rankings (i.e. higher centralities) than non-successful ones. For every start-up, only

the value of closeness centrality collected in the last month of observation has been used.

assigned rescaled values approaching 1 as their ranking approached the highest value (i.e., the

lowest position). Such a rescaling thus enables to appropriately compare ﬁrms characterised by

different values of centrality, obtained in different networks and at different calendar times. In

order to smooth out the data, a binning has been performed in the x axis (bin size of 0.005). We

notice that histograms are non-overlapping, and that there is a net overabundance of start-ups

with a positive outcome (successful) closer to the top of the ranking. In other words, start-ups

which are higher in the centrality rankings (i.e. small values of closeness centrality ranks) have

statistically a higher chance of positive economic outcomes. This conﬁrms that rankings based

on closeness centrality are indeed informative of a start-up long-term success and can then be

used to inform recommendation.

31

S5.2 Different centrality measures are correlated

We have considered closeness centrality as our primary measure of network centrality. Close-

ness centrality is based on the lengths of shortest paths in the network. However, the structural

centrality of a node in a network can be quantiﬁed by different network metrics, either global

such as closeness and betweenness, and local as the degree centrality (32). Closeness centrality

of a node (Eq.S1) characterises the overall distance between that node and the rest of the nodes

in the network, such that the lower that overall distance, the higher this measure, and hence the

more central this node is.

On the other hand, the betweenness centrality bi(t)of a node iwhen the network is observed at

a given time tis given by

bi(t) = 1

(N−1)(N−2)

N

X

j=1,j6=i

N

X

k=1,k6=i,j

njk (i;t)

njk (t),(S2)

where njk (t)is the total number of shortest paths between nodes jand kwhereas njk (i)is

the number of shortest paths between jand kthat actually go through i. This measure was

introduced by Freeman to quantify the fact that communication travels just along shortest paths,

and so a node iis more ‘central’ the more shortest paths among pairs of nodes in the network

go through it.

While both closeness and betweenness are measures of centrality based on shortest paths, one

can also think of a node being central if it acquires many edges over time –i.e. acquiring intel

from several other companies–. To account for this we may resort to use (normalised) degree

centrality d(i), deﬁned as

di(t) = ki(t)

kmax(t),(S3)

where ki(t)is the degree (number of links) of node iand kmax(t)is the largest degree in the

network at that particular time snapshot.

32

Consequently, the centrality of a start-up in the WWS network can be measured in many

alternative ways. In this section we will show that the choice of using closeness centrality is not

only supported by theoretical arguments based on employees’ mobility and intel ﬂows among

companies, but it also a robust choice as other alternative measures produce similar results.

To validate robustness, for each start-up in the open-deal list across time we have computed

additional centrality measures, namely degree and betweeness centrality (27), and computed

to which extent all three possible measures of centrality are correlated. More concretely, we

consider all start-ups in the open-deal list for which (i) we have data of the three centralities

over at least 3 of the 24 months forming the observation window, and for which (ii) closeness

and betweeness centralities are deﬁned. For each ﬁrm, we then compute the Pearson correlation

coefﬁcient between the monthly sequence of each pair of (rescaled) centrality measures. We

do this for all ﬁrms and we then construct the frequency histogram of the Pearson correlation

coefﬁcients. Results are reported in Fig. S11. Interestingly, we ﬁnd that the three measures are

in general well (pairwise) correlated. We conclude that the choice of a particular type of global

centrality measure, such as closenness, is a robust choice as other global structural indicators

based on a different use of shortest paths and, to a minor extent, also local measures such as

the degree are correlated with the closeness in the case of the WWS network under analysis in

this work. Hence, focusing on closeness centrality is a robust choice. In the next subsection

we round-off this validation by exploring results of our recommendation method using either

betweenness or degree centrality as the key network indicator, and will show that success rates

of the recommendation method are similar in all three cases.

S5.3 Recommendation methods based on other centrality measures

To further complement the correlation analysis of the previous subsection, here we focus on

recommendation methods based on centrality measures other than closeness. Results are sum-

33

(a)

Correlations CB

Density

−1.0 −0.5 0.0 0.5 1.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

(b)

Correlations BD

Density

−1.0 −0.5 0.0 0.5 1.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

(c)

Correlations CD

Density

−1.0 −0.5 0.0 0.5 1.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

Figure S11: Centrality measures are correlated. Frequency histograms of the Pearson

correlation coefﬁcients between (a) rescaled closeness and rescaled betweeness centrality, (b)

rescaled betweeness and rescaled degree centrality, and (c) rescaled closeness and rescaled de-

gree centrality. We ﬁnd that centrality measures are systematically correlated between each

other, so the choice of using closeness centrality as the centrality measure under analysis is

robust.

marised in Fig.S12 for averages over the entire period, and in Figs.S13 and S14 for monthly

analysis. In every case we ﬁnd that the results are qualitatively similar whether we use closeness,

betweeness or degree centrality, with success rates systematically larger than random expecta-

tions (and therefore larger than the actual perfomance of accelerators and investors focusing on

early-stage start-ups).

S5.4 The effect of fading links

The mobility of workers from one company to another creates an intel ﬂow between compa-

nies. Our working hypothesis is that companies receiving employees increase their ﬁtness by

capitalising on the know-how the employee is bringing with him/her. Such microscopic dynam-

ics is thus captured and modelled by the creation of new edges at the level of the network of

start-ups. As a consequence, companies which are perceived at the micro scale as appealing op-

34

Figure S12: Overall success rate of recommendation methods based on different centrality

measures. Comparison of recommendation methods focusing on the top 20, top 50 and Top

100 rankings, for ∆t= 5,6,7and ∞years, using different centrality measures. Closeness,

degree and betweeness centralities perform similarly, and recommendations based on either of

these measures are systematically superior to a random expectation model, with overall success

rates which are systematically larger than in the null model.

35

Figure S13: Monthly success rate of recommendation methods based on betweeness cen-

trality. Recommendation methods focusing on the Top 20 and Top 50 ranked start-ups in the

open-deal list, for ∆t= 5,6,7and ∞years, using betweeness centrality instead of closeness

centrality. Results are qualitatively similar and the monthly success rate of recommendations

based on betweenness is systematically superior to that of a random expectation model.

36

Figure S14: Monthly success rate of recommendation methods based on degree centrality.

Recommendation methods focusing on the Top 20 and Top 50 ranked start-ups in the open-

deal list, for ∆t= 5,6,7and ∞years, using degree centrality instead of closeness centrality.

Results are qualitatively similar and the monthly success rate of recommendations based on

degree centrality is systematically superior to that of a random expectation model.

37

portunities by mobile employees will likely boost their connectivity and therefore will acquire

a more central position in the WWS network. An important underlying assumption is that, once

a link is created, it will remain in the network indeﬁnitely, so that the company that has received

the intel keeps it and builds on this intel forever. Conversely, considering the possibility of re-

moving links (or actually fading their strength) some time after their creation, would actually

be equivalent to assume that companies can lose the know-how they have acquired, something

which is less likely to occur. Accordingly, allowing links to fade or be removed with time in the

construction of the time-varying WWS network should lead to recommendations on the positive

economic outcome with much lower success rates than those obtained from a network where

know-how is not artiﬁcially removed. To check for this case, we have ﬁrst build the WWS (for

each month) from January 1990 to December 1999. Then, starting from January 2000 onwards,

for each month all connections older than 10 years are removed from the network. Closeness is

then evaluated each month as described in the recommendation method. A similar analysis is

also performed for 5-year fading instead of 10-year fading, with very similar results.

In Fig.S15 we compare the overall success rate for the 5-year fading case (red bars) to our

standard recommendation method based on a WWS network that does not allow link fading.

Results show that a recommendation method with fading links systematically fails. In fact

it works even worse than a random null model, in good logical agreement with our previous

discussion. For completeness, a comparison of the two methods is also considered for the

monthly success rates in Fig.S16. Results are consistent with those obtained for the overall

success rate.

All these results strengthen our working hypothesis that the intel ﬂow across start-ups is

well captured by node centrality in the WWS network.

38

Figure S15: Effects of fading links and removal of venture capital funds. Success rate of

recommendation methods focusing on the Top 20, Top 50 and Top 100 rankings, for ∆t=

5,6,7and ∞years. The standard case of closeness centrality from the original network (blue

bars) is compared to closeness centrality in a case where the links of the WWS are allowed

to fade over time (red bars), and to closeness centrality in a situation where all venture capital

funds (VC) have been removed from the WWS network (green bars). Results from the null

model are plotted in black bars.

39

Figure S16: Effect of fading links on monthly success rates. Systematic comparison of the

monthly success rate of a recommendation based on the closeness centrality from the original

network and in a case where the links of the WWS are allowed to fade over time. In the latter

case success rates drop below the random null model.

40

Figure S17: Comparison of monthly success rate of recommendation method based on a

worldwide start-up network with and without venture capital funds. Monthly success

rate of the recommendation method focusing in the Top 20 and Top 50, for ∆t= 5,6,7and ∞

years, comparing the original case to the case where all venture capital funds have been removed

from the worldwide start-up network. Results are essentially identical to the ones obtained in

the original case, hence conﬁrming that the topological presence of venture capital funds is not

a confounding factor.

41

S5.5 Possible confounding factor 1: the effect of venture capital funds

A ﬁrst possible confounding factor is the presence of venture capital funds, i.e. the fact that the

presence of these nodes in the network might enhance the closeness centrality of start-ups. In

order to assess the role played by venture capital funds in the effective centrality of different

start-ups, we have performed an experiment where we remove all venture capital funds from

the world start-up network, and subsequently have recomputed closeness centrality values for

each start-up in the open-deal list. Concretely, we extracted from CrunchBase.com a list of

101 companies that are labelled as venture capital ﬁrms see Table S5 for details.

Accordingly, in this experiment we create the WWS network but not include those nodes in the

network (and all the connections they bring with them). Closeness centrality is then evaluated

each month as described in the recommendation method. Results of overall success rate are

shown (green bars) in Fig.S15 while monthly success rates are compared in Fig.S17. The

success rate of the recommendation method based on this quantity is consistently similar to the

one found in the case where venture capital funds are not removed from the original network,

hence conﬁrming that the topological presence of venture capital funds is not a confounding

factor.

S5.6 Possible confounding factor 2: number of employees

A second possible confounding factor or hidden predictor is the start-up size (e.g., number of

employees). To assess this possibility, we have conducted a number of experiments. Initially,

we explored start-up size (number of employees) instead of topological network centrality as

the informative predictor, and built a recommendation method based on that metric. Results are

shown in the left panel of Fig.S18, conﬁrming that number of employees is not informative of

the start-up success likelihood.

Additionally, we have also checked the recommendation method (based on closeness centrality)

42

Figure S18: Accounting for start-up size. (Left panel) Monthly success rate of an hypothethi-

cal recommendation method based only on start-up size (number of employees), focusing in the

Top 20 start-ups in the open-deal list. Results wildly ﬂuctuate above and below a random ex-

pectation model, and p-values safely conclude that the number of employees is not informative

of the start-up success likelihood. (Right panel) Monthly success rate of the recommendation

method based on closeness centrality, focusing in the Top 20 start-ups from a subset of the

open-deal list gathering only start-ups with a single employee. Results are qualitatively similar

to the ones obtained without conditioning for start-up size, and suggest that start-up size is not

a confounding factor.

when only the subset of open-deal start-ups with a ﬁxed number of employees is considered.

Since the most frequent size is a start-up with a single employee, we extract the subset of all

start-ups with only one employee. Monthly success rates of the recommendation method are

shown in the right panel of Fig.S18. These results conﬁrm that start-up size is not a counfound-

ing factor and that number of employees is indeed not an informative variable that determines

future success.

S5.7 Possible confounding factor 3: start-up geographical location

A third possible confounding factor is the geographical location of each of the start-ups. To

account for this, we have replicated our analysis (originally performed at a worldwide scale)

in ﬁve geographically separated regions, by dividing open-deal start-ups in ﬁve subsets: Cal-

43

ifornia, United Kingdom, New York, Texas and Israel. Results for the monthly success rate

of our recommendation method are plotted in Fig.S19. While results are more noisy than for

the worldwide setup, we can conﬁrm that for every case the recommendation method based on

closeness centrality is above the random expectation.

S6 From recommendation to prediction of start-up success:

supervised learning approaches

The recommendation method proposed in the main text is based on the working hypothesis that

start-ups with higher closeness centrality rankings are more likely to experience a economic

successful outcome in the future. We have provided theoretical foundation to our research hy-

pothesis at the microscopic level, and then heuristically validated our recommendation lists on

a monthly basis, obtaining results that are signiﬁcantly better than those obtained by a random

expectation model.

However, strictly speaking, a recommendation method is not a true prediction method, as we

are not predicting the outcome of each and every start-up in the open-deal list (either to the

successful or to the non-successful category). To bridge this gap, in this section we consider

different types of prediction models which can indeed truly “predict” the positive outcome of a

start-up, i.e. they can classify whether a given start-up will have a positive outcome or not.

All models are initially based on a sample including 5,305 ﬁrms. These are the ﬁrms that have

been in the open-deal list for at least one month, and can therefore be suggested as potential

investment opportunities. Each ﬁrm is then observed over a period of 24 months, or until when

it has experienced a “successful” event (positive economic outcome) if this event occurs before

the end of the 24 months period. Notice however that in the greatest majority of cases ﬁrms were

observed for 24 months. For this experiment note also that we are aggregating all the ﬁrms in

open-deal list in our database: not all of them are observed in the same time, e.g. one ﬁrm can

44

Figure S19: Accounting for spatial location. Monthly success rate of recommendation method

on ﬁve geographically separated regions.: California, United Kingdom, New York (US), Texas

(US) and Israel. For the California ecosystem we considered the Top100 ranking whereas for

the other four (smaller) ecosystems we only considered the Top 10. Results are more noisy

by qualitatively similar to the ones obtained in the original case, hence conﬁrming that spatial

location is not a confounding factor.

45

be observed for the 24 months starting in January 2000, another ﬁrm can be observed for the

24 months starting in June 2004, etc. In other words, month t= 1 for a given ﬁrm does not

necessarily matches the actual date of month t= 1 for another ﬁrm, we are simply recording

the temporal evolution of different start-ups which appear in open-deal lists at different times in

the period ranging from 1990 to 2008.

Then, for each of the 5,305 ﬁrms, we conclude that a ﬁrm has experienced a “successful” event

(a positive economic outcome) at month tif, within a time window of ∆t= 6,7or 8years

since month t, one of the following events takes place: (i) the ﬁrm makes an acquisition; (ii) the

ﬁrm is acquired; or (iii) the ﬁrm undergoes an IPO. Accordingly, each ﬁrm receives a unique

class label (either successful with class label ‘1’, if at any time t+ ∆tthe start-up experiences

a successful event, or not-successful with class label ’0’ otherwise).

Overall, this data set enables supervised learning (classiﬁcation), as it consists of a large number

of samples (the ﬁrms in the open-deal list), each of them described by a set of features (a vector

of centrality measures over the whole observation window), and each of them being labelled by

a class label.

We will use logistic regression as our supervised learning model. A logistic regression model

links the probability of success of a start-up to a linear combination of predictors. More pre-

cisely, a logistic regression model is traditionally given by:

log pi

1−pi=βTXi+εi(S4)

where piis the probability of success of the i-th start-up, Xiis the vector of predictors, βTis

the (transposed) vector of parameters, which are estimated when the logistic regression model

is ﬁtted, and εiare the errors, which are assumed to be independent, identically-distributed

46

Normal random variables. Rearranging terms, we have

pi=σ(βTXi+εi),

where βTXi+εiis a linear combination of predictors with additive noise term and σ(x) =

1/[1 + exp(−x)] is the so-called logistic function. In essence, the term βTXi+εiis akin to a

linear regression on the predictors, and the logistic function is used to force the outcome to be

equal to 0 or 1: if pi< c, the class 0is assigned, and for pi> c the class 1is assigned, where

the threshold cis indeed another parameter that can be trained by the algorithm. Once the pa-

rameters are estimated, logistic regression can be used to predict the probability of success of

new start-ups.

In what follows we consider two scenarios. In the ﬁrst case, we deﬁne prediction models that do

not consider the time evolution of centrality measures for each ﬁrm and only use instantaneous

values of the ﬁrm’s closeness centrality: these models will be closer in spirit to the recommen-

dation method. In a second case, we enrich the predictor set by adding predictors summarising

the time evolution of the ﬁrm’s centrality over the observation period (to assess whether this

factor is informative) as well as similar quantities extracted from different centrality measures.

S6.1 Logistic regression: the unbalanced case.

Here we use the ROC (receiver operating characteristic) curve to assess the efﬁcacy of this

binary classiﬁcation algorithm to choose the optimal threshold based on our tolerance for false

negatives and desire for true positives. We initially have used only the last value of the (rescaled)

closeness centrality of a start-up over the observed period as the single predictor, in order to try

to match the conditions of our recommendation method where only instantaneous information

is used. The estimation and prediction steps above have been repeated 1,000 times, leaving out

10% of the data set (Monte Carlo cross-validation). Averaging over the prediction results, we

47

obtain the confusion matrix reported in Table S2 (left panel), together with the confusion matrix

expected for a random classiﬁer operating on the same data set (right panel).

ACTUAL predicted

failure success

true failure 0.43 0.34

success 0.11 0.13

RANDOM predicted

failure success

true failure 0.593 0.177

success 0.177 0.053

Table S2: (Left) Confusion matrix for a logistic regression based on the last value of close-

ness and on the mean closeness over the entire period in the unbalanced case. Averages over

1,000 repetitions of the Monte Carlo cross-validation leaving out 10% of the data. (Right) Cor-

responding confusion matrix expected for a random classiﬁer in the same unbalanced case.

Classical ways to assess the prediction performance include the evaluation of accuracy, de-

ﬁned as the total percentage of correctly identiﬁed samples, sensitivity, deﬁned as the percentage

of successful start-ups correctly predicted by the classiﬁer over the percentage of true success-

ful start-ups, and precision, deﬁned as the percentage of successful start-ups correctly predicted

by the classiﬁer over the total percentage of start-ups which are predicted as successful by the

classiﬁer. The F1 score is the harmonic average of precision and sensitivity. Depending on the

context, it might be desirable for a classiﬁer to have high sensitivity or precision, and when both

quantities are relevant then the F1 score is typically used to assess model selection. In our case

the sensitivity is the relevant quantity to look at if we want to maximise the detection of success-

ful start-ups, whereas the precision is important if we want to make sure that all the start-ups

classiﬁed as successful will be successful. In other words, the ﬁrst performance indicator can

be the one of relevance for an investment company with unlimited budget, while the precision

can be of interest to an investment company with limited budget.

The values obtained for the different indicators are reported in Table S3. The F1 score

–which trades off sensitivity and precision– shows that the predictions on whether any start-

up in the open-deal list will have a positive outcome are systematically better than those of a

benchmark given by a random classiﬁer. Note that the problems with the accuracy are due to

48

the fact that our two classes are unbalanced, and this can affect the usefulness of this indicator.

We will come back to this point in the next subsection.

We have also experimented by including additional features of the evolution over time of the

closeness centrality as predictors in the logistic regression model. Interestingly, our results

did not improve signiﬁcantly, suggesting that it is not necessary to use temporal evolution of

centrality measures, and thus conﬁrming the validity of the recommendation method. This

observation will be further explored in the next subsection.

Accuracy Sensitivity Precision F1 score

Unbalanced 0.56 0.54 0.28 0.37

Unbalanced (random classiﬁer) 0.65 0.31 0.31 0.31

Balanced (single predictor) 0.58 0.61 0.57 0.59

Balanced (with temporal information) 0.59 0.62 0.58 0.60

Balanced (random classiﬁer) 0.5 0.5 0.5 0.5

Table S3: Summary of the performance indicators obtained for a logistic regression model to

predict the success of start-ups in the open-deal list based on the last and on the mean value of

closeness over time. Both unbalanced and balanced cases are considered.

S6.2 Logistic regression: the balanced case

It is well known that many binary classiﬁcation algorithms suffer if the two classes are un-

balanced, i.e. if the number of samples in each class is not similar. A classiﬁer would then

systematically try to ﬁt the over-represented class and, as an outcome, the classiﬁcation would

be biased. Consider, e.g., the extreme case where the classiﬁer assigns each sample to the over-

represented class. In this extreme situation, the classiﬁer would not be predicting anything, but

the classiﬁcation accuracy would still be very high due to class unbalance. For such a reason

most classiﬁers do not perform well for unbalanced classes, and in unbalanced classiﬁcation,

accuracy can be a misleading metric. This is indeed our case, as in our data set the majority

of start-ups do not end up being successful. Here, we show that, when we correct for class un-

balancing, then the prediction performance substantially improves. In order to solve the issue

49

of unbalanced classes, we downsample the over-represented class, so that the successful/non-

succesful classiﬁcation problem has now perfectly balanced (50% −50%) classes.

All over this section we use 5-fold crossvalidation. First we have considered that case where

we only use the value of the closeness centrality of each start-up in the last month of our ob-

servation window, this being closer in spirit to the analysis performed in the main part of the

manuscript. Again, the descriptor used is the closeness centrality rescaled ranking. The perfor-

mance indicators of this logistic regression model are reported in Table S3, while the confusion

matrix is shown in Table S4. Results conﬁrm that prediction is indeed possible, and performance

indicators are safely superior to random benchmarks.

ACTUAL predicted

failure success

true failure 0.275 0.227

success 0.193 0.305

RANDOM predicted

failure success

true failure 0.25 0.25

success 0.25 0.25

Table S4: (Left) Confusion matrix for a logistic regression with a single predictor in the bal-

anced case , using 5-fold cross-validation. (Right) Equivalent confusion matrix expected for a

random classiﬁer in the same balanced case.

We have also considered a second logistic regression model with predictors including various

statistics of the temporal sequence of closeness centralities in the observation window. We have

used the following 9 predictors based on closeness centrality, namely: maximum value, mini-

mum value, slope of a linear interpolation and last value of both the ranking and the rescaled

ranking, and number of months in the observation window). The model provides an accuracy

of 0.59, sensitivity 0.62 and precision 0.58, indicating that temporal information leads to only a

marginal improvement over the previous case.

Finally, we have investigated other logistic regression models by further adding predictors re-

lated to other centrality measures. We ﬁnd that the performance is not boosted, in agreement

with the fact that in our case most of the other centrality measures tend to be correlated to the

closeness, according to Fig.S11.

50

3i group advanced technology ventures accel partners

andreessen horowitz atlas venture atomico

august capital austin ventures avalon ventures

azure capital partners bain capital ventures balderton capital

battery ventures benchmark bessemer venture partners

binary venture partners canvas venture fund carmel ventures

charles river ventures clearstone venture partners columbus nova

costanoa venture capital crosslink capital crunchfund

data collective digital sky technologies fo draper ﬁsher jurvetson

elevate ventures ff venture capital ﬁdelity ventures

ﬁrstmark capital ﬁrst round capital ﬂybridge capital

foundation capital founders fund general catalyst partners

genesis partners golden gate capital ggv capital

google ventures granite ventures greylock partners israel

harris harris group highland capital partners idg ventures europe

idg ventures india idg ventures vietnam initial capital 2

in q tel index ventures innovacom

insight venture partners intel capital intellectual ventures

institutional venture partners inventus capital partners jerusalem venture partners

jmi equity kapor capital kleiner perkins cauﬁeld byers

khosla ventures lightspeed venture partners lux capital

matrix partners maveron mayﬁeld fund

menlo ventures meritech capital partners merus capital

morgenthaler ventures new enterprise associates norwest venture partners

oak investment partners oregon angel fund openview venture partners

polaris partners radius ventures redpoint ventures

revolution capital partners rho ventures ﬁnisterre capital

rre ventures rothenberg ventures sante ventures

scale venture partners scottish investment bank scottish equity partners

sequoia capital seventure partners sevin rosen funds

the social capital partnership soﬁnnova partners spark capital

tenaya capital third rock ventures tribeca global investments

union square ventures us venture partners vantagepoint capital partners

venrock wellington partners

Table S5: List of 101 venture capital funds extracted from crunchbase.com. Also available

at https://en.wikipedia.org/wiki/List_of_venture_capital_firms.

51

Additional References

27. V. Latora, V. Nicosia and G. Russo, Complex Networks: Principles, Methods and Applica-

tions (Cambridge University Press, 2017).

28. M. E. Newman, Mixing patterns in networks. Physical Review E,67(2), 026126 (2003).

29. P. Erd ¨

os, A. R´

enyi, On random graphs I. Publicationes Mathematicae,6, 290-297 (1959).

30. P. Erd ¨

os, A. Hajnal, On chromatic number of graphs and set-systems. Acta Mathematica

Hungarica, bf 17(1), 61-99 (1966).

31. S. B. Seidman, Network structure and minimum degree. Social Networks,5(3), 269-287

(1983).

32. S. Wasserman, K. Faust, Social Network Analysis: Methods and Applications (Cambridge

University Press, Cambridge, 1994).

33. N. Lin 1976. Foundations of Social Research (McGraw-Hill, New York, 1976).

34. A. Skrondal, S. Rabe–Hesketh, Generalized Latent Variable Modeling: Multilevel, Longi-

tudinal, and Structural Equation Models (Chapman & Hall/CRC Press, Boca Raton FL,

2004).

35. W. A. Thompson, On the treatment of grouped observations in life studies. Biometrics,33,

463–470 (1977).

36. E. L. Kaplan, P. Meier, Nonparametric estimation from incomplete observations. Journal

of the American Statistical Association,53, 457–481 (1958).

37. D. R. Cox, Regression models and life-tables (with discussion). Journal of the Royal Sta-

tistical Society. Series B,34, 187–220 (1972).

52