Nassim Nicholas Taleb’s research while affiliated with American University of Beirut and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (110)


Informational Rescaling of PCA Maps with Application to Genetic Distance
  • Article

December 2024

·

38 Reads

·

3 Citations

Computational and Structural Biotechnology Journal

Nassim Nicholas Taleb

·

·

Khaled Elbassioni

·

[...]

·

Principal Component Analysis (PCA) is a powerful multivariate tool allowing the projection of data in low-dimensional representations. Nevertheless, datapoint distances on these low-dimensional projections are challenging to interpret. Here, we propose a computationally simple heuristic to transform a map based on standard PCA (when the variables are asymptotically Gaussian) into an entropy-based map where distances are based on mutual information (MI). Moreover, we show that in certain instances our proposed scaled PCA can improve cluster identification. Rescaling principal component-based distances using MI results in a representation of relative statistical associations when, as in genetics, it is applied on bit measurements between individuals' genomic mutual information. This entropy-rescaled PCA, while preserving order relationships (along a dimension), quantifies relative distances into information units, such as “bits”. We illustrate the effect of this rescaling using genomics data derived from world populations and describe how the interpretation of results is impacted.


PCA computed and displayed for the full list of populations listed in Supplementary Tables S1 and S2. (a) Principal component analysis, PCs 1 and 2. (b) Principal component analysis, PCs 3 and 4. (c) Rescaled principal component analysis, PCs 1 and 2. (d) Rescaled principal component analysis, PCs 3 and 4.
ADMIXTURE plot for admixtures computed on Koura and surrounding populations listed in Supplementary Tables S1 and S2.
F4\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_4$$\end{document} Forest Plots.The F4\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_4$$\end{document} plots display the overlap of the differences between Mbuti with population X compared to the differences between Ancient Lebanon (a), Ancient Anatolia (b), Ancient Greece and Koura (c).
qpGraph topologies. Testing ancient Greek and anatolian contribution to modern Koura Greek orthodox. (a) Topology without Greek Admixture - z = − 31.18. (b) Topology with Greek Admixture - z = − 41.41.
Anatolian genetic ancestry in North Lebanese populations
  • Article
  • Full-text available

July 2024

·

138 Reads

Lebanon’s rich history as a cultural crossroad spanning millennia has significantly impacted the genetic composition of its population through successive waves of migration and conquests from surrounding regions. Within modern-day Lebanon, the Koura district stands out with its unique cultural foundations, primarily characterized by a notably high concentration of Greek Orthodox Christians compared to the rest of the country. This study investigates whether the prevalence of Greek Orthodoxy in Koura can be attributed to modern Greek heritage or continuous blending resulting from the ongoing influx of refugees and trade interactions with Greece and Anatolia. We analyzed both ancient and modern DNA data from various populations in the region which could have played a role in shaping the current population of Koura using our own and published data. Our findings indicate that the genetic influence stemming directly from modern Greek immigration into the area appears to be limited. While the historical presence of Greek colonies has left its mark on the region’s past, the distinctive character of Koura seems to have been primarily shaped by cultural and political factors, displaying a stronger genetic connection mostly with Anatolia, with affinity to ancient but not modern Greeks.

Download

Figure 4: The expected fraction of infections p 1 and p 2 in each arm of a study are shown. If in each arm a fraction c were infected due to exposures outside of the study microenvironment, then the true risk ratio is given by r = (p 2 − c)/(p 1 − c). See Appendix B for more details.
Figure 5: The value of the risk ratio generating the observed data (p 2 /p 1 ) is shown for various potential values of the actual risk ratio r and the fraction of unmasked exposures b. Note that as the fraction of unmasked exposures increases, the observed risk ratio will approach 1, regardless of actual efficacy. Thus, if the fraction of risk from unprotected exposure is relatively high, a small uncertainty in the observed risk ratio corresponds to a much larger uncertainty in the actual risk ratio. For small or large risk ratios (corresponding to masks having a large effect on transmission), changing the fraction of unmasked infections has a substantial effect, as compared with risk ratios close to 1 where there is very little difference. This sensitivity analysis shows that such studies underestimate the effectiveness of high quality masks, systematically shifting the results toward a conclusion of ineffective masking.
Figure 7: Lognormal distributions obtained from Cochrane review data (see Appendix G) comparing (A) masks and no-masks indicate that the ratio of likelihood for benefit from masks are 2.9 to 1, 0.9 to 1, and 9.7 to 1 (inset shows unshaded as benefit and detriment as shaded)-two out of three indicate benefit for masking by a high probability ratio; (B1) N95 respirators compared with non-respirator masks with benefit of N95s also have 2 out of 3 high benefit likelihood ratios, 15.2, 21.4 and 0.2. The number of studies included in (B1) are 3, 5, and 5, for the end points clinical respiratory illness, influenza-like illness, and laboratoryconfirmed influenza, respectively. When one of the 5 of the last two end points is omitted (B2) the ratios are 9.6 and 0.6. Details of the meta analyses (B1.1-B1.3) and (B2.1-B2.3) show study and meta analysis log normal distributions and the relative probability of benefit. One of the studies Radonovich 2019 ref. [10] shown in light blue dominates the meta analyses but has protocols to put on masks or N95s only within 6 ft of patients, despite such protocols being counter to the principles of airborne transmission. 14
Figure 8: A schematic diagram for possible values of the infection probabilities in the control and intervention groups p 1 , p 2 , c, s 1 ≡ 1 N N j=1 s j , and s 2 ≡ 1 N N j=1 rs j = rs 1 .
Quantitative errors in the Cochrane review on "Physical interventions to interrupt or reduce the spread of respiratory viruses"

October 2023

·

241 Reads

·

1 Citation

The COVID-19 pandemic has brought a heightened sense of urgency in the scientific community regarding the need to advance understanding and prevention of pathogen transmission, particularly concerning infectious airborne particles and the utility of various preventive strategies in reducing the risk of infection. There are extensive studies validating scientific understanding about the behavior of larger (droplets) and smaller (aerosols) particles in disease transmission and the dosimetry of particles in the respiratory track. Similarly, modalities for respiratory protection against particles in the size range spanned by infectious particles, such as N95 respirators, are available and known to be efficacious with tested standards for harm reduction across environments including physical, chemical and biological hazards. Even though multiple studies also confirm their protective effect when adopted in healthcare and public settings for infection prevention, overall, studies of protocols of their adoption over the last several decades in both clinical trials and observational studies have not provided as clear an understanding. Here we demonstrate that these studies are strongly biased towards the null by infections resulting from transmission outside of the investigated environments and study participants. Such study limitations are frequently mis-stated as not influencing the conclusions of research on respiratory protection. The reason for the failure to properly analyze the studies is that the standard analytical equations used do not correctly represent the random variables that play a role in the study results. By correcting the mathematical representation and the equations that result from them, we demonstrate that conclusions drawn from these studies are strongly biased and much more uncertain than is acknowledged, providing almost no useful information. Even with all these limitations, we show that existing results, when outcome measures are properly analyzed, consistently point to the benefit of precautionary measures such as N95 respirators over medical masks, and masking over its absence. We also show that correcting manifest errors of widely reported meta-analyses also leads to statistically significant estimates. Our results have implications for the design of studies and analyses on the effectiveness of respiratory protection and on using existing evidence for policy guidelines for infection control.


Informational Rescaling of PCA Maps with Application to Genetic Distance

March 2023

·

120 Reads

We discuss the inadequacy of covariances/correlations and other measures in L2 as relative distance metrics under some conditions. We propose a computationally simple heuristic to transform a map based on standard principal component analysis (PCA) (when the variables are asymptotically Gaussian) into an entropy-based map where distances are based on mutual information (MI). Rescaling Principal Component based distances using MI allows a representation of relative statistical associations when, as in genetics, it is applied on bit measurements between individuals' genomic mutual information. This entropy rescaled PCA, while preserving order relationships (along a dimension), changes the relative distances to make them linear to information. We show the effect on the entire world population and some subsamples, which leads to significant differences with the results of current research.


Working with Convex Responses: Antifragility from Finance to Oncology

February 2023

·

267 Reads

·

18 Citations

We extend techniques and learnings about the stochastic properties of nonlinear responses from finance to medicine, particularly oncology, where it can inform dosing and intervention. We define antifragility. We propose uses of risk analysis for medical problems, through the properties of nonlinear responses (convex or concave). We (1) link the convexity/concavity of the dose-response function to the statistical properties of the results; (2) define “antifragility” as a mathematical property for local beneficial convex responses and the generalization of “fragility” as its opposite, locally concave in the tails of the statistical distribution; (3) propose mathematically tractable relations between dosage, severity of conditions, and iatrogenics. In short, we propose a framework to integrate the necessary consequences of nonlinearities in evidence-based oncology and more general clinical risk management.



The Probability Conflation: A Reply

January 2023

·

320 Reads

We respond to Tetlock et al. (2022) showing 1) how expert judgment fails to reflect tail risk, 2) the lack of compatibility between forecasting tournaments and tail risk assessment methods (such as extreme value theory). More importantly, we communicate a new result showing a greater gap between the properties of tail expectation and those of the corresponding probability.


Fig. 1. Fragility below level K as indicative of survival. It is not quite symmetric because global antifragility is conditioned on tail robustness ("to do well one must first survive"). The Taleb and Douady (2013)[68] paper shows that the gap between K f (x, σ)dx and K f (x, σ + ∆)dx where σ is the scale of the distribution is proportional to the concavity of f (x). Hence without knowing the distribution (PDF above), one can gauge such effect by looking at the nonlinearity of f (.) below the threshold K.
Fig. 6. The second-derivative is an approximation for fragility for low values of h. (A) Hill function, H(x) (eqn. 14) shown for n = 10, E 0 = 0, E 1 = 100, C = 10. Analytically derived second-derivative (eqn. 15) is shown in the bottom panel. (B) Difference between fragility and second-derivative at various dose values (red to blue) corresponding to panel A. As h → 0, the error approaches zero: F (x, h) − h 2 d 2 H dx 2 → 0.
Fig. 7. (A) How a fractional intervention is more effective to surpass a threshold than a constant dosage of the same average. This is akin to stochastic resonance (in physics) by which the presence of noise cause the signal to rise above the detection threshold. For instance, genetically modified BT crops produce a constant level of pesticide, which appears to be much less effective than occasional manual interventions to add doses to conventional plants. The same may apply to antibiotics, chemotherapy and radiation therapy. (B) How more variance impacts the exceedance over the threshold. If threshold ≥ mean, we have convexity and the variance increases the payoff more than variations in the mean. Such an effect is proportional to the remoteness of such threshold. Note that the harm function is defined as positive.
Fig. 12. Relationship between convexity and mixed, heterogeneous populations (A) Dose response shown for sensitive (green) and resistant (red) cell lines. When mixed, dose response is a weighted average of each (eqn. 20; black) (B) Fragility shown for sensitive (green) and resistant (red) cell lines. When mixed, fragility (black) switches from locally convex to locally concave multiple times.
Fig. 14. Unseen risks and mild gains: translation of Fig. 13 into a probabilistic representation, showing to the skewness of a decision involving iatrogenics when the condition is mild. This also gives the intuition of the Taleb and Douady[68] translation theorems from concavity for S(x) into a probabilistic attributes.
Working With Convex Responses: Antifragility From Finance to Oncology

September 2022

·

295 Reads

We extend techniques and learnings about the stochastic properties of nonlinear responses from finance to medicine, particularly oncology where it can inform dosing and intervention. We define antifragility. We propose uses of risk analysis to medical problems, through the properties of nonlinear responses (convex or concave). We 1) link the convexity/concavity of the dose-response function to the statistical properties of the results; 2) define "antifragility" as a mathematical property for local beneficial convex responses and the generalization of "fragility" as its opposite, locally concave in the tails of the statistical distribution; 3) propose mathematically tractable relations between dosage, severity of conditions, and iatrogenics. In short we propose a framework to integrate the necessary consequences of nonlinearities in evidence-based oncology and more general clinical risk management.



FIG. 3. Left: A representative function for a susceptible individual's probability of infection p as a function of viral dose v for a single exposure event, together with the effective exposure˜vexposure˜ exposure˜v ≡ f (v) ≡ − ln(1 − p(v)). f (v) is convex for all v, while p(v) is convex for sufficiently small v. The convexity of f (v) (which is demonstrated in the Appendix) yields an S-curve for p(v). Note that for any particular viral dose v, the effective exposure˜vexposure˜ exposure˜v = f (v) can vary from individual to individual. Right: A depiction of how the total effective exposure˜vTexposure˜ exposure˜vT and the probability of eventually becoming infected scale with the number of exposure events. The total effective exposure is the sum of the effective exposures from each exposure event; see Appendix for details.
Unmasking the mask studies: Why the effectiveness of surgical masks in preventing respiratory infections has been underestimated

September 2021

·

731 Reads

·

12 Citations

Journal of Travel Medicine

Background: Pre-pandemic empirical studies have produced mixed statistical results on the effectiveness of masks against respiratory viruses, leading to confusion that may have contributed to organizations such as the WHO and CDC initially not recommending that the general public wear masks during the COVID-19 pandemic. Methods: A threshold-based dose–response curve framework is used to analyse the effects of interventions on infection probabilities for both single and repeated exposure events. Empirical studies on mask effectiveness are evaluated with a statistical power analysis that includes the effect of adherence to mask usage protocols. Results: When the adherence to mask-usage guidelines is taken into account, the empirical evidence indicates that masks prevent disease transmission: all studies we analysed that did not find surgical masks to be effective were under-powered to such an extent that even if masks were 100% effective, the studies in question would still have been unlikely to find a statistically significant effect. We also provide a framework for understanding the effect of masks on the probability of infection for single and repeated exposures. The framework demonstrates that masks can have a disproportionately large protective effect and that more frequently wearing a mask provides super-linearly compounding protection. Conclusions: This work shows (1) that both theoretical and empirical evidence is consistent with masks protecting against respiratory infections and (2) that nonlinear effects and statistical considerations regarding the percentage of exposures for which masks are worn must be taken into account when designing empirical studies and interpreting their results.


Citations (61)


... We also performed a rescaled form of PCA based on bit measurements between individuals' genomic mutual information using the methods previously described by our group 15,37 (Supplementary Material). ...

Reference:

Human migration from the Levant and Arabia into Yemen since Last Glacial Maximum
Informational Rescaling of PCA Maps with Application to Genetic Distance
  • Citing Article
  • December 2024

Computational and Structural Biotechnology Journal

... function g(x) is defined as convex if, for any two points a and b, the function value 61 at the average of a and b is less than or equal to the average of the function values at a 62 and b[18] ...

Working with Convex Responses: Antifragility from Finance to Oncology

... Quite the opposite, they are accountable as a profession for not having any code of conduct until very recently. They are also accountable for not keeping an open eye on the discussions on ethics that other social disciplines or other disciplines in general have been developing for many years or centuries now (Daly, 2014;Heilig and Weijer, 2005;Reynolds, 2009;Sindzingre, 2019). Moreover, they cannot use metaphors from biology, medicine, and anthropology when they try to make their economic theories or advice acceptable to the public without following the ethics of the discipline, the expertise and the reputation of which they use to "market" their economics. ...

The Oxford Handbook of Professional Economic Ethics
  • Citing Article
  • March 2014

... Taleb's ideas have been further explored in different contexts not only in risk analysis and financial systems [2][3][4], but also as a means for strategic design and planning [5][6][7]. The original formal definition [8] (see also [9,10]), however, is difficult to use in practice and with real data. Therefore, we use in this paper an alternative definition of antifragility that can be implemented easily for a quantitative analysis and is general enough to be applied in a broad range of domains. ...

A New Heuristic Measure of Fragility and Tail Risks: Application to Stress Testing
  • Citing Article
  • January 2012

SSRN Electronic Journal

... In randomized controlled trials investigating the effectiveness of interventions, the effectiveness of masks and respirators has been found to be weak (Jefferson et al., 2023). This is for the most part due to poor adherence, i.e., relatively low numbers of people followed the guidance about wearing masks in these studies (Kollepara et al., 2021). ...

Unmasking the mask studies: Why the effectiveness of surgical masks in preventing respiratory infections has been underestimated

Journal of Travel Medicine

... al., (2022) provides an interesting study of factors determining the adaptation of cryptocurrencies in 137 countries. Stimulating discussions on advantages and disadvantages of cryptocurrencies are offered by Taleb (2021) and Lipton (2021). For a detailed review of the literature on cryptocurrencies see Bariviera and Merediz-Solà (2021). ...

Bitcoin, currencies, and fragility
  • Citing Article
  • August 2021

... Pandemics are extremely fat-tailed events, with potentially destructive tail risk. Any model ignoring this is necessarily flawed (Taleb et al., 2022). The pandemic's effects on financial markets (see e.g., Husnain et al., 2024;De Crescenzio & Lepers, 2024;Alba et al. 2023;Wu et al., 2022;Sugandi, 2022;Chang et al., 2021;Seven & Yılmaz, 2021;So et al., 2021), investors' sentiments and investment decisions have been extensively studied (Beloskar & Rao, 2023;Jin & Zhang, 2023;Murashima, 2023). ...

On single point forecasts for fat-tailed variables
  • Citing Article
  • October 2020

International Journal of Forecasting

... This was clearly visible during the COVID-19 pandemic when societies worldwide relied on scientific information about the nature and impact of the virus. An example of this role of science journalism was the widespread use of complex epidemiological models in news messages and on social media (Siegenfeld et al., 2020). Traditional media such as television, radio, and newspapers remained an important source of scientific information for many people during the pandemic (Metcalfe et al., 2020). ...

Opinion: What models can and cannot tell us about COVID-19

Proceedings of the National Academy of Sciences

... Thus, predictions of stochastic events might be better captured by estimating the cumulative probability of the event using extreme value theory. 45 Extreme value theory can likewise be useful for forecasting future pandemics. For example, extreme value theory has estimated that the annual probability of another pandemic of at least the magnitude of the COVID-19 pandemic is 2-3%. ...

Tail risk of contagious diseases
  • Citing Article
  • May 2020

Nature Physics

... In addition, reduction of optimizer's curse with more informative prior selection was predicted by Smith and Winkler (2006). Taleb (2020) compared differences in forecasting accuracy of a binary event and actual returns, claiming that the models that generate accurate forecasts do not necessarily generate high returns. Our simulation study, a similar phenomenon was observed with a continuous target variable (stock returns). ...

On the statistical differences between binary forecasts and real-world payoffs
  • Citing Article
  • April 2020

International Journal of Forecasting