Gael Varoquaux

Gael Varoquaux
National Institute for Research in Computer Science and Control | INRIA · Parietal

PhD

About

314
Publications
192,340
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
113,581
Citations
Additional affiliations
September 2005 - September 2008
French National Centre for Scientific Research
Position
  • PhD Student

Publications

Publications (314)
Article
Full-text available
Causal inference enables machine learning methods to estimate treatment effects of medical interventions from electronic health records (EHRs). The prevalence of such observational data and the difficulty for randomized controlled trials (RCT) to cover all population/treatment relationships make these methods increasingly attractive for studying ca...
Preprint
Full-text available
The first International AI Safety Report comprehensively synthesizes the current evidence on the capabilities, risks, and safety of advanced AI systems. The report was mandated by the nations attending the AI Safety Summit in Bletchley, UK. Thirty nations, the UN, the OECD, and the EU each nominated a representative to the report's Expert Advisory...
Preprint
Full-text available
A myriad of measures to illustrate performance of predictive artificial intelligence (AI) models have been proposed in the literature. Selecting appropriate performance measures is essential for predictive AI models that are developed to be used in medical practice, because poorly performing models may harm patients and lead to increased costs. We...
Article
Objective Integrating electronic health record (EHR) data with other resources is essential in rare disease research due to low disease prevalence. Such integration is dependent on the alignment of ontologies used for data annotation. The international classification of diseases (ICD) is used to annotate clinical diagnoses, while the human phenotyp...
Preprint
Full-text available
This is the interim publication of the first International Scientific Report on the Safety of Advanced AI. The report synthesises the scientific understanding of general-purpose AI -- AI that can perform a wide variety of tasks -- with a focus on understanding and managing its risks. A diverse group of 75 AI experts contributed to this report, incl...
Preprint
When dealing with right-censored data, where some outcomes are missing due to a limited observation period, survival analysis -- known as time-to-event analysis -- focuses on predicting the time until an event of interest occurs. Multiple classes of outcomes lead to a classification variant: predicting the most likely event, a less explored area kn...
Preprint
Full-text available
Medical imaging is spearheading the AI transformation of healthcare. Performance reporting is key to determine which methods should be translated into clinical practice. Frequently, broad conclusions are simply derived from mean performance values. In this paper, we argue that this common practice is often a misleading simplification as it ignores...
Article
Full-text available
In many application settings, data have missing entries, which makes subsequent analyses challenging. An abundant literature addresses missing values in an inferential framework, aiming at estimating parameters and their variance from incomplete tables. Here, we consider supervised-learning settings: predicting a target when missing values appear i...
Preprint
Full-text available
Large Language Models (LLMs) have made significant progress in advancing artificial general intelligence (AGI), leading to the development of increasingly large models such as GPT-4 and LLaMA-405B. However, scaling up model sizes results in exponentially higher computational costs and energy consumption, making these models impractical for academic...
Preprint
Missing values are prevalent across various fields, posing challenges for training and deploying predictive models. In this context, imputation is a common practice, driven by the hope that accurate imputations will enhance predictions. However, recent theoretical and empirical studies indicate that simple constant imputation can be consistent and...
Preprint
When data are right-censored, i.e. some outcomes are missing due to a limited period of observation, survival analysis can compute the "time to event". Multiple classes of outcomes lead to a classification variant: predicting the most likely event, known as competing risks, which has been less studied. To build a loss that estimates outcome probabi...
Article
Full-text available
The Individual Brain Charting (IBC) is a multi-task functional Magnetic Resonance Imaging dataset acquired at high spatial-resolution and dedicated to the cognitive mapping of the human brain. It consists in the deep phenotyping of twelve individuals, covering a broad range of psychological domains suitable for functional-atlasing applications. Her...
Article
Randomized controlled trials (RCTs) may suffer from limited scope. In particular, samples may be unrepresentative: some RCTs over- or under-sample individuals with certain characteristics compared to the target population, for which one wants conclusions on treatment effectiveness. Re-weighting trial individuals to match the target population can i...
Preprint
Accurate predictions, as with machine learning, may not suffice to provide optimal healthcare for every patient. Indeed, prediction can be driven by shortcuts in the data, such as racial biases. Causal thinking is needed for data-driven decisions. Here, we give an introduction to the key elements, focusing on routinely-collected data, electronic he...
Preprint
Full-text available
Accurate predictions, as with machine learning, may not suffice to provide optimal healthcare for every patient. Indeed, prediction can be driven by shortcuts in the data, such as racial biases. Causal thinking is needed for data-driven decisions. Here, we give an introduction to the key elements, focusing on routinely-collected data, electronic he...
Preprint
Full-text available
There are many measures to report so-called treatment or causal effect: absolute difference, ratio, odds ratio, number needed to treat, and so on. The choice of a measure, e.g. absolute versus relative, is often debated because it leads to different appreciations of the same phenomenon; but it also implies different heterogeneity of treatment effec...
Article
Full-text available
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem....
Preprint
Full-text available
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem....
Preprint
Full-text available
Acronym Disambiguation (AD) is crucial for natural language understanding on various sources, including biomedical reports, scientific papers, and search engine queries. However, existing acronym disambiguation benchmarks and tools are limited to specific domains, and the size of prior benchmarks is rather small. To accelerate the research on acron...
Article
Full-text available
Randomized controlled trials (RCTs) are often considered the gold standard for estimating causal effect, but they may lack external validity when the population eligible to the RCT is substantially different from the target population. Having at hand a sample of the target population of interest allows us to generalize the causal effect. Identifyin...
Article
Full-text available
Functional magnetic resonance imaging (fMRI) captures information on brain function beyond the anatomical alterations that are traditionally visually examined by neuroradiologists. However, the fMRI signals are complex in addition to being noisy, so fMRI still faces limitations for clinical applications. Here we review methods that have been propos...
Preprint
Full-text available
The ability to ensure that a classifier gives reliable confidence scores is essential to ensure informed decision-making. To this end, recent work has focused on miscalibration, i.e., the over or under confidence of model scores. Yet calibration is not enough: even a perfectly calibrated classifier with the best possible accuracy can have confidenc...
Article
Full-text available
Previous literature has focused on predicting a diagnostic label from structural brain imaging. Since subtle changes in the brain precede cognitive decline in healthy and pathological aging, our study predicts future decline as a continuous trajectory instead. Here, we tested whether baseline multimodal neuroimaging data improve the prediction of f...
Preprint
Full-text available
The limited scope of Randomized Controlled Trials (RCT) is increasingly under scrutiny, in particular when samples are unrepresentative. Indeed, some RCTs over- or under- sample individuals with certain characteristics compared to the target population, for which one want to draw conclusions on treatment effectiveness. Re-weighting trial individual...
Preprint
While deep learning has enabled tremendous progress on text and image datasets, its superiority on tabular data is not clear. We contribute extensive benchmarks of standard and novel deep learning methods as well as tree-based models such as XGBoost and Random Forests, across a large number of datasets and hyperparameter combinations. We define a s...
Preprint
Full-text available
The field of automatic biomedical image analysis crucially depends on robust and meaningful performance metrics for algorithm validation. Current metric usage, however, is often ill-informed and does not reflect the underlying domain interest. Here, we present a comprehensive framework that guides researchers towards choosing performance metrics in...
Article
Full-text available
Associating brain systems with mental processes requires statistical analysis of brain activity across many cognitive processes. These analyses typically face a difficult compromise between scope—from domain-specific to system-level analysis—and accuracy. Using all the functional Magnetic Resonance Imaging (fMRI) statistical maps of the largest dat...
Article
Full-text available
Background As databases grow larger, it becomes harder to fully control their collection, and they frequently come with missing values. These large databases are well suited to train machine learning models, e.g., for forecasting or to extract biomarkers in biomedical settings. Such predictive approaches can use discriminative—rather than generativ...
Article
Full-text available
MRI has been extensively used to identify anatomical and functional differences in Autism Spectrum Disorder (ASD). Yet, many of these findings have proven difficult to replicate because studies rely on small cohorts and are built on many complex, undisclosed, analytic choices. We conducted an international challenge to predict ASD diagnosis from MR...
Article
Full-text available
Research in computer analysis of medical images bears many promises to improve patients’ health. However, a number of systematic challenges are slowing down the progress of the field, from limitations of the data, such as biases, to research incentives, such as optimizing for publication. In this paper we review roadblocks to developing and assessi...
Preprint
Full-text available
State-of-the-art NLP systems represent inputs with word embeddings, but these are brittle when faced with Out-of-Vocabulary (OOV) words. To address this issue, we follow the principle of mimick-like models to generate vectors for unseen words, by learning the behavior of pre-trained embeddings using only the surface form of words. We present a simp...
Article
Full-text available
Background With increasing data sizes and more easily available computational methods, neurosciences rely more and more on predictive modeling with machine learning, e.g., to extract disease biomarkers. Yet, a successful prediction may capture a confounding effect correlated with the outcome instead of brain features specific to the outcome of inte...
Preprint
BACKGROUND: As databases grow larger, it becomes harder to fully control their collection, and they frequently come with missing values: incomplete observations. These large databases are well suited to train machine-learning models, for instance for forecasting or to extract biomarkers in biomedical settings. Such predictive approaches can use dis...
Preprint
Full-text available
BACKGROUND As databases grow larger, it becomes harder to fully control their collection, and they frequently come with missing values: incomplete observations. These large databases are well suited to train machine-learning models, for instance for forecasting or to extract biomarkers in biomedical settings. Such predictive approaches can use disc...
Article
Full-text available
The analysis of brain-imaging data requires complex processing pipelines to support findings on brain function or pathologies. Recent work has shown that variability in analytical decisions, small amounts of noise, or computational environments can lead to substantial differences in the results, endangering the trust in conclusions. We explored the...
Article
Full-text available
Background Biological aging is revealed by physical measures, e.g., DNA probes or brain scans. In contrast, individual differences in mental function are explained by psychological constructs, e.g., intelligence or neuroticism. These constructs are typically assessed by tailored neuropsychological tests that build on expert judgement and require ca...
Preprint
Full-text available
High-quality data accumulation is now becoming ubiquitous in the health domain. There is increasing opportunity to exploit rich data from normal subjects to improve supervised estimators in specific diseases with notorious data scarcity. We demonstrate that low-dimensional embedding spaces can be derived from the UK Biobank population dataset and u...
Article
PurposeThe Coronavirus disease 2019 (COVID-19) has led to an unparalleled influx of patients. Prognostic scores could help optimizing healthcare delivery, but most of them have not been comprehensively validated. We aim to externally validate existing prognostic scores for COVID-19.Methods We used “COVID-19 Evidence Alerts” (McMaster University) to...
Article
Full-text available
Machine learning brings the hope of finding new biomarkers extracted from cohorts with rich biomedical measurements. A good biomarker is one that gives reliable detection of the corresponding condition. However, biomarkers are often extracted from a cohort that differs from the target population. Such a mismatch, known as a dataset shift, can under...
Article
Full-text available
As the global health crisis unfolded, many academic conferences moved online in 2020. This move has been hailed as a positive step towards inclusivity in its attenuation of economic, physical, and legal barriers and effectively enabled many individuals from groups that have traditionally been underrepresented to join and participate. A number of st...
Preprint
Machine learning brings the hope of finding new biomarkers extracted from cohorts with rich biomedical measurements. A good biomarker is one that gives reliable detection of the corresponding condition. However, biomarkers are often extracted from a cohort that differs from the target population. Such a mismatch, known as a dataset shift, can under...
Preprint
How to learn a good predictor on data with missing values? Most efforts focus on first imputing as well as possible and second learning on the completed data to predict the outcome. Yet, this widespread practice has no theoretical grounding. Here we show that for almost all imputation functions, an impute-then-regress procedure with a powerful lear...
Article
Biomedical entity linking aims to map biomedical mentions, such as diseases and drugs, to standard entities in a given knowledge base. The specific challenge in this context is that the same biomedical entity can have a wide range of names, including synonyms, morphological variations, and names with different word orderings. Recently, BERT-based m...
Preprint
Full-text available
While a randomized controlled trial (RCT) readily measures the average treatment effect (ATE), this measure may need to be shifted to generalize to a different population. Standard estimators of the target population treatment effect are based on the distributional shift in covariates, using inverse propensity sampling weighting (IPSW) or modeling...
Article
Full-text available
Cognitive brain imaging is accumulating datasets about the neural substrate of many different mental processes. Yet, most studies are based on few subjects and have low statistical power. Analyzing data across studies could bring more statistical power; yet the current brain-imaging analytic framework cannot be used at scale as it requires casting...
Preprint
Full-text available
Medical imaging is an important research field with many opportunities for improving patients' health. However, there are a number of challenges that are slowing down the progress of the field as a whole, such optimizing for publication. In this paper we reviewed several problems related to choosing datasets, methods, evaluation metrics, and public...
Article
Full-text available
In brain imaging, decoding is widely used to infer relationships between brain and cognition, or to craft brain-imaging biomarkers of pathologies. Yet, standard decoding procedures do not come with statistical guarantees, and thus do not give confidence bounds to interpret the pattern maps that they produce. Indeed, in whole-brain decoding settings...
Article
Full-text available
Functional magnetic resonance imaging (fMRI) has opened the possibility to investigate how brain activity is modulated by behavior. Most studies so far are bound to one single task, in which functional responses to a handful of contrasts are analyzed and reported as a group average brain map. Contrariwise, recent data‐collection efforts have starte...
Preprint
Full-text available
Biomedical entity linking aims to map biomedical mentions, such as diseases and drugs, to standard entities in a given knowledge base. The specific challenge in this context is that the same biomedical entity can have a wide range of names, including synonyms, morphological variations, and names with different word orderings. Recently, BERT-based m...
Preprint
Full-text available
With increasing data availability, treatment causal effects can be evaluated across different dataset, both randomized trials and observational studies. Randomized trials isolate the effect of the treatment from that of unwanted (confounding) co-occuring effects. But they may be applied to limited populations, and thus lack external validity. On th...
Article
Full-text available
We present an extension of the Individual Brain Charting dataset –a high spatial-resolution, multi-task, functional Magnetic Resonance Imaging dataset, intended to support the investigation on the functional principles governing cognition in the human brain. The concomitant data acquisition from the same 12 participants, in the same environment, al...
Preprint
Full-text available
Background Biological aging is revealed by physical measures, e . g ., DNA probes or brain scans. Instead, individual differences in mental function are explained by psychological constructs, e.g., intelligence or neuroticism. These constructs are typically assessed by tailored neuropsychological tests that build on expert judgement and require car...
Article
Full-text available
We leveraged the largely untapped resource of electronic health record data to address critical clinical and epidemiological questions about Coronavirus Disease 2019 (COVID-19). To do this, we formed an international consortium (4CE) of 96 hospitals across five countries (www.covidclinical.net). Contributors utilized the Informatics for Integrating...
Article
Full-text available
We simultaneously revisited the ADI-R and ADOS with a comprehensive data-analytics strategy. Here, the combination of pattern analysis algorithms and an extensive data resources (n=266 patients aged 7 to 49 years) allowed identifying coherent clinical constellations in and across ADI-R and ADOS assessments widespread in clinical practice. Our clust...
Preprint
Full-text available
The presence of missing values makes supervised learning much more challenging. Indeed, previous work has shown that even when the response is a linear function of the complete data, the optimal predictor is a complex function of the observed entries and the missingness indicator. As a result, the computational or sample complexities of consistent...
Article
Full-text available
Population imaging markedly increased the size of functional-imaging datasets, shedding new light on the neural basis of inter-individual differences. Analyzing these large data entails new scalability challenges, computational and statistical. For this reason, brain images are typically summarized in a few signals, for instance reducing voxel-leve...
Preprint
Full-text available
Objective To assess the clinical effectiveness of oral hydroxychloroquine (HCQ) with or without azithromycin (AZI) in preventing death or leading to hospital discharge. Design Retrospective cohort study. Setting An analysis of data from electronic medical records and administrative claim data from the French Assistance Publique - Hopitaux de Paris...
Preprint
Full-text available
Cognitive decline occurs in healthy and pathological aging, and both may be preceded by subtle changes in the brain - offering a basis for cognitive predictions. Previous work has largely focused on predicting a diagnostic label from structural brain imaging. Our study broadens the scope of applications to cognitive decline in healthy aging by pred...
Article
Full-text available
In the 20th century, evidence-based medicine has put clinical practice on much more solid ground. For instance, randomized clinical trials have provided strong evidence on useful interventions, thanks to double-blind treatment application and tests for treatment associations with clinical outcomes. However, precision medicine in the 21st century st...
Article
Full-text available
Electrophysiological methods, that is M/EEG, provide unique views into brain health. Yet, when building predictive models from brain data, it is often unclear how electrophysiology should be combined with other neuroimaging methods. Information can be redundant, useful common representations of multimodal data may not be obvious and multimodal data...
Article
Full-text available
Electrophysiological methods, that is M/EEG, provide unique views into brain health. Yet, when building predictive models from brain data, it is often unclear how electrophysiology should be combined with other neuroimaging methods. Information can be redundant, useful common representations of multimodal data may not be obvious and multimodal data...