Russ B Altman

Russ B Altman
Stanford University | SU · Departments of Bioengineering, Genetics, & Medicine

MD, PhD

About

853
Publications
111,640
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
45,508
Citations
Citations since 2017
192 Research Items
21911 Citations
201720182019202020212022202301,0002,0003,0004,000
201720182019202020212022202301,0002,0003,0004,000
201720182019202020212022202301,0002,0003,0004,000
201720182019202020212022202301,0002,0003,0004,000
Introduction
Russ Biagio Altman is the Kenneth Fong Professor of Bioengineering, Genetics, Medicine, Biomedical Data Science and (by courtesy) Computer Science) and past chairman of the Bioengineering Department at Stanford University. His primary research interests are in the application of computing and informatics technologies to problems relevant to medicine. He is particularly interested in methods for understanding drug action at molecular, cellular, organism and population levels. His lab studies how human genetic variation impacts drug response (e.g., http://www.pharmgkb.org/). Other work focuses on the analysis of biological molecules to understand the actions, interactions and adverse events of drugs (e.g., http://feature.stanford.edu/). He helps lead an FDA-supported Center of Excellence in
Additional affiliations
September 1992 - present
Stanford University
Position
  • Professor
September 1992 - present
Stanford University
Position
  • Professor

Publications

Publications (853)
Preprint
Full-text available
The three-dimensional structures of proteins are crucial for understanding their molecular mechanisms and interactions. Machine learning algorithms that are able to learn accurate representations of protein structures are therefore poised to play a key role in protein engineering and drug development. The accuracy of such models in deployment is di...
Preprint
Full-text available
Smoking greatly reduces life expectancy in both men and women, but with different patterns of morbidity. After adjusting for smoking history, women have higher risk of respiratory effects and diabetes from smoking, while men show greater mortality from smoking-related cancers. While many smoking-related sex differences have been documented, the und...
Article
Full-text available
Single cell technologies are rapidly generating large amounts of data that enables us to understand biological systems at single-cell resolution. However, joint analysis of datasets generated by independent labs remains challenging due to a lack of consistent terminology to describe cell types. Here, we present OnClass, an algorithm and accompanyin...
Preprint
The pathogenesis of many inflammatory diseases is a coordinated process involving metabolic dysfunctions and immune response, usually modulated by the production of cytokines and associated inflammatory molecules. In this work, we seek to understand how genes involved in pathogenesis which are often not associated with the immune system in an obvio...
Preprint
Full-text available
AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and...
Article
Full-text available
Objectives We sought to cluster biological phenotypes using semantic similarity and create an easy-to-install, stable, and reproducible tool. Materials and Methods We generated Phenotype Clustering (PhenClust)—a novel application of semantic similarity for interpreting biological phenotype associations—using the Unified Medical Language System (UM...
Preprint
Although vaccines for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) have been successful, there are no good treatments for those who are actively infected and potentially suffer from diverse neurological symptoms. While SARS-CoV-2 primarily infects the respiratory tract, clinical evidence indicates that cells from sensory organs, bra...
Article
Full-text available
The SARS-CoV-2 pandemic has caused a surge in research exploring all aspects of the virus and its effects on human health. The overwhelming publication rate means that researchers are unable to keep abreast of the literature. To ameliorate this, we present the CoronaCentral resource that uses machine learning to process the research literature on S...
Chapter
Translational bioinformatics refers to the subfield of informatics that extracts new knowledge from voluminous amounts of biological data, particularly molecular data, to enable precision medicine. Though which types of molecular data spans a broad spectrum including genomic, proteomic, metabolomic, epigenomic, and other data, much of the focus to...
Chapter
This chapter introduces bioinformatics and the impacts of data generated by advanced biotechnology applications. A brief introduction regarding the underlying biology of these technologies is described. The types of data being generated and how they are important clinically are introduced. An overview of notable methods that operate on those data t...
Article
The increasing availability of genotype data linked with information about drug‐response phenotypes has enabled genome‐wide association studies (GWAS) that uncover genetic determinants of drug response. GWAS have discovered associations between genetic variants and both drug efficacy and adverse drug reactions. Despite these successes, the design o...
Article
Full-text available
Background Defining clinical phenotypes provides opportunities for new diagnostics and may provide insights into early intervention and disease prevention. There is increasing evidence that patient-derived health data may contain information that complements traditional methods of clinical phenotyping. The utility of these data for defining meaning...
Preprint
Full-text available
The opioid epidemic persists in the United States; in 2019, annual drug overdose deaths increased by 4.6% to 70,980, including 50,042 opioid-related deaths. The widespread abuse of opioids across geographies and demographics and the rapidly changing dynamics of abuse require reliable and timely information to monitor and address the crisis. Social...
Article
Full-text available
Genome sequencing is enabling precision medicine—tailoring treatment to the unique constellation of variants in an individual’s genome. The impact of recurrent pathogenic variants is often understood, however there is a long tail of rare genetic variants that are uncharacterized. The problem of uncharacterized rare variation is especially acute whe...
Article
Full-text available
Background Women are at more than 1.5-fold higher risk for clinically relevant adverse drug events. While this higher prevalence is partially due to gender-related effects, biological sex differences likely also impact drug response. Publicly available gene expression databases provide a unique opportunity for examining drug response at a cellular...
Article
Full-text available
Background Understanding the relationships between genes, drugs, and disease states is at the core of pharmacogenomics. Two leading approaches for identifying these relationships in medical literature are: human expert led manual curation efforts, and modern data mining based automated approaches. The former generates small amounts of high-quality...
Article
Full-text available
For many prevalent complex diseases, treatment regimens are frequently ineffective. For example, despite multiple available immunomodulators and immunosuppressants, inflammatory bowel disease (IBD) remains difficult to treat. Heterogeneity in the disease across patients makes it challenging to select the optimal treatment regimens, and some patient...
Preprint
Full-text available
Adverse drug reactions (ADRs) impact the health of 100,000s of individuals annually in the United States with associated costs in the hundreds of billions. The monitoring and analysis of the severity of adverse drug reactions is limited by the current qualitative and categorical system of severity classifications. Previous efforts have generated qu...
Article
Adverse drug reactions (ADRs) are a major concern for patients, clinicians, and regulatory agencies. The discovery of serious ADRs leading to substantial morbidity and mortality has resulted in mandatory Phase IV clinical trials, black box warnings, and withdrawal of drugs from the market. Real World Data, data collected during routine clinical car...
Article
The COVID-19 pandemic is an unprecedented challenge to the biomedical research community at the intersection of great uncertainty due to the novelty of the virus and extremely high stakes due to the large global death count. The global quarantine shut-downs complicated scientific matters because many laboratories were closed down unless they were a...
Article
Pharmacogenetics studies how genetic variation leads to variability in drug response. Guidelines for selecting the right drug and right dose for patients based on their genetics are clinically effective, but are widely unused. For some drugs, the normal clinical decision making process may lead to the optimal dose of a drug that minimizes side effe...
Preprint
Full-text available
The global SARS-CoV-2 pandemic has caused a surge in research exploring all aspects of the virus and its effects on human health. The overwhelming rate of publications means that human researchers are unable to keep abreast of the research. To ameliorate this, we present the CoronaCentral resource which uses machine learning to process the research...
Preprint
Full-text available
The scale and speed of the COVID-19 pandemic has strained many parts of the national healthcare infrastructure, including communicable disease monitoring and prevention. Many local health departments now receive hundreds or thousands of COVID-19 case reports a day. Many arrive via faxed handwritten forms, often intermingled with other faxes sent to...
Poster
Full-text available
We present ATOM3D, a collection of both novel and existing datasets spanning several key classes of biomolecules, to systematically assess learning methods that work on 3D molecular structure. We implement prototypical three-dimensional models for each of these tasks, finding that they consistently improve performance relative to one- and two-dimen...
Preprint
Full-text available
Computational methods that operate directly on three-dimensional molecular structure hold large potential to solve important questions in biology and chemistry. In particular deep neural networks have recently gained significant attention. In this work we present ATOM3D, a collection of both novel and existing datasets spanning several key classes...
Article
Full-text available
Although tremendous effort has been put into cell-type annotation, identification of previously uncharacterized cell types in heterogeneous single-cell RNA-seq data remains a challenge. Here we present MARS, a meta-learning approach for identifying and annotating known as well as new cell types. MARS overcomes the heterogeneity of cell types by tra...
Preprint
Full-text available
Millions of Americans suffer from illnesses with non-existent or ineffective drug treatment. Identifying plausible drug candidates is a major barrier to drug development due to the large amount of time and resources required; approval can take years when people are suffering now. While computational tools can expedite drug candidate discovery, thes...
Article
Pharmacogenetics (PGx) studies the influence of genetic variation on drug response. Clinically actionable associations inform guidelines created by the Clinical Pharmacogenetics Implementation Consortium (CPIC), but the broad impact of genetic variation on entire populations is not well‐understood. We analyzed PGx allele and phenotype frequencies f...
Preprint
Full-text available
Massively accumulated pharmacogenomics, chemogenomics, and side effect datasets offer an unprecedented opportunity for drug response prediction, drug target identification and drug side effect prediction. Existing computational approaches limit their scope to only one of these three tasks, inevitably overlooking the rich connection among them. Here...
Preprint
Genome sequencing is enabling precision medicine—tailoring treatment to the unique constellation of variants in an individual’s genome. The impact of recurrent pathogenic variants is often understood, leaving a long tail of rare genetic variants that are uncharacterized. The problem of uncharacterized rare variation is especially acute when it occu...
Article
Full-text available
Cytochrome P450 2D6 ( CYP2D6 ) is a highly polymorphic gene whose protein product metabolizes more than 20% of clinically used drugs. Genetic variations in CYP2D6 are responsible for interindividual heterogeneity in drug response that can lead to drug toxicity and ineffective treatment, making CYP2D6 one of the most important pharmacogenes. Predict...
Article
Full-text available
In the current marketplace, there are now more than a dozen commercial companies providing pharmacogenetic tests. Each company varies in the panel of genes they test and the variants they are able to screen for. The reports generated by these companies provide phenotypic interpretations of pharmacogenes and clinically actionable gene–drug interacti...
Preprint
Full-text available
Women are at more than 1.5-fold higher risk for clinically relevant adverse drug events. While this higher prevalence is partially due to gender-related effects, biological sex differences likely also impact drug response. Publicly available gene expression databases provide a unique opportunity for examining drug response at a cellular level. Howe...
Article
Full-text available
Objective: To assess usability and usefulness of a machine learning-based order recommender system applied to simulated clinical cases. Materials and methods: 43 physicians entered orders for 5 simulated clinical cases using a clinical order entry interface with or without access to a previously developed automated order recommender system. Case...
Article
Pharmacogenomics is a key area of precision medicine which is already being implemented in some health systems and may help guide clinicians towards effective therapies for individual patients. Over the last two decades, the Pharmacogenomics Knowledgebase (PharmGKB) has built a unique repository of pharmacogenomic knowledge, including annotations o...
Article
Advances in machine learning, specifically the subfield of deep learning, have produced algorithms that perform image-based diagnostic tasks with accuracy approaching or exceeding that of trained physicians. Despite their well-documented successes, these machine learning algorithms are vulnerable to cognitive and technical bias,¹ including bias int...
Article
Sex differences have been shown in laboratory biomarkers; however, the extent to which this is due to genetics is unknown. In this study, we infer sex-specific genetic parameters (heritability and genetic correlation) across 33 quantitative biomarker traits in 181,064 females and 156,135 males from the UK Biobank study. We apply a Bayesian Mixture...
Preprint
Full-text available
Genetics plays a key role in drug response, affecting efficacy and toxicity. Pharmacogenomics aims to understand how genetic variation influences drug response and develop clinical guidelines to aid clinicians in personalized treatment decisions informed by genetics. Although pharmacogenomics has not been broadly adopted into clinical practice, gen...
Preprint
Full-text available
Pharmacogenetics studies how genetic variation leads to variability in drug response. Guidelines for selecting the right drug and right dose to patients based on their genetics are clinically effective, but are still widely unused. For some drugs, the normal clinical decision making process may lead to the optimal dose of a drug that minimizes side...
Article
Full-text available
Requiring regional or in-country confirmatory clinical trials before approval of drugs already approved elsewhere delays access to medicines in low- and middle-income countries and raises drug costs. Here, we discuss the scientific and technological advances that may reduce the need for in-country or in-region clinical trials for drugs approved in...
Article
Full-text available
Gene sets, including protein complexes and signalling pathways, have proliferated greatly, in large part as a result of high-throughput biological data. Leveraging gene sets to gain insight into biological discovery requires computational methods for converting them into a useful form for available machine learning models. Here, we study the proble...
Article
Full-text available
Named entity recognition tools are used to identify mentions of biomedical entities in free text and are essential components of high-quality information retrieval and extraction systems. Without good entity recognition, methods will mislabel searched text and will miss important information or identify spurious text that will frustrate users. Most...
Preprint
Full-text available
Pharmacogenentics (PGx) studies the influence of genetic variation on drug response. Clinically actionable associations inform guidelines created by the Clinical Pharmacogenetics Implementation Consortium (CPIC), but the broad impact of genetic variation on entire populations is not well-understood. We analyzed PGx allele and phenotype frequencies...
Article
Full-text available
Antiplatelet response to clopidogrel shows wide variation, and poor response is correlated with adverse clinical outcomes. CYP2C19 loss‐of‐function alleles play an important role in this response, but account for only a small proportion of variability in response to clopidogrel. An aim of the International Clopidogrel Pharmacogenomics Consortium (I...
Article
Full-text available
Background: Enzymatic and chemical reactions are key for understanding biological processes in cells. Curated databases of chemical reactions exist but these databases struggle to keep up with the exponential growth of the biomedical literature. Conventional text mining pipelines provide tools to automatically extract entities and relationships fr...
Article
Objective: Non-small cell lung cancer is a leading cause of cancer death worldwide, and histopathological evaluation plays the primary role in its diagnosis. However, the morphological patterns associated with the molecular subtypes have not been systematically studied. To bridge this gap, we developed a quantitative histopathology analytic framew...
Preprint
The most rapid path to discovering treatment options for the novel coronavirus SARS-CoV-2 is to find existing medications that are active against the virus. We have focused on identifying repurposing candidates for the transmembrane serine protease family member II (TMPRSS2), which is critical for entry of coronaviruses into cells. Using known 3D s...
Article
Full-text available
Asians as a group comprise of over 60% the world's population. There is an incredible amount of diversity in Asian and admixed populations that has not been studied in a pharmacogenetic context. The known pharmacogenetic differences in Asians subgroups generally represent previously known variants that are present at much lower or higher frequencie...
Preprint
Full-text available
Although tremendous efforts have been put into cell type classification, the identification of previously uncharacterized cell types in heterogeneous single-cell RNA-seq data remains a challenge. We present MARS, a meta-learning algorithm for annotating known and novel cell types that reconciles the heterogeneity by transferring latent cell represe...
Preprint
Full-text available
Objective: To determine whether clinicians will use machine learned clinical order recommender systems for electronic order entry for simulated inpatient cases, and whether such recommendations impact the clinical appropriateness of the orders being placed. Materials and Methods: 43 physicians used a clinical order entry interface for five simulate...
Preprint
Full-text available
A bstract The primary challenge of fixed-backbone protein design is to find a distribution of sequences that fold to the backbone of interest. This task is central to nearly all protein engineering problems, as achieving a particular backbone conformation is often a prerequisite for hosting specific functions. In this study, we investigate the capa...
Article
One in five Americans experience mental illness, and roughly 75% of psychiatric prescriptions do not successfully treat the patient's condition. Extensive evidence implicates genetic factors and signaling disruption in the pathophysiology of these diseases. Changes in transcription often underlie this molecular pathway dysregulation; individual pat...
Article
Full-text available
Precision medicine tailors treatment to individuals personal data including differences in their genome. The Pharmacogenomics Knowledgebase (PharmGKB) provides highly curated information on the effect of genetic variation on drug response and side effects for a wide range of drugs. PharmGKB's scientific curators triage, review and annotate a large...
Article
Millions of Americans are affected by rare diseases, many of which have poor survival rates. However, the small market size of individual rare diseases, combined with the time and capital requirements of pharmaceutical R&D, have hindered the development of new drugs for these cases. A promising alternative is drug repurposing, whereby existing FDA-...
Article
Successfully applying AI to biomedicine requires innovators trained in contrasting cultures. Successfully applying AI to biomedicine requires innovators trained in contrasting cultures.
Preprint
Full-text available
Gene functional enrichment is a mainstay of genomics, but it relies on manually curated databases of gene functions that are incomplete and unaware of the biological context. Here we present an alternative machine learning approach, Deep Functional Synthesis (DeepSyn), which moves beyond gene function databases to dynamically infer the functions of...
Article
Background: Therapeutic cancer vaccines targeting neoantigens have shown promise in early phase clinical trials for inducing tumor-specific T-cell responses in diverse tumor types. Identification of the small number of optimal cancer vaccine targets is essential to limit cost and improve efficacy of patient-specific vaccine design. Therefore, there...
Preprint
Full-text available
Sex differences have been shown in laboratory biomarkers; however, the extent to which this is due to genetics is unknown. In this study, we infer sex-specific genetic parameters (heritability and genetic correlation) across 33 quantitative biomarker traits in 181,064 females and 156,135 males from the UK Biobank study. We apply a Bayesian mixture...
Article
Full-text available
Accurate prediction of antigen presentation by human leukocyte antigen (HLA) class II molecules would be valuable for vaccine development and cancer immunotherapies. Current computational methods trained on in vitro binding data are limited by insufficient training data and algorithmic constraints. Here we describe MARIA (major histocompatibility c...
Article
Full-text available
The small molecule Retro-2 prevents ricin toxicity through a poorly-defined mechanism of action (MOA), which involves halting retrograde vesicle transport to the endoplasmic reticulum (ER). CRISPRi genetic interaction analysis revealed Retro-2 activity resembles disruption of the transmembrane domain recognition complex (TRC) pathway, which mediate...
Article
Full-text available
Protein-RNA interaction plays important roles in post-transcriptional regulation. However, the task of predicting these interactions given a protein structure is difficult. Here we show that, by leveraging a deep learning model NucleicNet, attributes such as binding preference of RNA backbone constituents and different bases can be predicted from l...
Preprint
Full-text available
Single cell technologies have rapidly generated an unprecedented amount of data that enables us to understand biological systems at single-cell resolution. However, analyzing datasets generated by independent labs remains challenging due to a lack of consistent terminology to describe cell types. Here, we present OnClass, an algorithm and accompany...