Andreas BenderUniversity of Cambridge | Cam · Department of Chemistry
Andreas Bender
PhD
About
453
Publications
96,357
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
16,130
Citations
Introduction
Committed to developing new life science data analysis methods and their application in experimental, prospective settings, primarily related to chemical biology and drug discovery.
Feel free to contact me, but please do so by direct email (I don't read my messages here in the system): https://www.ch.cam.ac.uk/person/ab454
Cheers, Andreas
Additional affiliations
January 2008 - April 2010
January 2006 - December 2007
May 2010 - present
Education
January 2003 - December 2005
Publications
Publications (453)
Pathway analysis is an informative method for comparing and contrasting drug-induced gene expression in cellular systems. Here, we define the effects of the marine natural product fucoxanthin, separately and in combination with the prototypic phosphatidylinositol 3-kinase (PI3K) inhibitor LY-294002, on gene expression in a well-established human gl...
Genome-wide association studies have pinpointed numerous susceptibility loci in complex diseases like chronic immune-mediated inflammatory disorders (IMIDs), yet their impact on pathomechanisms remain poorly understood. Genetic epistasis, low effect sizes, and predominance within non-coding genomic regions, remain major challenges to the functional...
Neural processes (NPs) are models for meta-learning which output uncertainty estimates. So far, most studies of NPs have focused on low-dimensional datasets of highly-correlated tasks. While these homogeneous datasets are useful for benchmarking, they may not be representative of realistic transfer learning. In particular, applications in scientifi...
Generative chemical language models have demonstrated success in learning language-based molecular representations for de novo drug design. Here, we integrate structure-based drug design (SBDD) principles with chemical language models to present a modern hit-finding workflow to go from protein structure to novel small-molecule ligands, without a pr...
Generative chemical language models have demonstrated success in learning language-based molecular representations for de novo drug design. Here, we integrate structure-based design principles with chemical language models to present a modern hit-finding workflow to go from protein structure to novel small-molecule ligands, without a priori knowled...
Renal secretion plays an important role in excretion of drug from the kidney. Two major transporters known to be highly involved in renal secretion are MATE1/2 K and OCT2, the former of which is highly related to drug–drug interactions. Among published in silico models for MATE inhibitors, a previous model obtained a ROC-AUC value of 0.78 using hig...
Drug-induced liver injury (DILI) has been a significant challenge in drug discovery, often leading to clinical trial failures and necessitating drug withdrawals. Over the last decade, the existing suite of in vitro proxy-DILI assays has generally improved at identifying compounds with hepatotoxicity. However, there is considerable interest in enhan...
Three-dimensional (3D) deep molecular generative models offer the advantage of goal-directed generation based on 3D-dependent properties, such as binding affinity for structure-based design within binding pockets. Traditional benchmarks created to evaluate SMILES or molecular graphs generators, such as GuacaMol or MOSES, are limited to evaluate 3D...
Recent advances in machine learning methods for materials science have significantly enhanced accurate predictions of the properties of novel materials. Here, we explore whether these advances can be adapted to drug discovery by addressing the problem of prospective validation - the assessment of the performance of a method on out-of-distribution d...
Adverse drug reactions (ADRs) are a major source of concern in the development of novel pharmaceuticals. ADRs may be identified in the late stages of development or even after commercialization, which may lead to failure or discontinuation after spending enormous resources on candidate molecules. Thus, predicting ADRs early in the process could hel...
Generative models are undergoing rapid research and application to de novo drug design. To facilitate their application and evaluation, we present MolScore. MolScore already contains many drug-design-relevant scoring functions commonly used in benchmarks such as, molecular similarity, molecular docking, predictive models, synthesizability, and more...
High-content image-based assays have fueled significant discoveries in the life sciences in the past decade (2013-2023), including novel insights into disease etiology, mechanism of action, new therapeutics, and toxicology predictions. Here, we systematically review the substantial methodological advancements and applications of Cell Painting. Adva...
High-content image-based assays have fueled significant discoveries in the life sciences in the past decade (2013-2023), including novel insights into disease etiology, mechanism of action, new therapeutics, and toxicology predictions. Here, we systematically review the substantial methodological advancements and applications of Cell Painting. Adva...
Recent findings show that drug combination therapy can increase efficacy, decrease drug resistance, and reduce drug side effects. Due to the enormous number of possibilities in the selection of drugs, it is clinically impossible to screen all available combinations. Fortunately, artificial intelligence has opened up new perspectives for solving thi...
Drug exposure is a key contributor to the safety and efficacy of drugs. It can be defined using human pharmacokinetics (PK) parameters that affect the blood concentration profile of a drug, such as steady-state volume of distribution (VDss), total body clearance (CL), half-life (t½), fraction unbound in plasma (fu) and mean residence time (MRT). In...
Drug-induced cardiotoxicity (DICT) is a major concern in drug development, accounting for 10–14% of postmarket withdrawals. In this study, we explored the capabilities of chemical and biological data to predict cardiotoxicity, using the recently released DICTrank data set from the United States FDA. We found that such data, including protein target...
Drug-induced liver injury (DILI) presents a significant challenge in drug discovery, often leading to clinical trial failures and necessitating drug withdrawals. In this study, we introduce a novel method for DILI prediction that first predicts eleven proxy-DILI labels and then uses them as features in addition to chemical structural features to pr...
Cell Painting assays generate morphological profiles that are versatile descriptors of biological systems and have been used to predict in vitro and in vivo drug effects. However, Cell Painting features extracted from classical software such as CellProfiler are based on statistical calculations and often not readily biologically interpretable. In t...
Identifying bioactive conformations of small molecules is an essential process for virtual screening applications relying on three-dimensional structure such as molecular docking. For most small molecules, conformer generators retrieve at least one bioactive-like conformation, with an atomic root-mean-square deviation (ARMSD) lower than 1 Å, among...
While a multitude of deep generative models have recently emerged there exists no best practice for their practically relevant validation. On the one hand, novel de novo-generated molecules cannot be refuted by retrospective validation (so that this type of validation is biased); but on the other hand prospective validation is expensive and then of...
While a multitude of deep generative models have recently emerged there exists no best practice for their practically relevant validation. On the one hand, novel de novo-generated molecules cannot be refuted by retrospective validation (so that this type of validation is biased); but on the other hand prospective validation is expensive and then of...
Background
Understanding the Mechanism of Action (MoA) of a compound is an often challenging but equally crucial aspect of drug discovery that can help improve both its efficacy and safety. Computational methods to aid MoA elucidation usually either aim to predict direct drug targets, or attempt to understand modulated downstream pathways or signal...
Cell Painting assays generate morphological profiles that are versatile descriptors of biological systems and have been used to predict in vitro and in vivo drug effects. However, Cell Painting features are based on image statistics, and are, therefore, often not readily biologically interpretable. In this study, we introduce an approach that maps...
MolScore is an open-source Python framework for scoring and evaluating molecules in the context of goal-directed generative models as used in de novo drug design. MolScore includes many relevant scoring functions for de novo drug design such as molecular similarity, docking software, predictive models, and synthesizability, as well as commonly used...
While a multitude of deep generative models have recently emerged there exists no best practice for their practically relevant validation. On the one hand, novel de novo-generated molecules cannot be refuted by retrospective validation (so that this type of validation is biased); but on the other hand prospective validation is expensive and then of...
Environmental factors such as exposure to ionizing radiations, certain environmental pollutants, and toxic chemicals are considered as risk factors in the development of breast cancer. Triple-negative breast cancer (TNBC) is a molecular variant of breast cancer that lacks therapeutic targets such as progesterone receptor, estrogen receptor, and hum...
The applicability domain of machine learning models trained on structural fingerprints for the prediction of biological endpoints is often limited by the lack of diversity of chemical space of the training data. In this work, we developed similarity-based merger models which combined the outputs of individual models trained on cell morphology (base...
Pharmacokinetic (PK) parameters such as clearance (CL) and volume of distribution (Vd) have been the subject of previous in silico predictive models. However, having information of the concentration over time profile explicitly can provide additional value like time above MIC or AUC, etc., to understand both the efficacy and safety-related aspects...
Background
Elucidating compound mechanism of action (MoA) is beneficial to drug discovery, but in practice often represents a significant challenge. Causal Reasoning approaches aim to address this situation by inferring dysregulated signalling proteins using transcriptomics data and biological networks; however, a comprehensive benchmarking of such...
Uncontrolled angiogenesis is a common denominator underlying many deadly and debilitating diseases such as myocardial infarction, chronic wounds, cancer, and age-related macular degeneration. As the current range of FDA-approved angiogenesis-based medicines are far from meeting clinical demands, the vast reserve of natural products from traditional...
Various sources of information can be used to better understand and predict compound activity and safety-related endpoints, including biological data such as gene expression and cell morphology. In this review, we first introduce types of chemical, in vitro and in vivo information that can be used to describe compounds and adverse effects. We then...
Gene expression and cell morphology data are high-dimensional biological readouts of much recent interest for drug discovery. They are able to describe biological systems in different states (e.g., healthy and diseased), as well as biological systems before and after compound treatment, and they are hence useful for matching both spaces (e.g., for...
Preclinical inter-species concordance can increase the predictivity of observations to the clinic, potentially reducing drug attrition caused by unforeseen adverse events. We quantified inter-species concordance of histopathological findings and target organ toxicities across four preclinical species in the eTOX database using likelihood ratios (LR...
Virtual Control Groups (VCGs) based on Historical Control Data (HCD) in preclinical toxicity testing have the potential to reduce animal usage. As a case study we retrospectively analyzed the impact of replacing Concurrent Control Groups (CCGs) with VCGs on the treatment-relatedness of 28 selected histopathological findings reported in either rat o...
Functional changes to cardiomyocytes are undesirable during drug discovery and identifying the inotropic effects of compounds is hence necessary to decrease the risk of cardiovascular adverse effects in the clinic. Recently, approaches leveraging calcium transients in human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) have been...
Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target g...
Idiopathic pulmonary fibrosis (IPF) is a chronic lung disease, which affects around three million people worldwide and is characterized by impaired regeneration from recurrent injury to the alveolar epithelium resulting in progressive lung scarring. In this work, we target a cell transition in the differentiation of AT2 cells into mature AT1 cells...
Mitochondrial toxicity is an important safety endpoint in drug discovery. Models based solely on chemical structure for predicting mitochondrial toxicity are currently limited in accuracy and applicability domain to the chemical space of the training compounds. In this work, we aimed to utilize both -omics and chemical data to push beyond the state...
Trefoil factor 3 (TFF3) is a secreted protein with an established oncogenic function and a highly significant association with clinical progression of various human malignancies. Herein, a novel small molecule that specifically targets TFF3 homodimeric functions was identified. Utilizing the concept of reversible covalent interaction, 2-amino-4-(4-...
The applicability domain of machine learning models trained on structural fingerprints for the prediction of biological endpoints is often limited by the diversity of chemical space of the training data. In this work, we developed “similarity-based merger models” which combined the output of individual models trained on cell morphology (based on Ce...
Background
Understanding the Mechanism of Action (MoA) of a compound is an often challenging but equally crucial aspect of drug discovery that can help improve both its efficacy and safety. Computational methods to aid MoA elucidation usually either aim to predict direct drug targets, or attempt to understand modulated downstream pathways or signal...
PROteolysis TArgeting Chimeras (PROTACs) use the ubiquitin-proteasome system to degrade a protein of interest for therapeutic benefit. Advances made in targeted protein degradation technology have been remarkable, with several molecules having moved into clinical studies. However, robust routes to assess and better understand the safety risks of PR...
Machine learning (ML) promises to tackle the grand challenges in chemistry and speed up the generation, improvement and/or ordering of research hypotheses. Despite the overarching applicability of ML workflows, one usually finds diverse evaluation study designs. The current heterogeneity in evaluation techniques and metrics leads to difficulty in (...
We describe a precision medicine workflow, the integrated single nucleotide polymorphism network platform (iSNP), designed to determine the mechanisms by which SNPs affect cellular regulatory networks, and how SNP co-occurrences contribute to disease pathogenesis in ulcerative colitis (UC). Using SNP profiles of 378 UC patients we map the regulator...
Triple negative breast cancer (TNBC) is currently associated with a lack of treatment options. Arsenic derivatives have shown antitumoral activity both in vitro and in vivo; however, their mode of action is not completely understood. In this work we evaluate the response to arsenate of the double positive MCF-7 breast cancer cell line as well as of...
Estimation of points of departure (PoDs) from high-throughput transcriptomic data (HTTr) represents a key step in the development of next-generation risk assessment (NGRA). Current approaches mainly rely on single key gene targets, which are constrained by the information currently available in the knowledge base and make interpretation challenging...
The elucidation of a compound's Mechanism of Action (MoA) is a challenging task in the drug discovery process, but it is important in order to rationalise phenotypic findings and to anticipate potential side-effects. Bioinformatic approaches, advances in machine learning techniques and the increasing deposition of high-throughput data in public dat...
Human induced pluripotent stem cell-derived cardiomyocytes have been established to detect dynamic calcium transients by fast kinetic fluorescence assays that provide insights into specific aspects of clinical cardiac activity. However, the precise derivation and use of waveform parameters to predict cardiac activity merit deeper investigation. In...
PROTACs (PROteolysis TArgeting Chimeras) use the ubiquitin-proteasome system to degrade a protein of interest for therapeutic benefit. Advances in targeted protein degradation technology have been remarkable with several molecules moving into clinical studies. However, robust routes to assess and better understand the safety risks of PROTACs need t...
Mitochondrial toxicity is an important safety endpoint in drug discovery. Models based solely on chemical structure for predicting mitochondrial toxicity are currently limited in accuracy and applicability domain to the chemical space of the training compounds. In this work, we aimed to utilize both -omics and chemical data to push beyond the state...
Background
Elucidating compound mechanism of action (MoA) is beneficial to drug discovery, but in practice often represents a significant challenge. Causal Reasoning approaches aim to address this situation by inferring dysregulated signalling proteins using transcriptomics data and biological networks; however, a comprehensive benchmarking of such...
Resistance to current therapies is common for Pancreatic cancer and hence novel treatment options are urgently needed. In this work, we developed and validated a computational method to select synergistic compound combinations based on transcriptomic profiles from both the disease and compound side, combined with a pathway scoring system, which was...
Measurements of protein–ligand interactions have reproducibility limits due to experimental errors. Any model based on such assays will consequentially have such unavoidable errors influencing their performance which should ideally be factored into modelling and output predictions, such as the actual standard deviation of experimental measurements...
Differentiation therapy is attracting increasing interest in cancer as it can be more specific than conventional chemotherapy approaches, and it has offered new treatment options for some cancer types, such as treating acute promyelocytic leukaemia (APL) by retinoic acid. However, there is a pressing need to identify additional molecules which act...
Background
Ulcerative Colitis (UC) associated single nucleotide polymorphisms (SNP) are mostly in non-coding regions of the genome. Because of that, it has been challenging to determine their role in the disease onset and severity. We have previously developed an integrative workflow (termed iSNP) to understand better how these SNPs are involved in...
p>In the context of small molecule property prediction, experimental errors are usually a neglected aspect during model generation. The main caveat to binary classification approaches is that they weight minority cases close to the threshold boundary equivalently in distinguishing between activity classes. For example, a pXC50 activity value of 5.1...
The understanding of the mechanism-of-action (MoA) of compounds and the prediction of potential drug targets play an important role in small-molecule drug discovery. The aim of this work was to compare chemical and cell morphology information for bioactivity prediction. The comparison was performed using bioactivity data from the ExCAPE database, i...
Enhanced/prolonged cAMP signalling has been suggested as a suppressor of cancer proliferation. Interestingly, two key modulators that elevate cAMP, the A 2A receptor (A 2A R) and phosphodiesterase 10A (PDE10A), are differentially co-expressed in various types of non-small lung cancer (NSCLC) cell-lines. Thus, finding dual-target compounds, which ar...
Shexiang Baoxin Pill (SBP) is an oral formulation of Chinese materia medica for the treatment of angina pectoris. It displays pleiotropic roles in protecting the cardiovascular system. However, the mode of action of SBP in promoting angiogenesis, and in particular the synergy between its constituents is currently not fully understood. The combinati...
The Traditional Chinese Medicine (TCM) formulation Shexiang Baoxin Pill (SBP) is commonly used in the treatment of coronary heart disease (CHD) in East Asia and regarded to promote the regulation of angiogenesis and to improve endothelial function. SBP comprises of seven TCM materials; however, the interactions of their effects in a biological syst...
Adverse drug reactions (ADRs) are undesired effects of medicines that can harm patients and are a significant source of attrition in drug development. ADRs are anticipated by routinely screening drugs against secondary pharmacology protein panels. However, there is still a lack of quantitative information on the links between these off-target prote...
To improve our ability to extrapolate preclinical toxicity to humans, there is a need to understand and quantify the concordance of adverse events (AEs) between animal models and clinical studies. In the present work, we discovered 3011 statistically significant associations between preclinical and clinical AEs caused by drugs reported in the Pharm...
Pathway analysis is an informative method for comparing and contrasting drug-induced gene expression in cellular systems. Here, we define the effects of the marine natural product fucoxanthin, separately and in combination with the prototypic phosphatidylinositol 3-kinase (PI3K) inhibitor LY-294002, on gene expression in a well-established human gl...
In the context of bioactivity prediction, the question of how to calibrate a score produced by a machine learning method into a probability of binding to a protein target is not yet satisfactorily addressed. In this study, we compared the performance of three such methods, namely Platt Scaling (PS), Isotonic Regression (IR) and Venn-ABERS Predictor...