Andreas Bender

Andreas Bender
University of Cambridge | Cam · Department of Chemistry

PhD

About

453
Publications
96,357
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
16,130
Citations
Introduction
Committed to developing new life science data analysis methods and their application in experimental, prospective settings, primarily related to chemical biology and drug discovery. Feel free to contact me, but please do so by direct email (I don't read my messages here in the system): https://www.ch.cam.ac.uk/person/ab454 Cheers, Andreas
Additional affiliations
January 2008 - April 2010
Leiden University
Position
  • Professor (Assistant)
January 2006 - December 2007
Novartis Institutes for BioMedical Research
Position
  • PostDoc Position
May 2010 - present
University of Cambridge
Position
  • Lecturer for Drug Design
Education
January 2003 - December 2005
University of Cambridge
Field of study
  • Cheminformatics / Molecular Similarity / Virtual Screening

Publications

Publications (453)
Article
Full-text available
Pathway analysis is an informative method for comparing and contrasting drug-induced gene expression in cellular systems. Here, we define the effects of the marine natural product fucoxanthin, separately and in combination with the prototypic phosphatidylinositol 3-kinase (PI3K) inhibitor LY-294002, on gene expression in a well-established human gl...
Preprint
Full-text available
Genome-wide association studies have pinpointed numerous susceptibility loci in complex diseases like chronic immune-mediated inflammatory disorders (IMIDs), yet their impact on pathomechanisms remain poorly understood. Genetic epistasis, low effect sizes, and predominance within non-coding genomic regions, remain major challenges to the functional...
Article
Full-text available
Neural processes (NPs) are models for meta-learning which output uncertainty estimates. So far, most studies of NPs have focused on low-dimensional datasets of highly-correlated tasks. While these homogeneous datasets are useful for benchmarking, they may not be representative of realistic transfer learning. In particular, applications in scientifi...
Preprint
Generative chemical language models have demonstrated success in learning language-based molecular representations for de novo drug design. Here, we integrate structure-based drug design (SBDD) principles with chemical language models to present a modern hit-finding workflow to go from protein structure to novel small-molecule ligands, without a pr...
Preprint
Generative chemical language models have demonstrated success in learning language-based molecular representations for de novo drug design. Here, we integrate structure-based design principles with chemical language models to present a modern hit-finding workflow to go from protein structure to novel small-molecule ligands, without a priori knowled...
Article
Renal secretion plays an important role in excretion of drug from the kidney. Two major transporters known to be highly involved in renal secretion are MATE1/2 K and OCT2, the former of which is highly related to drug–drug interactions. Among published in silico models for MATE inhibitors, a previous model obtained a ROC-AUC value of 0.78 using hig...
Article
Drug-induced liver injury (DILI) has been a significant challenge in drug discovery, often leading to clinical trial failures and necessitating drug withdrawals. Over the last decade, the existing suite of in vitro proxy-DILI assays has generally improved at identifying compounds with hepatotoxicity. However, there is considerable interest in enhan...
Preprint
Full-text available
Three-dimensional (3D) deep molecular generative models offer the advantage of goal-directed generation based on 3D-dependent properties, such as binding affinity for structure-based design within binding pockets. Traditional benchmarks created to evaluate SMILES or molecular graphs generators, such as GuacaMol or MOSES, are limited to evaluate 3D...
Preprint
Full-text available
Recent advances in machine learning methods for materials science have significantly enhanced accurate predictions of the properties of novel materials. Here, we explore whether these advances can be adapted to drug discovery by addressing the problem of prospective validation - the assessment of the performance of a method on out-of-distribution d...
Preprint
Full-text available
Adverse drug reactions (ADRs) are a major source of concern in the development of novel pharmaceuticals. ADRs may be identified in the late stages of development or even after commercialization, which may lead to failure or discontinuation after spending enormous resources on candidate molecules. Thus, predicting ADRs early in the process could hel...
Article
Full-text available
Generative models are undergoing rapid research and application to de novo drug design. To facilitate their application and evaluation, we present MolScore. MolScore already contains many drug-design-relevant scoring functions commonly used in benchmarks such as, molecular similarity, molecular docking, predictive models, synthesizability, and more...
Preprint
Full-text available
High-content image-based assays have fueled significant discoveries in the life sciences in the past decade (2013-2023), including novel insights into disease etiology, mechanism of action, new therapeutics, and toxicology predictions. Here, we systematically review the substantial methodological advancements and applications of Cell Painting. Adva...
Article
High-content image-based assays have fueled significant discoveries in the life sciences in the past decade (2013-2023), including novel insights into disease etiology, mechanism of action, new therapeutics, and toxicology predictions. Here, we systematically review the substantial methodological advancements and applications of Cell Painting. Adva...
Article
Recent findings show that drug combination therapy can increase efficacy, decrease drug resistance, and reduce drug side effects. Due to the enormous number of possibilities in the selection of drugs, it is clinically impossible to screen all available combinations. Fortunately, artificial intelligence has opened up new perspectives for solving thi...
Preprint
Full-text available
Drug exposure is a key contributor to the safety and efficacy of drugs. It can be defined using human pharmacokinetics (PK) parameters that affect the blood concentration profile of a drug, such as steady-state volume of distribution (VDss), total body clearance (CL), half-life (t½), fraction unbound in plasma (fu) and mean residence time (MRT). In...
Article
Drug-induced cardiotoxicity (DICT) is a major concern in drug development, accounting for 10–14% of postmarket withdrawals. In this study, we explored the capabilities of chemical and biological data to predict cardiotoxicity, using the recently released DICTrank data set from the United States FDA. We found that such data, including protein target...
Preprint
Full-text available
Drug-induced liver injury (DILI) presents a significant challenge in drug discovery, often leading to clinical trial failures and necessitating drug withdrawals. In this study, we introduce a novel method for DILI prediction that first predicts eleven proxy-DILI labels and then uses them as features in addition to chemical structural features to pr...
Article
Cell Painting assays generate morphological profiles that are versatile descriptors of biological systems and have been used to predict in vitro and in vivo drug effects. However, Cell Painting features extracted from classical software such as CellProfiler are based on statistical calculations and often not readily biologically interpretable. In t...
Article
Full-text available
Identifying bioactive conformations of small molecules is an essential process for virtual screening applications relying on three-dimensional structure such as molecular docking. For most small molecules, conformer generators retrieve at least one bioactive-like conformation, with an atomic root-mean-square deviation (ARMSD) lower than 1 Å, among...
Article
Full-text available
While a multitude of deep generative models have recently emerged there exists no best practice for their practically relevant validation. On the one hand, novel de novo-generated molecules cannot be refuted by retrospective validation (so that this type of validation is biased); but on the other hand prospective validation is expensive and then of...
Preprint
Full-text available
While a multitude of deep generative models have recently emerged there exists no best practice for their practically relevant validation. On the one hand, novel de novo-generated molecules cannot be refuted by retrospective validation (so that this type of validation is biased); but on the other hand prospective validation is expensive and then of...
Article
Full-text available
Background Understanding the Mechanism of Action (MoA) of a compound is an often challenging but equally crucial aspect of drug discovery that can help improve both its efficacy and safety. Computational methods to aid MoA elucidation usually either aim to predict direct drug targets, or attempt to understand modulated downstream pathways or signal...
Preprint
Full-text available
Cell Painting assays generate morphological profiles that are versatile descriptors of biological systems and have been used to predict in vitro and in vivo drug effects. However, Cell Painting features are based on image statistics, and are, therefore, often not readily biologically interpretable. In this study, we introduce an approach that maps...
Preprint
Full-text available
MolScore is an open-source Python framework for scoring and evaluating molecules in the context of goal-directed generative models as used in de novo drug design. MolScore includes many relevant scoring functions for de novo drug design such as molecular similarity, docking software, predictive models, and synthesizability, as well as commonly used...
Preprint
Full-text available
While a multitude of deep generative models have recently emerged there exists no best practice for their practically relevant validation. On the one hand, novel de novo-generated molecules cannot be refuted by retrospective validation (so that this type of validation is biased); but on the other hand prospective validation is expensive and then of...
Article
Environmental factors such as exposure to ionizing radiations, certain environmental pollutants, and toxic chemicals are considered as risk factors in the development of breast cancer. Triple-negative breast cancer (TNBC) is a molecular variant of breast cancer that lacks therapeutic targets such as progesterone receptor, estrogen receptor, and hum...
Article
Full-text available
The applicability domain of machine learning models trained on structural fingerprints for the prediction of biological endpoints is often limited by the lack of diversity of chemical space of the training data. In this work, we developed similarity-based merger models which combined the outputs of individual models trained on cell morphology (base...
Article
Full-text available
Pharmacokinetic (PK) parameters such as clearance (CL) and volume of distribution (Vd) have been the subject of previous in silico predictive models. However, having information of the concentration over time profile explicitly can provide additional value like time above MIC or AUC, etc., to understand both the efficacy and safety-related aspects...
Article
Full-text available
Background Elucidating compound mechanism of action (MoA) is beneficial to drug discovery, but in practice often represents a significant challenge. Causal Reasoning approaches aim to address this situation by inferring dysregulated signalling proteins using transcriptomics data and biological networks; however, a comprehensive benchmarking of such...
Article
Full-text available
Uncontrolled angiogenesis is a common denominator underlying many deadly and debilitating diseases such as myocardial infarction, chronic wounds, cancer, and age-related macular degeneration. As the current range of FDA-approved angiogenesis-based medicines are far from meeting clinical demands, the vast reserve of natural products from traditional...
Article
Various sources of information can be used to better understand and predict compound activity and safety-related endpoints, including biological data such as gene expression and cell morphology. In this review, we first introduce types of chemical, in vitro and in vivo information that can be used to describe compounds and adverse effects. We then...
Article
Full-text available
Gene expression and cell morphology data are high-dimensional biological readouts of much recent interest for drug discovery. They are able to describe biological systems in different states (e.g., healthy and diseased), as well as biological systems before and after compound treatment, and they are hence useful for matching both spaces (e.g., for...
Article
Preclinical inter-species concordance can increase the predictivity of observations to the clinic, potentially reducing drug attrition caused by unforeseen adverse events. We quantified inter-species concordance of histopathological findings and target organ toxicities across four preclinical species in the eTOX database using likelihood ratios (LR...
Article
Virtual Control Groups (VCGs) based on Historical Control Data (HCD) in preclinical toxicity testing have the potential to reduce animal usage. As a case study we retrospectively analyzed the impact of replacing Concurrent Control Groups (CCGs) with VCGs on the treatment-relatedness of 28 selected histopathological findings reported in either rat o...
Article
Functional changes to cardiomyocytes are undesirable during drug discovery and identifying the inotropic effects of compounds is hence necessary to decrease the risk of cardiovascular adverse effects in the clinic. Recently, approaches leveraging calcium transients in human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) have been...
Article
Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target g...
Preprint
Full-text available
Idiopathic pulmonary fibrosis (IPF) is a chronic lung disease, which affects around three million people worldwide and is characterized by impaired regeneration from recurrent injury to the alveolar epithelium resulting in progressive lung scarring. In this work, we target a cell transition in the differentiation of AT2 cells into mature AT1 cells...
Article
Full-text available
Mitochondrial toxicity is an important safety endpoint in drug discovery. Models based solely on chemical structure for predicting mitochondrial toxicity are currently limited in accuracy and applicability domain to the chemical space of the training compounds. In this work, we aimed to utilize both -omics and chemical data to push beyond the state...
Article
Trefoil factor 3 (TFF3) is a secreted protein with an established oncogenic function and a highly significant association with clinical progression of various human malignancies. Herein, a novel small molecule that specifically targets TFF3 homodimeric functions was identified. Utilizing the concept of reversible covalent interaction, 2-amino-4-(4-...
Preprint
Full-text available
The applicability domain of machine learning models trained on structural fingerprints for the prediction of biological endpoints is often limited by the diversity of chemical space of the training data. In this work, we developed “similarity-based merger models” which combined the output of individual models trained on cell morphology (based on Ce...
Preprint
Background Understanding the Mechanism of Action (MoA) of a compound is an often challenging but equally crucial aspect of drug discovery that can help improve both its efficacy and safety. Computational methods to aid MoA elucidation usually either aim to predict direct drug targets, or attempt to understand modulated downstream pathways or signal...
Article
Full-text available
PROteolysis TArgeting Chimeras (PROTACs) use the ubiquitin-proteasome system to degrade a protein of interest for therapeutic benefit. Advances made in targeted protein degradation technology have been remarkable, with several molecules having moved into clinical studies. However, robust routes to assess and better understand the safety risks of PR...
Article
Machine learning (ML) promises to tackle the grand challenges in chemistry and speed up the generation, improvement and/or ordering of research hypotheses. Despite the overarching applicability of ML workflows, one usually finds diverse evaluation study designs. The current heterogeneity in evaluation techniques and metrics leads to difficulty in (...
Article
Full-text available
We describe a precision medicine workflow, the integrated single nucleotide polymorphism network platform (iSNP), designed to determine the mechanisms by which SNPs affect cellular regulatory networks, and how SNP co-occurrences contribute to disease pathogenesis in ulcerative colitis (UC). Using SNP profiles of 378 UC patients we map the regulator...
Article
Full-text available
Triple negative breast cancer (TNBC) is currently associated with a lack of treatment options. Arsenic derivatives have shown antitumoral activity both in vitro and in vivo; however, their mode of action is not completely understood. In this work we evaluate the response to arsenate of the double positive MCF-7 breast cancer cell line as well as of...
Article
Full-text available
Estimation of points of departure (PoDs) from high-throughput transcriptomic data (HTTr) represents a key step in the development of next-generation risk assessment (NGRA). Current approaches mainly rely on single key gene targets, which are constrained by the information currently available in the knowledge base and make interpretation challenging...
Article
Full-text available
The elucidation of a compound's Mechanism of Action (MoA) is a challenging task in the drug discovery process, but it is important in order to rationalise phenotypic findings and to anticipate potential side-effects. Bioinformatic approaches, advances in machine learning techniques and the increasing deposition of high-throughput data in public dat...
Article
Full-text available
Human induced pluripotent stem cell-derived cardiomyocytes have been established to detect dynamic calcium transients by fast kinetic fluorescence assays that provide insights into specific aspects of clinical cardiac activity. However, the precise derivation and use of waveform parameters to predict cardiac activity merit deeper investigation. In...
Preprint
Full-text available
PROTACs (PROteolysis TArgeting Chimeras) use the ubiquitin-proteasome system to degrade a protein of interest for therapeutic benefit. Advances in targeted protein degradation technology have been remarkable with several molecules moving into clinical studies. However, robust routes to assess and better understand the safety risks of PROTACs need t...
Preprint
Full-text available
Mitochondrial toxicity is an important safety endpoint in drug discovery. Models based solely on chemical structure for predicting mitochondrial toxicity are currently limited in accuracy and applicability domain to the chemical space of the training compounds. In this work, we aimed to utilize both -omics and chemical data to push beyond the state...
Preprint
Full-text available
Background Elucidating compound mechanism of action (MoA) is beneficial to drug discovery, but in practice often represents a significant challenge. Causal Reasoning approaches aim to address this situation by inferring dysregulated signalling proteins using transcriptomics data and biological networks; however, a comprehensive benchmarking of such...
Article
Full-text available
Resistance to current therapies is common for Pancreatic cancer and hence novel treatment options are urgently needed. In this work, we developed and validated a computational method to select synergistic compound combinations based on transcriptomic profiles from both the disease and compound side, combined with a pathway scoring system, which was...
Article
Full-text available
Measurements of protein–ligand interactions have reproducibility limits due to experimental errors. Any model based on such assays will consequentially have such unavoidable errors influencing their performance which should ideally be factored into modelling and output predictions, such as the actual standard deviation of experimental measurements...
Article
Full-text available
Differentiation therapy is attracting increasing interest in cancer as it can be more specific than conventional chemotherapy approaches, and it has offered new treatment options for some cancer types, such as treating acute promyelocytic leukaemia (APL) by retinoic acid. However, there is a pressing need to identify additional molecules which act...
Conference Paper
Background Ulcerative Colitis (UC) associated single nucleotide polymorphisms (SNP) are mostly in non-coding regions of the genome. Because of that, it has been challenging to determine their role in the disease onset and severity. We have previously developed an integrative workflow (termed iSNP) to understand better how these SNPs are involved in...
Preprint
p>In the context of small molecule property prediction, experimental errors are usually a neglected aspect during model generation. The main caveat to binary classification approaches is that they weight minority cases close to the threshold boundary equivalently in distinguishing between activity classes. For example, a pXC50 activity value of 5.1...
Article
The understanding of the mechanism-of-action (MoA) of compounds and the prediction of potential drug targets play an important role in small-molecule drug discovery. The aim of this work was to compare chemical and cell morphology information for bioactivity prediction. The comparison was performed using bioactivity data from the ExCAPE database, i...
Article
Enhanced/prolonged cAMP signalling has been suggested as a suppressor of cancer proliferation. Interestingly, two key modulators that elevate cAMP, the A 2A receptor (A 2A R) and phosphodiesterase 10A (PDE10A), are differentially co-expressed in various types of non-small lung cancer (NSCLC) cell-lines. Thus, finding dual-target compounds, which ar...
Article
Full-text available
Shexiang Baoxin Pill (SBP) is an oral formulation of Chinese materia medica for the treatment of angina pectoris. It displays pleiotropic roles in protecting the cardiovascular system. However, the mode of action of SBP in promoting angiogenesis, and in particular the synergy between its constituents is currently not fully understood. The combinati...
Article
The Traditional Chinese Medicine (TCM) formulation Shexiang Baoxin Pill (SBP) is commonly used in the treatment of coronary heart disease (CHD) in East Asia and regarded to promote the regulation of angiogenesis and to improve endothelial function. SBP comprises of seven TCM materials; however, the interactions of their effects in a biological syst...
Article
Adverse drug reactions (ADRs) are undesired effects of medicines that can harm patients and are a significant source of attrition in drug development. ADRs are anticipated by routinely screening drugs against secondary pharmacology protein panels. However, there is still a lack of quantitative information on the links between these off-target prote...
Article
To improve our ability to extrapolate preclinical toxicity to humans, there is a need to understand and quantify the concordance of adverse events (AEs) between animal models and clinical studies. In the present work, we discovered 3011 statistically significant associations between preclinical and clinical AEs caused by drugs reported in the Pharm...
Article
Full-text available
Pathway analysis is an informative method for comparing and contrasting drug-induced gene expression in cellular systems. Here, we define the effects of the marine natural product fucoxanthin, separately and in combination with the prototypic phosphatidylinositol 3-kinase (PI3K) inhibitor LY-294002, on gene expression in a well-established human gl...
Article
In the context of bioactivity prediction, the question of how to calibrate a score produced by a machine learning method into a probability of binding to a protein target is not yet satisfactorily addressed. In this study, we compared the performance of three such methods, namely Platt Scaling (PS), Isotonic Regression (IR) and Venn-ABERS Predictor...