• Home
  • Alex Michael Clark
Alex Michael Clark

Alex Michael Clark
Molecular Materials Informatics · Research & Development

Doctor of Philosophy

About

104
Publications
10,622
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,052
Citations
Additional affiliations
January 2004 - March 2010
Chemical Computing Group Inc.
Position
  • Researcher
August 2001 - December 2003
IntelliChem
Position
  • Computational Scientist
January 1992 - July 1999
University of Auckland
Position
  • Postgrad & Undergrad

Publications

Publications (104)
Article
Full-text available
Substances of unknown or variable composition, complex reaction products, or biological materials (UVCBs) are over 70 000 "complex" chemical mixtures produced and used at significant levels worldwide. Due to their unknown or variable composition, applying chemical assessments originally developed for individual compounds to UVCBs is challenging, wh...
Article
Full-text available
Chemical mixtures have recently come to the attention of open standards and data structures for capturing machine-readable descriptions for informatics uses. At the present time, essentially all transmission of information about mixtures is done using short text descriptions that are readable only by trained scientists, and there are no accessible...
Chapter
In this article, we review the disease, biology, and biochemistry of kinetoplastids, as well as the new drugs and drug candidates that have entered the clinic in the last decade. We also describe examples of the preclinical exploration of small molecules against various protein targets (e.g. cysteine proteases, the proteasome, and tubulin), as well...
Chapter
Drug discovery requires the simultaneous optimization of many properties such as bioactivity, absorption, distribution, metabolism, excretion, toxicity, and the underlying physicochemical properties. The ability to satisfy many requirements at once is termed multiobjective optimization, and we will discuss the importance of research in this area fo...
Article
Full-text available
Background: Humans are exposed to tens of thousands of chemical substances that need to be assessed for their potential toxicity. Acute systemic toxicity testing serves as the basis for regulatory hazard classification, labeling, and risk management. However, it is cost- and time-prohibitive to evaluate all new and existing chemicals using traditi...
Poster
Full-text available
Endocrine disruption is a major focus of toxicology research, and thus human estrogen and androgen receptors are key targets of interest. Downstream effects of receptor activation are difficult to anticipate without expensive, time-consuming in vitro and in vivo testing, so the Environmental Protection Agency (EPA) has prioritized alternative metho...
Presentation
Full-text available
Presentation delivered at the 2019 Society of Environmental Toxicology and Chemistry meeting in Toronto on November 4th.
Article
Full-text available
We describe a file format that is designed to represent mixtures of compounds in a way that is fully machine readable. This Mixfile format is intended to fill the same role for substances that are composed of multiple components as the venerable Molfile does for specifying individual structures. This much needed datastructure is intended to replace...
Article
A variety of machine learning methods such as naive Bayesian, support vector machines and more recently deep neural networks are demonstrating their utility for drug discovery and development. These leverage the generally bigger datasets created from high-throughput screening data and allow prediction of bioactivities for targets and molecular prop...
Article
One potential source of new antibacterials is through probing existing chemical libraries for copper-dependent inhibitors (CDIs), i.e., molecules with antibiotic activity only in the presence of copper. Recently, we demonstrated that previously unknown staphylococcal CDIs were frequently present in a small pilot screen. Here, we report the outcome...
Article
Full-text available
The human immunodeficiency virus (HIV) causes over a million deaths every year and has a huge economic impact in many countries. The first class of drugs approved were nucleoside reverse transcriptase inhibitors. A newer generation of reverse transcriptase inhibitors have become susceptible to drug resistant strains of HIV, and hence alternatives a...
Article
We have recently stressed the need to scale and “industrialize” rare disease drug discovery. Finding information on compounds relevant to rare diseases across the hundreds of available databases is a complex challenge, even for experts and the linkage between targets and data is often non-existent. Disease-associated targets are only identified aft...
Poster
We have recently stressed the need to scale and "industrialize" rare disease drug discovery. Finding information on compounds relevant to rare diseases across the hundreds of available databases is a complex challenge, even for experts and the linkage between targets and data is often non-existent. Disease-associated targets are only identified aft...
Article
Full-text available
We have previously described the first Bayesian machine learning models from FDA-approved drug screens, for identifying compounds active against the Ebola virus (EBOV). These models led to the identification of three active molecules in vitro: tilorone, pyronaridine, and quinacrine. A follow-up study demonstrated that one of these compounds, tiloro...
Article
Organic cation transporter (OCT) 2 mediates the entry step for organic cation secretion by renal proximal tubule cells and is a site of unwanted drug-drug interactions (DDIs). But reliance on decision tree-based predictions of DDIs at OCT2 that depend on IC50 values can be suspect because they can be influenced by choice of transported substrate; f...
Poster
Full-text available
An overview of MegaTox, a subset of Assay Central focused on ADME/Tox models, presented at the Chemical Toxicology Division poster session on 08/21/2018.
Poster
Full-text available
An overview of the work done for schistosomiasis with Assay Central, presented at the Medicinal Chemistry Division poster session on 08/19/2018.
Poster
Full-text available
An overview of the work done for Mycobacterium abscessus with Assay Central, presented at the Medicinal Chemistry Division poster session on 08/19/2018.
Presentation
Full-text available
A PDF of the oral presentation given at Computers in Chemistry Division session on 08/23/2018, detailing the efforts to use machine learning for HIV whole cell and targets with an NIAID database.
Poster
Full-text available
An overview of the published work done for tuberculosis with Assay Central and other machine learning methods (DOI 10.1021/acs.molpharmaceut.8b00083), presented at the Medicinal Chemistry Division poster session on 08/19/2018.
Article
Many chemicals that disrupt endocrine function have been linked to a variety of adverse biological outcomes. However, screening for endocrine disruption using in vitro or in vivo approaches is costly and time-consuming. Computational methods, e.g. Quantitative Structure-Activity Relationship models, have become more reliable due to bigger training...
Preprint
The organic cation transporter OCT2 mediates the entry step for organic cation secretion by renal proximal tubule cells and is a site of unwanted drug-drug interactions (DDIs). But reliance on decision tree-based predictions of DDIs at OCT2 that depend on IC50 values can be suspect because they can be influenced by choice of transported substrate;...
Article
Tuberculosis is a global health dilemma. In 2016, the WHO reported 10.4 million incidences and 1.7 million deaths. The need to develop new treatments for those infected with Mycobacterium tuberculosis (Mtb) has led to many large-scale phenotypic screens and many thousands of new active compounds identified in vitro. However, with limited funding, e...
Chapter
We are now seeing the benefit of investments made over the last decade in high-throughput screening (HTS) that is resulting in large structure activity datasets entering public and open databases such as ChEMBL and PubChem. The growth of academic HTS screening centers and the increasing move to academia for early stage drug discovery suggests a gre...
Poster
Full-text available
NICEATM curated a large training dataset for acute rat oral toxicity and released it for international collaboration to build a consensus model from all participants. This poster outlines my submission.
Chapter
This chapter explores machine learning algorithms and makes them accessible for seeding drug discovery projects. It explores the authors's novel data pruning strategy when constructing Bayesian models to predict other types of properties. The chapter summarizes the application of the machine learning methods to toxicology datasets and transporters....
Chapter
This chapter outlines some of the efforts to make software accessible for cheminformatics as well as the most recent efforts which aims to address the lack of accessibility of public screening datasets for machine learning models. Chemistry apps for mobile phones and tablets are available for a variety of workflows that involve chemical structures....
Presentation
Full-text available
Public sources of open data from repositories like ChEMBL or PubChem represent Big Data and an ideal starting point for drug discovery efforts. However, this manually curated data is not in a form that is immediately accessible for computational model building. The research presented herein details efforts to streamline the neglected disease drug d...
Presentation
Full-text available
Over 7 million people in Latin America are infected with Trypanosoma cruzi (T. cruzi), the eukaryotic parasite that gives rise to Chagas disease. Chagas disease has also begun to gain a foothold as an expanding infection in the United States where an estimated 300,000 people may be infected. The cost of treatment in the United States alone is estim...
Article
Full-text available
The search for small molecule inhibitors of Ebola virus (EBOV) has led to several high throughput screens over the past 3 years. These have identified a range of FDA-approved active pharmaceutical ingredients (APIs) with anti-EBOV activity in vitro and several of which are also active in a mouse infection model. There are millions of additional com...
Chapter
Full-text available
The past decade has seen an increased availability of databases with both molecular structures and bioactivity (e.g. binding affinity data) curated from publications or submitted by high throughput screening centers. Early collections of curated data, such as BindingDB or ChemBank, were initially eclipsed by PubChem, ChEMBL and commercial literatur...
Article
Neglected disease drug discovery is generally poorly funded compared with major diseases and hence there is an increasing focus on collaboration and precompetitive efforts such as public–private partnerships (PPPs). The More Medicines for Tuberculosis (MM4TB) project is one such collaboration funded by the EU with the goal of discovering new drugs...
Chapter
Mobile apps are being used in many areas of business productivity as well as in science in general and are normally focused on one task or provide data in a narrow area (e.g., periodic tables). Mobile apps also represent an alternative approach to provide information on green chemistry information as well as concepts to scientists. We describe thre...
Article
Full-text available
This project will develop a way to connect, in real time, globally disparate researchers who are doing similar science so that they can work better and faster towards the development of new medicines. The scientific literature already fulfills the role of notifying researchers about work that has been done, and social media has recently evolved to...
Article
The renewed urgency to develop new treatments for Mycobacterium tuberculosis (Mtb) infection has resulted in large-scale phenotypic screening and thousands of new active compounds in vitro. The next challenge is to identify candidates to pursue in a mouse in vivo efficacy model as a step to predicting clinical efficacy. Previously, we have analyzed...
Article
Full-text available
Annotation of bioassay protocols using semantic web vocabulary is a way to make experiment descriptions machine-readable. Protocols are communicated using concise scientific English, which precludes most kinds of analysis by software algorithms. Given the availability of a sufficiently expressive ontology, some or all of the pertinent information c...
Preprint
Full-text available
Annotation of bioassay protocols using semantic web vocabulary is a way to make experiment descriptions machine-readable. Protocols are communicated using concise scientific English, which precludes most kinds of analysis by software algorithms. Given the availability of a sufficiently expressive ontology, some or all of the pertinent information c...
Preprint
Full-text available
Annotation of bioassay protocols using semantic web vocabulary is a way to make experiment descriptions machine-readable. Protocols are communicated using concise scientific English, which precludes most kinds of analysis by software algorithms. Given the availability of a sufficiently expressive ontology, some or all of the pertinent information c...
Article
Bayesian models constructed from structure-derived fingerprints have been a popular and useful method for drug discovery research when applied to bioactivity measurements that can be effectively classified as active or inactive. The results can be used to rank candidate structures according to their probability of activity, and this ranking benefit...
Article
The search for small molecule inhibitors of Ebola virus (EBOV) has led to several high throughput screens over the past 3 years. These have identified a range of FDA-approved active pharmaceutical ingredients (APIs) with anti-EBOV activity in vitro and several of which are also active in a mouse infection model. There are millions of additional com...
Article
Full-text available
The current rise in the use of open lab notebook techniques means that there are an increasing number of scientists who make chemical information freely and openly available to the entire community as a series of micropublications that are released shortly after the conclusion of each experiment. We propose that this trend be accompanied by a thoro...
Article
Full-text available
The search for small molecule inhibitors of Ebola virus (EBOV) has led to several high throughput screens over the past 3 years. These have identified a range of FDA-approved active pharmaceutical ingredients (APIs) with anti-EBOV activity in vitro and several of which are also active in a mouse infection model. There are millions of additional com...
Article
Full-text available
The past decade has seen increased numbers of studies publishing ligand-based computational models for drug transporters. Although generally using small experimental datasets, these models can provide insights into structure activity relationships for the transporter. In addition, such models have helped to identify new compounds as substrates or i...
Article
In an associated paper by us we have described a reference implementation of Laplacian-corrected naïve Bayesian model building using ECFP and FCFP-type fingerprints. As a follow-up to this work, we have now undertaken a large scale validation study in order to ensure that the technique generalizes to a broad variety of drug discovery datasets. To a...
Article
On the order of hundreds of ADME/Tox models have been described in the literature in the last decade which are more often than not inaccessible to anyone but the authors. Public accessibility is also an issue with computational models for bioactivity, and a major challenge limiting drug discovery still remains the ability to share such models. We d...
Article
The availability of structures and linked bioactivity data in databases is powerfully enabling for drug discovery and chemical biology. However, we now review some confounding issues with the divergent expansions of public and commercial sources of chemical structures. These are not only associated with expanding patent extraction but also increasi...
Article
Full-text available
Bioinformatics and computer aided drug design rely on the curation of a large number of protocols for biological assays that measure the ability of potential drugs to achieve a therapeutic effect. These assay protocols are generally published by scientists in the form of plain text, which needs to be more precisely annotated in order to be useful t...
Article
This chapter explores some of the ways that cheminformatics software is adapting to the overall industry transition toward consumer oriented mobile devices and cloud computing. While scientific software in general lags the trend due to high complexity and narrowly defined market segments, a significant amount of technical progress has been made. Mo...
Article
Full-text available
Background We recently developed a freely available mobile app (TB Mobile) for both iOS and Android platforms that displays Mycobacterium tuberculosis (Mtb) active molecule structures and their targets with links to associated data. The app was developed to make target information available to as large an audience as possible. Results We now repor...
Article
Full-text available
Over the past decade we have seen a growth in the provision of chemistry data and cheminformatics tools as either free websites or software as a service commercial offerings. These have transformed how we find molecule-related data and use such tools in our research. There have also been efforts to improve collaboration between researchers either o...
Conference Paper
Background / Purpose: Here is an idea for a mobile app for rare diseases based on the previously created Open Drug Discovery Teams (ODDT) app. Main conclusion: The benefit of having open source rare disease drug discovery is that it will increase visibility for researchers, provide less repetition of work and thus much faster progress towards...
Article
Selecting and translating in vitro leads for a disease into molecules with in vivo activity in an animal model of the disease is a challenge that takes considerable time and money. As an example, recent years have seen whole-cell phenotypic screens of millions of compounds yielding over 1500 inhibitors of Mycobacterium tuberculosis (Mtb). These mus...
Article
Full-text available
Background An increasing number of researchers are focused on strategies for developing inhibitors of Mycobacterium tuberculosis (Mtb) as tuberculosis (TB) drugs. Results In order to learn from prior work we have collated information on molecules screened versus Mtb and their targets which has been made available in the Collaborative Drug Discover...
Article
The creation of 2D molecular structure diagrams that make full use of the capabilities of modern display systems, using only input data expressed in file formats used for cheminformatics, is a complex task that requires a number of additional algorithms. Assuming that atom positions have been well chosen, the rendering engine is required to microma...
Article
We are perhaps at a turning point for making cheminformatics accessible to scientists who are not computational chemists. The proliferation of mobile devices has seen the development of software or 'apps' that can be used for sophisticated chemistry workflows. These apps can offer capabilities to the practicing chemist that are approaching those of...
Article
Green Chemistry related information is generally proprietary, and papers on the topic are commonly behind pay walls that limit their accessibility. Several new mobile applications (apps) have been recently released for the Apple iOS platform, which incorporate green chemistry concepts. Because of the large number of people who now own a mobile devi...
Article
Drug discovery is shifting focus from industry to outside partners and, in the process, creating new bottlenecks. Technologies like high throughput screening (HTS) have moved to a larger number of academic and institutional laboratories in the USA, with little coordination or consideration of the outputs and creating a translational gap. Although t...