Predictive Modeling of Chemical Hazard by Integrating Numerical Descriptors of Chemical Structures and Short-term Toxicity Assay Data

Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, North Carolina 27599, USA.
Toxicological Sciences (Impact Factor: 3.85). 03/2012; 127(1):1-9. DOI: 10.1093/toxsci/kfs095
Source: PubMed


Quantitative structure-activity relationship (QSAR) models are widely used for in silico prediction of in vivo toxicity of drug candidates or environmental chemicals, adding value to candidate selection in drug development or in a search for less hazardous and more sustainable alternatives for chemicals in commerce. The development of traditional QSAR models is enabled by numerical descriptors representing the inherent chemical properties that can be easily defined for any number of molecules; however, traditional QSAR models often have limited predictive power due to the lack of data and complexity of in vivo endpoints. Although it has been indeed difficult to obtain experimentally derived toxicity data on a large number of chemicals in the past, the results of quantitative in vitro screening of thousands of environmental chemicals in hundreds of experimental systems are now available and continue to accumulate. In addition, publicly accessible toxicogenomics data collected on hundreds of chemicals provide another dimension of molecular information that is potentially useful for predictive toxicity modeling. These new characteristics of molecular bioactivity arising from short-term biological assays, i.e., in vitro screening and/or in vivo toxicogenomics data can now be exploited in combination with chemical structural information to generate hybrid QSAR-like quantitative models to predict human toxicity and carcinogenicity. Using several case studies, we illustrate the benefits of a hybrid modeling approach, namely improvements in the accuracy of models, enhanced interpretation of the most predictive features, and expanded applicability domain for wider chemical space coverage.

Download full-text


Available from: Alexander Tropsha, Jun 10, 2015
  • Source
    • "The combination of catalogued data along with HTS allows for the analysis of short-term effects and addresses the question as to which oil dispersant(s) would be most eco-friendly in this environment (Judson et al. 2010). Additional examples of Tier 1 profiling include the use of HTS assays for screening endocrine disruptors (Reif et al. 2010), and the use of in silico methods to screen and prioritize large numbers of chemicals (Rusyn et al. 2012; Wang N et al. 2011, 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: In 2011, the U.S. Environmental Protection Agency initiated the NexGen project to develop a new paradigm for the next generation of risk science. The NexGen framework was built on three cornerstones: the availability of new data on toxicity pathways made possible by fundamental advances in basic biology and toxicological science; the incorporation of a population health perspective that recognizes that most adverse health outcomes involve multiple determinants; and a renewed focus on new risk assessment methodologies designed to better inform risk management decision making. The NexGen framework has three phases. Phase I (objectives) focuses on problem formulation and scoping, taking into account the risk context and the range of available risk management decision making options. Phase II (risk assessment) seeks to identify critical toxicity pathway perturbations using new toxicity testing tools and technologies, and to better characterize risks and uncertainties using advanced risk assessment methodologies. A blueprint for pathway-based toxicity testing was provided by the 2007 U.S. National Research Council (NRC) report, Toxicity Testing in the 21st Century: A Vision and a Strategy; guidance on new risk assessment methods is provided by the 2009 NRC report, Science and Decisions, Advancing Risk Assessment. Phase III (risk management) involves the development of evidence-based population health risk management strategies of a regulatory, economic, advisory, community-based, or technological nature, using sound principles of risk management decision making. Analysis of a series of case-study prototypes indicated that many aspects of the NexGen framework are already beginning to be adopted in practice.
    Full-text · Article · Apr 2014 · Environmental Health Perspectives
  • Source
    • "QSAR) or purely biological (e.g. biomarkers) modeling strategies, the hybrid modeling approach showed advantages in performance, prediction coverage and model interpretation (Rusyn et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Drug-induced liver injury (DILI) is a major adverse drug reaction that accounts for one-third of post-marketing drug withdrawals. Several classifiers for human hepatotoxicity using chemical descriptors with limited prediction accuracies have been published. In this study, we developed predictive in silico models based on a set of 156 DILI positive and 136 DILI negative compounds for DILI prediction. First, models based on a chemical descriptor (CDK, Dragon and MOE) and in vitro cell-imaging endpoints [human hepatocyte imaging assay technology (HIAT) descriptors] were built using random forest (RF) and five-fold cross-validation procedure. Then three hybrid models were built using HIAT and a single type of chemical descriptors. Generally, the models based only on chemical descriptors were poor, with a correct classification rate (CCR) around 0.60 when the default threshold value (i.e. threshold = 0.50) was used. The hybrid models afforded a CCR of 0.73 with a specificity of 0.74 and a better true positive rate (sensitivity of 0.71), which is crucial in drug toxicity screening for the purpose of patient safety. The benefit of hybrid models was even more drastic when stricter classification thresholds were employed (e.g. CCR would be 0.83 when double thresholds (non-toxic < 0.40 and toxic > 0.60) were used for the hybrid model). We have developed rigorously validated hybrid models which can be used in virtual screening of lead compounds with potential hepatotoxicity. Our study also showed a chemical structure and in vitro biological data can be complementary in enhancing the prediction accuracy of human hepatotoxicity and can afford rational mechanistic interpretation. Copyright © 2013 John Wiley & Sons, Ltd.
    Full-text · Article · Mar 2014 · Journal of Applied Toxicology
  • Source
    • "Therefore, screening a large number of chemicals and understanding their complex molecular mechanism of toxicity on experimental bacterial cell line models is required. To assist such multidimensional experimental work, reliable simulation/theoretical analysis should be performed (Rusyn et al., 2012; Kavlock et al., 2008; Nigsch et al., 2009; Rusyn and Daston, 2010). A few in silico models based on quantitative structure toxicity/activity/property (QSTR/QSAR/QSPR) were developed to predict toxicity of chemicals such as ocular toxicity, cytotoxicity, and fish toxicity (Tugcu et al., 2012; Solimeo et al., 2012; Kar and Roy, 2013). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Biodiversity deprivation can affect functions and services of the ecosystem. Changes in biodiversity alter ecosystem processes and change the resilience of ecosystems to ecological changes. Bacterial communities are the main form of biomass in the ecosystem and one of largest populations on the planet. Bacterial communities provide important services to biodiversity. They break down pollutants, municipal waste and ingested food, and they are the primary route for recycling of organic matter to plants and other autotrophs, conversion of inorganic matter into new biological tissue using sunlight, management of energy crisis through use of biofuel. In the present study, computational chemistry and statistical modeling have been used to develop mathematical equations which can be applied to calculate toxicity of new/unknown chemicals/ biofuels/ metabolites in Escherichia coli. 2D and 3D descriptors were generated from molecular structure of compounds and mathematical models have been developed using genetic function approximation followed by multiple linear regression (GFA-MLR) method. Model validity was checked through defined internal (R(2)=0.751 and Q(2)=0.711), and external (R(2)pred=0.773) statistical parameters. Molecular features responsible for toxicity were also assessed through 3D toxicophore study. The toxicophore-based model was validated (R= 0.785) using qualitative statistical metrics and randomization test (Fischer validation).
    Full-text · Article · Nov 2013 · Toxicology in Vitro
Show more