ChemSpider: An Online Chemical Information Resource

Journal of chemical education (Impact Factor: 1). 08/2010; 87(11). DOI: 10.1021/ed100697w

ABSTRACT ChemSpider is a free, online chemical database offering access to physical and chemical properties, molecular structure, spectral data, synthetic methods, safety information, and nomenclature for almost 25 million unique chemical compounds sourced and linked to almost 400 separate data sources on the Web. ChemSpider is quickly becoming the primary chemistry Internet portal and it can be very useful for both chemical teaching and research.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Chemical compounds and drugs (together called chemical entities) embedded in scientific articles are crucial for many information extraction tasks in the biomedical domain. However, only a very limited number of chemical entity recognition systems are publically available, probably due to the lack of large manually annotated corpora. To accelerate the development of chemical entity recognition systems, the Spanish National Cancer Research Center (CNIO) and The University of Navarra organized a challenge on Chemical and Drug Named Entity Recognition (CHEMDNER). The CHEMDNER challenge contains two individual subtasks: 1) Chemical Entity Mention recognition (CEM); and 2) Chemical Document Indexing (CDI). Our study proposes machine learning-based systems for the CEM task. The 2013 CHEMDNER challenge organizers provided a manually annotated 10,000 UTF8-encoded PubMed abstracts according to a predefined annotation guideline: a training set of 3,500 abstracts, a development set of 3,500 abstracts and a test set of 3,000 abstracts. We developed machine learning-based systems, based on conditional random fields (CRF) and structured support vector machines (SSVM) respectively, for the CEM task for this data set. The effects of three types of word representation (WR) features, generated by Brown clustering, random indexing and skip-gram, on both two machine learning-based systems were also investigated. The performance of our system was evaluated on the test set using scripts provided by the CHEMDNER challenge organizers. Primary evaluation measures were micro Precision, Recall, and F-measure. Our best system was among the top ranked systems with an official micro F-measure of 85.05%. Fixing a bug caused by inconsistent features marginally improved the performance (micro F-measure of 85.20%) of the system. The SSVM-based CEM systems outperformed the CRF-based CEM systems when using the same features. Each type of the WR feature was beneficial to the CEM task. Both the CRF-based and SSVM-based systems using the all three types of WR features showed better performance than the systems using only one type of the WR feature.
    Journal of Cheminformatics 01/2015; 7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S8. DOI:10.1186/1758-2946-7-S1-S8 · 4.54 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Internet databases serve as an important source of information on chemical compounds that students can readily investigate, including those studying food science and food technology. This Activity provides a brief introduction to the application of chemical information resources with a focus on conducting structure-based searches.
    Journal of chemical education 03/2015; 92(5):874-876. DOI:10.1021/ed5006739 · 1.00 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Hepatitis C virus (HCV) is one of the major viruses affecting the world today. It is a highly variable virus, having a rapid reproduction and evolution rate. The variability of genomes is due to hasty replication catalyzed by nonstructural protein 5B (NS5B) which is also a potential target site for the development of anti-HCV agents. Recently, the US Food and Drug Administration approved sofosbuvir as a novel oral NS5B inhibitor for the treatment of HCV. Unfortunately, it is much highlighted for its pricing issues. Hence, there is an urgent need to scrutinize alternate therapies against HCV that are available at affordable price and do not have associated side effects. Such a need is crucial especially in underdeveloped countries. The search for various new bioactive compounds from plants is a key part of pharmaceutical research. In the current study, we applied a pharmacoinformatics-based approach for the identification of active plant-derived compounds against NS5B. The results were compared to docking results of sofosbuvir. The lead compounds with high-binding ligands were further analyzed for pharmacokinetic and pharmacodynamic parameters based on in silico absorption, distribution, metabolism, excretion, and toxicity (ADMET) profile. The results showed the potential alternative lead compounds that can be developed into commercial drugs having high binding energy and promising ADMET properties.
    Drug Design, Development and Therapy 03/2015; 2015:9:1825–1841. DOI:10.2147/DDDT.S75886 · 3.03 Impact Factor


Available from
Jun 1, 2014