
Christoph SteinbeckEuropean Molecular Biology Laboratory | EMBL · EMBL Hinxton (EBI)
Christoph Steinbeck
Professor
About
272
Publications
88,770
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,215
Citations
Citations since 2017
Introduction
Christoph Steinbeck is Professor for Analytical Chemistry, Cheminformatics and Chemometrics at the Friedrich-Schiller-University in Jena, Germany.
His research interests are the computer-assisted structure elucidation of natural products and computational metabolomics.
Over the course of his career, Christoph Steinbeck was founding editor-in-chief of the Journal of Cheminformatics, a director of the Metabolomics Society, chairman of the Computers-Information-Chemistry (CIC) division of the German Chemical Society, and established the German Conference on Cheminformatics. Christoph is a lifetime member of the World Association of Theoretically Oriented Chemists (WATOC), a member of the Metabolomics Society, the German Chemical Society, as well as of various editorial boards and committees.
Additional affiliations
January 2008 - present
European Bioinformatics Institute (EMBL-EBI)
Position
- EMBL-EBI
January 2008 - present
January 2004 - December 2007
Universität Köln
Education
November 1992 - December 1995
October 1986 - October 1992
Publications
Publications (272)
Bottom-up variants of Dissipative Particle Dynamics (DPD), where particles can be defined as small molecules with a molecular weight in the order of 100 Daltons, allow the study of large (bio)molecular systems and supramolecular phenomena on the nanometre length and microsecond time scale. The conservative interaction between two DPD particles i an...
The concept of molecular scaffolds as defining core structures of organic molecules is utilised in many areas of chemistry and cheminformatics, e.g. drug design, chemical classification, or the analysis of high-throughput screening data. Here, we present Scaffold Generator, a comprehensive open library for the generation, handling, and display of m...
The concept of molecular scaffolds as defining core structures of organic molecules is utilised in many areas of chemistry and cheminformatics, e.g. drug design, chemical classification, or the analysis of high-throughput screening data. Here, we present Scaffold Generator, a comprehensive open library for the generation, handling, and display of m...
Contemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges for data access, both within the discipline and for...
Glycosidic moieties are a common feature in natural product (NP) structures. They have been detected in 12 % of structures in the open NP database COCONUT [1]. While sugar units can be important for NP pharmacokinetic activities in some cases, they can also obstruct the analysis of the aglycone (molecule core without the glycoside) in cheminformati...
Chemical structure generators are used in cheminformatics to produce or enumerate virtual molecules based on a set of boundary conditions. The result can then be tested for properties of interest, such as adherence to measured data or for their suitability as drugs. The starting point can be a potentially fuzzy set of fragments or a molecular formu...
The translation of images of chemical structures into machine-readable representations of the depicted molecules is known as optical chemical structure recognition (OCSR). There has been a lot of progress over the last three decades in this field, but the development of systems for the recognition of complex hand-drawn structure depictions is still...
The development of deep learning-based optical chemical structure recognition (OCSR) systems has led to a need for datasets of chemical structure depictions. The diversity of the features in the training data is an important factor for the generation of deep learning systems that generalise well and are not overfit to a specific type of input. In t...
Diatoms (Bacillariophyceae) are a major constituent of the phytoplankton and have a universally recognized ecological importance. Between 1,000 and 1,300 diatom genera have been described in the literature, but only 10 nuclear genomes have been published and made available to the public up to date. Skeletonema costatum is a cosmopolitan marine diat...
The use of molecular string representations for deep learning in chemistry has been steadily increasing in recent years. The complexity of existing string representations, and the difficulty in creating meaningful tokens from them, lead to the development of new string representations for chemical structures. In this study, the translation of chemi...
Chemical structure generators are used in cheminformatics to produce or enumerate virtual molecules based on a set of boundary conditions. The result can then be tested for properties of interest, such as adherence to measured data or for their suitability as drugs. The starting point can be a potentially fuzzy set of fragments or a molecular formu...
The open rich-client Molecule Set Comparator (MSC) application enables a versatile and fast comparison of large molecule sets with a unique inter-set molecule-to-molecule mapping obtained e.g. by molecular-recognition-oriented machine learning approaches. The molecule-to-molecule comparison is based on chemical descriptors obtained with the Chemist...
With the recent explosion of information, Natural Products (NP) research critically needs efficient ways to access and share knowledge, also to save precious knowledge being lost [1]. The reporting and sharing of NP occurrences in biological organisms are relevant to numerous scientific fields ranging from drug discovery to chemical ecology or chem...
The chemical graph theory is a subfield of mathematical chemistry which applies classic graph theory to chemical entities and phenomena. Chemical graphs are main data structures to represent chemical structures in cheminformatics. Computable properties of graphs lay the foundation for (quantitative) structure activity and structure property predict...
The use of molecular string representations for deep learning in chemistry has been steadily increasing in recent years. The complexity of existing string representations, and the difficulty in creating meaningful tokens from them, lead to the development of new string representations for chemical structures. In this study, the translation of chemi...
The use of molecular string representations for deep learning in chemistry has been steadily increasing in recent years. The complexity of existing string representations, and the difficulty in creating meaningful tokens from them, lead to the development of new string representations for chemical structures. In this study, the translation of chemi...
The use of molecular string representations for deep learning in chemistry has been steadily increasing in recent years. The complexity of existing string representations, and the difficulty in creating meaningful tokens from them, lead to the development of new string representations for chemical structures. In this study, the translation of chemi...
The use of molecular string representations for deep learning in chemistry has been steadily increasing in recent years. The complexity of existing string representations, and the difficulty in creating meaningful tokens from them, lead to the development of new string representations for chemical structures. In this study, the translation of chemi...
Background
The Investigation/Study/Assay (ISA) Metadata Framework is an established and widely used set of open source community specifications and software tools for enabling discovery, exchange, and publication of metadata from experiments in the life sciences. The original ISA software suite provided a set of user-facing Java tools for creating...
Sweet dessert watermelon (Citrullus lanatus) is one of the most important vegetable crops consumed throughout the world. The chemical composition of watermelon provides both high nutritional value and various health benefits. The present manuscript introduces a catalog of 1,679 small molecules occurring in the watermelon and their cheminformatics a...
The amount of data available on chemical structures and their properties has increased steadily over the past decades. In particular, articles published before the mid-1990 are available only in printed or scanned form. The extraction and storage of data from those articles in a publicly accessible database are desirable, but doing this manually is...
Als Fachkonsortium für die Chemie hat sich NFDI4Chem innerhalb der Nationalen For-schungsdateninfrastruktur (NFDI) gebildet. In diesem Beitrag stellt sich das Konsor-tium kurz vor und legt seine zentralen Ziele und wichtigsten Verbesserungen für dasForschungsdatenmanagement (FDM) in der Chemie sowie die praktischen Heraus-forderungen dar. Die Visio...
The amount of data available on chemical structures and their properties has increased steadily over the past decades. In particular, articles published before the mid-1990 are available only in printed or scanned form. The extraction and storage of data from those articles in a publicly accessible database are desirable, but doing this manually is...
The amount of data available on chemical structures and their properties has increased steadily over the past decades. In particular, articles published before the mid-1990 are available only in printed or scanned form. The extraction and storage of data from those articles in a publicly accessible database are desirable, but doing this manually is...
The generation of constitutional isomer chemical spaces has been a subject of cheminformatics since the early 1960s, with applications in structure elucidation and elsewhere. In order to perform such a generation efficiently, exhaustively and isomorphism-free, the structure generator needs to ensure the building of canonical graphs already during t...
p>The amount of data available on chemical structures and their properties has increased exponentially over the past decades. In particular, articles published before the mid-1990 are available only in printed or scanned form. The extraction and storage of data from those articles in a publicly accessible database are desirable, but doing this manu...
p>The generation of constitutional isomer chemical spaces has been a subject of cheminformatics since the early 1960s, with applications in structure elucidation and elsewhere. In order to perform such a generation efficiently, exhaustively and isomorphism-free, the structure generator needs to ensure the building of canonical graphs already during...
Chemical compounds can be identified through a graphical depiction, a suitable string representation, or a chemical name. A universally accepted naming scheme for chemistry was established by the International Union of Pure and Applied Chemistry (IUPAC) based on a set of rules. Due to the complexity of this ruleset a correct chemical name assignmen...
Natural products (NPs), biomolecules produced by living organisms, inspire the pharmaceutical industry and research due to their structural characteristics and the substituents from which they derive their activities. Glycosidic residues are frequently present in NP structures and have particular pharmacokinetic and pharmacodynamic importance as th...
Chemistry looks back at many decades of publications on chemical compounds, their structures and properties, in scientific articles. Liberating this knowledge (semi-)automatically and making it available to the world in open-access databases is a current challenge. Apart from mining textual information, Optical Chemical Structure Recognition (OCSR)...
Contemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges to data access, either within the discipline or to...
Die Vision der NFDI4Chem ist die Digitalisierung aller wichtigen Schritte in der chemischen Forschung, um Wissenschaftler bei der Erhebung, Speicherung, Verarbeitung, Analyse, Veröffentlichung und Wiederverwendung von Forschungsdaten bestmöglich zu unterstützen. NFDI4Chem will alle Disziplinen der Chemie in der Wissenschaft vertreten. In der Anfang...
Natural products (NP), biomolecules produced by living organisms, inspire the pharmaceutical industry and research due to their structural characteristics and the substituents from which they derive their activities. Glycosidic residues are frequently present in NP structures and have particular pharmacokinetic and pharmacodynamic importance as the...
Natural products (NPs) are small molecules produced by living organisms with potential applications in pharmacology and other industries as many of them are bioactive. This potential raised great interest in NP research around the world and in different application fields, therefore, over the years a multiplication of generalistic and thematic NP d...
p>Chemistry looks back at many decades of publications on chemical compounds, their structures and properties, in scientific articles. Liberating this knowledge (semi-)automatically and making it available to the world in open-access databases is a current challenge. Apart from mining textual information, Optical Chemical Structure Recognition (OCS...
Chemical compounds can be identified through a graphical depiction, a suitable string representation, or a chemical name. A universally accepted naming scheme for chemistry was established by the International Union of Pure and Applied Chemistry (IUPAC) based on a set of rules. Due to the complexity of this rule set a correct chemical name assignme...
Metabolomics offers systematic identification and quantification of all metabolic products from the human body. This field could provide clinicians with new sets of diagnostic biomarkers for disease states in addition to quantifying treatment response to medications at an individualised level. This literature review aims to highlight the technology...
Background
The Investigation/Study/Assay (ISA) Metadata Framework is an established and widely used set of open-source community specifications and software tools for enabling discovery, exchange and publication of metadata from experiments in the life sciences. The original ISA software suite provided a set of user-facing Java tools for creating a...
Sugar units in natural products are pharmacokinetically important but often redundant and therefore obstructing the study of the structure and function of the aglycon. Therefore, it is recommended to remove the sugars before a theoretical or experimental study of a molecule. Deglycogenases, enzymes that specialized in sugar removal from small molec...
The automatic recognition of chemical structure diagrams from the literature is an indispensable component of workflows to rediscover information about chemicals and to make it available in open-access databases. Here we report preliminary findings in our development of Deep lEarning for Chemical ImagE Recognition (DECIMER), a deep learning method...
Abstract Structural information about chemical compounds is typically conveyed as 2D images of molecular structures in scientific documents. Unfortunately, these depictions are not a machine-readable representation of the molecules. With a backlog of decades of chemical literature in printed form not properly represented in open-access databases, t...
Sugar units in natural products are pharmacokinetically important but often redundant and therefore obstructing the study of the structure and function of the aglycon. Therefore, it is recommended to remove the sugars before a theoretical or experimental study of a molecule. Deglycogenases, enzymes that specialized in sugar removal from small molec...
Sugar units in natural products are pharmacokinetically important but often redundant and therefore obstructing the study of the structure and function of the aglycon. Therefore, it is recommended to remove the sugars before a theoretical or experimental study of a molecule. Deglycogenases, enzymes that specialized in sugar removal from small molec...
Sugar units in natural products are pharmacokinetically important but often redundant and therefore obstructing the study of the structure and function of the aglycon. Therefore, it is recommended to remove the sugars before a theoretical or experimental study of a molecule. Deglycogenases, enzymes that specialized in sugar removal from small molec...
The vision of NFDI4Chem is the digitalisation of all key steps in chemical research to support scientists in their efforts to collect, store, process, analyse, disclose and re-use research data. Measures to promote Open Science and Research Data Management (RDM) in agreement with the FAIR data principles are fundamental aims of NFDI4Chem to serve t...
The automatic recognition of chemical structure diagrams from the literature is an indispensable component of workflows to re-discover information about chemicals and to make it available in open-access databases. Here we report preliminary findings in our development of DECIMER (Deep lEarning for Chemical ImagE Recognition), a deep learning method...
Abstract Natural products (NPs) have been the centre of attention of the scientific community in the last decencies and the interest around them continues to grow incessantly. As a consequence, in the last 20 years, there was a rapid multiplication of various databases and collections as generalistic or thematic resources for NP information. In thi...
Growth from spores activated a biosynthetic gene cluster in Actinomadura sp. RB29, resulting in the identification of two novel groups of halogenated polyketide natural products, named maduralactomycins and actinospirols. The unique tetracyclic and spirocyclic structures were assigned based on a combination of NMR analysis, chemoinformatic calculat...
Abstract
Metabolomics, the youngest of the major omics technologies, is supported by an active community of researchers and infrastructure developers across Europe. To coordinate and focus efforts around infrastructure building for metabolomics within Europe, a workshop on the "Future of metabolomics in ELIXIR" was organised at Frankfurt Airport in...
This project aims at the development of an algorithm for extracting characteristic substructures (fragments) from natural product structures in-silico to study their chemical space in order to create better computer-assisted structure elucidation systems. The initial idea is to adapt the Ertl algorithm for automatic functional group identification...
Particle-based mesoscopic simulation with Dissipative Particle Dynamics (DPD) enables the study of supramolecular phenomena at the nanometer length and microsecond time scale for large interacting chemical ensembles representing millions of atoms. The conservative interaction between two DPD particles i and j is characterized by an isotropic repuls...
Natural products (NPs) have been the centre of attention of the scientific community in the last decencies and the interest around them continues to grow incessantly. As a consequence, in the last 20 years, there was a rapid multiplication of various databases and collections as generalistic or thematic resources for NP information. In this review,...
The Ertl algorithm for automated functional groups (FG) detection and extraction of organic molecules is implemented on the basis of the Chemistry Development Kit (CDK). A distinct impact of the chosen CDK aromaticity model is demonstrated by a FG analysis of the ChEMBL database compounds. The average performance of less than a millisecond for a si...
Particle based mesoscopic simulation with Dissipative Particle Dynamics (DPD) enables the study of supramolecular phenomena at the nanometer length and microsecond time scale for large interacting chemical ensembles representing millions of atoms. A series of already published and ongoing open projects aims to achieve a comprehensive computational...
Abstract Natural products (NPs), often also referred to as secondary metabolites, are small molecules synthesised by living organisms. Natural products are of interest due to their bioactivity and in this context as starting points for the development of drugs and other bioactive synthetic products. In order to select compounds from virtual librari...
„Die Schaffung einer Nationalen Forschungsdateninfrastruktur für die Chemie (NFDI4Chem), eingebunden in eine Nationale Forschungsdateninfrastruktur für alle Wissenschaftsbereiche, ist eine großartige Chance für unser Fachgebiet. Gutes Forschungsdatenmanagement ist die Basis für gute wissenschaftliche Praxis und eröffnet auf lange Sicht neue Forschu...
“The formation of a National Research Data Infrastructure for Chemistry (NFDI4Chem), integrated into a National Research Data Infrastructure for all scientific disciplines, is a great opportunity for our discipline. Proper research data management is the basis for good scientific practice and opens up new fields of research …” Read more in the Gues...
The Ertl algorithm for automated functional groups (FG) detection and extraction of organic molecules is implemented on the basis of the Chemistry Development Kit (CDK). A distinct impact of the chosen CDK aromaticity model is demonstrated by an FG analysis of the ChEMBL database compounds. The average performance of less than a millisecond for a s...
Motivation:
Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected int...
Correction for ‘The value of universally available raw NMR data for transparency, reproducibility, and integrity in natural product research’ by James B. McAlpine et al. , Nat. Prod. Rep. , 2018, DOI: 10.1039/c7np00064b.
Background
Metabolomics is the comprehensive study of a multitude of small molecules to gain insight into an organism's metabolism. The research field is dynamic and expanding with applications across biomedical, biotechnological and many other applied biological domains. Its computationally-intensive nature has driven requirements for open data fo...