Conference PaperPDF Available

A Network model of the Chemical Space provides similarity structure to the system of chemical elements


Abstract and Figures

The collection of every species reported up to date constitutes the so-called Chemi- cal Space (CS). This space currently comprises well over 30 million substances and is growing exponentially [2]. In order to characterize this ever-growing space, chemists seek for similarity of substances on the CS based on the way they combine [3]. Mendeleev’s work on chemical elements was based upon his knowledge of the CS by 1869 is per- haps the most famous example of how the CS determines similarity relations [4]. From a contemporary point of view, Network Theory serves as a natural framework to identify c these kind of relational patterns in the CS [5]. Nowadays, databases such as Reaxys 6 have grown to a point where they can be taken as proxies for the whole CS, opening the possibility to analyze it from a data driven perspective. In this work we propose to study the similarity of chemical elements according to the compounds they form. From each compound, we deleted each element to ob- tain a formula that is connected to the deleted element, v.g. S 1/2 O 4/2 , Na 2/1 O 4/1 and Na 2/4 S 1/4 are formulae coming from Na 2 SO 4 (Sodium sulfate) where Na, S and O, have been deleted respectively. This form a bipartite graph formed by elements and those formulae where they have been deleted, We build our network using 26,206,663 compounds recorded on Reaxys up to 2015. Similarity among chemical elements is constructed analogously to Social Network Analysis, where actors are declared similar whenever they are connected to the same set of other actors. The more formulae ele- ments share, the more similar they are. We introduce a new notion of in-betweenness of elements acting as mediators on similarity relations of others. We analyze the struc- tural features of this network and how they are affected by node removal. We show that the network is both highly dense and redundant. Even though it is heavily centralized, similarity relations are widely spread across a wide range of formulae, which grants the network extraordinary structure resiliency, even against directed attack. We discuss some implications of these results for chemistry.
Content may be subject to copyright.
A preview of the PDF is not available
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Meyer and Mendeleev came across with their periodic systems by classifying and ordering the known elements by about 1869. Order and similarity were based on knowledge of chemical compounds, which gathered together constitute the chemical space by 1869. Despite its importance, very little is known about the size and diversity of this space and even less is known about its influence upon Meyer's and Mendeleev's periodic system. Here we show, by analysing 11,484 substances reported in the scientific literature up to 1869 and stored in Reaxys database, that 80% of the space was accounted by 12 elements, oxygen and hydrogen being those with most compounds. We found that the space included more than 2,000 combinations of elements, of which 5%, made of organogenic elements, gathered half of the substances of the space. By exploring the temporal report of compounds containing typical molecular fragments, we found that Meyer's and Mendeleev's available chemical space had a balance of organic, inorganic and organometallic compounds, which was, after 1830, drastically overpopulated by organic substances. The size and diversity of the space show that knowledge of organogenic elements sufficed to have a panoramic idea of the space. We determined similarities among the 60 elements known by 1869 taking into account the resemblance of their combinations and we found that Meyer's and Mendeleev's similarities for the chemical elements agree to a large extent with the similarities allowed by the chemical space.
Full-text available
It has been claimed that relational properties among chemical substances are at the core of chemistry. Here we show that chemical elements and a wealth of their trends can be found by the study of a relational property: the formation of binary compounds. We say that two chemical elements A and B are similar if they form binary compounds AC and BC, C being another chemical element. To allow the richness of chemical combinations, we also included the different stoichiomet-rical ratios for binary compounds. Hence, the more combinations with different chemical elements, and with similar stoichiometry, the more similar two chemical elements are. We studied 4,700 binary compounds by using network theory and point set topology, we obtained well-known chemical families of elements, such as: alkali metals, alkaline earth metals, halogens, lanthanides, actinides, some transi-tion metal groups and chemical patterns like: singularity principle, knight's move, and secondary periodicity. The methodology applied here can be extended to the study of ternary, quaternary and other compounds, as well as other chemical sets where a relational property can be defined.
Chemical research unveils the structure of chemical space, spanned by all chemical species, as documented in more than 200 y of scientific literature, now available in electronic databases. Very little is known, however, about the large-scale patterns of this exploration. Here we show, by analyzing millions of reac- tions stored in the Reaxys database, that chemists have reported new compounds in an exponential fashion from 1800 to 2015 with a stable 4.4% annual growth rate, in the long run nei- ther affected by World Wars nor affected by the introduction of new theories. Contrary to general belief, synthesis has been the means to provide new compounds since the early 19th cen- tury, well before Wöhler’s synthesis of urea. The exploration of chemical space has followed three statistically distinguishable regimes. The first one included uncertain year-to-year output of organic and inorganic compounds and ended about 1860, when structural theory gave way to a century of more regular and guided production, the organic regime. The current organometal- lic regime is the most regular one. Analyzing the details of the synthesis process, we found that chemists have had preferences in the selection of substrates and we identified the workings of such a selection. Regarding reaction products, the discovery of new compounds has been dominated by very few elemental com- positions. We anticipate that the present work serves as a starting point for more sophisticated and detailed studies of the history of chemistry.
Chemistry, as today's most active science, has increased its substances exponentially during the past 200 years without saturation. To get more insight why and how chemists produce new substances, a content analysis of 300 communications to theAngewandte Chemie of the years 1980, 1990, and 1995 is carried out regarding aims and methods of preparative research. In the most productive field of organic chemistry production mainly occurs to improve abilities for further production, while the less productive field of inorganic chemistry has more diverse aims. Methodological differences between organic and inorganic chemistry are discussed in detail as well as the relationship between pure and applied science.
The chemical core of chemistry I: a conceptual approach
  • J Schummer
Schummer, J.: The chemical core of chemistry I: a conceptual approach. HYLE-International Journal for Philosophy of Chemistry 4 (2), 129-162 (1998)
The Chemical Space from Which the Periodic System Arose
  • W Leal
  • E Llanos
  • P F Stadler
  • J Jost
  • G Restrepo
Leal, W.; Llanos, E.; Stadler, P.F.; Jost, J.; Restrepo, G.: The Chemical Space from Which the Periodic System Arose. ChemRxiv 10.26434/chemrxiv.9698888.v1 (2019)