A preview of the PDF is not available
A Network model of the Chemical Space provides similarity structure to the system of chemical elements
Abstract and Figures
The collection of every species reported up to date constitutes the so-called Chemi- cal Space (CS). This space currently comprises well over 30 million substances and is growing exponentially . In order to characterize this ever-growing space, chemists seek for similarity of substances on the CS based on the way they combine . Mendeleev’s work on chemical elements was based upon his knowledge of the CS by 1869 is per- haps the most famous example of how the CS determines similarity relations . From a contemporary point of view, Network Theory serves as a natural framework to identify c these kind of relational patterns in the CS . Nowadays, databases such as Reaxys 6 have grown to a point where they can be taken as proxies for the whole CS, opening the possibility to analyze it from a data driven perspective. In this work we propose to study the similarity of chemical elements according to the compounds they form. From each compound, we deleted each element to ob- tain a formula that is connected to the deleted element, v.g. S 1/2 O 4/2 , Na 2/1 O 4/1 and Na 2/4 S 1/4 are formulae coming from Na 2 SO 4 (Sodium sulfate) where Na, S and O, have been deleted respectively. This form a bipartite graph formed by elements and those formulae where they have been deleted, We build our network using 26,206,663 compounds recorded on Reaxys up to 2015. Similarity among chemical elements is constructed analogously to Social Network Analysis, where actors are declared similar whenever they are connected to the same set of other actors. The more formulae ele- ments share, the more similar they are. We introduce a new notion of in-betweenness of elements acting as mediators on similarity relations of others. We analyze the struc- tural features of this network and how they are affected by node removal. We show that the network is both highly dense and redundant. Even though it is heavily centralized, similarity relations are widely spread across a wide range of formulae, which grants the network extraordinary structure resiliency, even against directed attack. We discuss some implications of these results for chemistry.
Figures - uploaded by Eugenio J. Llanos
All figure content in this area was uploaded by Eugenio J. Llanos
Content may be subject to copyright.