Jure Zupan

Jure Zupan
National Institute of Chemistry · Laboratory of Chemometrics

PhD

About

182
Publications
29,518
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,890
Citations

Publications

Publications (182)
Chapter
This chapter presents an overview of artificial neural networks (ANNs). The choice that particular ANN method will be used depends on the goal of the data handling and, of course, on the properties and size of the data base. Unless the behaviors of properties of multivariate data are well known, it is always worthwhile to consider the nonlinear pro...
Article
In the paper first the two main learning strategies of the artificial neural networks (ANNs), the error-back propagation (EBP) and Kohonen self-organizing maps (SOM) are briefly described. Next, two nonstandard network layouts of the ANNs (bottle-neck and pyramidal decision tree) one for each of both learning strategies are suggested. In the last p...
Article
The problem of incomplete data matrices is repeatedly found in large databases, posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight, to predict physicochemical parameters of water samples in a data set wh...
Article
Full-text available
The underlying structural and physicochemical interpretation of the recently defined information indices (denominated as GT-STAF indices) is examined, with the aim of gaining greater insight on the codified chemical information. It is found that these indices are related with molecular symmetry in the context of the defined molecular “fragment” mod...
Chapter
Structure elucidation in this article is focused on databases, expert systems and modelling techniques based on different spectroscopic methods. Databases of spectroscopic data include information on i.e. chemical or physical properties of compounds, structures. The input of new data is usually standardised. Databases are connected with efficient s...
Article
The randomly selected set of 558 chemicals from Cosmetic inventory was studied with internet accessible program package CAESAR. Four toxic endpoints were considered: mutagenicity, carcinogenicity, developmental toxicity and skin sensitization. The CAESAR program provides beside the predictions comprehensive information on applicability domain and t...
Article
A brief outline of various data handling methods, from linear learning machines, principal component analysis, experimental design, and modeling to visualization, optimization, and validation together with a personal view on the historical development of the use of these methods, is given. Some future trends in handling chemical data are proposed a...
Article
On the set of 53 trypsin inhibitors the affinity to the covalent bound ligands is modeled using linear (MLR) and non-linear (ANN) methods. Each compound is represented by 343 chemical descriptors. The hypothesis was that linear models are not sufficiently flexible to yield the best model, because in MLR (multiple regression analysis) the number of...
Article
The irradiation dose in tumor and healthy tissue of a boron neutron capture therapy (BNCT) patient depends on the boron concentration in blood. In most treatments, this concentration is experimentally determined before and after irradiation but not while irradiation is being carried out because it is troublesome to take the blood samples when the p...
Article
For the prediction of decay concentration profiles of the p-boronophenylalanine (BPA) in blood during BNCT treatment, a method is suggested based on Kohonen neural networks. The results of a model trained with the concentration profiles from the literature are described. The prediction of the model was validated by the leave-one-out method. Its rob...
Article
Full-text available
A chemometrical study regarding a 10-years water quality monitoring plan at 15 sampling points along a section of the Reconquista River and its stream channels, which embraces 21 campaigns, is presented. The original data were pre-treated in order to eliminate missing data and outliers, obtaining a final data matrix of 270 samples containing 26 phy...
Article
The motive for the introduction of graphical representations of DNA was to facilitate visual inspection of similarities and differences among lengthy DNA sequences, which is almost impossible without some kind of preprocessing. The pioneers on graphical representation of DNA were Eugene Hamori and John Ruskin, who introduced a geometrical illustrat...
Article
An algorithm for the evaluation of the extended connectivity in directed graphs is described and discussed. The algorithm is a general purpose one for finding the number of all paths from any given node Vi in a directed graph toward all leaves that can be reached from that particular node Vi in the graph.
Article
In this paper we design the neural network consumer credit scoring models for financial institutions where data usually used in previous research are not available. We use extensive primarily accounting data set on transactions and account balances of clients available in each financial institution. As many of these numerous variables are correlate...
Article
This paper demonstrates the possibility of using counter-propagation neural networks to identify the combinations of dyes in textile printing paste formulations. An existing collection of 1430 printed samples produced with 10 dyes was used for neural network training. The reflectance values served as input data and the known concentrations of singl...
Chapter
HistoryIntroductionBasic Concepts of ANNLearning by ANNStandard ApplicationsNon-standard ApplicationsConclusions
Article
Recently Line Distance (LD) matrix has been introduced as a novel route for characterization of DNA sequences. The approach was based on construction of four separate submatrices for the four nucleotides, the first row of each of which records the separation between the selected nucleotide and the remaining nucleotides of the same kind. In this art...
Article
Full-text available
The aim of the optimization is to find out the optimal parameters for complex system such as synthesis of the com- pounds, chemical reactions, analytical methods, property of the products or chemical processes. The parameters that we want to determine are the values, which describe the system. The SIMPLEX is one of the most simple and general opti-...
Article
To arrive at graphical representations of proteins one is confronted with number of arbitrary decisions how to assign the 20 natural amino acids to equivalent or non-equivalent sites of underlying geometrical objects used for construction of their graphical representation. Here we consider representation of proteins based on generalized star graphs...
Article
Thrombin plays a central role in thrombosis and hemostasis. Inhibition of thrombin is a prime target for therapeutic intervention of thrombosis. A considerable number of experimental structures of both thrombin and trypsin complexes with their non-covalently bound inhibitors is available and they offer an excellent database for development of chemo...
Article
We outline an unexpected use of a particular graphical representation of DNA sequences, the ‘four line’ graphical representation [M. Randić, M. Vračko, N. Lerš, D. Plavšić, Chem. Phys. Lett. 368 (2003) 1] and how it can facilitate solving problem of DNA sequence alignment. The approach is exemplified by two shorter DNA sequences.
Article
Full-text available
Air pollutant concentrations from a monitoring campaign in Buenos Aires City, Argentina, are used to investigate the relationships between ambient levels of ozone (O3), nitric oxide (NO) and nitrogen dioxide (NO2) as a function of NOx (=NO + NO2). This campaign undertaken by the electricity sector was aimed at elucidating the apportionment of therm...
Article
We have outlined novel highly compact graphical representation for proteins, which offers graphical and numerical characterization of individual proteins. Protein representations are constructed in the interior of a unit ÔmagicÕ circle, on the circumference of which at equal distances are positioned 20 amino acids. Graphical representation of prote...
Article
Grinding with pearl mill capable of solid particles grinding in emulsions or greases from granulation of approximately 30–1 μm was studied on the basis of statistically planned experiments. The fractional factorial design for five factors was implemented. The data were used for modelling to develop back-propagation neural network and incomplete hig...
Article
In this paper, a class-modeling technique based on Kohonen artificial neural networks is presented. In particular, in order for the Kohonen self-organizing map to operate as a class-modeling device, two main issues are identified: integrating the training set (composed of samples from a single category) with a set of uniformly distributed random ve...
Article
An algorithm for encoding long strings of building blocks, like 4 DNA bases (adenine-A, cytosine-C, thymine-T, and guanidine-G), 20 natural amino acids (from Alanine Ala to Valine-Val, plus the stop triplet), or all 64 possible base triplets (from AAA to TTT), into "zigzag" or "spectrum-like" representations is suggested. The new encoding scheme ca...
Article
Paul Lewi, Frits Daeyaert – Center for Molecular Design, Janssen Pharmaceutica N.V., Vosselaar, Belgium Natural computing algorithms can give answers to outstanding hard problems encountered in structure-based drug design. In this context, Solmajer and Zupan describe the application of genetic algorithms and simulated annealing to optimization prob...
Article
We consider a graphical representation of proteins as an alternative to the usual representation of proteins as a sequence listing the natural amino acids. The approach is based on a graphical representation of triplets of DNA in which the interior of a square or the interior of a tetrahedron is used to accommodate 64 sites for the 64 codons. By as...
Article
The present study focuses on fish antibiotics which are an important group of pharmaceuticals used in fish farming to treat infections and, until recently, most of them have been exposed to the environment with very little attention. Information about the environmental behaviour and the description of the environmental fate of medical substances ar...
Article
Most 2D graphical representations of primary DNA sequences, while offering visual geometrical patterns for depicting sequences, do require considerable space if enough details of such representations are to be visible. In this contribution, we consider a highly compact graphical representation of DNA, which allows visual inspection and numerical ch...
Article
Full-text available
A quantitative structure-selectivity relationships of series of structurally diverse alpha1-adrenergic antagonists was performed by using counter-propagation neural network (CP-ANN). The theoretical molecular descriptors have been calculated and selected using CODESSA program. The results obtained for a highly non-congeneric set of molecules have c...
Article
A counterpropagation artificial neural network (CP-ANN) approach was used to classify 1779 Italian rice samples according to their variety, using physical measurements which are routinely determined for the commercial classification of the product. If compared to the classical Principal Component Analysis, the mapping based on the Kohonen network s...
Article
Full-text available
A quantitative structure–activity relationship study with respect to selectivity for α1 adrenoreceptor subtypes (α1a, α1b and α1d) of a wide series of structurally heterogeneous α1 adrenoreceptor antagonists has been performed. A large variety of molecular descriptors have been calculated and then analyzed by a heuristic method. The orthogonalizati...
Article
The 15-variable environmental data (7 concentrations: CO, SO2, O3, NOx, NO, NO2, particulate matter smaller than 10 micron (PM10), and 8 weather data: cloudiness, rainfall, insolation factor (Isfi), temperature, pressure at two locations, and wind intensity with direction) in a period of 45 days with 1-h intervals were extracted from a larger datab...
Article
The Kohonen neural networks were chosen to prepare a relevant model for fast selection of the most suitable phase equilibrium method(s) to be used in efficient vaporliquid chemical process design and simulation. They were trained to classify the objects of the study (the known physical properties and parameters of samples) into none, one or more po...
Article
The Kubelka-Munk theory, which is the basis of dye formulation, has found no application in textile printing. Recipe formulation with dye-stuffs in viscose printing pastes is carried out manually by matching the recipe of the printed sample to the standard. In order to find the desired shade this procedure is time-consuming. The consequence is that...
Article
This job refers to classification of multidimensional objects and Kohonen artificial neural networks. A new concept is introduced, called the mean angular distance among objects (MADO). Its value can be calculated as the cosine of the mean centered vectors between objects. It can be expressed in matrix form for any number of objects. The MADO allow...
Article
We present a novel 2-D graphical representation for DNA sequences which has an important advantage over the existing graphical representations of DNA in being very compact. It is based on: (1) use of binary labels for the four nucleic acid bases, and (2) use of the ‘worm’ curve as template on which binary codes are placed. The approach is illustrat...
Article
In order to evaluate the influence of the choice of the data for the training set on the prediction ability of linear and nonlinear models, various methods for sample selection were tested. The study is carried on for modelling of five colour properties: whiteness (W10), lightness (L* and Lp*), and hue (b* and bp*) of a titanium dioxide white pigme...
Article
The optimization of glow discharge lamp control parameters (voltage, current) for the determination of Cu by glow discharge optical emission spectrometry (GD-OES) at 327.3 nm in copper–titanium–zinc alloy was performed using Simplex optimization. This approach substantially reduces the required number of experiments. The efficiency of the optimizat...
Article
Previous studies on mathematical characterization of proteomics maps by sets of map invariants were based on the construction of a set of distance-related matrices obtained by matrix multiplication of a single matrix by itself. Here we consider an alternative characterization of proteomics maps based on a set of matrices characterizing local featur...
Chapter
We review the problems encountered with the interpretation of topological indices and consider the partition of a selection of topological indices into bond contributions. For indices that are defined in terms of bond additive contributions, as is the case with the Wiener index and the molecular connectivity index, such a partition is straightforwa...
Article
para-Xylene is widely used in chemical industry. It can be synthesized by alkylation of toluene with methanol using zeolite ZSM-5 as catalyst. The proportion of para-xylene, among its other isomers and other reaction byproducts, depends on the reaction conditions. As this process still remains largely empirical, we attempted to build a theoretical...
Article
Within the period from autumn 1990 to spring 1999 (from October to April in each period) 207 samples were collected and the measurement of 19 physical and chemical variables of the Mura river, Slovenia, were carried out. These variables are: river flow, water temperature, air temperature, dissolved oxygen, deficit of oxygen, oxygen saturation index...
Article
Resistance to antibiotics in bacterial population has widened the interest of scientific community for development of novel therapeutic compounds. Penicillins and cephalosporins which share the β-lactam structural moiety form the most abundant group of antibiotics on the market. Their recently developed tricyclic analogues have shown remarkable bio...
Article
A new method for »intelligent« or »content dependent« retrieval of objects from among a large quantities of multi-variate data is devised and explained. The method is based on the combination of two different approaches. One is the multi-branching decision tree and the second is Kohonen neural network. The new method allows a retrieval of similar o...
Article
We consider numerical characterization of proteomics maps by representing a map as a three-dimensional graphical object based on x, y coordinates of the spots and using their relative abundance as the z coordinate. In our representation the protein spots are first ordered based on their relative abundance and labeled accordingly. In the next step a...
Article
A computer algorithm for the calculation of ion chromatography separation is presented. It is based on the calculation of equilibrium concentrations of present analyte in discrete column segments. The continuous column is treated as a number of discrete cells or segments where the equilibration process between the stationary phase and the eluent is...
Article
Many topological indices lack an interpretation in terms of simple physicochemical quantities. We have reexamined the structural interpretation of well-known topological indices: the connectivity index (1)chi, the Wiener index W, and the Hosoya topological index Z. We relate the success of various indices in structure-property studies to the degree...
Article
The energy bands of graphite and boron nitride have been calculated using the tight-binding approximation, with atomic potentials obtained by the SCF CNDO method. On the basis of the calculated bands some optical properties of the above materials are discussed. The optical absorption of the hot-pressed pyrolitic boron nitride was measured in the wa...
Article
Full-text available
In the paper Kohonen neural network is described as an alternative tool for a fast se-lection of the most suitable physical property estimation method to be used in efficient chemical process design and simulation. Kohonen neural networks are trained to suggest the appropriate method of phase equilibrium estimation on the basis of known physical pr...
Article
Full-text available
In the paper Kohonen neural network as an alternative tool for fast selection of suitable physical property estimation method that is very important for efficient chemical process design and simulation is described. Neural networks should advice appropriate methods of phase equilibrium estimation on the basis of known physical properties. In other...
Article
A study of the influence of the training set selection, the modelling technique, and the number of objects in the training set was performed on a data set of 2 years' daily measurements of atmospheric precipitation and river flowrates. Twenty-five different data sets were prepared by the following selection methods: random selection (RND), Kohonen...
Article
In atomic absorption spectrometric measurements calibration lines are measured daily. These lines are not always acceptable. They can, for instance, contain outliers, have a bad precision or can be curved. To evaluate the quality of those lines a method which gives a fast diagnosis is recommended. In this study the use of Kohonen neural networks wa...
Article
Previously published data (Gašperlin et al., 1998) on viscoelastic behaviour of lipophilic semisolid emulsion systems and the prediction of their physical stability by neural network modelling are analysed in further detail. Most attention has been paid to viscosity, which with storage (G′) and loss modulus (G′′), is one of the most important rheol...
Article
New uniform and reversible spectrum-like representation of 3D chemical structures is explained. On a simple example of 3D structure of ethane, both the coding and decoding procedures are explained in detail. The spectrum-like representation is based on the projection of atoms specified by co-ordinate triplets [xi, yi, zi] on an arbitrarily large sp...
Chapter
In the present study the correlation between chemical structures and inhibiting properties of 256 5-phenyl-3,4-diamino-6,6-dimethyldihydrotriazine derivatives, inhibitors of dihydrofolate reductase (DHFR), is investigated. The data-set has been studied by several researches in many different laboratories1–3. hi the first studies1, the linear regres...
Chapter
In all kinds of QSAR studies it is very important how the chemical structure is represented. Usually a set of structural properties, calculated or extracted experimentally, is considered as a structure representation vector when compared and correlated to a biological property. Numerous attempts to suggest different structure representations reflec...
Article
Water-based acrylic polymers are frequently used as binders in ceramic materials that contain ZnO as a major component. Thin flexible ceramic films used in semiconductor elements are prepared from ceramic powder, polymer binder, dispergant and plasticizer. In the present work, the chemical reaction of acrylic acid, a part of the polymer, and cerami...
Article
ICP-AES was used for the quantitative determination of copper and copper accompanying elements (As, Ag, Bi, Co, Fe, Mn, Ni, Pb, Sb, Sn, Zn) that are present as impurities in plano-convex ingots from seven late bronze age hoards in Slovenia. Principal component analysis (PCA) was used for multi-dimensional data evaluation and for visualisation of in...
Article
A brief introduction into artificial neural networks (ANNs) is given, with emphasis on counter-propagation learning strategy, as well as their use for the purpose of modeling and optimization of H2O2/UV decoloration process. The use of Plackett–Burman partial factorial design for seven variables on three different levels, for the selection of exper...
Article
Optimisation of a spectrum-like structure representation via genetic algorithm (GA) is described. The final optimised structure representation of 28 molecules (flavonoid derivatives, inhibitors of the enzyme p56lck protein tyrosine kinase) contains only 15 variables compared with the 120 ones of the initial spectrum-like representation. The fitness...
Article
The investigation presented here is an attempt to establish a model for the prediction of toxicity of molecules using artificial neural networks (ANN) with a counterpropagation learning strategy. Molecules have been described as 3D geometrical structures, i.e. by the (x, y, z)-coordinates of all atoms. Each structure has been encoded into a `spectr...
Article
The determination of concentrations of sulphate in different samples of river and drinking waters and of concentrations of calcium in different wine samples using Kohonen and counterpropagation artificial neural network (ANN) is described. Kohonen ANN has been used to define the training and the test sets. All the samples are represented as sets of...
Article
ICP-AES was used for the quantitative determination of copper and copper accompanying elements (As, Ag, Bi, Co, Fe, Mn, Ni, Pb, Sb, Sn, Zn) that are present as impurities in plano-convex ingots from seven late bronze age hoards in Slovenia. Principal component analysis (PCA) was used for multi-dimensional data evaluation and for visualisation of in...
Chapter
Several problems associated with the role of statistics in the field of linguistics are outlined. Descriptors of sentences which allow classification of written material according to the style (not to the content!) are described, first in a more general view and next with respect to the Slovenian language. The computer requirements, algorithms and...
Article
The chemometrics approach was applied for the optimization of the ion chromatographic analysis for transition metal cations in order to obtain optimal operating conditions for routine work. In order to achieve this goal the usefulness of the sigmoidal functions for the evaluation of the two different chromatographic performance goals (resolution an...
Article
The article presents the ability to use a feed-forward neural network as a mapping tool. The objects are fed to the artificial neural network with two neurons in a hidden layer, and the result is compared to the targets which are in our case equal to the inputs themselves. After training one can plot the objects' labels to the map whose coordinates...
Article
In any type of modelling (be classical or by artificial neural networks) involving chemical structures and their corresponding properties, the first problem encountered is the representation of chemical structures. A good structure representation should have different code for each 3-D structure (uniqueness), it should have the same number of varia...
Article
A mathematical model for the description of the detector signal obtained in flow injection asynchronous merging zone technique (FIA-AMZ) is proposed. FIA-AMZ is based on the separate injection of a sample and an appropriate reagent in such a way that both injected solutions are covered only partly. The resulted detector signal consists of two conse...
Article
The principles of the Kohonen and counterpropagation artificial neural network (K-ANN and CP-ANN) learning strategy is described. The use of both methods (with the emphasis on CP-ANNs) is explained with several examples from analytical chemistry. The problems discussed in this presentation are: selection of a set of representative objects from a la...
Article
This paper describes an automated analytical system able to diagnose multivariate spectrophotometric responses, with the aim of detecting faulty responses and assigning causes to the symptoms detected. Not only does this system detect faulty spectra, but it is also capable of modifying, by means of a ‘feed-back response’, the entire analytical syst...
Article
In order to establish an adequate analytical system for the quality control of industrially produced titanium dioxide white pigments, two multivariate linear calibration techniques, principal component regression (PCR) and partial least squares (PLS), are used to model the relationship between the important pigment property, change of colour, and i...
Article
The use of experimental designs to obtain information about the factors and their interactions that affect the experimental system under study are described. For the classification of factors and their interactions according to their main and interactive effects a new parameter Ri is proposed. The computer program called EFFECTS has been made to si...
Article
Automatic classification of different mineralogical samples into 12 prespecified classes using Kohonen artificial neural networks (ANNs) is studied in comparison with standard chemometric techniques:  hierarchical clustering and principal component analysis. The mineral types into one of which the unknown samples should be classified are pyrrhotite...
Article
Information about geographical and chronological origin is often required of archaeological samples. In order to obtain such information, pattern recognition techniques are now used as valid tools for processing series of data from morphological and chemical analyses. This article reviews the advantages and disadvantages of artificial neural networ...
Article
Two different artificial neural network (ANN) strategies for building a model for the quantitative prediction of the property called ''total color difference'' are described. The models in the study are based on eight different complex oxide concentration measurements. The models obtained by the ANNs are compared with the multivariate linear regres...
Article
Two different artificial neural networks (ANNs) for infrared spectra analysis are presented: the self-organizing Kohonen ANN for mapping of the infrared spectra into a 2-D plane and the counterpropagation ANN for determination of the structural features of organic compounds based on their infrared spectra. The preliminary learning in the Kohonen AN...
Article
The apparent metabolic energy (EMA) of barley is modelled as a function of 12 easily obtainable analytical parameters by applying neural networks with the error back-propagation learning strategy. Kohonen maps and Ward's clustering technique have been used to define the objects for the training and test sets. The architecture of the neural network...
Article
Full-text available
All problems that in some way are linked to handling of multi‐variate experiments versus multi‐variate responses can be approached by the group of methods that has recently became known as the artificial neural network (ANN) techniques. In this lecture, the types of the problems that can be solved by ANN techniques rather than the ANN techniques th...
Article
The neural networks employing the counter-propagation learning strategy are described and their use for making complex models and inverse models is explained. Two examples show how such modelling strategy can yield satisfactory results for the investigated systems. The first example describes building a model for quantitative prediction of the so-c...
Article
Two different artificial neural networks (ANNs) for infrared spectra analysis are presented: the self-organizing Kohonen ANN for mapping of the infrared spectra into a 2-D plane and the counterpropagation ANN for determination of the structural features of organic compounds based on their infrared spectra. The preliminary learning in the Kohonen AN...
Article
A comparison of classification abilities of two different neural network methods, namely, back-propagation of errors and Kohonen learning is made and discussed. The classification is performed on a set of 572 Italian olive oils on the basis of an analysis of eight fatty acids. The comparison of methods is carried out by different neural network arc...