
Gordon Crippen- Ph. D.
- University of Michigan
Gordon Crippen
- Ph. D.
- University of Michigan
About
160
Publications
28,722
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
10,092
Citations
Introduction
Current institution
Additional affiliations
September 1980 - September 1985
September 1985 - September 2010
Publications
Publications (160)
A standard task in distance geometry is to calculate one or more sets of Cartesian coordinates for a set of points that satisfy given geometric constraints, such as bounds on some of the distances. Using instead distances is attractive because distance constraints can be expressed as simple linear bounds on coordinates. Likewise, a given matrix of...
In order to better understand the many different distance geometry numerical algorithms, it is necessary to relate them to real-world problems in computational chemistry. Here we consider small molecule applications, determination of protein conformation from nuclear magnetic resonance experiments (NMR), protein homology modeling, and more abstract...
BCS classification is a vital tool in the development of both generic and innovative drug products. The purpose of this work was to provisionally classify the world's top selling oral drugs according to the BCS, using in-silico methods. Three different in-silico methods were examined: the well-established group contribution (CLogP) and atom contrib...
ChemInform is a weekly Abstracting Service, delivering concise information at a glance that was extracted from about 100 leading journals. To access a ChemInform Abstract of an article which was published elsewhere, please select a “Full Text” option. The original article is trackable via the “References” option.
The searching and characterization of large chemical databases has recently provoked much interest, particularly with respect to the question of whether any of the compounds in the database could serve as new leads to a compound of pharmacological interest. This paper introduces a fast and novel method of determining whether any of a given series o...
The structure of the AMBER potential energy surface of the cyclic tetrapeptide cyclotetrasarcosyl is analyzed as a function of the dimensionality of coordinate space. It is found that the number of local energy minima decreases as the dimensionality of the space increases until some limit at which point equipotential subspaces appear. The applicabi...
Here we describe a new algorithm for automatically determining the mainchain sequential assignment of NMR spectra for proteins. Using only the customary triple resonance experiments, assignments can be quickly found for not only small proteins having rather complete data, but also for large proteins, even when only half the residues can be assigned...
One of the most important physicochemical properties of small molecules and macromolecules are the dissociation constants for any weakly acidic or basic groups, generally expressed as the pK(a) of each group. This is a major factor in the pharmacokinetics of drugs and in the interactions of proteins with other molecules. For both the protein and sm...
A simple, easily calculated, nonparametric statistic is described that can detect the presence of a functional relationship in bivariate data. Given a sample of data points (x,y), the statistic's value is nearly 1 if y is a linear function of x with little noise; it is greater than 1 if y is a nonlinear function of x; and it is close to 2 if x and...
Traditionally, quantitative structure-activity relations (QSAR) have been formulated as a linear combination of chemical compound properties that should fit the observed activity. Properties may be quantitative, such as logP, or qualitative, such as an indicator variable for the presence of a particular substituent. The advantages are that least sq...
Chirality is an important concept in medicinal chemistry, since many biochemical reactions and processes are stereospecific, including the recognition of some drugs by their receptors. Quantitative structure-activity relations (QSAR) are mathematical models that relate chemical structure and/or molecular properties to biological activity, and incor...
Realizing favorable absorption, distribution, metabolism, elimination, and toxicity profiles is a necessity due to the high attrition rate of lead compounds in drug development today. The ability to accurately predict bioavailability can help save time and money during the screening and optimization processes. As several robust programs already exi...
ChemInform is a weekly Abstracting Service, delivering concise information at a glance that was extracted from about 200 leading journals. To access a ChemInform Abstract of an article which was published elsewhere, please select a “Full Text” option. The original article is trackable via the “References” option.
Elimination of cytotoxic compounds in the early and later stages of drug discovery can help reduce the costs of research and development. Through the application of principal components analysis (PCA), we were able to data mine and prove that approximately 89% of the total log GI 50 variance is due to the nonspecific cytotoxic nature of substances....
ChemInform is a weekly Abstracting Service, delivering concise information at a glance that was extracted from about 200 leading journals. To access a ChemInform Abstract of an article which was published elsewhere, please select a “Full Text” option. The original article is trackable via the “References” option.
The NCI Developmental Therapeutics Program Human Tumor cell line data set is a publicly available database that contains cellular assay screening data for over 40 000 compounds tested in 60 human tumor cell lines. The database also contains microarray assay gene expression data for the cell lines, and so it provides an excellent information resourc...
Cheminformatics can be broadly defined to encompass any activity related to the application of information technology to the study of properties, effects and uses of chemical agents. One of the most important current challenges in cheminformatics is to allow researchers to search databases of biomedical knowledge, using chemical structures as input...
Recently, we developed a pairwise structural alignment algorithm using realistic structural and environmental information (SAUCE). In this paper, we at first present an automatic fold hierarchical classification based on SAUCE alignments. This classification enables us to build a fold tree containing different levels of multiple structural profiles...
Multiple STructural Alignment (MSTA) provides valuable information for solving problems such as fold recognition. The consistency-based approach tries to find conflict-free subsets of alignments from a pre-computed all-to-all Pairwise Alignment Library (PAL). If large proportions of conflicts exist in the library, consistency can be hard to get. On...
Rapid analysis of protein structure, interaction, and dynamics requires fast and automated assignments of 3D protein backbone triple-resonance NMR spectra. We introduce a new depth-first ordered tree search method of automated assignment, CASA, which uses hand-edited peak-pick lists of a flexible number of triple resonance experiments. The computer...
In the era of structural genomics, it is necessary to generate accurate structural alignments in order to build good templates for homology modeling. Although a great number of structural alignment algorithms have been developed, most of them ignore intermolecular interactions during the alignment procedure. Therefore, structures in different oligo...
Cluster distance geometry is a recent generalization of distance geometry whereby protein structures can be described at even lower levels of detail than one point per residue. With improvements in the clustering technique, protein conformations can be summarized in terms of alternative contact patterns between clusters, where each cluster contains...
The purpose of this study is to explore the use of classification regression trees (CART) in predicting, in the dose-independent range, the fraction dose absorbed in humans. Since the results from clinical formulations in humans were used for training the model, a hypothetical state of drug molecules already dissolved in the intestinal fluid was ad...
Biphenyl hydrolase-like (BPHL) protein is a novel serine hydrolase which has been identified as human valacyclovirase (VACVase), catalyzing the hydrolytic activation of valine ester prodrugs of the antiviral drugs acyclovir and ganciclovir as well as other amino acid ester prodrugs of therapeutic nucleoside analogues. The broad specificity for nucl...
This is our second type of model for protein folding where the configurational parameters and the effective potential energy function are chosen in such a way that all conformations are described and the canonical partition function can be evaluated analytically. Structure is described in terms of distances between pairs of sequentially contiguous...
The distance geometry approach to conformational calculation has been shown to be very effective at producing large molecular structures satisfying many given, long-range constraints on the interatomic distances. I now present a significant extension of the method that handles strictly geometric constraints as well as before while also locating con...
Distance geometry has been a broadly useful tool for dealing with conformational calculations. Customarily each atom is represented as a point, constraints on the distances between some atoms are obtained from experimental or theoretical sources, and then a random sampling of conformations can be calculated that are consistent with the constraints....
We have initiated an entirely new approach to statistical mechanical models of strongly interacting systems where the configurational parameters and the potential energy function are both constructed so that the canonical partition function can be evaluated analytically. For a simplified model of proteins consisting of a single, fairly short polype...
Empirical protein folding potentialfunctions should have a global minimum nearthe native conformationof globular proteins that fold stably, andthey should give the correct free energy offolding. We demonstrate that otherwise verysuccessful potentials fail to have even alocal minimumanywhere near the native conformation, anda seemingly well validate...
Empirical protein folding potentialfunctions should have a global minimum nearthe native conformationof globular proteins that fold stably, andthey should give the correct free energy offolding. We demonstrate that otherwise verysuccessful potentials fail to have even alocal minimumanywhere near the native conformation, anda seemingly well validate...
Given atomic coordinates for a particular conformation of a molecule and some property value assigned to each atom, one can easily calculate a chirality function that distinguishes enantiomers, is zero for an achiral molecule, and is a continuous function of the coordinates and properties. This is useful as a quantitative measure of chirality for m...
Series approximations of the three-dimensional structure of protein conformations can provide insightful ways to detect and manipulate global features and those local to contiguous segments of the chain. Discrete cosine transforms have proven to be very useful in the past, and now wavelet transforms appear to have additional advantages. Here the em...
Adequate conformational searching of small molecules and inclusion of a chirality identifier are necessary features of any current technique for quantitative structure-activity relationships (QSAR). However, implementation of these features can be difficult and computationally expensive, and some techniques can still lead to insufficient treatment...
Adequate conformational searching of small molecules and inclusion of a chirality identifier are necessary features of any current technique for quantitative structure-activity relationships (QSAR). However, implementation of these features can be difficult and computationally expensive, and some techniques can still lead to insufficient treatment...
A novel set of molecular descriptors suitable for use in quantitative structure-activity relationships and related methods is described. These descriptors are a smooth and interpretable representation of atomic physicochemical property values and intramolecular atom pair distances. Distance atomic physicochemical parameter energy relationships (DAP...
We present a simple method to train a potential function for the protein folding problem which, even though trained using a small number of proteins, is able to place a significantly large number of native conformations near a local minimum. The training relies on generating decoys by energy minimization of the native conformations using the curren...
Self-avoiding lattice walks are often used as minimalist models of proteins. Typically, the polypeptide chain is represented as a lattice walk with each amino acid residue lying on a lattice point, and the Hamiltonian being a sum of interactions between pairs of sequentially nonadjacent residues on adjacent points. Interactions depend on the types...
The overlap of ligand atoms has been analyzed for 32 common enzyme systems. The ligand alignment was determined by superposition of the experimentally determined protein structures. Comparison of the overlapping atoms in terms of atomic contribution to partition coefficient and molar refractivity shows that in most cases ligand atoms overlap with a...
Given an all non-hydrogen-atom potential function that implicitly includes solvation effects, it is possible to adjust its parameters to favor the correct native structure for several proteins over decoys produced by ungapped threading. It is also possible to further train it to reproduce the experimental free energy of unfolding in aqueous solutio...
A protein folding potential function ideally has several properties: it favors the native conformations for a number of protein sequences over a variety of nonnative folds; it can guide the search over conformations for the native state; it reflects changes in stability of the native fold due to changes in sequence; and it is relatively insensitive...
An energy potential is constructed and trained to succeed in fold recognition for the general population of proteins as well as an important class which has previously been problematic: small, disulfide-bearing proteins. The potential is modeled on solvation, with the energy a function of side chain burial and the number of disulfide bonds. An accu...
Self-avoiding walks on a three-dimensional (3D) simple cubic lattice are often used to model polymers, especially proteins. The Hamiltonian is generally taken to be a function of contacts between sequentially nonadjacent residues. The set of all conformations having a particular set of contacts occupies the same energy level, and one would like to...
One of the approaches to protein structure prediction is to obtain energy functions which can recognize the native conformation of a given sequence among a zoo of conformations. The discriminations can be done by assigning the lowest energy to the native conformation, with the guarantee that the native is in the zoo. Well-adjusted functions, then,...
One of the approaches to protein structure prediction is to obtain energy functions which can recognize the native conformation of a given sequence among a zoo of conformations. The discriminations can be done by assigning the lowest energy to the native conformation, with the guarantee that the native is in the zoo. Well-adjusted functions, then,...
VRI (Variable Resolution Invariants) is a new approach to quantitative structure-activity relations that makes use of three-dimensional features of molecules at different levels of spatial resolution as well as levels of resolution in atomic properties. These descriptors are independent of any numbering of the atoms of a molecule. They are also ind...
VRI (Variable Resolution Invariants) is a new approach to quantitative structure–activity relations that makes use of three-dimensional features of molecules at different levels of spatial resolution as well as levels of resolution in atomic properties. These descriptors are independent of any numbering of the atoms of a molecule. They are also ind...
We present a new atom type classification system for use in atom-based calculation of partition coefficient (log P) and molar refractivity (MR) designed in part to address published concerns of previous atomic methods. The 68 atomic contributions to log P have been determined by fitting an extensive training set of 9920 molecules, with r(2) = 0.918...
Protein structural knowledge is essential to understanding molecular basis of disease. Drug design and discovery are facilitated by understanding the three dimensional shape of relevant proteins, as will be future modes of disease intervention such as gene therapy. While identification of disease related protein sequences has dramatically increased...
One of the most challenging problems in computer-aided drug design is deducing one or more models of a binding site solely from the known chemical structures and measured binding affinities of a few small molecules. In particular, the X-ray crystal structure of the receptor is not known. The main difficulty is deciding how the different ligands are...
The assembly of large compound libraries for the purpose of screening against various receptor targets to identify chemical leads for drug discovery programs has created a need for methods to measure the molecular diversity of such libraries. The method described here, for which we propose the acronym RESIS (for Receptor Site Interaction Simulation...
It is hard to construct theories for the folding of globular proteins because they are large and complicated molecules having enormous numbers of nonnative conformations and having native states that are complicated to describe. Statistical mechanical theories of protein folding are constructed around major simplifying assumptions about the energy...
It is hard to construct theories for the folding of globular proteins because they are large and complicated molecules having enormous numbers of nonnative conformations and having native states that are complicated to describe. Statistical mechanical theories of protein folding are constructed around major simplifying assumptions about the energy...
EGSITE2 represents a substantial advance in a long series of methods for calculating receptor site models given only specific binding data. Compared to our most recently reported technique, EGSITE [Schnitker et al. J. Comput.-Aided Mol. Des. 1997, 11, 93-110] the user no longer has to simplify the structures of the molecules in the training set by...
Protein folding and inverse protein folding problems are examined for the extremely simplified model of short self-avoiding square lattice walks involving only two or three residue types. Simple interresidue contact free energy functions are given and are used to determine which sequences fold uniquely to which conformations. Contrary to general th...
We report the application of a recently developed alignment-free 3D QSAR method [Crippen, G.M., J. Comput. Chem., 16 (1995) 486] to a benchmark-type problem. The test system involves the binding of 31 steroid compounds to two kinds of human carrier protein. The method used not only allows for arbitrary binding modes, but also avoids the problems of...
To calculate the tertiary structure of a protein from its amino acid sequence, the thermodynamic approach requires a potential function of sequence and conformation that has its global minimum at the native conformation for many different proteins. Here we study the behavior of such functions for the simplest model system that still has some of the...
In order to calculate the tertiary structure of a protein from its amino acid sequence, the thermodynamic approach requires a potential function of sequence and conformation that has its global minimum at the native conformation for many different proteins. Here we study the behavior of such functions for the simplest model system that still has th...
For decades, a large number of investigators have been sifting the database of experimentally determined three-dimensional protein structures to discover recurring patterns of all types. Now that there are over a thousand such structures available, the natural question is whether we have seen all substantially different protein folds, and if not, h...
As the three-dimensional structures of more and more proteins are determined by experiment, discovering substantially novel folding motifs becomes ever rarer. The natural question is how many motifs are there and how many have already been found? In order to answer this in at least one plausible and well-defined sense, we have chosen a quantitative...
Protein structures are routinely compared by their root-mean-square deviation (RMSD) in atomic coordinates after optimal rigid body superposition. What is not so clear is the significance of different RMSD values, particularly above the customary arbitrary cutoff for obvious similarity of 2-3 A. Our earlier work argued for an intrinsic cutoff for p...
In the search for new drugs, it often occurs that the binding affinities of several compounds to a common receptor macromolecule are known experimentally, but the structure of the receptor is not known. This article describes an extraordinarily objective computer algorithm for deducing the important geometric and energetic features of the common bi...
There has been a great deal of activity recently on approaches to the calculation of protein folding using specially devised empirical potential functions. We have developed one such function that solves the protein structure recognition problem: given the sequence for a globular protein and a collection of plausible protein conformations, includin...
In the search for new drugs, it often occurs that the binding affinities of several compounds to a common receptor macromolecule are known experimentally. But the structure of the receptor is not known. We describe an extraordinarily objective computer algorithm for deducing the important geometric and energetic features of the common binding site,...
Over the last few years we have developed an empirical potential function that solves the protein structure recognition problem: given the sequence for an n-residue globular protein and a collection of plausible protein conformations, including the native conformation for that sequence, identify the correct, native conformation. Having determined t...
In the study of globular protein conformations, one customarily measures the similarity in three-dimensional structure by the root-mean-square deviation (RMSD) of the C alpha atomic coordinates after optimal rigid body superposition. Even when the two protein structures each consist of a single chain having the same number of residues so that the m...
The classical protein folding problem is to predict the three-dimensional (3D) conformation of a protein given only its amino acid sequence. In the thermodynamic approach one attempts to simulate this by choosing some kind of potential function of conformation and then searching for the conformation(s) having the global minimum of this function. Th...
The Voronoi approach has been used to obtain a three-dimensional model for the binding of the cocaine analogues at the cocaine receptor site. The method has been used to determine the geometric details and the physicochemical properties of the binding regions in the receptor site. With only eight compounds in the training set, the Voronoi site mode...
A commonly occurring problem in drug development is that the binding affinities for a few compounds to a particular binding site on some protein have been measured, but the crystal structure for that protein is not available. Quantitative structure-activity methods attempt to empirically correlate the binding data with various features of the chemi...
Vorom is a computer-aided method of drug design which can model a biological receptor given only binding data of known ligands. Using the binding energies of known competitive, reversible ligands of a biological macromolecule, vorom can make predictions about the binding energies and conformations of other small molecules binding to that receptor a...
We have devised a new measure of molecular similarity with respect to given simple partitions of space into regions. The similarity is determined by numerical integration of the difference in the optimal interaction between the two molecules and the regions over a large range of interaction parameter values. Compounds differing in empirical formula...
Linearized embedding is a variant on the usual distance geometry methods for finding atomic Cartesian coordinates given constraints on interatomic distances. Instead of dealing primarily with the matrix of interatomic distances, linearized embedding concentrates on properties of the metric matrix, the matrix of inner products between pairs of vecto...
We have devised a continuous function of interresidue contacts in globular proteins such that the X-ray crystal structure has a lower function value than that of thousands of protein-like alternative conformations. Although we fit the adjustable parameters of the potential using only 10,000 alternative structures for a selected training set of 37 p...
Linearized embedding is a variant on the usual distance geometry methods for finding atomic Cartesian coordinates given constraints on interatomic distances. Instead of dealing primarily with the matrix of interatomic distances, linearized embedding concentrates on properties of the metric matrix, the matrix of inner products between pairs of vecto...
One of the most difficult problems in computational chemistry is the prediction of the three-dimensional structure of a protein molecule given only its amino acid sequence. Although there are several programs for calculating the empirical or quantum mechanical energies, and there are more programs for either minimizing the energy as a function of c...
Since the 1988 monograph “Distance Geometry and Molecular Conformation” by Crippen and Havel, there have been significant changes in the application of distance geometry to problems of chemical interest. This review attempts to outline what the current state of the art is, in both the underlying mathematical methods and chemical applications, and t...
The structure of the AMBER potential energy surface of the cyclic tetrapeptide cyclotetrasarcosyl is analyzed as a function of the dimensionality of coordinate space. It is found that the number of local energy minima decreases as the dimensionality of the space increases until some limit at which point equipotential subspaces appear. The applicabi...
Predicting the three-dimensional structure of a protein given only its amino acid sequence is a long-standing goal in computational chemistry. In the thermodynamic approach, one needs a potential function of conformation that resembles the free energy of the real protein to the extent that the global minimum of the potential is attained by the nati...
A frequently occurring problem in drug design and enzymology is that the binding constants for several compounds to the same site are known, but the geometry and energetic interactions of the site are not. This paper presents in detail a novel approach to the problem which accurately but compactly represents the allowed conformation space of each l...
Given a sufficiently good empirical potential function for the internal energy of molecules, prediction of the preferred conformations is nearly impossible for large molecules because of the enormous number of local energy minima. Energy embedding has been a promising method for locating extremely good local minima, if not always the global minimum...
A general method is presented for constructing a potential function for approximate conformational calculations on globular proteins. The method involves solving a nonlinear program that seeks to adjust the potential's parameters in such a way that a minimum near the native remains a minimum and does not move far away, while any alternative minima...
A novel computer-aided receptor modeling method, REMOTEDISC [J. Med. Chem. 32:746-756 (1989)], has been used to analyze the inhibition of labeled diazepam binding by 29 benzodiazepine receptor ligands. The method uses the three-dimensional structure, conformational energy, and important atom-based physicochemical properties to model the hypothetica...
A three-dimensional Voronoi binding site model has been formulated from a series of competitors for the binding site on a recently isolated polycyclic aromatic hydrocarbon binding protein (PBP) from mouse liver. The PBP binds polycyclic aromatic hydrocarbons, such as benzo[a]pyrene (B[a]P), with high affinity and shows other characteristics associa...
Comparison of 1H and 13C NMR parameters for the cyclic, conformationally restricted, δ opioid receptor selective enkephalin analogue Tyr-D-Pen-Gly-Phe-D-Pen ([D-Pen2,D-Pen5]enkephalin, DPDPE) in aqueous versus dimethyl sulfoxide (DMSO) solution indicates that this peptide adopts similar conformations in these solvents. This suggestion that the conf...
There are many methods in the literature for calculating conformations of a molecule subject to geometric constraints, such as those derived from two-dimensional NMR experiments. One of the most general ones is the EMBED algorithm, based on distance geometry, where all constraints except chirality are converted into upper and lower bounds on intera...
A new and accurate method for calculating the geometrically allowed modes of binding of a ligand molecule to a Voronoi site model is reported. It is shown that the feasibility of the binding of a group of atoms to a Voronoi site reduces to a simple set of linear and quadratic inequalities and quadratic equalities which can be solved by minimization...
The in vitro antiviral activity of 28 nucleosides against the parainfluenza virus type 3 has been analyzed by using a novel computer aided receptor modeling procedure. The method involves an extensive modification of our earlier work (Ghose, A. K.; Crippen, G. M. J. Med. Chem. 1985, 28, 333). It presents a more straightforward algorithm for the ste...
Distance geometry is a technique widely used to find atomic coordinates that agree with given upper and lower bounds on the interatomic distances. It is successful because it chooses at random some relatively good "trial coordinates" that take into account the whole molecule and all constraints at once. Customarily, these trial coordinates must be...
In an earlier article 8 the need was demonstrated for atomic physicochemical properties for three dimensional structure directed quantitative structure-activity relationships, and it was shown how atomic parameters can be developed for successfully evaluating the molecular octanol-water partition coefficient, which is a measure of hydrophobicity. I...
Energy embedding is a numerical technique for locating conformations of molecules corresponding to very good local minima of the potential function used, which generally has involved only atom-pair interactions. Computational experience with various molecular force fields has shown it has a remarkable talent for locating either the global energy mi...
A frequently occurring problem in drug design and enzymology is that the binding constants for several compounds to the same site are known, but the geometry and energetic interactions of the site are not. This paper presents in detail a novel approach to the problem which accurately but compactly represents the allowed conformation space of each l...