Homology-Based Modeling of Protein Structure
ABSTRACT The human genome project has already discovered millions of proteins (http://www.swissprot.com). The potential of the genome
project can only be fully realized once we can assign, understand, manipulate, and predict the function of these new proteins
(Sanchez and Sali, 1997; Frishman et al., 2000; Domingues et al., 2000). Predicting protein function generally requires knowledge
of protein three-dimensional structure (Blundell et al., 1978;Weber, 1990), which is ultimately determined by protein sequence
(Anfinsen, 1973). Protein structure determination using experimental methods such as X-ray crystallography or NMR spectroscopy
is very time consuming (Johnson et al. 1994). To date, fewer than 2% of the known proteins have had their structures solved
experimentally. In 2004, more than half a million new proteins were sequenced that almost doubled the efforts in the previous
year, but only 5300 structures were solved. Although the rate of experimental structure determination will continue to increase,
the number of newly discovered sequences grows much faster than the number of structures solved (see Fig. 10.1).
- [show abstract] [hide abstract]
ABSTRACT: A new method for calculating the total conformational free energy of proteins in water solvent is presented. The method consists of a relatively brief simulation by molecular dynamics with explicit solvent (ES) molecules to produce a set of microstates of the macroscopic conformation. Conformational energy and entropy are obtained from the simulation, the latter in the quasi-harmonic approximation by analysis of the covariance matrix. The implicit solvent (IS) dielectric continuum model is used to calculate the average solvation free energy as the sum of the free energies of creating the solute-size hydrophobic cavity, of the van der Waals solute-solvent interactions, and of the polarization of water solvent by the solute's charges. The reliability of the solvation free energy depends on a number of factors: the details of arrangement of the protein's charges, especially those near the surface; the definition of the molecular surface; and the method chosen for solving the Poisson equation. Molecular dynamics simulation in explicit solvent relaxes the protein's conformation and allows polar surface groups to assume conformations compatible with interaction with solvent, while averaging of internal energy and solvation free energy tend to enhance the precision. Two recently developed methods--SIMS, for calculation of a smooth invariant molecular surface, and FAMBE, for solution of the Poisson equation via a fast adaptive multigrid boundary element--have been employed. The SIMS and FAMBE programs scale linearly with the number of atoms. SIMS is superior to Connolly's MS (molecular surface) program: it is faster, more accurate, and more stable, and it smooths singularities of the molecular surface. Solvation free energies calculated with these two programs do not depend on molecular position or orientation and are stable along a molecular dynamics trajectory. We have applied this method to calculate the conformational free energy of native and intentionally misfolded globular conformations of proteins (the EMBL set of deliberately misfolded proteins) and have obtained good discrimination in favor of the native conformations in all instances.Proteins Structure Function and Bioinformatics 10/1998; 32(4):399-413. · 3.34 Impact Factor
- Biopolymers 01/1988; 26(12):2053-85. · 2.88 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: Two principal methods of determining the conformation of short pieces of polypeptide backbone in proteins have been developed: using a database of known structures and systematically generating all conformations. In this paper, we compare the effectiveness of these two techniques. The completeness of the database for segments of different lengths is examined and it is found to contain most conformations for segments seven residues long, but to deteriorate rapidly for longer regions. When the database segment is to be incorporated into the rest of a structure, at least seven residues are required to build four new residues, because of the need to position the segment relative to the rest of the structure. It is found that such positioning using flanking residues results in large errors in the inserted region. We conclude that the database method is currently not effective for comparative modeling, even for short segments. The systematic search procedure is found to generate almost all structures of short segments found in proteins. In contrast to the database method, low root mean square error structures are obtained for a set of trial segments embedded in the rest of a protein structure. Thus, it should be considered the method of choice.Protein engineering 09/1994; 7(8):953-60.