[Show abstract][Hide abstract] ABSTRACT: Theoretical and in vitro experiments suggest that protein folding cores form early in the process of folding, and that proteins may have evolved to optimize both folding speed and native-state stability. In our previous work (Chen et al., Structure, 14 (2006) 1401), we developed a set of empirical potential functions and used them to analyze interaction energies among secondary-structure elements in two beta-sandwich proteins. Our work on this group of proteins demonstrated that the predicted folding core also harbors residues that form native-like interactions early in the folding reaction. In the current work, we have tested our empirical potential functions on structurally-different proteins for which the folding cores have been revealed by protein hydrogen-deuterium exchange experiments. Using a set of 29 unrelated proteins, which have been extensively studied in the literature, we demonstrate that the average prediction result from our method is significantly better than predictions based on other computational methods. Our study is an important step towards the ultimate goal of understanding the correlation between folding cores and native structures.
Archives of Biochemistry and Biophysics 01/2009; 483(1):16-22. · 3.37 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In this article, we present a de novo method for predicting protein domain boundaries, called OPUS-Dom. The core of the method is a novel coarse-grained folding method, VECFOLD, which constructs low-resolution structural models from a target sequence by folding a chain of vectors representing the predicted secondary-structure elements. OPUS-Dom generates a large ensemble of folded structure decoys by VECFOLD and labels the domain boundaries of each decoy by a domain parsing algorithm. Consensus domain boundaries are then derived from the statistical distribution of the putative boundaries and three empirical sequence-based domain profiles. OPUS-Dom generally outperformed several state-of-the-art domain prediction algorithms over various benchmark protein sets. Even though each VECFOLD-generated structure contains large errors, collectively these structures provide a more robust delineation of domain boundaries. The success of OPUS-Dom suggests that the arrangement of protein domains is more a consequence of limited coordination patterns per domain arising from tertiary packing of secondary-structure segments, rather than sequence-specific constraints.
Journal of Molecular Biology 12/2008; 385(4):1314-29. · 3.91 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In this paper, we report a knowledge-based potential function, named the OPUS-Ca potential, that requires only Calpha positions as input. The contributions from other atomic positions were established from pseudo-positions artificially built from a Calpha trace for auxiliary purposes. The potential function is formed based on seven major representative molecular interactions in proteins: distance-dependent pairwise energy with orientational preference, hydrogen bonding energy, short-range energy, packing energy, tri-peptide packing energy, three-body energy, and solvation energy. From the testing of decoy recognition on a number of commonly used decoy sets, it is shown that the new potential function outperforms all known Calpha-based potentials and most other coarse-grained ones that require more information than Calpha positions. We hope that this potential function adds a new tool for protein structural modeling.
Protein Science 08/2007; 16(7):1449-63. · 2.74 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This paper reports a combined computational and experimental study of the correlation between protein stability cores and folding kinetics. An empirical potential function was developed, and it was used for analyzing interaction energies among secondary structure elements. Studies on a beta sandwich protein, Pseudomonas aeruginosa azurin, showed that the computationally identified substructure with the strongest interactions in the native state is identical to the "interlocked pair" of beta strands, an invariant motif found in most sandwich-like proteins. Moreover, previous and new in vitro folding results revealed that the identified substructure harbors most residues that form native-like interactions in the folding transition state. These observations demonstrate that the potential function is effective in revealing the relative strength of interactions among various protein parts; they also strengthen the suggestion that the most stable regions in native proteins favor stable interactions early during folding.
[Show abstract][Hide abstract] ABSTRACT: This paper reports a computational method for folding small helical proteins. The goal was to determine the overall topology of proteins given secondary structure assignment on sequence. In doing so, a Monte Carlo protocol, which combines coarse-grained normal modes and a Hamiltonian at a different scale, was developed to enhance sampling. In addition to the knowledge-based potential functions, a small-angle X-ray scattering (SAXS) profile was also used as a weak constraint for guiding the folding. The algorithm can deliver structural models with overall correct topology, which makes them similar to those of 5 approximately 6 A cryo-EM density maps. The success could contribute to make the SAXS technique a fast and inexpensive solution-phase experimental method for determining the overall topology of small, soluble, but noncrystallizable, helical proteins.
[Show abstract][Hide abstract] ABSTRACT: We report a novel computational procedure for determining protein native topology, or fold, by defining loop connectivity based on skeletons of secondary structures that can usually be obtained from low to intermediate-resolution density maps. The procedure primarily involves a knowledge-based geometry filter followed by an energetics-based evaluation. It was tested on a large set of skeletons covering a wide range of protein architecture, including one modeled from an experimentally determined 7.6A cryo-electron microscopy (cryo-EM) density map. The results showed that the new procedure could effectively deduce protein folds without high-resolution structural data, a feature that could also be used to recognize native fold in structure prediction and to interpret data in fields like structure genomics. Most importantly, in the energetics-based evaluation, it was revealed that, despite the inevitable errors in the artificially constructed structures and limited accuracy of knowledge-based potential functions, the average energy of an ensemble of structures with slightly different configurations around the native skeleton is a much more robust parameter for marking native topology than the energy of individual structures in the ensemble. This result implies that, among all the possible topology candidates for a given skeleton, evolution has selected the native topology as the one that can accommodate the largest structural variations, not the one rigidly trapped in a deep, but narrow, conformational energy well.
Journal of Molecular Biology 08/2005; 350(3):571-86. · 3.91 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The atomic model of F-actin was refined against fiber diffraction data using long-range normal modes as adjustable parameters to account for the collective long-range filamentous deformations. To determine the effect of long-range deformations on the refinement, each of the four domains of G-actin was treated as a rigid body. It was found that among all modes, the bending modes make the most significant contributions to the improvement of the refinement. Inclusion of only 7-9 bending modes as adjustable parameters yielded a lowest R-factor of 6.3%. These results demonstrate that employing normal modes as refinement parameters has the advantage of using a small number of adjustable parameters to achieve a good fitting efficiency. Such a refinement procedure may therefore prevent the refinement from overfitting the structural model. More importantly, the results of this study demonstrate that, for any fiber diffraction data, a substantial amount of refinement error is due to long-range deformations, especially the bending, of the filaments. The effects of these intrinsic deformations cannot be easily compensated for by adjusting local structural parameters, and must be properly accounted for in the refinement to achieve improved fit of refined models with experimental diffraction data.
[Show abstract][Hide abstract] ABSTRACT: This review briefly summarizes the recent development in fibre diffraction refinement of flexible filamentous systems using long- range normal modes as adjustable parameters. Among all the long-range modes, the low-frequency bending modes were found to contribute the most to the improvement of the refinement. The use of several low-frequency modes in the refinement decreased significantly both R- and Rfree -factors, demonstrating the advantage of this procedure in achieving a good refinement without the risk of over-fitting. Moreover, the study provided strong evidence that substantial errors in conventional refinements are due to long-range deformations, especially the bending, of the filaments. These intrinsic deformations must be properly accounted for in order to improve the refinement efficiency. Fibre diffraction is widely employed for studying structures of biologically important filamentous systems (Stubbs, 1999) ranging from simple polypeptides to cytoskeletal filaments and filamentous viruses. In fibre diffraction experiments, the fibre specimens are aligned axially, but not azimuthally, and diffraction patterns are cylindrically averaged, which leads to the characteristic layer lines. Because of this averaging, the number of independent diffractions of fibres is significantly smaller than that from crystals. One usually does not have sufficient data to refine the Cartesian coordinates of each atom in the fibres. This feature imposes a severe challenge in choosing proper refinement parameters (Wang and Stubbs, 1993).
[Show abstract][Hide abstract] ABSTRACT: Here we report the results of applying substructure synthesis method to the simulation of F-actin filaments of several microns in length. The elastic deformational modes of long F-actin filaments were generated from the vibrational modes of the 13-subunit repeat of F-actin using a hierarchical synthesis scheme. The computationally synthesized deformational modes, in the very low-frequency regime, are in good agreement with theoretical solutions for long homogeneous elastic rods, which confirmed the usefulness of substructure synthesis method. Other low-frequency modes carry rich local deformational features that are unique to F-actins. All these modes thus provide a theoretical basis set for a description of spontaneously occurring thermal deformations, such as undulations, of the filaments. The results demonstrate that substructure synthesis method, as a method for computational modal analysis, is capable of scaling up the microscopic dynamic information, obtained from atomistic simulations, to a wide range of macroscopic length scale. Moreover, the combination of substructure synthesis method and hierarchical synthesis scheme provides an effective way in dealing with complex systems of periodic repeats that are abundant in cells.
[Show abstract][Hide abstract] ABSTRACT: Pyruvate dehydrogenase complex (PDC) is one of the largest multienzyme complexes known and consists of a dodecahedral E2 core to which other components are attached. We report the results of applying a new computational method, quantized elastic deformational model, to simulating the conformational fluctuations of the truncated E2 core, using low-resolution electron cryomicroscopy density maps. The motional features are well reproduced; especially, the symmetric breathing mode revealed in simulation is nearly identical with what was observed experimentally. Structural details of the motions of the trimeric building blocks, which are critical to facilitating the global expansion and contraction of the complex, were revealed. Using the low-resolution maps from electron cryomicroscopy reconstructions, the simulations showed a picture of the motional mechanism of the PDC core, which is an example without precedent of thermally activated global dynamics. Moreover, the current results support an earlier suggestion that, at low resolution and without the use of amino acid sequence and atomic coordinates, it is possible for computer simulations to provide an accurate description of protein dynamics.
Journal of Molecular Biology 07/2003; 330(1):129-35. · 3.91 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This paper reports a computational method for describing the conformational flexibility of very large biomolecular complexes using a reduced number of degrees of freedom. It is called the substructure synthesis method, and the basic concept is to treat the motions of a given structure as a collection of those of an assemblage of substructures. The choice of substructures is arbitrary and sometimes quite natural, such as domains, subunits, or even large segments of biomolecular complexes. To start, a group of low-frequency substructure modes is determined, for instance by normal mode analysis, to represent the motions of the substructure. Next, a desired number of substructures are joined together by a set of constraints to enforce geometric compatibility at the interface of adjacent substructures, and the modes for the assembled structure can then be synthesized from the substructure modes by applying the Rayleigh-Ritz principle. Such a procedure is computationally much more desirable than solving the full eigenvalue problem for the whole assembled structure. Furthermore, to show the applicability to biomolecular complexes, the method is used to study F-actin, a large filamentous molecular complex involved in many cellular functions. The results demonstrate that the method is capable of studying the motions of very large molecular complexes that are otherwise completely beyond the reach of any conventional methods.
Proceedings of the National Academy of Sciences 02/2003; 100(1):104-9. · 9.81 Impact Factor