[Show abstract][Hide abstract] ABSTRACT: Functioning proteins do not remain fixed in a unique structure, but instead they sample a range of conformations facilitated by motions within the protein. Even in the native state, a protein exists as a collection of interconverting conformations driven by thermodynamic fluctuations. Motions on the fast time scale allow a protein to sample conformations in the nearby area of its conformational landscape, while motions on slower time scales give it access to conformations in distal areas of the landscape. Emerging evidence indicates that protein landscapes contain conformational substates with dynamic and structural features that support the designated function of the protein. Nuclear magnetic resonance (NMR) experiments provide information about conformational ensembles of proteins. X-ray crystallography allows researchers to identify the most populated states along the landscape, and computational simulations give atom-level information about the conformational substates of different proteins. This ability to characterize and obtain quantitative information about the conformational substates and the populations of proteins within them is allowing researchers to better understand the relationship between protein structure and dynamics and the mechanisms of protein function. In this Account, we discuss recent developments and challenges in the characterization of functionally relevant conformational populations and substates of proteins. In some enzymes, the sampling of functionally relevant conformational substates is connected to promoting the overall mechanism of catalysis. For example, the conformational landscape of the enzyme dihydrofolate reductase has multiple substates, which facilitate the binding and the release of the cofactor and substrate and catalyze the hydride transfer. For the enzyme cyclophilin A, computational simulations reveal that the long time scale conformational fluctuations enable the enzyme to access conformational substates that allow it to attain the transition state, therefore promoting the reaction mechanism. In the long term, this emerging view of proteins with conformational substates has broad implications for improving our understanding of enzymes, enzyme engineering, and better drug design. Researchers have already used photoactivation to modulate protein conformations as a strategy to develop a hypercatalytic enzyme. In addition, the alteration of the conformational substates through binding of ligands at locations other than the active site provides the basis for the design of new medicines through allosteric modulation.
Accounts of Chemical Research 08/2013; · 24.35 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Ferroelectrics are multifunctional materials that reversibly change their polarization under an electric field. Recently, the search for new ferroelectrics has focused on organic and bio-organic materials, where polarization switching is used to record/retrieve information in the form of ferroelectric domains. This progress has opened a new avenue for data storage, molecular recognition, and new self-assembly routes. Crystalline glycine is the simplest amino acid and is widely used by living organisms to build proteins. Here, it is reported for the first time that γ-glycine, which has been known to be piezoelectric since 1954, is also a ferroelectric, as evidenced by local electromechanical measurements and by the existence of as-grown and switchable ferroelectric domains in microcrystals grown from the solution. The experimental results are rationalized by molecular simulations that establish that the polarization vector in γ-glycine can be switched on the nanoscale level, opening a pathway to novel classes of bioelectronic logic and memory devices.
[Show abstract][Hide abstract] ABSTRACT: Enzyme engineering for improved catalysis has wide implications. We describe a novel chemical modification of Candida antarctica lipase B that allows modulation of the enzyme conformation to promote catalysis. Computational modeling was used to identify dynamical enzyme regions that impact the catalytic mechanism. Surface loop regions located distal to active site but showing dynamical coupling to the reaction were connected by a chemical bridge between Lys136 and Pro192, containing a derivative of azobenzene. The conformational modulation of the enzyme was achieved using two sources of light that alternated the azobenzene moiety in cis and trans conformations. Computational model predicted that mechanical energy from the conformational fluctuations facilitate the reaction in the active-site. The results were consistent with predictions as the activity of the engineered enzyme was found to be enhanced with photoactivation. Preliminary estimations indicate that the engineered enzyme achieved 8–52 fold better catalytic activity than the unmodulated enzyme.
[Show abstract][Hide abstract] ABSTRACT: The molten globule nuclear receptor co-activator binding domain (NCBD) of CREB binding protein (CBP) selectively recruits transcription co-activators (TCAs) during the formation of the transcription preinitiation complex. NCBD:TCA interactions have been implicated in several cancers, however, the mechanisms of NCBD:TCA recognition remain uncharacterized. NCBD:TCA intermolecular recognition has challenged traditional investigation as both NCBD and several of its corresponding TCAs are intrinsically disordered. Using 40μs of explicit solvent molecular dynamics simulations, we relate the conformational diversity of ligand-free NCBD to its bound configurations. We introduce two novel techniques to quantify the conformational heterogeneity of ligand-free NCBD, dihedral quasi-anharmonic analysis (dQAA) and hierarchical graph-based diffusive clustering. With this integrated approach we find that three of four ligand-bound states are natively accessible to the ligand-free NCBD simulations with root-mean squared deviation (RMSD) less than 2Å These conformations are accessible via diverse pathways while a rate-limiting barrier must be crossed in order to access the fourth bound state.
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 01/2012;
[Show abstract][Hide abstract] ABSTRACT: Proteins are intrinsically flexible molecules. The role of internal motions in a protein's designated function is widely debated. The role of protein structure in enzyme catalysis is well established, and conservation of structural features provides vital clues to their role in function. Recently, it has been proposed that the protein function may involve multiple conformations: the observed deviations are not random thermodynamic fluctuations; rather, flexibility may be closely linked to protein function, including enzyme catalysis. We hypothesize that the argument of conservation of important structural features can also be extended to identification of protein flexibility in interconnection with enzyme function. Three classes of enzymes (prolyl-peptidyl isomerase, oxidoreductase, and nuclease) that catalyze diverse chemical reactions have been examined using detailed computational modeling. For each class, the identification and characterization of the internal protein motions coupled to the chemical step in enzyme mechanisms in multiple species show identical enzyme conformational fluctuations. In addition to the active-site residues, motions of protein surface loop regions (>10 Å away) are observed to be identical across species, and networks of conserved interactions/residues connect these highly flexible surface regions to the active-site residues that make direct contact with substrates. More interestingly, examination of reaction-coupled motions in non-homologous enzyme systems (with no structural or sequence similarity) that catalyze the same biochemical reaction shows motions that induce remarkably similar changes in the enzyme-substrate interactions during catalysis. The results indicate that the reaction-coupled flexibility is a conserved aspect of the enzyme molecular architecture. Protein motions in distal areas of homologous and non-homologous enzyme systems mediate similar changes in the active-site enzyme-substrate interactions, thereby impacting the mechanism of catalyzed chemistry. These results have implications for understanding the mechanism of allostery, and for protein engineering and drug design.
[Show abstract][Hide abstract] ABSTRACT: Many modern scientific applications, which are designed to utilize high performance parallel computers, occupy hundreds of thousands of computational cores running for days or even weeks. Since many scientists compete for resources, most supercomputing centers practice strict scheduling policies and perform meticulous accounting on their usage. Thus computing resources and time assigned to a user is considered invaluable. However, most applications are not well prepared for unforeseeable faults, still relying on primitive fault tolerance techniques. Considering that ever-plunging mean time to interrupt (MTTI) is making scientific applications more vulnerable to faults, it is increasingly important to provide users not only an improved fault tolerant environment, but also a framework to support their own fault tolerance policies so that their allocation times can be best utilized. This paper addresses a user level fault tolerance policy management based on a holistic approach to digest and correlate fault related information. It introduces simple semantics with which users express their policies on faults, and illustrates how event correlation techniques can be applied to manage and determine the most preferable user policies. The paper also discusses an implementation of the framework using open source software, and demonstrates, as an example, how a molecular dynamics simulation application running on the institutional cluster at Oak Ridge National Laboratory benefits from it.
Policies for Distributed Systems and Networks (POLICY), 2011 IEEE International Symposium on; 07/2011
[Show abstract][Hide abstract] ABSTRACT: Molecular dynamics (MD) simulations have dramatically improved the atomistic understanding of protein motions, energetics and function. These growing datasets have necessitated a corresponding emphasis on trajectory analysis methods for characterizing simulation data, particularly since functional protein motions and transitions are often rare and/or intricate events. Observing that such events give rise to long-tailed spatial distributions, we recently developed a higher-order statistics based dimensionality reduction method, called quasi-anharmonic analysis (QAA), for identifying biophysically-relevant reaction coordinates and substates within MD simulations. Further characterization of conformation space should consider the temporal dynamics specific to each identified substate.
Our model uses hierarchical clustering to learn energetically coherent substates and dynamic modes of motion from a 0.5 μs ubiqutin simulation. Autoregressive (AR) modeling within and between states enables a compact and generative description of the conformational landscape as it relates to functional transitions between binding poses. Lacking a predictive component, QAA is extended here within a general AR model appreciative of the trajectory's temporal dependencies and the specific, local dynamics accessible to a protein within identified energy wells. These metastable states and their transition rates are extracted within a QAA-derived subspace using hierarchical Markov clustering to provide parameter sets for the second-order AR model. We show the learned model can be extrapolated to synthesize trajectories of arbitrary length.
[Show abstract][Hide abstract] ABSTRACT: Proteins are dynamic objects, constantly undergoing conformational fluctuations, yet the linkage between internal protein motion and function is widely debated. This study reports on the characterization of temperature-activated collective and individual atomic motions of oxidized rubredoxin, a small 53 residue protein from thermophilic Pyrococcus furiosus (RdPf). Computational modeling allows detailed investigations of protein motions as a function of temperature, and neutron scattering experiments are used to compare to computational results. Just above the dynamical transition temperature which marks the onset of significant anharmonic motions of the protein, the computational simulations show both a significant reorientation of the average electrostatic force experienced by the coordinated Fe(3+) ion and a dramatic rise in its strength. At higher temperatures, additional anharmonic modes become activated and dominate the electrostatic fluctuations experienced by the ion. At 360 K, close to the optimal growth temperature of P. furiosus, simulations show that three anharmonic modes including motions of two conserved residues located at the protein active site (Ile7 and Ile40) give rise to the majority of the electrostatic fluctuations experienced by the Fe(3+) ion. The motions of these residues undergo displacements which may facilitate solvent access to the ion.
The Journal of Physical Chemistry B 05/2011; 115(28):8925-36. · 3.38 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Internal motions enable proteins to explore a range of conformations, even in the vicinity of native state. The role of conformational fluctuations in the designated function of a protein is widely debated. Emerging evidence suggests that sub-groups within the range of conformations (or sub-states) contain properties that may be functionally relevant. However, low populations in these sub-states and the transient nature of conformational transitions between these sub-states present significant challenges for their identification and characterization.
To overcome these challenges we have developed a new computational technique, quasi-anharmonic analysis (QAA). QAA utilizes higher-order statistics of protein motions to identify sub-states in the conformational landscape. Further, the focus on anharmonicity allows identification of conformational fluctuations that enable transitions between sub-states. QAA applied to equilibrium simulations of human ubiquitin and T4 lysozyme reveals functionally relevant sub-states and protein motions involved in molecular recognition. In combination with a reaction pathway sampling method, QAA characterizes conformational sub-states associated with cis/trans peptidyl-prolyl isomerization catalyzed by the enzyme cyclophilin A. In these three proteins, QAA allows identification of conformational sub-states, with critical structural and dynamical features relevant to protein function.
Overall, QAA provides a novel framework to intuitively understand the biophysical basis of conformational diversity and its relevance to protein function.
PLoS ONE 02/2011; 6(1):e15827. · 3.53 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Plasmid-encoded R67 dihydrofolate reductase (DHFR) catalyzes a hydride transfer reaction between substrate dihydrofolate (DHF) and its cofactor, nicotinamide adenine dinucleotide phosphate (NADPH). R67 DHFR is a homotetramer that exhibits numerous characteristics of a primitive enzyme, including promiscuity in binding of substrate and cofactor, formation of nonproductive complexes, and the absence of a conserved acid in its active site. Furthermore, R67's active site is a pore, which is mostly accessible by bulk solvent. This study uses a computational approach to characterize the mechanism of hydride transfer. Not surprisingly, NADPH remains fixed in one-half of the active site pore using numerous interactions with R67. Also, stacking between the nicotinamide ring of the cofactor and the pteridine ring of the substrate, DHF, at the hourglass center of the pore, holds the reactants in place. However, large movements of the p-aminobenzoylglutamate tail of DHF occur in the other half of the pore because of ion pair switching between symmetry-related K32 residues from two subunits. This computational result is supported by experimental results that the loss of these ion pair interactions (located >13 Å from the center of the pore) by addition of salt or in asymmetric K32M mutants leads to altered enzyme kinetics [Hicks, S. N., et al. (2003) Biochemistry 42, 10569-10578; Hicks, S. N., et al. (2004) J. Biol. Chem. 279, 46995-47002]. The tail movement at the edge of the active site, coupled with the fixed position of the pteridine ring in the center of the pore, leads to puckering of the pteridine ring and promotes formation of the transition state. Flexibility coupled to R67 function is unusual as it contrasts with the paradigm that enzymes use increased rigidity to facilitate attainment of their transition states. A comparison with chromosomal DHFR indicates a number of similarities, including puckering of the nicotinamide ring and changes in the DHF tail angle, accomplished by different elements of the dissimilar protein folds.
[Show abstract][Hide abstract] ABSTRACT: Collective behavior involving distally separate regions in a protein is known to widely affect its function. In this article, we present an online approach to study and characterize collective behavior in proteins as molecular dynamics (MD) simulations progress. Our representation of MD simulations as a stream of continuously evolving data allows us to succinctly capture spatial and temporal dependencies that may exist and analyze them efficiently using data mining techniques. By using tensor analysis we identify (a) collective motions (i.e., dynamic couplings) and (b) time-points during the simulation where the collective motions suddenly change. We demonstrate the applicability of this method on two different protein simulations for barnase and cyclophilin A. We characterize the collective motions in these proteins using our method and analyze sudden changes in these motions. Taken together, our results indicate that tensor analysis is well suited to extracting information from MD trajectories in an online fashion.
Journal of computational biology: a journal of computational molecular cell biology 03/2010; 17(3):309-24. · 1.69 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Reconfigurable computing (RC) is being investigated as a hardware solution for improving time-to-solution for biomolecular simulations. A number of popular molecular dynamics (MD) codes are used to study various aspects of biomolecules. These codes are now capable of simulating nanosecond time-scale trajectories per day on conventional microprocessor-based hardware, but biomolecular processes often occur at the microsecond time-scale or longer. A wide gap exists between the desired and achievable simulation capability; therefore, there is considerable interest in alternative algorithms and hardware for improving the time-to-solution of MD codes. The fine-grain parallelism provided by Field Programmable Gate Arrays (FPGA) combined with their low power consumption make them an attractive solution for improving the performance of MD simulations. In this work, we use an FPGA-based coprocessor to accelerate the compute-intensive calculations of LAMMPS, a popular MD code, achieving up to 5.5 fold speed-up on the non-bonded force computations of the particle mesh Ewald method and up to 2.2 fold speed-up in overall time-to-solution, and potentially an increase by a factor of 9 in power-performance efficiencies for the pair-wise computations. The results presented here provide an example of the multi-faceted benefits to an application in a heterogeneous computing environment.
Proceedings of the 7th Conference on Computing Frontiers, 2010, Bertinoro, Italy, May 17-19, 2010; 01/2010
[Show abstract][Hide abstract] ABSTRACT: Biomolecular simulations continue to become an increasingly important component of molecular biochemistry and biophysics investigations. Performance improvements in the simulations based on molecular dynamics (MD) codes are widely desired. This is particularly driven by the rapid growth of biological data due to improvements in experimental techniques. Unfortunately, the factors, which allowed past performance improvements of MD simulations, particularly the increase in microprocessor clock frequencies, are no longer improving. Hence, novel software and hardware solutions are being explored for accelerating the performance of popular MD codes. In this paper, we describe our efforts to port and optimize LAMMPS, a popular MD framework, on hybrid processors: graphical processing units (GPUs) accelerated multi-core processors. Our implementation is based on porting the computationally expensive, non-bonded interaction terms on the GPUs, and overlapping the computation on the CPU and GPUs. This functionality is built on top of message passing interface (MPI) that allows multi-level parallelism to be extracted even at the workstation level with the multi-core CPUs as well as extend the implementation on GPU clusters. The results from a number of typically sized biomolecular systems are provided and analysis is performed on 3 generations of GPUs from NVIDIA. Our implementation allows up to 30–40 ns/day throughput on a single workstation as well as significant speedup over Cray XT5, a high-end supercomputing platform. Moreover, detailed analysis of the implementation indicates that further code optimization and improvements in GPUs will allow ∼100 ns/day throughput on workstations and inexpensive GPU clusters, putting the widely-desired microsecond simulation time-scale within reach to a large user community.
High Performance Computing and Simulation (HPCS), 2010 International Conference on; 01/2010
[Show abstract][Hide abstract] ABSTRACT: Biomolecular simulations have traditionally benefited from increases in the processor clock speed and coarse-grain inter-node parallelism on large-scale clusters. With stagnating clock frequencies, the evolutionary path for performance of microprocessors is maintained by virtue of core multiplication. Graphical processing units (GPUs) offer revolutionary performance potential at the cost of increased programming complexity. Furthermore, it has been extremely challenging to effectively utilize heterogeneous resources (host processor and GPU cores) for scientific simulations, as underlying systems, programming models and tools are continually evolving. In this paper, we present a parametric study demonstrating approaches to exploit resources of heterogeneous systems to reduce time-to-solution of a production-level application for biological simulations. By overlapping and pipelining computation and communication, we observe up to 10-fold application acceleration in multi-core and multi-GPU environments illustrating significant performance improvements over code acceleration approaches, where the host-to-accelerator ratio is static, and is constrained by a given algorithmic implementation.
Conference on High Performance Computing Networking, Storage and Analysis, SC 2010, New Orleans, LA, USA, November 13-19, 2010; 01/2010