ResearchPDF Available

Abstract and Figures

A NAMD tutorial for protein molecular dynamics simulations using the VMD GUI, suitable for beginners. (--> please consider the June 2017 update available as a separate item)
Content may be subject to copyright.
Page 1 of 22
NAMD/VMD tutorial (uses VMD 1.9.2, NAMD 2.10_Win64-
multicore)
Molecular dynamics simulation of ‘protein
folding’
Certainly no subject or field is making more progress on so many fronts at the present
moment, than biology, and if we were to name the most powerful assumption of all, which
leads one on and on in an attempt to understand life it is that all things are made of atoms,
and that everything that living things do can be understood in terms of jigglings and
wigglings of atoms.” (Richard Feynman, 1963)
Andreas Kukol
University of Hertfordshire
School of Life and Medical Sciences
Page 2 of 22
Aims and Objectives
To become familiar with molecular dynamics simulations of biomolecules and protein
folding.
Skills
Prepare protein structure files (pdb-file) for molecular dynamics (MD) simulation
Create a solvated protein in the computer
Add counterions
Perform energy minimisation
Perform position restraint MD simulations
Perform unrestrained MD simulations
Analyse the trajectory visually using molecular graphics programs
Quantitative analysis of the trajectory (root mean square deviations, hydrogen bonds
etc.)
Logbook recording of computer-based experiments
Summary of useful tasks/questions:
1) Write an abstract of the tutorial (max. 250 words).
2) How many water molecules were added ?
3) How many sodium and chloride ions have been added ?
4) Based on those numbers, what net charge of the protein can you infer ?
5) What is the the cut-off distance for the calculation of both electrostatic and van der
Waals interactions (in Å) ?
6) What is the time increment of the simulation (with correct units) (refer to page 4 if
uncertain) ?
7) How many steps are performed in the position constraint simulation ?
8) How often are the coordinates of the system written to the trajectory (.dcd) file (note
both the steps and the actual time interval) ?
9) What is the total simulation time specified (position constraint) ?
10) What is the reason that the ions disappear and re-appear at the edges of the box ?
11) After which time do you observe the start of folding ? At which time does the
folding speed increase substantially ?
12) What trend do you observe in the number of hydrogen bonds during the “protein
folding” process ?
13) Report the average number of hydrogen bonds and the standard deviation (sd) for
this number as nhb ± sd. Take care to report the correct number of significant figures
for nhb.
14) How many salt bridges were identified in your simulation trajectory ?
Page 3 of 22
Conventions for this tutorial:
Text providing background information is indicated with this symbol:
Questions, which you should answer in your logbook are indicated like this:
Menu commands are shown in italics
Commands typed at the command prompt are shown like this
Input files are shown in boldface.
1. Introduction
Molecular dynamics (MD) simulations are a research method in computational
biochemistry that yields the thermal fluctations of atoms in a molecule as well as the
relative postions of molecules and atoms in dependence of time. Advances in the
price/performance ratio of computer hardware has moved biomolecular simulations from
the realm of supercomputers to desktop workstations. At the same time the accuracy of
simulations compared to experimental observations has increased due to more accurate
forcefields (see below), accelerated sampling methods and the use of explicit solvent in
simulations. Biomolecular simulations are used to gain insight into ligand binding,
enzymatic activities, signalling mechanisms and protein folding (Dodson et al, 2008).
Additionally simulations are valuable tools for the refinement of electron microscopy, x-
ray, NMR or other spectroscopic data in order to obtain more accurate molecular structures.
The ingredients of MD simulations are the atomic coordinates of the molecule, usually in
the format of a pdb-file, and a definition of the atom types, bonds and angles between
atoms as well as the number of molecules in the simulation system. This definition is
usually called the topology. The interactions between the atoms in large biomolecules are
treated by the principles of classical mechanics, i.e the molecule is represented in the
computer as a set of spheres connected by springs. The potential energy V is given by
classical physics-based equations, such as:
ij
ji
ijpairs r
ijpairs ij
ij
ij
ij
lestorsionang
iii
torsion
iii
bondangles
angle
i
bonds
ii
bond
i
r
QQ
r
B
r
A
nkkrrkV
0
612
0,
2
0,
2
0,
4
1
))cos(1()()(

(1)
Each summation term from left to right describes the bonds, bond angles, torsion angles,
von der Waals interactions and electrostatic interactions as depicted graphically in figure 1
below.
i
?
i
Page 4 of 22
Figure 1: Interactions between atoms included in the forcefield (Lindahl, 2008)
The form of the equation like equation (1) and the parameters kbond, kangle, ktorsion, A, B are
called the force field of the simulation. It is this force field and the way in which van der
Waals and electrostatic interactions are treated that determines the accuracy of the
simulation. The most widely used forcefields are AMBER, CHARMM, OPLS and
GROMOS. The GROMOS forcefield differs from the rest in the way hydrogen atoms are
treated. In the GROMOS forcefield non-polar hydrogen atoms are subsumed into their
adjacent carbon atom, e.g. a three particle H-C-H group becomes a single CH2 particle.
This type of forcefield is called a united-atom forcefield and is computationally more
efficient that all-atom force fields, because the number of atoms is reduced with a small
sacrifice in terms of accuracy. In this tutorial you will use an all-atom forcefield of the
CHARMM variant, namely a combination of CHARMM22 for proteins and CHARMM27
for lipids.
Once the molecular system is accurately described with coordinates, topology and force
field, the actual simulation consists of a large number of time steps:
Loop
For each particle: calculate force
For each particle: update coordinates
Increment time
Until maximum number of timesteps is reached.
The time is usually incremented in steps of one to three fs (femtoseconds). That means for
simulations of a reasonable amount of time we need to carry out a large number of time
steps. Early simulation of proteins in vaccuo have been performed for 100 ps (picoseconds)
on a Cray X-MP supercomputer (Karplus & Petsko, 1990), but nowadays proteins in water
are simulated routinely for 100,000 ps.
The use of solvated systems in MD simulation requires the introduction of another concept:
periodic boundary conditions. When a protein molecule is enclosed in a box of water
Page 5 of 22
molecules, the water molecules at the box edges would experience vacuum as shown in
figure 2a. In order to avoid this problem, the box is repeated indefintely in space as shown
in figure 2b. Obviously the box needs to be big enough so that the protein cannot see its
mirror image in a neighboring box.
Figure 2: a) a water box in vaccum, b) with periodic boundary conditions
Another procedure encountered in molecular simulations is energy minimisation (EM). In
EM the coordinates of the system are varied in order to find a minimum of the potential
energy V or mathematically: find the coordinates xi so that the first derivative of the
potential energy becomes zero:
0
i
x
V (2)
This is similar to school maths, when you were asked to find the minimum or maximum of
a function V, e.g. V = (x-3)2 with respect to the x-coordinate. For biomolecules the
potential energy function is much more complicated and there are thousands of coordinates,
thus the computer applies a numerical algorithm called ‘steepest descent’. A common issue
is that most EM algorithms are unable to find a global minimum but only a local minimum
as illustrated in figure 3. MD simulations followed by EM are better suited to find the
global energy minimum, but EM is required before you can start an MD simulation.
In this practical you will investigate the ribosomal protein L7/L12. It is a 12 kDa protein
that is part of the so-called stalk of the large ribosomal subunit. The protein L7/L12 consists
of two domains, namely the N-terminal domain responsible for achoring the protein to the
ribosome and the C-terminal domain that is involved in interaction with the translation
B
A vacuum
vacuum
vacuum
vacuum
Page 6 of 22
factor. We will focus here on the C-terminal domain, the structure of which is known from
nuclear magnetic resonance (NMR) spectroscopy (PDB identifier: 1RQS).
Energy
Coordinate
Figure 3: A schematic plot of energy against a coordinate. Energy minimisation follows the green arrows,
thus dependent on the starting position it is not always possible to reach the global energy minimum.
Protein folding is a slow process that can take any time from microseconds to minutes,
which is beyond the scope of MD simulations. Therefore we apply a trick and simulate
protein unfolding at a high temperature, which can be seen as the reverse of protein folding
(Day & Daggett, 2007).
2 Materials and Methods
The protein structure of the C-terminal domain of the ribosomal protein L7/L12 as
determined by NMR-spectroscopy (Bocharov et al, 2004).
The VMD graphical user interface (Humphrey et al, 1996).
STRIDE secondary structure prediction (as distributed with VMD) (Frishman & Argos,
1995).
The molecular dynamics simulation software NAMD, which was developed by the
Theoretical and Computational Biophysics Group in the Beckman Institute for
Advanced Science and Technology at the University of Illinois at Urbana-Champaign
(Phillips et al, 2005).
Page 7 of 22
3 Set-up of the simulation
For the simulation you will use the NAMD software. You should be familiar with VMD,
which provides all the necessary tools for setting-up, running and analysing a NAMD
simulation. The tasks required are summarised in the flow-chart shown in figure 4. It starts
with a pdb-file and ends with the execution of a NAMD simulation.
Figure 4: The steps required to perform a simulation with NAMD starting from a pdb-file. In your case the
pdb-file has only one chain/segment. This image was taken from the NAMD tutorial (Phillips et al, 2012).
All steps can be carried out with the VMD graphical user interface, except that the NAMD
simulation will be started from the command line. The parameter files for various
simulation tasks are provided in a supplementary zip-file. You will need a text editor, such
as NotePad or AkelPad to view and modify these files.
3.1 Generate a Protein Structure File (psf-file)
Obtain the file riboprotein.pdb from zip-file and save it into the folder, where you
will carry out the simulation. The pdb-file contains the atom coordinates, but we need to
tell the computer, what type of atoms are present in the pdb-file, if they carry any charge,
how the atoms are connected and if there is free or restricted rotation around bonds. All this
information is captured in the psf-file and for proteins it can be generated automatically by
pfsgen.
Start VMD and open the ‘TK console’ window (Menu Extensions – TK console). In the TK
console window go to the folder you want to work in (where you saved riboprotein.pdb),
for example:
Page 8 of 22
% cd u:/ab2-3
Note that the forward slash is used to separate folders (this is a LINUX convention,
Windows uses ‘\’). You may check the active folder with ’pwd’. Open the automatic psf-
builder: Extensions – Modelling – Automatic PSF Builder. Change the output basename to
‘rbp’, click on ‘Load input files’, then ‘Guess and split chains’. The picture on screen after
the last click is shown in figure 5. Then click on ‘Create Chains’. There are many warnings
displayed in the console window, but you may check that the file ‘rbp.psf’ has indeed be
generated. The next steps are required to place the protein into the centre of a water box.
3.2 Solvate the system
Close the ‘AutoPSF’ window and delete all molecules from the ‘VMD Main’ window, by
selecting each entry in the list and using the menu Molecule – Delete Molecule. Then load
first rbp.psf and rbp.pdb into the psf (make sure ‘Load files for: rbp.psf is selected). Now
we need to place the protein in a box of water. To start with, the protein molecule must be
placed at the centre of the coordinate system; this is done form the ‘VMD TK Console’
window (note the American spelling of ‘center’ vs British ‘centre’):
% set all [atomselect top all]
% measure center $all (this outputs the current x, y, z coordinates of the centre)
% $all moveby [vecinvert [measure center $all]]
% measure center $all (this checks, if the move was successful)
As you can see from the last command, the coordinates of the centre are close to zero, thus
the protein has been centred. These commands are part of the Tcl/TK programming
language.
Page 9 of 22
Figure 5: The screenshot after performing all steps including ‘Guess and split chains using current
selections’.
We are now ready to solvate the protein using the solvation box tool of VMD. Menu
Extensions – Modelling – Add Solvation Box brings up a new window. In this window
change the output to ‘rbp_water’ and all ‘Box Padding’ values to 10 Å. This value specifies
the distance between the protein and the edge of the box; as a size of a water molecule is
approximately 3 Å there will be about three layers of water molecules between the protein
and the box edge. Finally click on solvate. How many water molecules were added ? The
following Tcl/Tk commands may help you to answer this question. The output is the
number of atoms belonging to water molecules:
% set wat [atomselect top water]
% $wat num
?
Page 10 of 22
3.3 Neutralise the system
Proteins carry charges and the net charge of a protein is the sum of negative and positive
charges contributed by the side-chains of amino acid residues (Asp, Glu, Arg, Lys, His) and
the N- and C-terminus. The net charge may not be zero for the riboprotein, but any system
of molecules in a test tube has a zero net charge. Leaving a charge in your simulation
system would create unrealistic conditions, thus any charge in your system must be
neutralised by adding counterions. At the same time we will take the opportunity to set a
realistic salt concentration of 0.1 M NaCl. This step is performed with the add ions tool:
Extensions – Modelling – Add Ions. In this window set the output prefix to ‘rbp_ions’, for
the ‘Ion placement mode’ check ‘Neutralise and set NaCl concentration to’ 0.1 mol/L as
shown in figure 6.
Figure 6: The settings of the Autoionize tool.
Finally click on ‘Autoionize’. How many sodium and chloride ions have been added ? You
could check those numbers using Tcl/TK commands. Alternatively, you may use the
Windows Explorer, go to your working folder and open rbp_ions.pdb in Notepad. At the
end of the file you will find the sodium and chloride ions that have been added. Based on
those numbers, what net charge of the protein can you infer ?
?
?
Page 11 of 22
4 Energy minimisation and molecular dynamics simulation
Before the start of the actual unfolding simulation over 10,000 ps, also called the
‘production run’, we must perform a number of preparatory simulation steps starting with
an energy minimisation (EM) followed by equilibration of the water molecules around the
protein molecule. Without this equilibration the water molecules would bump into the
protein and distort the structure at the start of the simulation. Yet the structure is based on
NMR experimental data and should not be distorted artificially. In order to equilibrate the
water, we keep all protein atoms fixed in space, while water molecules and ions will be free
to move. This is called position constraint MD, whereby the postions of the protein atoms
are constraint to the positions from the NMR structure. For this purpose you will use
NAMD from the Windows command line, while all options that apply to the EM and MD
simulation are specified in text files with the extension ‘.namd’.
4.1 Energy minimisation (EM)
For the first energy minimisation we need to obtain some information about the size of the
simulation box and the coordinates of the centre. You need to note this information into
your logbook and write it into the namd-file. In VMD load the file rbp_ions.pdb, if not
already loaded and type the following commands into the TKConsole:
% set all [atomselect top all]
% measure minmax $all
The output should look similar to that:
{-23.841999053955078 -37.52799987792969 -21.398000717163086}
{24.74799919128418 27.742000579833984 21.966999053955078}
These are the coordinates of the bottom left and top right corner of the simulation box. For
these coordinates we can get the dimension (size) of the simulation box along the x-
coordinate as: xdim = 24.75 – (-23.84) = 48.59 and ydim and zdim in the same way. The centre
of the simulation box is obtained with:
% measure center $all
0.5794315338134766 -4.7676897048950195 0.3911421597003937
Note these numbers down as well. Now you are ready to prepare the file ‘minimize.namd’
specifying the energy minimisation of the system. Obtain the file from the supplementary
zip-file and insert the required information as indicated in the file and save it into your
Page 12 of 22
working folder. Then open the Windows command prompt, go to your working folder and
start the energy minimisation (Windows uses the backward slash ‘\’):
> u:
> cd \ab2-3
> c:\NAMD_2.10_Win64-multicore\namd2 +p4 minimize.namd >
minimize.log
(Note that the location of the namd2 program may be different on your computer.)
After less than one minute the minimisation has completed, open the file minimize.log with
Notepad to verify that everything went to completion without errors.
4.2 Position constraint MD
For the position constraint MD we must tell NAMD, which atoms of the system should be
constrained and to which coordinates they should be constrained. We will constrain all non-
hydrogen atoms of the protein to the positions obtained from energy minimisation (the
output of the previous step, emin.pdb). In order to tell NAMD the atoms that should be
constrained we need to create a pdb-file that contains the number 1.00 in the beta-factor
column. In order to do this restart VMD (or delete all molecules) and open emin.pdb and
the TKConsole. In the TKConsole type:
% set allprotein [atomselect top protein]
% set fix [atomselect top "protein not hydrogen"]
% $allprotein set beta 0
% $fix set beta 1
% [atomselect top all] writepdb fixsystem.pdb
Open fixsystem.pdb with Notepad and verify that the penultimate column contains 1.00 for
non-hydrogen protein atoms. You are now ready to run the position constraint MD
simulation:
> c:\NAMD_2.10_Win64-multicore\namd2 +p4 sim_fixprot.namd >
simfix.log
While the simulation is running (≈ 1.5 hours) answer the following questions with regards
to the position constraint MD simulation in your logbook. The online NAMD user’s guide
may be helpful (http://www.ks.uiuc.edu/Research/namd/2.9/ug/ug.html).
Page 13 of 22
What is the value of the cut-off distance for the calculation of both electrostatic and van der
Waals interactions (in Å) ?
What is the time increment of the simulation (with correct units) (refer to page 4 if
uncertain) ?
How many steps are performed in the position constraint simulation ?
How often are the coordinates of the system written to the trajectory (.dcd) file (note both
the steps and the actual time interval) ?
4.3 Free MD simulation (production run)
Once the previous simulation has completed you are ready to run the unconstraint MD
simulation at high temperature in order to study the unfolding of the riboprotein. The output
of the previous simulation sim_fixprot.pdb is used to to run a long simulation over
10,000 ps at a temperature of 550K. Save the configuration file sim_free.namd from the
supplementary zip-file into your working directory and start the simulation as previously
described. While the simulation is running you may analyse the content of sim_free.namd.
Note that there is no pressure control, thus the simulation is carried out at fixed volumen,
the NVT ensemble (‘canonical ensemble’ in statistical mechanics), in order to avoid the
likely evaporation of water at this high temperature.
What is the total simulation time specified ?
You may notice that only a very short simulation is performed. The full 10,000 ps
simulation takes a couple of days on a typical dual core CPU. For the next data analysis
part you will need to modify the sim_free.namd file in order to obtain the complete
trajectory sim_free.dcd and the log-file simfree.log.
(End of simulation tutoral)
?
?
Page 14 of 22
5 Data Analysis
5.1 Visual analysis
Start VMD and use the TKConsole to go to your working directory. Open the file
emin.pdb. Then use Menu Display – Orthographic. With Menu Graphics – Representations
you may change to the display to your liking, e.g. as shown in the following figure 7.
Figure 7: The settings of VMD (left) to creat the view shown on the right.
Then activate the molecule in the main window and select Menu File – Load data into
molecule. Browse to sim_fixprot.dcd and load it into VMD. About 1000 frames are
loaded. By using the player controls at the bottom of the ‘VMD Main’ window you should
verify that the position constraint simulation was successful, by observing that only the ions
and water molecules change position, while the position of the protein atoms remain
constant. What is the reason that the ions disappear and re-appear at the edges of the box ?
Now delete the molecule: Molecule – Delete Molecule and load sim_fixprot.pdb, the final
output of the position constraint simulation. Load into this molecule the long 10,000 ps
trajectory. This should contain 5000 frames.
Now you are ready to observe the unfolding of the ribosomal protein L7/L12 by playing the
trajectory forwards or the folding by playing the trajectory backwards. The current display
is not very usefull to observe changes of the protein structure, because the molecule
?
Page 15 of 22
undergoes diffusional and rotational motions. In order to eliminate rotation and diffusion
we will align all frames of the trajectory to the starting position. Open Extensions –
Analysis – RMSD Trajectory Tool and select ‘Backbone’ under ‘Selection Modifiers’, then
click on ‘ALIGN’. After a short wait, close the tool window and play the trajectory again;
now we can focus on the internal motions of the protein.
5.2 Quantitative analysis
5.2.1 Root mean square deviation
The root mean square deviation (RMSD) between two structures is a measure of overall
structural similarity and calculated as a mass weighted average over all N backbone atoms
between molecule 0 and molecule 1:
N
i
iii rrm
M
RMSD
1
210 )(
1
where ri = (x,y,z) is the vector of the coordinates and mi the mass of each atom. A backbone
RMSD around 0.2 nm (2 Å) is indicative of small structural fluctuations, while an RMSD >
0.3 nm (3 Å) indicates a conformational change. Molecule 0 is the first frame of the
trajectory (the reference frame), while molecule 1 corresponds to all the sucessive frames in
the trajectory.
The RMSD Trajectory tool can be used to plot the RMSD of each frame with reference to
the first frame, by checking ‘Plot’ in the ‘Trajectory’ section. Keep the ‘Backbone’ option
checked. The RMSD plot is shown in figure 8. This is the default unweighted RMSD.
Figure 8: RMSD plot against frame number
generated by the RMSD trajectory tool. 5000
frames were generated from a 10000 ps simulation,
thus the time difference between two consecutive
frames is 2 ps.
Since we want to study the process of protein folding by reversing the trajectory, we will
import the RMSD data into a spreadsheet program (e.g. Microsoft Excel) and reverse the
Page 16 of 22
data. Close the plot window, remove the checkmark from ‘Plot’, place a checkmark next to
‘Save’ and click on RMSD again. The program has generated a file called ‘trajrmsd.dat’ in
your working folder. In Microsoft Excel open the file trajrmsd.dat. Remember to select
‘all files’ in the file type drop-down box. Automatically the ‘Text Import Wizard’ comes
up; here you select ‘Fixed Width’ and ‘Next – Next – Finish’. Replace NA by zero and then
you need to generate a proper time by taking into account that frames occur at an interval of
2 ps/frame (= 10000 ps/5000 frames) as shown in figure 9.
Figure 9: The imported RMSD data and the conversion of frame number into time in column C
Then overwrite the content of column A with column C, copy column C followed by ‘Paste
Values’ into column A.
Because we want to analyse ‘protein folding’ we apply a trick and reverse the data in
column B. To do this copy column A into column C. Then select columns B and C and use
the Data – Sort function choosing ‘Sort by Column C’ and order ‘Largest to Smallest’.
In the next step create a plot of the RMSD in dependence of time, by selecting columns A
and B and creating a ‘Scatter Plot’ of the type ‘Scatter with Straight Lines”. An example is
shown in figure 10. In this particluar example, the RMSD fluctuates between 2 and 4 Å,
which indicates, that the protein is never completely unfolded. The results for your
simulation trajectory may be different.
Page 17 of 22
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
RMSD/Å
time/ps
Figure 10: Backbone RMSD of the “folding” trajectory of the C-terminal domain of the L7/L12 ribosomal
protein. The dark line is a moving average over a window of 25 data points.
After which time do you observe the start of folding ? At which time does the folding
speed increase substantially ? You may go back to VMD and play the trajectory in order to
visually confirm the structural changes observed in the RMSD plot.
5.2.2 Formation of hydrogen bonds
In order to calculate the number of hydrogen bonds during the trajectory, use Extensions –
Analysis – Hydrogen bonds.
Change ‘Donor-Acceptor distance’ to 3.5 Å, ‘Angle cutoff’ to 30° and click on ‘Find
hydrogen bonds’. After some time, a plot similar to the one in figure 11 is generated.
Figure 11: The number of hydrogen bonds plotted against frame number.
?
Page 18 of 22
Export the data from the Multiplot window by selecting File – Export to ASCII vectors.
Then import that data into Excel as described in paragraph 5.2.1 and reverse the time. What
trend do you observe in the number of hydrogen bonds during the “protein folding”
process ? Report the average number of hydrogen bonds and the standard deviation (sd) for
this number as nhb ± sd. Take care to report the correct number of significant figures for nhb;
the standard deviation will inform you what is a reasonable number of significant figures.
Standard deviations are usually reported with one or two significant figures.
Example: 14.123 ± 1.2 is incorrect, 14.1 ± 1.2 or 14 ± 1 are correct.
5.2.3 Secondary structure
With the tool Extensions – Timeline we can investigate changes of secondary structure
during the simulation. Since the calculation takes a long time (30+ mins), it is
recommended to plan a break at this point.
From the tool window select Calculate – Calc. Sec. Struct.
Figure 12: The secondary structure of each residue plotted against the frame number (T: turn, E: β-strand, B:
isolated bridge, H: α-helix, G: 3-10 helix, I: π-helix, C: coil).
?
Page 19 of 22
From the example shown in figure 12, you can see that the three α-helices remain fairly
stable during the simulation, while the N-terminal β-strand secondary structure centred at
Ile57 is reduced due to the formation of turns and coils in particular around frame 1500
(3000 ps, time refers to ‘unfolding’). The second β-strand centred at Lys95 is even more
substantially reduced around the same time. The C-terminal β-strand experiences some
shortening around frame 3200 (6400 ps). The results from your own simulation trajectory
may be completely different.
By clicking with mouse button on the frame number at the bottom, you show the
corresponding point of the trajectory in the VMD graphical display window. By clicking on
the sequence bar at the left, you can highlight the amino acid in red in the graphical display.
The update of displays may take a while with the message ‘VMD not responding’, but
eventually the display will update.
5.2.4 Salt bridges
With the Extensions – Analysis – Salt Bridges we can identify salt bridges as well as obtain
for each salt bridge the distance over time.
Figure 13: The settings of Salt Bridges tool. Note the settings in ‘Output Options’; additionally ‘Update
selections every frame’ was unchecked.
By setting the output options as shown in figure 13, the following salt bridges were
identified:
Page 20 of 22
GLU96-LYS95, ASP102-LYS95, GLU116-LYS107, ASP85-LYS95, GLU111-LYS108, GLU104-LYS108,
GLU50-LYS100, GLU112-LYS70, GLU53-LYS51, ASP101-LYS100, GLU82-LYS81, GLU49-LYS100,
GLU104-LYS100, GLU88-LYS65, ASP55-LYS120, ASP85-LYS84, GLU104-LYS107, GLU118-LYS120,
GLU111-LYS107, GLU49-LYS51, ASP55-LYS51, GLU82-LYS95, GLU118-LYS59, GLU88-LYS84,
GLU53-LYS120, GLU96-LYS120, ASP85-LYS81, GLU116-LYS59, GLU112-LYS108, GLU50-LYS51
Note that if at any point in the trajectory the distance was equal to or lower than the
threshold of 3.2 Å, the residue pair would have been identified as a salt bridge. For the salt
bridge Glu112-Lys70 the distance is plotted against ‘unfolding’ time in figure 14.
0
2
4
6
8
10
12
14
16
18
20
8500 9000 9500 10000
Glu112-Lys70 distance / Å
time/ps
Figure 14: Distance between Glu112 and Lys17 plotted against time for a extract of the ‘unfolding’
trajectory.
How many salt bridges were identified in your simulation trajectory ?
5.2.5 Energetics
In the following analysis you will plot the van der Waals and electrostatic internal energy of
the protein in dependence of time. This analysis is performed by the NAMDEnergy tool:
Extensions – Analysis – NAMD Energy with the settings shown in figure 14. Note that the
parameter files will be inserted automatically.
?
Page 21 of 22
Figure 14: Calculation of protein van der Waals and electrostatic internal energy during protein folding as a
function of simulation time.
From the resulting Multiplot window use File – Export to ASCII matrix. Then import the
generated file multiplot.dat into Excel. The first column denotes the trajectory frame
followed by total, electrostatic and van der Waals energy terms in kcal/mol. Reverse the
time axis as described previously to represent the “folding” process and plot the
electrostatic and van der Waals internal energy against time as shown for the example in
figure 15.
Figure 15: Protein internal energy terms plotted against folding time.
Page 22 of 22
References
Bocharov EV, Sobol AG, Pavlov KV, Korzhnev DM, Jaravine VA, Gudkov AT, Arseniev
AS (2004) From structure and dynamics of protein L7/L12 to molecular switching in
ribosome. Journal of Biological Chemistry 279: 17697-17706
Day R, Daggett V (2007) Direct observation of microscopic reversibility in single-molecule
protein folding. J Mol Biol 366: 677-686
Dodson GG, Lane DP, Verma CS (2008) Molecular simulations of protein dynamics: new
windows on mechanisms in biology. EMBO reports 9: 144-150
Frishman D, Argos P (1995) Knowledge-based protein secondary structure assignment.
Proteins-Structure Function and Genetics 23: 566-579
Humphrey W, Dalke A, Schulten K (1996) VMD: Visual molecular dynamics. Journal of
molecular graphics & modelling 14: 33-38
Karplus M, Petsko GA (1990) Molecular dynamics simulations in biology. Nature 347:
631-639
Lindahl E (2008) Molecular dynamics simulations. In Molecular Modeling of Proteins,
Kukol A (ed), Vol. 443, pp 3-23. Totowa: Humana Press
Phillips J, Isgro T, Sotomayor M, Villa E, Yu H, Tanner D, Liu Y. (2012) NAMD tutorial.
University of Illinois at Urbana-Champaign.
Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD,
Kale L, Schulten K (2005) Scalable molecular dynamics with NAMD. Journal of
Computational Chemistry 26: 1781-1802

Supplementary resource (1)

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Molecular dynamics--the science of simulating the motions of a system of particles--applied to biological macromolecules gives the fluctuations in the relative positions of the atoms in a protein or in DNA as a function of time. Knowledge of these motions provides insights into biological phenomena such as the role of flexibility in ligand binding and the rapid solvation of the electron transfer state in photosynthesis. Molecular dynamics is also being used to determine protein structures from NMR, to refine protein X-ray crystal structures faster from poorer starting models, and to calculate the free energy changes resulting from mutations in proteins.
Article
Full-text available
Based on the 1H-15N NMR spectroscopy data, the three-dimensional structure and internal dynamic properties of ribosomal protein L7 from Escherichia coli were derived. The structure of L7 dimer in solution can be described as a set of three distinct domains, tumbling rather independently and linked via flexible hinge regions. The dimeric N-terminal domain (residues 1-32) consists of two antiparallel α-α-hairpins forming a symmetrical four-helical bundle, whereas the two identical C-terminal domains (residues 52-120) adopt a compact α/β-fold. There is an indirect evidence of the existence of transitory helical structures at least in the first part (residues 33-43) of the hinge region. Combining structural data for the ribosomal protein L7/L12 from NMR spectroscopy and x-ray crystallography, it was suggested that its hinge region acts as a molecular switch, initiating “ratchet-like” motions of the L7/L12 stalk with respect to the ribosomal surface in response to elongation factor binding and GTP hydrolysis. This hypothesis allows an explanation of events observed during the translation cycle and provides useful insights into the role of protein L7/L12 in the functioning of the ribosome.
Article
Full-text available
Recent advances in computer hardware and software have led to the development of increasingly successful molecular simulations of protein structural dynamics that are intrinsic to biological processes. These simulations have resulted in models that increasingly agree with experimental observations, suggest new experiments and provide insights into biological mechanisms. Used in combination with data obtained with sophisticated experimental techniques, simulations are helping us to understand biological complexity at the atomic and molecular levels and are giving promising insights into the genetic, thermodynamic and functional/mechanistic behaviour of biological processes. Here, we highlight some examples of such approaches that illustrate the current state and potential of the field of molecular simulation.
Article
Molecular dynamics has evolved from a niche method mainly applicable to model systems into a cornerstone in molecular biology. It provides us with a powerful toolbox that enables us to follow and understand structure and dynamics with extreme detail-literally on scales where individual atoms can be tracked. However, with great power comes great responsibility: Simulations will not magically provide valid results, but it requires a skilled researcher. This chapter introduces you to this, and makes you aware of some potential pitfalls. We focus on the two basic and most used methods; optimizing a structure with energy minimization and simulating motion with molecular dynamics. The statistical mechanics theory is covered briefly as well as limitations, for instance the lack of quantum effects and short timescales. As a practical example, we show each step of a simulation of a small protein, including examples of hardware and software, how to obtain a starting structure, immersing it in water, and choosing good simulation parameters. You will learn how to analyze simulations in terms of structure, fluctuations, geometrical features, and how to create ray-traced movies for presentations. With modern GPU acceleration, a desktop can perform μs-scale simulations of small proteins in a day-only 15 years ago this took months on the largest supercomputer in the world. As a final exercise, we show you how to set up, perform, and interpret such a folding simulation.
Article
VMD is a molecular graphics program designed for the display and analysis of molecular assemblies, in particular biopolymers such as proteins and nucleic acids. VMD can simultaneously display any number of structures using a wide variety of rendering styles and coloring methods. Molecules are displayed as one or more "representations," in which each representation embodies a particular rendering method and coloring scheme for a selected subset of atoms. The atoms displayed in each representation are chosen using an extensive atom selection syntax, which includes Boolean operators and regular expressions. VMD provides a complete graphical user interface for program control, as well as a text interface using the Tcl embeddable parser to allow for complex scripts with variable substitution, control loops, and function calls. Full session logging is supported, which produces a VMD command script for later playback. High-resolution raster images of displayed molecules may be produced by generating input scripts for use by a number of photorealistic image-rendering applications. VMD has also been expressly designed with the ability to animate molecular dynamics (MD) simulation trajectories, imported either from files or from a direct connection to a running MD simulation. VMD is the visualization component of MDScope, a set of tools for interactive problem solving in structural biology, which also includes the parallel MD program NAMD, and the MDCOMM software used to connect the visualization and simulation programs. VMD is written in C++, using an object-oriented design; the program, including source code and extensive documentation, is freely available via anonymous ftp and through the World Wide Web.
Article
We have developed an automatic algorithm STRIDE for protein secondary structure assignment from atomic coordinates based on the combined use of hydrogen bond energy and statistically derived backbone torsional angle information. Parameters of the pattern recognition procedure were optimized using designations provided by the crystallographers as a standard-of-truth. Comparison to the currently most widely used technique DSSP by Kabsch and Sander (Biopolymers 22:2577-2637, 1983) shows that STRIDE and DSSP assign secondary structural states in 58 and 31% of 226 protein chains in our data sample, respectively, in greater agreement with the specific residue-by-residue definitions provided by the discoverers of the structures while in 11% of the chains, the assignments are the same. STRIDE delineates every 11th helix and every 32nd strand more in accord with published assignments.
Article
Both folded and unfolded conformations should be observed for a protein at its melting temperature (T(m)), where DeltaG between these states is zero. In an all-atom molecular dynamics simulation of chymotrypsin inhibitor 2 (CI2) at its experimental T(m), the protein rapidly loses its low-temperature native structure; it then unfolds before refolding to a stable, native-like conformation. The initial unfolding follows the unfolding pathway described previously for higher-temperature simulations: the hydrophobic core is disrupted, the beta-sheet pulls apart and the alpha-helix unravels. The unfolded state reached under these conditions maintains a kernel of structure in the form of a non-native hydrophobic cluster. Refolding simply reverses this path, the side-chain interactions shift, the helix refolds, and the native packing and hydrogen bonds are recovered. The end result of this refolding is not the initial crystal structure; it contains the proper topology and the majority of the native contacts, but the structure is expanded and the contacts are long. We believe this to be the native state at elevated temperature, and the change in volume and contact lengths is consistent with experimental studies of other native proteins at elevated temperature and the chemical denaturant equivalent of T(m).