Vol. 21 no. 8 2005, pages 1472–1478
M-ZDOCK: a grid-based approach for Cnsymmetric multimer
Brian Pierce1, Weiwei Tong2and Zhiping Weng1,2,∗
1Bioinformatics Program and2Department of Biomedical Engineering, Boston University, Boston, MA, USA
Received on October 13, 2004; revised on November 10, 2004; accepted on December 13, 2004
Advance Access publication December 21, 2004
Summary: Computational protein docking is a useful technique for
gaining insights into protein interactions. We have developed an
algorithm M-ZDOCK for predicting the structure of cyclically sym-
metric (Cn) multimers based on the structure of an unbound (or
partially bound) monomer. Using a grid-based Fast Fourier Transform
approach, a space of exclusively symmetric multimers is searched
for the best structure. This leads to improvements both in accuracy
and running time over the alternative, which is to run a binary docking
thus hits are not as easily overlooked. By searching four instead of six
degrees of freedom, the required amount of computation is reduced.
This program has been tested on several known multimer complexes
from the Protein DataBank, including four unbound multimers: three
trimers and a pentamer. For all of these cases, M-ZDOCK was able
to find at least one hit, whereas only two of the four testcases had hits
when using ZDOCK and a symmetry filter. In addition, the running
times are 30–40% faster for M-ZDOCK.
Availability: M-ZDOCK is freely available to academic users at
Supplementary information: http://zlab.bu.edu/m-zdock
Much of the activity of cells is guided by interactions between pro-
teins. In order to better understand the workings of cells and for
rational drug development, it is useful to understand these protein–
protein interactions. One means of revealing information about
protein interactions is the prediction of the structure of a protein
units. This problem is referred to as unbound docking, as opposed to
the simpler (and largely solved) bound docking which is to predict
the structure based on subunit coordinates taken directly from the
bound structure. In order to simplify unbound docking, it is gener-
ally divided into two steps, the initial stage and the refinement stage.
The initial stage is a full search of the six-dimensional (6D) space
(three rotational degrees and three translational degrees) for the pos-
sible relative orientations of the two molecules. In order to make
this search tractable, the proteins are assumed to be rigid during this
stage, with allowance for some clash between the proteins (referred
∗To whom correspondence should be addressed.
to as soft docking). The next stage, the refinement stage, performs
slight improvements on a subset of the predictions from the initial
docking stage. In addition to slight movements of the rigid bodies in
6D space, the refinement stage sometimes allows for movements of
side chains and backbones (referred to as flexible in this case).
For initial stage docking, a variety of approaches have been
developed; they are discussed in several reviews (Lengauer and
Rarey, 1996; Halperin et al., 2002). A popular approach, using a
fast Fourier transform (FFT) correlation-based method to test for
surface complementarity, was first proposed by Katchalski-Katzir
et al. (1992). The programs DOT (Mandell et al., 2001), GRAMM
(Vakser, 1995), FTDOCK (Gabb et al., 1997) and ZDOCK (Chen
et al., 2003a) all use (and expand upon) this concept successfully to
predict protein complex structures. ZDOCK, developed by our lab,
uses FFT correlations to find complexes based on desolvation and
electrostatics, in addition to a surface complementarity metric called
PSC (Chen and Weng, 2003).
A subclass of interactions between proteins is the case where two
or more identical proteins interact to form a homomultimer. A com-
mon form of symmetry found in homomultimers is Cnsymmetry or
cyclic symmetry, which delineates a ring-shaped complex. For sym-
metric dimers, trimers, pentamers and heptamers, this symmetry is
necessarily the case, while this symmetry is also found for other
numbers of protein subunits. For instance, membrane channels and
chaperonins often have oligomers with Cnsymmetry.
To efficiently and accurately predict Cn multimer complexes,
we have implemented a program called Multimer ZDOCK (M-
ZDOCK). This program takes advantage of the properties of Cn
symmetry to perform a simplified search for the correct complex.
There are many instances where this program can be applied. A
number of proteins have been solved as monomers or in a com-
plex with another protein but exist in a homomultimeric state under
different conditions in vivo (e.g. heat shock, pH changes, viral
The recently solved crystal structure of adeno-associated virus
can be modeled. Another example is the protein Chaperonin-60
(Cpn60), which is expressed under heat shock and other forms of
stress, isahomologofEscherichiacoli GroELandistypicallyfound
inadoubleringstructurecomposedof14protomers. However, ithas
been found that Mycobacterium tuberculosis has lower order oli-
© The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: firstname.lastname@example.org
M-ZDOCK multimer docking
this structure can be obtained by multimer docking. Also, Korkhov
transporter-1 (GAT1). By computationally predicting possible struc-
tures of dimeric GAT1, multimer docking would help to support this
model or provide new ones regarding the structure.
Since the interface between two adjacent subunits is the same for
all interfaces of the complex, only one of the n interfaces needs to be
considered, reducing the problem to two monomers for any degree
of Cnsymmetry. In addition, since all Cnmultimers can be aligned
in a plane (as they are rotated around a single axis), one spatial
degree of freedom can be ignored. Finally, since there is redundancy
when rotating a Cncomplex around its rotational axis (the resultant
complex will be the same), this rotational degree of freedom is elim-
inated. Thus the problem becomes 4D instead of 6D; this reduces
the amount of searching and the computational time.
Another type of symmetry seen in proteins is dihedral (D2)
symmetry, which is composed of two homodimers interacting sym-
metrically, or a dimer of dimers (four asymmetric units). From an
interaction standpoint, this case differs from Cnsymmetry in that
is seen in Cnsymmetry. Recently Berchanski and Eisenstein (2003)
filtered and combined the pairwise complexes between monomers
generated with a FFT-based generic docking algorithm (Katchalski-
Katzir et al., 1992) to predict the structures of D2multimers. They
tested the subunits taken directly from the complex structure, as well
as homology modeled monomers, and reported promising results. A
similar approach was used earlier to construct the helically symmet-
ric protein coat of the tobacco mosaic virus (Eisenstein et al., 1997).
However, due to the discrete nature of the FFT algorithm, the vast
majority of these binary predictions are not symmetric and even the
ones that pass the filter are never truly symmetric.
Here we have developed a new docking algorithm M-ZDOCK,
such that we explore only the part of search space that conforms to
the Cnsymmetry. We observe a significant improvement in accur-
acy, lower redundancy and fewer false positives, as shown in a direct
comparison with docking and filtering. In addition, since only per-
fectly symmetric multimers are explored in the search space, less
computational time is required.
a set of test cases that exist in both monomeric and multimeric forms
in physiological conditions. Although small, this set represents an
exhaustive search of such test cases in the Protein Data Bank (PDB)
(Berman et al., 2000). It should prove useful for future docking
studies on multimers.
The scoring function used by this program is based on the scoring used in
the latest version of ZDOCK (Chen et al., 2003a). ZDOCK is an initial stage
docking algorithm designed to predict the structure of the complex of two
proteins, referred to as the receptor and the ligand. It takes into account sur-
face complementarity, electrostatics and desolvation to find the optimal fit
between two proteins. Surface complementarity is calculated using pairwise
shape complementarity (PSC), which consists of a favorable term determ-
ined by the number of atom pairs within a distance cutoff, and a penalty term
determined by the number of clashes. Atomic Contact Energy (ACE) (Zhang
et al., 1997) is used to score desolvation, and the electrostatic term is calcu-
lated by applying Coulomb’s equation to the partial charges of the ligand in
the electrostatic field of the receptor.
The search strategy of ZDOCK is to discretize both ligand and receptor
onto a grid and use FFT to determine the best position of the ligand relative
to the receptor. This discretization and FFT is performed for a complete set
of angular orientations of the ligand (relative to a fixed receptor).
Critical Assessment of Prediction of Interactions.
The Euler angle conventions used in this paper refer to these successive
rotations from the initial configuration, described in Goldstein’s Classical
Mechanics (Goldstein, 1980):
(1) Rotation by ψ around the z-axis.
(2) Rotation by θ around the original x-axis.
See Figure 1 for a diagram of these rotations. Typically, Euler angles are sets
of three angles; in this case, the third angle, φ, is not necessary as it would
be redundant in the symmetric search (see the Search space section for an
M-ZDOCK uses the convention that the rotational axis will be parallel to the
z-axis, and searches in the x–y plane for the optimal position of this axis.
To perform the search for the best conformation of a multimer based on the
structure of a monomer, it has been necessary to make modifications to the
search methodology that is used for ZDOCK 2.3. The new search algorithm
is outlined below:
(1) Center the receptor (the input monomer) at the origin.
(2) Rotatethereceptorbyanangleψ aroundthez-axis, andthenθ around
(3) Copy the receptor, and rotate it by 360◦/n around the z-axis to create
(4) Discretize both the ligand and receptor, with a grid spacing of 1.2 Å
(the same as ZDOCK 2.3).
(5) Perform the 3D FFT and correlation, and search in the x–y plane for
the best scoring multimer position for that rotational orientation.
(6) Repeat the steps 2–5 for various other sets of ψ and θ.
In order to fully explore the space of multimers, it is necessary to vary ψ
from 0 to 360◦, and θ from 0 to 90◦. θ does not need to sample a full 360◦
because for a given φ there are redundancies at 180◦− θ, 180◦+ θ, and −θ
due to the symmetric nature of these angles around the z- and x-axes.
as these are symmetric around the z-axis and therefore would be redundant
for the same values of ψ and θ. This corresponds to the loss of a rotational
degree of freedom that is referred to in the Introduction section.
M-ZDOCK uses 1500 angle sets, as this was found to be a good bal-
ance between computational time and predictive performance. In addition,
given that ZDOCK 2.3 uses 54000 angle sets for 3D angular freedom
(6 degree sampling density), the number of angles that M-ZDOCK covers is
mathematically reasonable as it is approximately 540002/3.
Reconstructing the multimer
Based on the optimal relative position of two adjacent monomers in the x–y
plane (output from the FFT), it is possible to reconstruct the full multimer.
The only constraint is that the monomers need to be rotated by 360◦/n
with respect to one another around the z-axis. Referring to Figure 2, the
vector representing the displacement between the two adjacent monomers
is L and the vector from the monomer to the symmetry axis (in the x–
y plane) is d. β is the angle around the Cnsymmetry axis between two
multimer centers of mass, 360◦/n. The angle between the vectors L and
B.Pierce et al.
Fig. 1. Diagram of successive rotation through Euler angles ψ and θ the angles used to describe the rotational configuration of the ligand and receptor. In this
case, ψ = 90◦and θ = 45◦.
d is α, given by (180◦− β)/2. The magnitude of d can be computed
Once the rotational axis is found, the monomer needs to be rotated
around this axis n times by β◦to form the multimer. Thus, given the
vector between two adjacent monomers in the Cnmultimer (and the sym-
metry number), it is possible to reconstruct the entire multimer. To illustrate
this concept, a java applet has been written and is publicly available at
In order to compare the results of M-ZDOCK with results from an existing
method of docking, we implemented a symmetry filter that will choose only
near-symmetric complexes. It is designed to process the results from a dock-
ing tool such as ZDOCK which produces many predictions (54000 in the
case of ZDOCK with dense sampling).
The filter determines the angle and axis between the monomers of the pre-
diction, as well as the center of mass translation between the monomers. For
perfect symmetry, the angle between the center of mass translation and the
axis is 90◦, and the angle of rotation around the axis is 360◦/n, but a certain
range must be allowed as the predictions are not perfectly symmetric. In the
case of Berchanski and Eisenstein (2003) the angular range for the rotation
around the axis was ±6◦, and between the axis and translation the angular
range was ±3◦. To allow for a comparison with the M-ZDOCK results so
increased to ±18◦and ±9◦, respectively.
We tested M-ZDOCK with two categories of testcases, bound/quasi-bound
Bound and quasi-bound testcases
To ensure that the search space is covered entirely and that the algorithm is
valid for various types of Cnsymmetry, both bound and quasi-bound dock-
ing testcases were used. The bound testcases were generated by extracting
the monomer from the multimeric structure so that the docking algorithm
can attempt to reassemble the multimer. These testcases should be relatively
simple to dock as there is no conformational change to account for. If the
correct structure is not found with these cases, it is possibly due to some
problem with the searching algorithm. Though found in the PDB as both
monomers and multimers, quasi-bound testcases are most likely biological
multimers. Therefore the conformational change involved is of little or no
significance, making these cases similar to (but slightly more difficult than)
M-ZDOCK multimer docking
Fig. 2. The relative positions of the subunits of a C3multimer. The vector
L is the relative position between the receptor and the ligand (which is the
receptor rotated by β degrees; in this case β = 120◦). The magnitude of
vector d to the axis of symmetry and the angle α between vectors L and d
can be determined algebraically. Thus, once the interface between the ligand
and receptor is evaluated by M-ZDOCK, the rest of the multimer (in this case
the subunit represented by the dashed lines) can be generated automatically.
the bound testcases. The monomer structure that is found in the PDB is used
as input to the docking algorithm, while the multimer structure in the PDB is
used to evaluate the docking results.
The second type of testcases is unbound structures. These testcases are signi-
proteins inherent in unbound docking and because of the low affinity of the
complexes, as these cases must exist in both monomer and multimer forms
to be found experimentally. Four proteins were found in the PDB (Berman
et al., 2000) for which different symmetric forms exist, according to Protein
Quaternary Structure server classification. Here is a brief summary of these
et al., 1992) and trimeric (Liu et al., 2002) forms. The trimer in this case is
about this structure is a domain-swapped C-terminal beta strand.
The Naja naja naja (Indian cobra) phospholipase
A2(PLA2) was obtained from the venom and crystallized in trimeric form
was crystallized with a lower concentration of PLA2and higher concentra-
tion of Ca2+(the Ca2+is seen in the structure of the monomer but not the
trimer). In Segelke et al. (1998) it is discussed that the trimeric form may be
a means of shielding the active site and thus ‘protecting the snake from its
Flavivirus envelope protein.
This is the fusion envelope protein of the
tick-borne encephalitis virus (TBEV E protein). The input structure is taken
from the homodimer structure (Rey et al., 1995). The trimeric form, which
occurs at low pH during membrane fusion, was recently solved (Bressanelli
et al., 2004).
Bovine trypsin inhibitor.
This testcase is the bovine pancreatic trypsin
inhibitor (BPTI), which occurs as a monomer (Wlodawer et al., 1984) at
Table 1. The unbound multimer testcases
Testcase PDB IDsa
protein (TBEV E)
Bovine trypsin inhibitor (BPTI)
aThe first PDB code is for the structure used as input for docking, while the second one
is the bound multimer.
bInterface Cα RMSD change between unbound/bound structures.
basic pH and a decamer (Hamiaux et al., 2000) at acidic pH levels. As the
is one half of the decamer.
Table 1 summarizes these testcases. To provide a measure of the difficulty
of docking each complex, interface Cα atoms from unbound monomers were
fitted to two adjacent subunits of the complex. As M-ZDOCK is a rigid-body
docking algorithm, the root mean square deviation (RMSD) in this case can
be seen as the lower limit for the RMSD of the predictions.
RMSD calculations and hits
To evaluate bound and unbound predictions, the RMSDs of interface alpha
Carbon (Cα) atoms were used. The interface Cα atoms were determined
from the crystal structure of the multimer. If any atom of a residue is within
10 Å of any atom of another chain, the Cα atom from that residue is determ-
ined to be an interface Cα. In addition, to avoid false negatives due to large
domain movements, regions of residues with large movement from unbound
to bound (>4 Å) were removed before determining interface Cα atoms. See
Supplementary information for the removed residues.
Once the Cα residues are known, two adjacent subunits of the predicted
Cαs, and the RMSD between the interface Cαs is computed. Hits are defined
as predictions that have an interface Cα RMSD ≤2.5 Å.
RESULTS AND DISCUSSION
Structure prediction: quasi-bound and bound
of predicting structures with Cnsymmetry. For all of the structures
the number one ranked prediction was a hit, and in addition there
were a number of hits in the top 20 for every testcase.
Structure prediction: unbound
The structure prediction capabilities of M-ZDOCK are shown to be
superior to filtering normal docking predictions, across the unbound
multimer benchmark (Table 3). For M-ZDOCK, all of the first hits
two cases where no hit was found, and the two other cases were in
the bottom third of the predictions.
M-ZDOCK successfully predicted a hit for RNase A (Fig. 3a),
while the near-symmetric predictions of ZDOCK failed to produce a
hit. This is despite the fact that 375 more predictions were produced
by ZDOCK plus filtering. This complex was difficult to predict due
to the strand swapping that takes place upon multimerization, which
explains the relatively high rank of 476 for the first M-ZDOCK hit.
B.Pierce et al.
Table 2. M-ZDOCK results for quasi-bound and bound testcases
(Morera et al., 1995; Gonin et al., 1999)
(Garavaglia et al., 2002; Werner et al., 2002)
(Taylor and Andersson, 1997)
(Gulbis et al., 1999, 2000)
(Wang et al., 1997)
(Auerbach et al., 2000)
(Kitov et al., 2000)
(Hunt et al., 1997)
aPDB IDs of the testcases, with the PDB ID of the input structure for M-ZDOCK listed first for the quasi-bound testcases.
bNumber of hits in the top 20 (out of 1500) predictions, as ranked by M-ZDOCK.
cRank of the first hit.
dRMSD (in Å) of the first hit.
eThe bound structures in these cases are in fact dimers of the Cnmultimer; just the Cncontacts are predicted so the other interface is ignored.
Table 3. M-ZDOCK results for unbound testcases
ZDOCK + filtering
aNumber of predictions produced by M-ZDOCK (the number is always 1500).
bNumber of hits among the predictions.
cRank of the first hit.
dRMSD (in Å) of the first hit.
eNumber of predictions remaining after running ZDOCK and filtering the 54000 predictions for symmetry.
lation, they were clearly a part of the interface making the prediction
non-trivial. The swapped strands are highlighted in Figure 3a.
The symmetric trimer PLA2 was successfully prediced by M-
ZDOCK. In this case M-ZDOCK predicted six hits, one of them
with the particularly high rank of 33. While ZDOCK + filtering
obtained a hit, the rank of the hit was 1417 and the RMSD of this hit
two hits were found by M-ZDOCK, while no hits were found with
ZDOCK. This protein is somewhat difficult to dock due to the large
C-terminal conformational change upon trimerization that helps to
stabilize the interaction. The difficulty is also reflected in the lower
error for rigid-body docking to obtain a hit (under 2.5 Å). However
M-ZDOCK is able to predict this structure, giving the first-ranked
hit an impressive rank of 62 (see Fig. 3c for the structure).
The BPTI pentamer (Fig. 3d) had a large number of predictions
produced by M-ZDOCK. As with the other testcases, M-ZDOCK
performed better with regard to hits and the rank of the first hit. In
prediction. Butofthe20M-ZDOCKhits, 6ofthemhadabetterrank
and RMSD than the top ZDOCK prediction, so clearly M-ZDOCK
is superior in this case as well.
The docking predictions reported in this study were performed on
an IBM p690 workstation with 32 1.3 Ghz Power4 processors, using
MPI for parallelization. Due to the increased efficiency of the M-
ZDOCK search, a significant reduction in running time can be seen
using this approach. The discretization of the receptor at every angle
set (as described in the Methods section), which costs more than
regular ZDOCK, is more than compensated by the faster search. On
average, M-ZDOCK runs 30–40% faster than ZDOCK.
M-ZDOCK was also compiled and run on Linux in serial and
parallel, and on Mac OS X in serial. Versions of M-ZDOCK for all
of these platforms are available at: http://zlab.bu.edu/m-zdock.
A possible future modification to the M-ZDOCK algorithm would
be to incorporate the degree of packing into the algorithm. Since
the algorithm currently used considers only the interface between