R E V I E W S
NATURE REVIEWS | DRUG DISCOVERY
VOLUME 3 | NOVEMBER 2004 | 935
The number of proteins with a known three-dimen-
sional structure is increasing rapidly,and structures pro-
duced by structural genomics initiatives are beginning to
become publicly available1,2.The increase in the number
of structural targets is in part due to improvements in
techniques for structure determination,such as high-
throughput X-ray crystallography3. With large-scale
structure-determination projects driven by genomics
consortia, many current target proteins have been
selected for their therapeuticpotential.
Computational methodologies have become a crucial
component of many drug discovery programmes,
from hit identification to lead optimization and
beyond4–6,and approaches such as ligand-4or structure-
based virtual screening7techniques are widely used in
many discovery efforts. One key methodology —
docking of small molecules to protein binding sites —
was pioneered during the early 1980s8,and remains a
highly active area of research7.When only the structure
of a target and its active or binding site is available,
high-throughput docking is primarily used as a hit-
identification tool.However,similar calculations are
often also used later on during lead optimization,
when modifications to known active structures can
quickly be tested in computer models before compound
synthesis.Furthermore,docking can also contribute to
the analysis of drug metabolism using structures such
as cytochrome P450 isoforms9,10.
Here,we review basic concepts and specific features of
small-molecule–protein docking methods and several
selected applications,with particular emphasis on hit
identification and lead optimization,but do not specifi-
cally review protein–protein docking,which is less rele-
vant for small-molecule drug discovery.We attempt to
distinguish between the problems of docking com-
pounds into target sites and ofscoring docked conforma-
tions,because the available data indicate that numerous
robust and accurate docking algorithms are available,
whereas imperfections ofscoring functions continue to
be a major limiting factor.
An introduction to docking
The docking process involves the prediction of ligand
conformation and orientation (or posing) within a
targeted binding site (BOX 1).In general,there are two
aims of docking studies:accurate structural modelling
and correct prediction of activity.However,the identifi-
cation of molecular features that are responsible for
specific biological recognition, or the prediction of
compound modifications that improve potency,are
DOCKING AND SCORING IN VIRTUAL
SCREENING FOR DRUG DISCOVERY:
METHODS AND APPLICATIONS
Douglas B.Kitchen*,Hélène Decornez*,John R.Furr* and Jürgen Bajorath‡,§
Abstract | Computational approaches that ‘dock’ small molecules into the structures of
macromolecular targets and ‘score’ their potential complementarity to binding sites are widely
used in hit identification and lead optimization. Indeed, there are now a number of drugs whose
development was heavily influenced by or based on structure-based design and screening
strategies, such as HIV protease inhibitors. Nevertheless, there remain significant challenges
in the application of these approaches, in particular in relation to current scoring schemes.
Here, we review key concepts and specific features of small-molecule–protein docking
methods, highlight selected applications and discuss recent advances that aim to address
the acknowledged limitations of established approaches.
21 Corporate Circle,Albany,
‡AMRI Bothell Research
18804 North Creek Parkway,
Correspondence to J.B.
The process of determining
whether a given conformation
and orientation of a ligand fits
the active site.This is usually a
fuzzy procedure that returns
many alternative results.
Both posing and ranking involve
scoring.The pose score is often
a rough measure of the fit of a
ligand into the active site.The
rank score is generally more
complex and might attempt to
estimate binding energies.
A more advanced process than
pose scoring that typically takes
several results from an initial
scoring phase and re-evaluates
them.This process usually
attempts to estimate the free
energy of binding as accurately
as possible.Although the posing
phase might use simple energy
calculations (electrostatic and
van der Waals),ranking
procedures typically involve more
elaborate calculations (perhaps
including properties such as
entropy or explicit solvation).
complex issues that are often difficult to understand and
— even more so — to simulate on a computer.
In view of these challenges, docking is generally
devised as a multi-step process in which each step intro-
duces one or more additional degrees of complexity11.
The process begins with the application of docking
algorithms that POSEsmall molecules in the active site.
This in itself is challenging,as even relatively simple
organic molecules can contain many conformational
degrees of freedom.Sampling these degrees of freedom
must be performed with sufficient accuracy to identify
the conformation that best matches the receptor struc-
ture,and must be fast enough to permit the evaluation
of thousands of compounds in a given docking run.
Algorithms are complemented by SCORING FUNCTIONS that
are designed to predict the biological activity through
the evaluation of interactions between compounds and
potential targets. Early scoring functions evaluated
compound fits on the basis of calculations of approxi-
mate shape and electrostatic complementarities.
Relatively simple scoring functions continue to be
heavily used,at least during the early stages of docking
simulations.Pre-selected conformers are often further
evaluated using more complex scoring schemes with
more detailed treatment of electrostatic and van der
Waals interactions,and inclusion of at least some solva-
tion or entropic effects7.It should also be noted that
ligand-binding events are driven by a combination of
enthalpic and entropic effects,and that either entropy
or enthalpy can dominate specific interactions.This
often presents a conceptual problem for contemporary
scoring functions (discussed below),because most of
∆G = –RT1nKA
KA = Ki–1 =
[E]aq + [I]aq
[E + I]aq
936 | NOVEMBER 2004 | VOLUME 3
R E V I E W S
them are much more focused on capturing energetic
than entropic effects.
In addition to problems associated with scoring of
compound conformations,other complications exist that
make it challenging to accurately predict binding confor-
mations and compound activity.These include,among
others, limited resolution of crystallographic targets,
inherent flexibility,induced fit or other conformational
changes that occur on binding,and the participation of
water molecules in protein–ligand interactions.Without
doubt,the docking process is scientifically complex.
Molecular representations for docking
To evaluate various docking methods,it is important to
consider how the protein and ligand are represented.
There are three basic representations of the receptor:
atomic,surface and grid12.Among these,atomic represen-
tation is generally only used in conjunction with a poten-
tial energy function13and often only during final RANKING
procedures (because ofthe computational complexity of
evaluating pair-wise atomic interactions).
Surface-based docking programs are typically,but
not exclusively,used in protein–protein docking14,15.
Connolly’s early work on molecular surface representa-
tions is mainly responsible for spawning much of the
research in this area16,17.These methods attempt to align
points on surfaces by minimizing the angle between the
surfaces of opposing molecules18.Therefore,a rigid
body approximation is still the standard for many
protein–protein docking techniques.
The use of potential energy grids was pioneered by
Goodford19,and various docking programs use such
grid representations for energy calculations.The basic
idea is to store information about the receptor’s ener-
getic contributions on grid points so that it only needs
to be read during ligand scoring. In the most basic
form,grid points store two types of potentials:electro-
static and van der Waals (BOX 2). FIGURE 1shows a repre-
sentative grid for capturing electrostatic potentials,and
FIG.2 illustrates the electrostatic potential of a bound
inhibitor mapped on its molecular surface.
Search methods and molecular flexibility
This section focuses on algorithms used to treat ligand
flexibility and, to some extent, protein flexibility.
Treatment ofligand flexibility can be divided into three
basic categories11:systematic methods (incremental con-
struction,conformational search,databases);random or
stochastic methods (Monte Carlo,genetic algorithms,
tabu search); and simulation methods (molecular
dynamics, energy minimization). A summary of the
search approaches implemented in widely used docking
programs is presented in BOX 3.
Systematic search.These algorithms try to explore all the
degrees offreedom in a molecule,but ultimately face the
problem ofcombinatorial explosion20(BOX 4).Therefore,
ligands are often incrementally grown into active sites.
A stepwise or incremental search can be accomplished in
different ways — for example,by docking various molec-
ular fragments into the active-site region and linking
Box 1 | Theoretical aspects of docking
For an enzyme and inhibitor,docking aims at correct prediction ofthe structure ofthe
complex [E+I] = [EI] under equilibrium conditions (see figure and equation 1).
The figure illustrates the binding ofinhibitor Dmp323 to HIV protease and is based on
solution structures (PDB code:1BVE).Multiple structures ofenzyme–inhibitor
complexes revealed only limited structural variations.
The free energy ofbinding (∆G) is related to binding affinity by equations 2 and 3:
Prediction ofthe correct structure (posing) ofthe [E+I] complex does not require
information about KA.However,prediction ofbiological activity (ranking) requires this
information;scoring terms can therefore be divided in the following fashion.When
considering the term [EI],the following factors are important:steric,electrostatic,
hydrogen bonding,inhibitor strain (ifflexible) and enzyme strain.When considering the
equilibrium shown in equation 1,the following factors are also important:desolvation,
rotational entropy and translational entropy.
NATURE REVIEWS | DRUG DISCOVERY
VOLUME 3 | NOVEMBER 2004 | 937
R E V I E W S
stage,energy minimization is performed after each
Another method of systematic search is the use of
libraries ofpre-generated conformations.Library confor-
mations are typically only calculated once and the search
problem is therefore reduced to a rigid body docking
procedure.For example,FLOG29generates database con-
formations on the basis of distance geometry. Once
acceptable conformations have been generated,the algo-
rithm explores them in a manner similar to DOCK11,29.
Random search.These algorithms (often called stochastic
methods) operate by making random changes to either a
single ligand or a population ofligands.A newly obtained
ligand is evaluated on the basis of a pre-defined proba-
bility function.Two popular random approaches are
Monte Carlo and genetic algorithms (BOX 5).Alternative
implementations of Monte Carlo search have been
reported30,31,including a popular form in AutoDock30.By
contrast,several other programs (including DOCK and
GOLD) have implemented genetic algorithms32–34.
The basic idea of a tabu search algorithm is to take
into consideration already explored areas of conforma-
tional space35,36.To determine whether a molecular con-
formation is accepted or not, the root mean square
deviation is calculated between current molecular
coordinates and every molecule’s previously recorded
conformation.For example,PRO_LEADS makes use
of a tabu search algorithm35.
Simulation methods.Molecular dynamics is currently
the most popular simulation approach.However,molec-
ular dynamics simulations are often unable to cross
high-energy barriers within feasible simulation time
periods,and therefore might only accommodate ligands
in local minima of the energy surface11.Therefore,an
attempt is often made to simulate different parts of a
protein–ligand system at different temperatures37.
Another strategy for addressing the local minima prob-
lem is starting molecular dynamics calculations from
different ligand positions. In contrast to molecular
dynamics,energy minimization methods are rarely used
as stand-alone search techniques,as only local energy
minima can be reached,but often complement other
search methods,including Monte Carlo38.DOCK per-
forms a minimization step after each fragment addition,
followed by a final minimization before scoring.
Protein flexibility.The treatment ofprotein flexibility is
less advanced than that of ligand flexibility,but various
approaches have been applied to flexibly model at least
part of the target39,including molecular dynamics and
Monte Carlo calculations31–33,rotamer libaries40,41and
protein ensemble grids42.The idea behind using amino-
acid side-chain rotamer libraries is to model protein
conformational space on the basis of a limited number
of experimentally observed and preferred side-chain
conformations40.To reduce the number ofdiscrete pro-
tein conformations arising from combinations of
rotamers,a dead-end elimination algorithm is often
used41.This algorithm recursively removes side-chain
them covalently (which is most popular as a de novo
ligand-design strategy) or, alternatively, by dividing
docked ligands into rigid (core fragment) and flexible
parts (side chains).In the latter case,once the rigid cores
have been defined,they are docked into the active site.
Next, flexible regions are added in an incremental
fashion21–23. For example,DOCK 4.024poses the core
fragment by steric complementarity,and flexible side
chains are grown one bond at a time by systematically
exploring each bond’s POSE SPACE.A pruning algorithm is
applied to remove unfavourable conformations early
on,thereby reducing the complexity of the problem24,25.
FlexX differs from DOCK in that the placement of the
rigid core fragment is based on interaction geometries
between fragments and receptor groups22,26.Interacting
groups are primarily hydrogen-bond donors and accep-
tors,as well as hydrophobic groups.FlexX further differs
from DOCK in that it uses a pose-clustering algorithm
to classify the docked poses22,27.
The Hammerhead algorithm28,in common with other
incremental search algorithms,also divides ligands into
fragments.However,Hammerhead docks each fragment
and then rebuilds the ligand from fragments that have
acceptable initial scores.During the fragment-growing
All degrees of freedom involved
in the process of placing one
molecule relative to another.
For example,for two rigid
molecules the pose space simply
consists of relative orientations.
When one of the molecules,the
ligand,is allowed to be flexible,
the pose space comprises both
the conformational space of the
ligand and orientational space
of ligand and receptor.
Ecoul(r) = (1)
i = 1
j = 1
EvdW(r) = –
j = 1
i = 1
Van der Waals energy
Box 2 | Standard potential energy functions
The electrostatic potential energy is represented as a pair-
wise summation ofCoulombic interactions,as described
in equation 1:
In equation 1,Nis the number ofatoms in molecules A
and B,respectively,and qthe charge on each atom.
The van der Waals potential energy for the general
treatment of non-bonded interactions is often modelled
by a Lennard–Jones 12–6 function,as shown in
In equation 2,εis the well depth of the potential and σis
the collision diameter ofthe respective atoms i and j.
The figure shows a representation ofthe Lennard–Jones
12–6 function.The exp(12) term of the equation is
responsible for small-distance repulsion,whereas the
exp(6) provides an attractive term which approaches
zero as the distance between the two atoms increases.
948 | NOVEMBER 2004 | VOLUME 3
R E V I E W S
1. Berman, H. M. et al.The protein data bank and the
challenge of structural genomics. Nature Struct. Biol. 7,
Westbrook, J., Feng, Z., Chen, L., Yang, H. & Berman, H. M.
The protein data bank and structural genomics. Nucleic
Acid Res. 31, 489–491 (2003).
Blundell, T. L., Jhoti, H. & Abell, C. High-throughput
crystallography for lead discovery in drug design. Nature
Rev. Drug Discov. 1, 45–54 (2002).
Bajorath, J. Integration of virtual and high-throughput
screening. Nature Rev. Drug Discov. 1, 882–894 (2002).
Walters, W. P., Stahl, M. T. & Murcko, M. A. Virtual
screening — an overview. Drug Discov. Today 3,
Langer, T. & Hoffmann, R. D. Virtual screening: an effective
tool for lead structure discovery. Curr. Pharm. Design 7,
Gohlke, H. & Klebe, G. Approaches to the description and
prediction of the binding affinity of small-molecule ligands to
macromolecular receptors. Angew. Chem. Int. Ed. 41,
A very extensive and informative review with
emphasis on quantitative analysis of protein–ligand
Kuntz, I. D., Blaney, J. M., Oatley, S. J., Langridge, R. &
Ferrin, T. E. A geometric approach to macromolecule–ligand
interactions. J. Mol. Biol. 161, 269–288 (1982).
Venhorst, J. et al.Homology modeling of rat and human
cytochrome P450 2D (CYP2D) isoforms and computational
rationalization of experimental ligand-binding specificities.
J. Med. Chem. 46, 74–86 (2003).
10. Williams, P. A. et al.Crystal structure of human cytochrome
P450 2C9 with bound warfarin. Nature 424, 464–468
11. Brooijmans, N. & Kuntz, I. D. Molecular recognition and
docking algorithms. Annu. Rev. Biophys. Biolmol. Struct.
32, 335–373 (2003).
Excellent review of research in the docking arena that
contains an instructive section on the conceptually
different processes involved in ligand–protein docking.
12. Halperin, I., Ma, B., Wolfson, H. & Nussinov, R. Principles of
docking: an overview of search algorithms and a guide to
scoring functions. Proteins 47, 409–443 (2002).
13. Burnett, R. M. & Taylor, J. S. DARWIN: a program for
docking flexible molecules. Proteins 41, 173–191 (2000).
14. Norel, R., Lin, S. L., Wolfson, H. & Nussinov, R. Shape
complementarity at protein–protein interfaces. Biopolymers
34, 933–940 (1994).
15. Norel, R., Petrey, D., Wolfson, H. & Nussinov, R.
Examination of shape complementarity in docking of
unbound proteins. Proteins 35, 403–419 (1999).
16. Connolly, M. L. Analytical molecular surface calculation.
J. Appl. Cryst. 16, 548–558 (1983).
17. Connolly, M. Solvent-accessible surface of proteins and
nucleic acids. Science 221, 709–713 (1983).
References 16 and 17 outline the theoretical
foundation of molecular surface calculations that
have also become a crucial component of many
shape-based docking algorithms.
18. Norel, R., Wolfson, H. & Nussinov, R. Small molecular
recognition: solid angles surface representation and shape
complementarity. Comb. Chem. High Throughput Screen
2, 177–191 (1999).
19. Goodford, P. J. A computational procedure for determining
energetically favorable binding sites on biologically important
macromolecules. J. Med. Chem. 28, 849–857 (1985).
This seminal paper introduced the idea of potential
energy grids and its application to understanding
protein–ligand interactions. This concept has been
applied and extended in many contemporary
20. Leach, A. R. Molecular Modelling: Principles and Applications
(Addison Wesley Longman Limited, Harlow, 1996).
21. DesJarlais, R. L. Docking flexible ligands to
macromolecular receptors by shape. J. Med Chem. 29,
22. Klebe, G. & Rarey, M. A fast flexible docking method using
an incremental construction algorithm. J. Mol. Biol. 261,
23. Kuntz, I. D. & Leach, A. R. Conformational analysis of flexible
ligands in macromolecular receptor sites. J. Comput. Chem.
13, 730–748 (1992).
24. Ewing, T. J. A., Makino, S., Skillman, A. G. & Kuntz, I. D.
DOCK 4.0: search strategies for automated molecular
docking of flexible molecule databases. J. Comput. Aided
Mol. Des. 15, 411–428 (2001).
25. Conformation search [online], <http://dock.compbio.
26. Kramer, B., Rarey, M., Lengauer, T. Evaluation of the FlexX
incremental construction algorithm for protein–ligand
docking. Proteins 37, 228–241 (1999).
27. Linnainmaa, S., Harwood, D. & Davis, L. S. Pose
determination of a three-dimensional object using triangle
pairs. IEEE Trans. Comput. Anal. Machine Intelligence 10,
An in-depth study of a computer vision technique
(pose clustering) that is utilized, for example, in FlexX.
28. Welch, W., Ruppert, J. & Jain, A. N. Hammerhead: fast, fully
automated docking of flexible ligands to protein binding
sites. Chem. Biol. 3, 449–462 (1996).
29. Kearsly, S. K., Underwood, D. J., Sheridan, R. P. &
Miller, M. D. Flexibase: a way to enhance the use of
molecular docking methods. J. Comput. Aided Mol. Des.
8, 565–582 (1994).
30. Olson, A. J. & Goodsell, D. S. Automated docking in
crystallography: analysis of the substrates of aconitase.
Proteins 17, 1–10 (1993).
31. Read, R. J. & Hart, T. N. A multiple-start Monte Carlo
docking method. Proteins 13, 206–222 (1992).
32. Dixon, J. S. & Oshiro, C. M. Flexible ligand docking using a
genetic algorithm. J. Comput. Aided Mol. Des. 9, 113–130
33. Morris, G. M., Goodsell, D. S., Halliday, R. S., Huey, R. &
Hart, W. E. Automated docking using a Lamarckian genetic
algorithm and an empirical free energy function. J. Comput.
Chem. 19, 1639–1662 (1998).
34. Jones, G., Willet, P., Glen, R. C., Leach, A. R. & Taylor, R.
Development and validation of a genetic algorithm for
flexible docking. J. Mol. Biol. 267, 727–748 (1997).
35. Westhead, D. R., Clark, D. E. & Murray, C. W. A
comparison of heuristic search algorithms for molecular
docking. J. Comput. Aided Mol. Des. 11, 209–228
36. Baxter, C. A., Murray, C. W., Clark, D. E., Westhead, D. R. &
Eldridge, M. D. Flexible docking using tabu search and an
empirical estimate of binding affinity. Proteins 33, 367–382
37. Di Nola, A., Berendsen, H. J. C. & Roccatano, D. Molecular
dynamics simulation of the docking of substrates to
proteins. Proteins 19, 174–182 (1994).
38. Trosset, J.-Y. & Scheraga, H. A. Reaching the global
minimum in docking simulations: a Monte Carlo energy
minimization approach using Bezier Splines. Proc. Natl
Acad. Sci. USA 95, 8011–8015 (1995).
39. Carlson, H. A. & McGammon, J. A. Accommodating protein
flexibility in computational drug design. Mol. Pharmacol. 57,
Informative review of approaches to treat protein
flexibility in the computational study of protein–ligand
40. Leach, A. R. Ligand docking to proteins with discrete side-
chain flexibility. J. Mol. Biol. 235, 245–356 (1994).
41. Desmet, J., Maeyer, M. D., Hazes, B. & Lasters, I. The dead
end elimination theorem and its use in protein side-chain
positioning. Nature 356, 539–542 (1992).
42. Knegtel, R. M. A., Kuntz, I. D. & Oshiro, C. M. Molecular
docking to ensembles of protein structures. J. Mol. Biol.
266, 242–440 (1997).
43. Kollman, P. A. Free energy calculations: applications to
chemical and biochemical phenomena. Chem. Rev. 93,
Review of the theory of free-energy calculations and
their areas of application, including ligand binding.
44. Simonson, T., Archontis, G. & Karplus, M. Free energy
simulations come of age: protein–ligand recognition. Acc.
Chem. Res.35, 430–437 (2002).
45. Morris, G. M. et al. Automated docking using a
Lamarckian genetic algorithm and an empirical binding
free energy function. J. Comput. Chem. 19, 1639–1662
46. Weiner, S. J., Kollman, P. A., Nguyen, D. T. & Case, D. A. An
all-atom force field for simulations of proteins and nucleic
acids. J. Comput. Chem. 7, 252 (1986).
47. Verdonk, M. L., Cole, J. C., Hartshorn, M. J., Murray, C. W.
& Taylor, R. D. Improved protein–ligand docking using
GOLD. Proteins 52, 609–623 (2003).
48. Böhm, H.-J. LUDI: rule-based automatic design of new
substituents for enzyme inhibitor leads. J. Comput. Aided
Mol. Des. 6, 593–606 (1992).
49. Eldridge, M. D., Murray, C. W., Auton, T. R., Paolini, G. V. &
Mee, R. P. Empirical scoring functions: I. The development
of a fast empirical scoring function to estimate the binding
affinity of ligands in receptor complexes. J. Comput. Aided
Mol. Des. 11, 425–445 (1997).
50. Rarey, M., Kramer, B., Lengauer, T. & Klebe, G. A fast flexible
docking method using an incremental construction
algorithm. J. Mol. Biol. 261, 470–489 (1996).
51. Rognan, D., Lauemoller, S. L., Holm, A., Buus, S. &
Tschinke, V. Predicting binding affinities of protein ligands
from three-dimensional models: application to peptide
binding to class I major histocompatibility proteins. J. Med.
Chem. 42, 4650–4658 (1999).
52. Sitkoff, D. F., Sharp, K. A. & Honig, B. Accurate calculation
of hydration free energies using macroscopic continuum
models. J. Phys. Chem. 98, 1978–1983 (1998).
53. Huo, S., Wang, J., Cieplak, P., Kollman, P. A. & Kuntz, I. D.
Molecular dynamics and free energy analyses of
cathepsin D–inhibitor interactions: insight into structure-
based ligand design. J. Med. Chem. 45, 1412–1419
54. Muegge, I. A knowledge-based scoring function for
protein–ligand interactions: probing the reference state.
Perspect. Drug Discov. Des. 20, 99–114 (2000).
55. Muegge, I. Effect of ligand volume correction on PMF
scoring. J. Comput. Chem. 22, 418–425 (2001).
56. Muegge, I. & Martin, Y. C. A general and fast scoring
function for protein-ligand interactions: a simplified potential
approach. J. Med. Chem. 42, 791–804 (1999).
57. Gohlke, H., Hendlich, M. & Klebe, G. Knowledge-based
scoring function to predict protein-ligand interactions.
J. Mol. Biol. 295, 337–356 (2000).
58. DeWitte, R. S. & Shakhnovich, E. I. SMoG: de novo design
method based on simple, fast, and accurate free energy
estimates. 1. Methodology and supporting evidence. J. Am.
Chem. Soc. 118, 11733–11744 (1996).
59. Charifson, P. S., Corkery, J. J., Murcko, M. A. & Walters, W. P.
Consensus scoring: a method for obtaining improved hit rates
from docking databases of three-dimensional structures into
proteins. J. Med. Chem. 42, 5100–5109 (1999).
This study introduced the concept of consensus
scoring as an approach to balance imperfections of
single scoring functions and improve prediction
60. Wang, R., Lai, L. & Wang, S. Further development and
validation of empirical scoring functions for structure-based
binding affinity prediction. J. Comput. Aided Mol. Des. 16,
61. Perez, C. & Ortiz, A. R. Evaluation of docking functions for
protein–ligand docking. J. Med. Chem. 44, 3768–3785 (2001).
62. Good, A. C. et al.Analysis and optimization of structure-
based virtual screening protocols 2. Examination of docked
ligand orientations sampling methodology: mapping a
pharmacophore for success. J. Mol. Graph. Model. 22,
63. Baxter, C. A. et al.New approach to molecular docking and
its application to virtual screening of chemical databases.
J. Chem. Inf. Comput. Sci. 40, 254–262 (2000).
64. GOLD Version 1.2. [online], <http://www.ccdc.cam.ac.uk/
65. Sotriffer, C. A., Gohlke, H. & Klebe, G. Docking into
knowledge-based potential fields: a comparative evaluation
of DrugScore. J. Med. Chem. 45, 1967–1970 (2002).
66. Wang, R., Lu, Y. & Wang, S. Comparative evaluation of 11
scoring functions for molecular docking. J. Med. Chem. 46,
67. McGann, M. R., Almond, H. R., Nicholls, A., Grant, J. A. &
Brown, F. K. Gaussian docking functions. Biopolymers 68,
68. Schultz-Gasch, T. & Stahl, M. Binding site characteristics in
structure-based virtual screening: evaluation of current
docking tools. J. Mol. Model 9, 47–57 (2003).
69. Erickson, J. A., Jalaie, M., Robertson, D. H., Lewis, R. A. &
Vieth, M. Lessons in molecular recognition: the effects of
ligand and protein flexibility on molecular docking accuracy.
J. Med. Chem. 47, 45–55 (2004).
70. Kontoyianni, M., McClellan, L. M. & Sokol, G. S. Evaluation
of docking performance: comparative data on docking
algorithms. J. Med. Chem. 47, 558–565 (2004).
71. Smith, R., Hubbard, R. E., Gschwend, D. A., Leach, A. R. &
Good, A. C. Analysis and optimization of structure-based
virtual screening protocols 3. New Methods and old
problems in scoring function design. J. Mol. Graph. Model.
22, 41–53 (2003).
72. Still, W. C., Tempczyk, A., Hawley, R. C. & Hendrickson, T.
Semianalytical treatment of solvation for molecular
mechanics and dynamics. J. Am. Chem. Soc. 112,
73. Ghosh, A., Rapp, C. S. & Friesner, R. A. A generalized Born
model based on a surface integral formulation. J. Phys.
Chem. B 102, 10983–10990 (1998).
74. Nissink, J. W. M. et al.A new test set for validating
predictions of protein–ligand interaction. Proteins 49,
75. Grzybowski, B. A., Ishchenko, A. V., Shimada, J. &
Shakhnovich, E. I. From knowledge-based potentials to
combinatorial lead design in silico. Acc. Chem. Res. 35,
76. Diller, D. J. & Li, Y. Kinases, homology models, and high
throughput docking. J. Med. Chem. 46, 4638–4647 (2003).
77. DesJarlais, R. L. et al.Using shape complementarity as an
initial screen in designing ligands for a receptor binding site
of known three-dimensional structure. J. Med. Chem. 31,
NATURE REVIEWS | DRUG DISCOVERY
VOLUME 3 | NOVEMBER 2004 | 949
R E V I E W S
78. Dean, P. M. & Poornima, C. S. Hydration in drug design. 1.
Multiple hydrogen-bonding features of water molecules in
mediating protein–ligand interactions. J. Comput. Aided
Mol. Des. 9, 500–512 (1995).
79. McGovern, S. L., Caselli, E., Grigorieff, N. & Shoichet, B. K.
A common mechanism underlying promiscous inhibitors
from virtual and high-throughput screening. J. Med. Chem.
45, 1712–1722 (2002).
80. Roche, O. et al.Development of a virtual screening method
for identification of ‘frequent hitters’ in compound libraries.
J. Med. Chem. 45, 137–142 (2002).
81. Doman, T. N. et al.Molecular docking and high-throughput
screening for novel inhibitors of protein tyrosine
phosphatase-1B. J. Med. Chem. 45, 2213–2221 (2002).
An impressive example of the performance of
structure-based virtual screening.
82. McGovern, S. L. & Shoichet, B. K. Information decay in
molecular docking screens against holo, apo and modeled
conformations of enzymes. J. Med Chem. 46, 2895–2907
Informative analysis of the influence of chosen
protein-structure templates on the quality of docking
83. Lipinski, C. A. & Christopher, A. L. Experimental and
computational approaches to estimate solubility and
permeability in drug discovery and development settings.
Adv. Drug Deliv. Rev. 23, 3–25 (1997).
84. Nilakantan, R., Bauman, N. & Venkataraghavan, R. New
method for rapid characterization of molecular shapes:
applications in drug design. J. Chem. Inf. Comput. Sci. 33,
85. Good, A. C., Ewing, T. J. A., Gschwend, D. A. & Kuntz, I. D.
New molecular shape descriptors: application in database
screening. J. Comput. Aided Mol. Des. 9, 1–12 (1995).
86. Zauhar, R. J., Moyna, G., Tian, L., Li, Z. & Welsh, W. J.
Shape signatures: a new approach to computer-aided
ligand-and receptor-based drug design. J. Med. Chem. 46,
87. Rastelli, G. et al.Docking and database screening reveal
new classes of Plasmodium falciparum dihydrofolate
reductase inhibitors. J. Med. Chem. 46, 2834–2845 (2003).
88. Choong, I. C. et al.Identification of potent and selective
small-molecule inhibitors of caspase-3 through the use of
extended tethering and structure-based drug design.
J. Med. Chem. 45, 5005–5022 (2002).
89. Kick, E. K. et al.Structure-based design and combinatorial
chemistry yield low nanomolar inhibitors of cathepsin D.
Chem. Biol. 4, 297–307 (1997).
An instructive study highlighting the potential of
interfacing docking analysis and targeted library
90. Karplus, M. & Miranker, A. Functionality maps of binding
sites: a multiple copy simultaneous search method. Proteins
11, 29–34 (1991).
91. Caflisch, A. Computational combinatorial ligand design:
application to human α-thrombin. J. Comput. Aided Mol.
Des. 10, 372–396 (1996).
92. Böhm, H. J. The development of a simple empirical scoring
function to estimate the binding constant for a
protein–ligand complex of known three-dimensional
structure. J. Comput. Aided Mol. Des. 8, 243–256 (1994).
Pioneering development of an empirical scoring
function using multiple linear regression to calculate
coefficients for the most important terms
contributing to ligand binding.
93. Böhm, H. J. Prediction of binding constants of protein
ligands: a fast method for the polarization of hits obtained
from de novo design on 3D database search programs.
J. Comput. Aided Mol. Des. 12, 309–323 (1998).
94. Murcko, M. A. & Rotstein, S. H. GroupBuild: a fragment-
based method for de novo drug design. J. Med. Chem.
36, 1700–1710 (1993).
95. Murcko, M. A. & Rotstein, S. H. GenStar: a method for de
novo drug design. J. Comput. Aided Mol. Des. 7, 23–43
96. Howe, W. J. & Moon, J. B. 3D database searching and de
novo construction methods in molecular design. Comput.
Meth. 3, 697–711 (1990).
97. Bohacek, R. S. & McMartin, C. Multiple highly diverse
structures complementary to enzyme binding sites: results
of extensive application of a de novo design method
incorporating combinatorial growth. J. Am. Chem. Soc.
116, 5560–5571 (1994).
98. Vinkers, H. M. et al.SYNOPSIS: SYNthesize and OPtimize
system in silico. J. Med. Chem. 46, 2765–2773 (2003).
99. Guimaraes, C. R. W. & de Alencastro, R. B. Thrombin
inhibition by novel benzamidine derivatives: a free-energy
perturbation study. J. Med. Chem. 45, 4995–5004 (2003).
100. Pearlman, D. A. & Charifson, P. S. Improved scoring of
ligand–protein interactions using OWFEG free energy grids.
J. Med. Chem. 44, 502–511 (2001).
101. Aquist, J., Medina, C. & Samuelsson, J. E. A new method
for predicting binding affinity in computer-aided drug design.
Protein Eng. 7, 385–391 (1994).
This article presents an early formulation and use of
linear response and linear interaction approximations
in estimating binding affinity of protein ligands.
102. Tounge, B. A. & Reynolds, C. H. Calculation of the binding
affinity of β-secretase inhibitors using the linear interaction
energy method. J. Med. Chem. 46, 2074–2082 (2003).
103. Rizzo, R. C., Wang, D.-P., Tirado-Rives, J. & Jorgensen, W. L.
Validation of a model for the complex of HIV-1 reverse
transcriptase with sustiva through computation of resistance
profiles. J. Am. Chem. Soc. 122, 12898–12900 (2003).
104. Rizzo, R. C., Tirado-Rives, J. & Jorgensen, W. L. Estimation
of binding affinities for HEPT and nevirapine analogues
with HIV-1 reverse transcriptase via Monte Carlo
simulations. J. Med. Chem. 44, 145–154 (2003).
105. Kroeger-Smith, M. B. et al.Molecular modeling calculations
of HIV-1 reverse transcriptase nonnucleoside inhibitors:
correlation of binding energy with biological activity for novel
2-aryl-substituted benzimidazole analogues. J. Med. Chem.
46, 1940–1947 (2003).
106. Udier-Blagovic, M., Tirado-Rives, J. & Jorgensen, W. L.
Validation of a model for the complex of HIV-1 reverse
transcriptase with nonnucleoside inhibitor TMC125. J. Am.
Chem. Soc. 125, 6016–6017 (2003).
107. Rizzo, R. C. et al.Prediction of activity for nonnucleoside
inhibitors with HIV-1 reverse transcriptase based on Monte
Carlo simulations. J. Med. Chem. 45, 2970–2987 (2002).
108. Ostrovsky, D., Udier-Blagovic, M. & Jorgensen, W. L.
Analyses of activity for Factor Xa inhibitors based on Monte
Carlo simulations. J. Med. Chem. 46, 5691–5699 (2003).
109. van Lipzig, M. M. et al.Prediction of ligand binding affinity
and orientation of xenoestrogens to the estrogen receptor
by molecular dynamics simulations and the linear interaction
energy method. J. Med. Chem. 47, 1030 (2004).
This work provides a good example of linear
interaction methods applied to binding energies
ranging over many orders of magnitude.
110. Kollman, P. A. et al.Calculating structures and free energies
of complex molecules: combining molecular mechanics
and continuum models. Acc. Chem. Res. 33, 889–897
111. Masukawa, K. M., Kollman, P. A. & Kuntz, I. D. Investigation
of neuraminidase-substrate recognition using molecular
dynamics and free energy calculations. J. Med. Chem. 46,
112. Sheridan, R., Holloway, M. K., McGaughey, G. B., Mosley,
R. T. & Singh, S. B. A simple method for visualizing the
differences between related receptor sites. J. Mol. Graph.
Model. 21, 71–79 (2002).
113. Deng, Z., Chuaqui, C. & Singh, J. Structural interaction
fingerprint (SIFt): a novel method for analyzing three-
dimensional protein–ligand binding interactions. J. Med.
Chem. 47, 337–344 (2004).
114. Horvath, D. A virtual screening approach applied to the
search for trypanothione reductase inhibitors. J. Med.
Chem. 40, 2412–2423 (1997).
The study details many possible scoring terms for
protein–ligand complexes and is a good example of
the value of refitting parameters for a particular
protein class and series of ligands.
115. Matter, H. et al.Design and quantitative structure–activity
relationship of 3-amidinobenzyl-1H-indole-2-carboxamides
as potent, nonchiral and selective inhibitors of blood
coagulation factor Xa. J. Med. Chem. 45, 2749–2769 (2002).
116. Murcia, M. & Ortiz, A. R. Virtual screening with flexible docking
and COMBINE-based models. Application to a series of
factor Xa inhibitors. J. Med. Chem. 47, 805–820 (2004).
117. van de Waterbeemd, H. & Gifford, E. ADMET in silico
modelling: towards prediction paradise? Nature Rev. Drug
Discov. 2, 192–204 (2003).
118. Omiecinski, C. J. Concise review of the cytochrome P450s
and their roles in toxicology. Toxicol. Sci. 48, 151–156 (1999).
119. de Groot, M. J., Ackland, M. J., Horne, V. A., Alex, A. A. &
Jones, B. C. Novel approach to predicting P450-mediated
drug metabolism: development of a combined protein and
pharmacophore model for CYP2D6. J. Med. Chem. 42,
120. de Groot, M. J., Ackland, M. J., Horne, V. A., Alex, A. A. &
Jones, B. C. A novel approach to predicting P450 mediated
drug metabolism. CYP2D6 catalyzed n-dealkylation
reactions and qualitative metabolite predictions using a
combined protein and pharmacophore model for CYP2D6.
J. Med. Chem. 42, 4062–4070 (1999).
121. de Groot, M. J. Development of a combined protein and
pharmacophore model for cytochrome P450 2C9. J. Med.
Chem. 45, 1983–1993 (2002).
122. Park, J.-Y. & Harris, D. Construction and assessment of
models of CYP2E1: Predictions of metabolism from docking,
molecular dynamics, and density functional theoretical
calculations. J. Med. Chem. 46, 1645–1660 (2003).
123. Godden, J. W., Stahura, F. L. & Bajorath, J. Statistical analysis
of computational docking of large compound databases to
distinct protein binding sites. J. Comput. Chem. 20,
124. Briem, H. & Kuntz, I. D. Molecular similarity based on DOCK-
generated fingerprints. J. Med. Chem. 39, 3401–3408 (1996).
125. Su, A. I. et al.Docking molecules by families to increase the
diversity of hits in database screens: computational strategy
and experimental evaluation. Proteins42, 279–293 (2001).
126. Rognan, D., Lauemoller, S. L., Holm, A., Buus, S., Tschinke V.
Predicting binding affinities of protein ligands from three-
dimensional models: application to peptide binding to class I
major histocompatibility proteins. J. Med. Chem. 42,
127. Wei, B. Q., Baase, W. A., Weaver, L. H., Matthews, B. W. &
Shoichet, B. K. A model binding site for testing scoring
functions in molecular docking. J. Mol. Biol. 322, 339–355
128. Fradera, X., Knegtel, M. A., Mestres, J. Similarity-driven
flexible ligand docking. Proteins40, 623–626 (2000).
129. Lamb, M. L. et al.Design, docking, and evaluation of multiple
libraries against multiple targets. Proteins42, 296–318 (2001).
130. Aronov, A. M., Munagala, N. R., Kuntz, I. D. & Wang, C. C.
Virtual screening of combinatorial libraries across a gene
family in search of inhibitors of Giardia lamblia guanine
phosphoribosyltransferase. Antimicrob. Agents Chemother.
45, 2571–2576 (2001).
131. Wang, R., Liu, L., Lai, L. & Tang,Y. SCORE: a new empirical
method for estimating the binding affinity of a protein-ligand
complex. J. Mol. Model4, 379–394 (1998).
132. Tao, P. & Lai, L. Protein ligand docking based on empirical
method for binding affinity estimation. J. Comput. Aided Mol.
Des. 15, 429–446 (2001).
133. Chemical Computing Group. MOE. 2003. Montreal, Quebec,
134. Friesner, R. A. et al.Glide: a new approach for rapid, accurate
docking and scoring. 1. Method and assessment of docking
accuracy. J. Med. Chem. 47, 1739–1749 (2004).
135. Kearsley, S. K., Underwood, D. J., Sheridan, R. P. &
Miller, M. D. Flexibases: a way to enhance the use of
molecular docking methods. J. Comput. Aided Mol. Des. 8,
136. Peng, H. et al.Identification of novel inhibitors of BCR-ABL
tyrosine kinase via virtual screening. Bioorg. Med. Chem. Lett.
13, 3693–3699 (2003).
137. McNally, V. A. et al.Identification of a novel class of inhibitor
of human and Escherichia coli thymidine phosphorylase by
in silico screening. Bio. Med. Chem. Lett. 13, 3705–3709
138. Brenk, R. et al.Virtual screening for submicromolar leads of
tRNA-guanine transglycosylase based on a new unexpected
binding mode detected by crystal structure analysis. J. Med.
Chem. 46, 1133–1143 (2003).
139. Kamionka, M. et al.In silicoand NMR identification of
inhibitors of the IGF-I and IGF-Binding protein-5 interaction.
J. Med. Chem. 45, 5655–5660 (2002).
140. Vangrevelinghe, E. et al.Discovery of a potent and selective
protein kinase CK2 inhibitor by high-througput docking.
J. Med. Chem. 46, 2656–2662 (2003).
141. Enyedy, I. J. et al.Discovery of small-molecule inhibitors of
Bcl-2 through structure-based computer screening. J. Med.
Chem. 44, 4313–4324 (2001).
H.D. and J.R.F. contributed equally to this paper. This manuscript
is dedicated to Wolfram Saenger, Free University Berlin, on the
occasion of his sixty-fifth birthday.
Competing interests statement
The authors declare no competing financial interests.
Protein Structure Prediction Center:
Research Collaboratory for Structural Biology Protein Data
Biomolecular Interaction Network Database:
Drug Design Resources: http://www.drugdesign.org
Access to this interactive links box is free online.