Inferred Biomolecular Interaction Server (IBIS)—a web server to analyze and predict protein interacting partners and binding sites
IBIS is the NCBI Inferred Biomolecular Interaction Server. This server organizes, analyzes and predicts interaction partners and locations of binding sites in proteins. IBIS provides annotations for different types of binding partners (protein, chemical, nucleic acid and peptides), and facilitates the mapping of a comprehensive biomolecular interaction network for a given protein query. IBIS reports interactions observed in experimentally determined structural complexes of a given protein, and at the same time IBIS infers binding sites/interacting partners by inspecting protein complexes formed by homologous proteins. Similar binding sites are clustered together based on their sequence and structure conservation. To emphasize biologically relevant binding sites, several algorithms are used for verification in terms of evolutionary conservation, biological importance of binding partners, size and stability of interfaces, as well as evidence from the published literature. IBIS is updated regularly and is freely accessible via http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.html.
Inferred Biomolecular Interaction Server—a web
server to analyze and predict protein interacting
partners and binding sites
Benjamin A. Shoemaker, Dachuan Zhang, Ratna R. Thangudu, Manoj Tyagi,
Jessica H. Fong, Aron Marchler-Bauer, Stephen H. Bryant, Thomas Madej* and
Anna R. Panchenko*
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health,
Bethesda, MD 20894, USA
Received August 15, 2009; Revised September 16, 2009; Accepted September 21, 2009
IBIS is the NCBI Inferred Biomolecular Interaction
Server. This server organizes, analyzes and
predicts interaction partners and locations of
binding sites in proteins. IBIS provides annotations
for different types of binding partners (protein,
chemical, nucleic acid and peptides), and facilitates
the mapping of a comprehensive biomolecular inter-
action network for a given protein query. IBIS
reports interactions observed in experimentally
determined structural complexes of a given
protein, and at the same time IBIS infers binding
sites/interacting partners by inspecting protein
complexes formed by homologous proteins.
Similar binding sites are clustered together based
on their sequence and structure conservation. To
emphasize biologically relevant binding sites,
several algorithms are used for verification in
terms of evolutionary conservation, biological
importance of binding partners, size and stability
of interfaces, as well as evidence from the published
literature. IBIS is updated regularly and is freely
accessible via http://www.ncbi.nlm.nih.gov/
Proteins function by interacting with other biomolecules,
and a complete protein functional annotation is impossi-
ble without knowledge of the protein interactions.
Mapping biomolecular interactions is invaluable in deci-
phering the interactome, the entire set of molecular
interactions in a cell. Recent advances in the experimental
and computational tools for identifying proteins and their
complexes have spawned a wealth of information that
encourages such a mapping (1,2).
The most successful function prediction methods rely
on evolutionary relationships between proteins and the
conservation of their molecular function; they look for
sequence similarities between unknown queries and func-
tionally annotated proteins (3,4). A similar approach has
been used to infer protein interaction partners from a set
of homologous proteins, where an interaction between
two proteins is predicted if this interaction has been
observed between orthologs (interologs) in other species
(5). Homology inference methods have certain limitations,
though. Common descent does not necessarily imply
similarity in function or interactions, and annotations
transferred from one homologous protein to another
may result in incorrect functional or interolog assignment
at larger evolutionary distances (3,6–8). To verify and
guide annotations, it is often essential to detect function-
ally important binding sites. Current binding site predic-
tion methods can be subdivided into several major
categories: those which use evolutionary conservation of
binding site motifs, those which use information about a
structure of a complex, and docking methods (9).
The knowledge of protein structure may facilitate
and improve the annotation of protein function and the
characterization of protein binding partners and binding
sites. Structure-based methods use detailed knowledge of
the protein structure to identify binding sites on the basis
of the physico-chemical properties of individual residues,
their electrostatic contribution, and their location in the
3D structure (10–14). A number of servers have been
developed for predicting protein binding sites from
structures by locating the binding pockets, by identifying
sequence and structural features of homologous proteins
which are important for binding, or by using threading
and other approaches (14–22).
*To whom correspondence should be addressed. Tel: +1 301 435 5891; Fax: +1 301 480 4637; Email: firstname.lastname@example.org
Correspondence may also be addressed to Thomas Madej. Tel: +1 301 435 5998; Fax: +1 301 480 4637; Email: email@example.com
D518–D524 Nucleic Acids Research, 2010, Vol. 38, Database issue Published online 20 October 2009
Published by Oxford University Press 2009.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/
by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
We have developed a new database and server called
IBIS (Inferred Biomolecular Interaction Server), which
provides tools to investigate biomolecular interactions
observed in a given protein structure together with the
complex set of interactions inferred from its close
homologs. IBIS identiﬁes and predicts a protein’s interac-
tion partners together with the locations of the corre-
sponding binding sites on the protein query. It does not
focus on one speciﬁc type of interacting molecule, but
provides annotations of binding sites for proteins, small
chemicals, nucleic acids and peptides (interactions with
ions are currently under development). This may allow
the mapping of a comprehensive biomolecular interaction
network for a given query, depending on the data avail-
able for its protein family.
To focus on biologically relevant binding sites, IBIS
clusters similar binding sites found in homologous
proteins based on the sites’ conservation of sequence
and structure. Binding sites which appear evolutionarily
conserved among non-redundant sets of homologous
proteins are given higher priority in the displays.
Additionally, binding site clusters are validated by
comparing them with binding site annotations from a
manually curated subset of the Conserved Domain
Database (CDD) (23), if available. In the case of
protein–protein binding sites, IBIS also compares its
ﬁndings to binding interfaces conﬁrmed by the PISA algo-
rithm (24), which estimates the stability of protein-protein
interfaces observed in crystal structures. After binding
sites are clustered, position speciﬁc score matrices
(PSSMs) are constructed from the corresponding
binding site alignments. Together with other measures,
the PSSMs are subsequently used to rank binding sites
to assess how well they match the query, and to gauge
the biological relevance of binding sites with respect to
MATERIAL AND METHODS
The current release of the Molecular Modeling Database
(MMDB) (25), an automatically parsed and validated
derivative of the Protein Data Bank (PDB) (26) hosted
by the National Center for Biotechnology Information
(NCBI) is used in this study. MMDB addresses several
issues in interpreting PDB’s 3D structure data and
provides standardized structural information. For
example, MMDB attempts to ﬁx atom name ambiguity,
establishes chemical graphs that contain explicit bonding
information, extracts biopolymer sequences and small
molecules that get deposited into corresponding
databases, and cross-references its entries to GenBank,
PubMed, the NCBI taxonomy database and PubChem.
Deﬁning a unit of interaction
Protein–protein interactions are identiﬁed and analyzed
on the level of domains. The Conserved Domain Search
service (CD-Search) provides domain annotation for
query sequences and pre-computed annotation for the
majority of all entries in NCBI’s Entrez Protein
database (27). If a complete protein chain is used as a
query, protein–protein interaction annotations are
provided separately for each domain identiﬁed on this
query. Interactions between protein domains may occur
on the same protein chain, not involving any other
molecule. For other types of binding partners (chemicals,
nucleic acids and peptides), interactions are deﬁned
for a complete protein chain regardless of its domain
annotations and always involve another molecule.
One eﬀort that is made to reduce nonbiological contacts
regards the case of a chemical that interacts with multiple
chains. If contacts to that chemical are dominated by one
of the chains (>75%), then its interactions with the other
proteins are not considered; otherwise each protein inter-
action with the chemical will be listed separately.
Deﬁning interactions and binding site residues
An interaction and binding site is deﬁned if a protein has
at least ﬁve residues in contact with another protein,
chemical, DNA, RNA or peptide. Contacts are deﬁned
if any of the heavy-atom inter-atomic distances is
shorter than 4A
. The binding site is deﬁned as a group
of residues which make a contact with a given type of
interaction partner. For protein–DNA interactions each
DNA strand is considered separately. In the case of
protein–chemical interactions, chemical ligands are all
validated and standardized (if possible) by the PubChem
databases and have explicit links to PubChem (28) which
may provide extensive information on their known biolog-
ical activities. There are two types of interactions and
binding sites recorded in IBIS: ‘observed’ from experimen-
tal structures and ‘inferred’ from homologs.
Clustering and Inferring binding sites
A ﬂowchart that summarizes the inference of binding
partners and sites on a query is presented in Figure 1.
First we collect homologs with known structures and
higher than 30% identity to the query. To ensure good
quality alignments, the VAST structure–structure compar-
ison algorithm is used (29). If a query does not have a
structure, the BLAST heuristic is applied to ﬁnd the
most closely related structure (30). The closest homolog
(with an E-value <0.01) is picked with a conservative
threshold for alignment extent, requiring 80% or more
of the query sequence to be aligned.
A binding site cluster represents a collection of
structures which are related to the query, and where all
members of the cluster contain similar overlapping
binding sites when mapped onto the query. Similarity
between binding sites is measured in terms of sequence
similarity, and those positions which overlap structurally
are assigned an additional weight. Binding sites are
clustered by a hierarchical complete linkage clustering
procedure. To decide on the cutoﬀ for clustering, we use
a recently described energy function which maximizes the
mean similarity of members within a cluster and minimizes
the complexity of the description provided by cluster
membership (number of bits required to describe the
data) (31). Clusters which contain an actual interaction
observed in the query structure are marked by the letter
Nucleic Acids Research, 2010, Vol. 38, Database issue D519
‘O’. By expanding the cluster one can see additional infor-
mation about its members.
All binding site clusters are ranked in terms of their
predicted biological relevance and similarity to the
query. The components of the ranking score are the
sequence-PSSM score; the average sequence identity
between the query and cluster members calculated over
the whole structure–structure alignment; the number of
interfacial contacts and the average sequence conservation
of binding site alignment columns. All components of the
ranking score are then normalized and all clusters are
ranked with respect to the Z-scores.
Evaluating biological relevance of binding sites
To emphasize biologically relevant binding sites we
validate sites according to a few criteria. First, we assess
the evolutionary conservation of binding site clusters.
Those sites which reoccur in diverse enough protein
complexes are ranked higher, an idea which was
previously implemented in the Conserved Binding
Modes (CBM) database (32). Clusters that have only
one non-redundant member (after members with >90%
identity are purged) are considered ‘singletons’ and are
displayed at the bottom of the interaction summary
table with a low rank. Another way to evaluate binding
sites is to compare them with manually curated site
annotations from the Conserved Domain Database
(CDD), which have been extracted from the published
literature or derived from manual interpretation of indi-
vidual three-dimensional structures (23). Binding site
clusters which overlap by >50% with a CDD annotation
are ranked ﬁrst. For protein–chemical interactions, we
exclude by default chemicals such as buﬀers, salts,
detergents, solvents and ions that are typically added
for the purpose of crystallization and/or puriﬁcation.
Most often, these are not relevant with respect to the
protein’s biological function. Finally, we employ the
PISA algorithm (24) to validate protein–protein interac-
tion interfaces and eliminate those interfaces which appear
to be the result of crystal packing.
RESULTS AND DISCUSSION
Summary statistics of the IBIS database
Currently, a total of 40 716 proteins (151 887 protein
chains/domains) are represented in IBIS with at least
one type of interaction observed in their structural
complexes. As can be seen from Figure 2, protein–
protein and protein–chemical interactions are the most
Figure 1. Overview of the binding site annotation procedure in IBIS.
D520 Nucleic Acids Research, 2010, Vol. 38, Database issue
frequent types of interactions observed in protein struc-
tures. Protein–protein interactions are the most prevalent
interactions as reﬂected by the number of domains
involved in interactions and the number of binding sites.
The number of inferred interactions is always higher than
the number of observed interactions, especially for
protein–peptide and protein–nucleic acid interactions,
where the number of inferred interactions exceeds the
number of observed ones (in terms of the number of
protein chains) almost 5-fold. This ratio is even higher
for binding site clusters (Figure 2B). Altogether, IBIS
provides information on binding partners and binding
site locations with averages of 3.4 protein–chemical
binding site clusters per chain, and eight protein–protein
binding site clusters per domain. The scale of such
annotations is approaching the scale of whole
Description of the IBIS interface
IBIS may be queried by supplying either a protein NCBI
GenBank identiﬁer or PDB code (the one letter PDB chain
identiﬁer is optional). For a given query, it is possible
to see diﬀerent types of interactions, protein–protein,
protein–chemical, protein–DNA, protein–RNA and
protein–peptide, by navigating through diﬀerent tabs at
the top of the page (the display of protein-ion interactions
is currently under development). Figure 3 illustrates an
IBIS Interaction Summary page. Observed and inferred
binding site clusters are sorted by the ranking score.
Each row in the table corresponds to a binding site
cluster and can be expanded to show the cluster members.
The main features of binding sites and interaction
partners in the Interaction Summary table are as follows:
‘Interaction partner’—name of the interaction partner
which interacts with either the actual query (‘observed’
interactions) or homologs of the query from within a
given binding site cluster (‘inferred’ interactions). For
protein–protein interactions, the CDD domain name of
the binding partner is listed. For protein–chemical
interactions, the column reports the name of the
chemical bound to a representative member of the
cluster. For protein–nucleic acid and protein–peptide
interactions, the column reports the sequence of the ﬁrst
20 biopolymer residues from the interaction partner of a
representative cluster member.
‘Ranking score’—the score which ranks the binding site
clusters in terms of their biological relevance and similar-
ity to the query. The ranking score is not deﬁned for the
‘Number of cluster members’—the number of cluster
members. Upon cluster expansion only non-redundant
cluster members are displayed (at <90% identity level).
A complete list of members can also be viewed by
clicking the ‘See all members’ link.
‘Average percent identity to query’—the average
sequence identity between the query and the cluster
members calculated over all of their structural alignments
with the query.
‘Number of binding site residues’—the union of binding
sites mapped from all members of the cluster to the query.
‘Number of chemicals’ (for protein–chemical inter-
actions)—the number of unique, standardized chemicals
present in a given binding site cluster.
‘Curator annotation’—binding site annotation from the
CDD which overlaps by >50% with the sites annotated by
IBIS. Binding site clusters with matching CDD annotation
are top-ranked irrespective of their ranking score.
‘Taxonomic diversity’—the last common ancestor of the
proteins from a given cluster, listed with a link to NCBI’s
Taxonomy Browser, so that one can explore all taxonomic
groups represented by the cluster.
The actual binding site residue alignment can be seen
upon expanding the clusters, including the PDB codes cor-
responding to all complex structures summarized by the
clusters. It is also possible to view the inferred binding
sites projected onto the actual query structure using the
Cn3D visualization software (http://www.ncbi.nlm.nih
.gov/Structure/CN3D/cn3d.shtml). For the case of
Figure 2. (A) Histogram depicting the number of proteins in PDB with
observed/inferred binding sites. (B) Histogram showing the number of
binding sites inferred by IBIS as compared to those observed in protein
Nucleic Acids Research, 2010, Vol. 38, Database issue D521
protein–protein interactions, the expanded table will
provide the PISA validation status for each interaction
interface. PISA may not be able to process a particular
complex structure; these cases are indicated by an ‘N/A’
The features of binding site clusters can be examined
by using the ‘Advanced search’ option found on the left
side bar. This option allows one to ﬁlter the interactions
within a given interaction type by various criteria like
level of sequence identity, structural similarity, names of
interacting partner and others. In the case of chemical
binding sites, for example, it is possible to pick and
inspect various sites a particular chemical may bind to
on a given query.
Annotating new binding sites using IBIS: example
of human spleen tyrosine kinase catalytic domain
Spleen tyrosine kinase (Syk) is a non-receptor tyrosine
kinase, expressed in a wide range of cell types, which
plays an important role in immunoreceptor signaling
(33). It is an attractive drug target for the treatment of
allergic and antibody mediated autoimmune diseases,
breast and gastric cancers. Syk is characterized by two
N-terminal SH2 adapter domains, a linker region and a
C-terminal catalytic domain. Several drugs/inhibitors
target the active site of the Syk catalytic domain and
decrease its activity.
Here, we demonstrate how IBIS can be used to annotate
the binding sites of the Syk catalytic domain. We start
with a Syk sequence for which a structure of the
complex with the ligands is available (pdb code: 1XBB);
we predict binding sites using IBIS, and ﬁnally compare
predicted sites with the actual binding sites observed in the
structure. First we ﬁnd the closest homolog with a known
structure, a Zap-70 kinase (1U59 Chain A; Blast E-value
of 6e-99 and 77% identity to the query sequence,
Figure 2). Second, we use the structure of 1U59 as a
query in IBIS and ﬁnd nine protein–chemical binding
site clusters. The top two clusters overlap with the
‘active site/ATP binding site’ CDD annotations. The
ﬁrst binding site cluster includes 360 homologous
structures bound to 170 diﬀerent chemicals. The consen-
sus binding site alignment is 65 residues long, due to the
diversity and size variation of the chemicals bound, but it
highlights 13 highly conserved residues. The ATP-binding
site represents an attractive target for the design of kinase
inhibitors, and IBIS provides a concise summary of
interactions at that site, which would otherwise require
Figure 3. IBIS screen shot for 1U59, Chain A, displaying various chemical binding sites inferred from its homologs. A blowup of the expanded
cluster of the ATP binding site is also shown.
D522 Nucleic Acids Research, 2010, Vol. 38, Database issue
signiﬁcant comparative analysis. Here IBIS groups
and identiﬁes an ATP-binding site, and provides a list of
various chemicals, among them many kinase inhibitors,
which might potentially bind to and inhibit the query
protein. All binding sites observed in the actual structure
complex with the anticancer drug imatinib (1XBB)
are correctly annotated by IBIS (see table in Figure 4).
Interestingly, imatinib binds not only to the ATP-
binding site but also to a regulatory myristoylation site
on the C-terminus (from the binding site cluster #8) that
can be annotated on the query sequence.
In addition to chemical binding sites, it is also possible
to predict protein interaction partners for the Syk protein.
For example, binding site cluster #1 under protein–protein
interactions points to a potential SH2 domain binding
site which is further validated by CDD curator annota-
tion, although no structural complexes have been solved
between Syk and SH2.
In this paper, we presented a comprehensive, web-
accessible database, which organizes, analyzes and
predicts diﬀerent types of interaction partners and
binding sites in proteins. For proteins with or without
known binding partners, IBIS provides a succinct and
informative representation of observed binding sites and
binding sites inferred from homologs with known 3D
structure. It provides analysis of how well a binding site
is conserved across members of a homologous protein
family. Several structures of the same protein or close
homologs with diﬀerent binding partners may be available
in the Protein Data Bank, or the same protein may have
been crystallized under diﬀerent physiological conditions.
In such cases, the IBIS database facilitates a detailed
classiﬁcation and analysis of binding sites. IBIS also
attempts to validate binding sites by assessing their bio-
logical relevance and ranks them accordingly. It can be
used to annotate oligomeric states by inferring relevant
homo-oligomer interfaces and should prove useful in
studying the evolution of protein interactions.
IBIS is updated regularly (currently on a biweekly
schedule) to account for the growth of the GenBank,
PDB/MMDB, VAST and CDD databases. Recently, it
was estimated that almost half of all sequences in the
GenBank database have at least one structure homolog
with an extensive alignment and at least 30% identical
residues (34). As the on-going structural genomics initia-
tive continues to close the sequence-structure gap, IBIS
serves as a powerful knowledge-based annotation system
for proteins of unknown structure.
The authors would like to thank Yanli Wang and Lewis
Geer for useful discussions and Eugene Krissinel for help
with the PISA software.
National Institutes of Health/DHHS (Intramural
Research program of the National Library of Medicine).
Funding for open access charge: National Institutes of
Figure 4. Mapping of the 1U59 inferred ATP binding site onto the sequence of Syk tyrosine kinase (1XBB chain A) and its agreement with the
observed binding site in Syk + complex with imatinib. MMDB residue numbering is used which starts from the beginning of the corresponding
GenBank protein sequence.
Nucleic Acids Research, 2010, Vol. 38, Database issue D523
Health/DHHS (Intramural Research program of the
National Library of Medicine).
Conﬂict of interest statement. None declared.
1. Giot,L., Bader,J.S., Brouwer,C., Chaudhuri,A., Kuang,B., Li,Y.,
Hao,Y.L., Ooi,C.E., Godwin,B., Vitols,E. et al. (2003) A protein
interaction map of Drosophila melanogaster. Science, 302,
2. Li,S., Armstrong,C.M., Bertin,N., Ge,H., Milstein,S., Boxem,M.,
Vidalain,P.O., Han,J.D., Chesneau,A., Hao,T. et al. (2004) A map
of the interactome network of the metazoan C. elegans. Science,
3. Bork,P. and Koonin,E.V. (1998) Predicting functions from protein
sequences—where are the bottlenecks? Nat. Genet., 18, 313–318.
4. Rentzsch,R. and Orengo,C.A. (2009) Protein function prediction—
the power of multiplicity. Trends Biotechnol., 27, 210–219.
5. Matthews,L.R., Vaglio,P., Reboul,J., Ge,H., Davis,B.P., Garrels,J.,
Vincent,S. and Vidal,M. (2001) Identiﬁcation of potential
interaction networks using sequence-based searches for conserved
protein-protein interactions or ‘interologs’. Genome Res., 11,
6. Gerlt,J.A. and Babbitt,P.C. (2000) Can sequence determine
function? Genome Biol., 1, REVIEWS0005.
7. Yu,H., Luscombe,N.M., Lu,H.X., Zhu,X., Xia,Y., Han,J.D.,
Bertin,N., Chung,S., Vidal,M. and Gerstein,M. (2004) Annotation
transfer between genomes: protein-protein interologs and protein-
DNA regulogs. Genome Res., 14, 1107–1118.
8. Hegyi,H. and Gerstein,M. (1999) The relationship between protein
structure and function: a comprehensive survey with application to
the yeast genome. J. Mol. Biol., 288, 147–164.
9. Campbell,S.J., Gold,N.D., Jackson,R.M. and Westhead,D.R.
(2003) Ligand binding: functional site location, similarity and
docking. Curr. Opin. Struct. Biol., 13, 389–395.
10. Jones,S. and Thornton,J.M. (1997) Analysis of protein-protein
interaction sites using surface patches. J. Mol. Biol., 272, 121–232.
11. Teichmann,S.A., Murzin,A.G. and Chothia,C. (2001)
Determination of protein function, evolution and interactions by
structural genomics. Curr. Opin. Struct. Biol., 11 , 354–363.
12. Landgraf,R., Xenarios,I. and Eisenberg,D. (2001) Three-
dimensional cluster analysis identiﬁes interfaces and functional
residue clusters in proteins. J. Mol. Biol., 307, 1487–1502.
13. Pazos,F. and Sternberg,M.J. (2004) Automated prediction of
protein function and detection of functional sites from structure.
Proc. Natl Acad. Sci. USA, 101, 14754–14759.
14. Brylinski,M. and Skolnick,J. (2008) A threading-based method
(FINDSITE) for ligand-binding site prediction and functional
annotation. Proc. Natl Acad. Sci. USA, 105, 129–134.
15. Hernandez,M., Ghersi,D. and Sanchez,R. (2009)
SITEHOUND-web: a server for ligand binding site identiﬁcation
in protein structures. Nucleic Acids Res., 37, W413–W416.
16. Huang,B. and Schroeder,M. (2006) LIGSITEcsc: predicting ligand
binding sites using the Connolly surface and degree of conservation.
BMC Struct Biol., 6
17. Laurie,A.T. and Jackson,R.M. (2005) Q-SiteFinder: an energy-
based method for the prediction of protein-ligand binding sites.
Bioinformatics, 21, 1908–1916.
18. Qin,S. and Zhou,H.X. (2007) meta-PPISP: a meta web server for
protein-protein interaction site prediction. Bioinformatics, 23,
19. Talavera,D., Laskowski,R.A. and Thornton,J.M. (2009) WSsas:
a web service for the annotation of functional residues through
structural homologues. Bioinformatics, 25, 1192–1194.
20. Snyder,K.A., Feldman,H.J., Dumontier,M., Salama,J.J. and
Hogue,C.W. (2006) Domain-based small molecule binding site
annotation. BMC Bioinformatics, 7, 152.
21. Chen,Y.C., Lo,Y.S., Hsu,W.C. and Yang,J.M. (2007) 3D-partner:
a web server to infer interacting partners and binding models.
Nucleic Acids Res., 35, W561–567.
22. Stein,A., Panjkovich,A. and Aloy,P. (2009) 3did Update: domain-
domain and peptide-mediated interactions of known 3D structure.
Nucleic Acids Res., 37, D300–D304.
23. Marchler-Bauer,A., Anderson,J.B., Chitsaz,F., Derbyshire,M.K.,
DeWeese-Scott,C., Fong,J.H., Geer,L.Y., Geer,R.C.,
Gonzales,N.R., Gwadz,M. et al. (2009) CDD: speciﬁc functional
annotation with the Conserved Domain Database. Nucleic Acids
Res., 37, D205–210.
24. Krissinel,E. and Henrick,K. (2007) Inference of macromolecular
assemblies from crystalline state. J. Mol. Biol., 372, 774–797.
25. Chen,J., Anderson,J.B., DeWeese-Scott,C., Fedorova,N.D.,
Geer,L.Y., He,S., Hurwitz,D.I., Jackson,J.D., Jacobs,A.R.,
Lanczycki,C.J. et al. (2003) MMDB: Entrez’s 3D-structure
database. Nucleic Acids Res. , 31, 474–477.
26. Sussman,J.L., Lin,D., Jiang,J., Manning,N.O., Prilusky,J., Ritter,O.
and Abola,E.E. (1998) Protein Data Bank (PDB): database of
three-dimensional structural information of biological
macromolecules. Acta Crystallogr. D Biol. Crystallogr., 54,
27. Marchler-Bauer,A. and Bryant,S.H. (2004) CD-Search: protein
domain annotations on the ﬂy. Nucleic Acids Res., 32, W327–W331.
28. Wang,Y., Xiao,J., Suzek,T.O., Zhang,J., Wang,J. and Bryant,S.H.
(2009) PubChem: a public information system for analyzing
bioactivities of small molecules. Nucleic Acids Res., 37,
29. Gibrat,J.F., Madej,T. and Bryant,S.H. (1996) Surprising similarities
in structure comparison. Curr. Opin. Struct. Biol., 6, 377–385.
30. Wang,Y., Bryant,S., Tatusov,R. and Tatusova,T. (2000) Links from
genome proteins to known 3-D structures. Genome Res., 10
31. Slonim,N., Atwal,G.S., Tkacik,G. and Bialek,W. (2005)
Information-based clustering. Proc. Natl Acad. Sci. USA, 102,
32. Shoemaker,B.A., Panchenko,A.R. and Bryant,S.H. (2006) Finding
biologically relevant protein domain interactions: conserved binding
mode analysis. Protein Sci., 15, 352–361.
33. Atwell,S., Adams,J.M., Badger,J., Buchanan,M.D., Feil,I.K.,
Froning,K.J., Gao,X., Hendle,J., Keegan,K., Leon,B.C. et al.
(2004) A novel mode of Gleevec binding is revealed by the structure
of spleen tyrosine kinase. J. Biol. Chem., 279, 55827–55832.
34. Wang,Y., Addess,K.J., Chen,J., Geer,L.Y., He,J., He,S., Lu,S.,
Madej,T., Marchler-Bauer,A., Thiessen,P.A. et al. (2007) MMDB:
annotating protein sequences with Entrez’s 3D-structure database.
Nucleic Acids Res., 35, D298–D300.
D524 Nucleic Acids Research, 2010, Vol. 38, Database issue