# Network-based models as tools hinting at nonevident protein functionality.

**ABSTRACT** Network-based models of proteins are popular tools employed to determine dynamic features related to the folded structure. They encompass all topological and geometric computational approaches idealizing proteins as directly interacting nodes. Topology makes use of neighborhood information of residues, and geometry includes relative placement of neighbors. Coarse-grained approaches efficiently predict alternative conformations because of inherent collectivity in the protein structure. Such collectivity is moderated by topological characteristics that also tune neighborhood structure: That rich residues have richer neighbors secures robustness toward random loss of interactions/nodes due to environmental fluctuations/mutations. Geometry conveys the additional information of force balance to network models, establishing the local shape of the energy landscape. Here, residue and/or bond perturbations are critically evaluated to suggest new experiments, as network-based computational techniques prove useful in capturing domain movements and conformational shifts resulting from environmental alterations. Evolutionarily conserved residues are optimally connected, defining a subnetwork that may be utilized for further coarsening.

**0**Bookmarks

**·**

**110**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**We elucidate the mechanisms that lead to population shifts in the conformational states of calcium-loaded calmodulin (Ca(2+)-CaM). We design extensive molecular dynamics simulations to classify the effects that are responsible for adopting occupied conformations available in the ensemble of NMR structures. Electrostatic interactions amongst the different regions of the protein and with its vicinal water are herein mediated by lowering the ionic strength or the pH. Amino acid E31, which is one of the few charged residues whose ionization state is highly sensitive to pH differences in the physiological range, proves to be distinctive in its control of population shifts. E31A mutation at low ionic strength results in a distinct change from an extended to a compact Ca(2+)-CaM conformation within tens of nanoseconds, that otherwise occur on the time scales of microseconds. The kinked linker found in this particular compact form is observed in many of the target-bound forms of Ca(2+)-CaM, increasing the binding affinity. This mutation is unique in controlling C-lobe dynamics by affecting the fluctuations between the EF-hand motif helices. We also monitor the effect of the ionic strength on the conformational multiplicity of Ca(2+)-CaM. By lowering the ionic strength, the tendency of nonspecific anions in water to accumulate near the protein surface increases, especially in the vicinity of the linker. The change in the distribution of ions in the vicinal layer of water allows N- and C- lobes to span a wide variety of relative orientations that are otherwise not observed at physiological ionic strength. E31 protonation restores the conformations associated with physiological environmental conditions even at low ionic strength.PLoS Computational Biology 12/2013; 9(12):e1003366. · 4.87 Impact Factor - SourceAvailable from: Alexander S. Mikhailov[Show abstract] [Hide abstract]

**ABSTRACT:**The proper biological functioning of proteins often relies on the occurrence of coordinated fluctuations around their native structure, or on their ability to perform wider and sometimes highly elaborated motions. Hence, there is considerable interest in the definition of accurate coarse-grained descriptions of protein dynamics, as an alternative to more computationally expensive approaches. In particular, the elastic network model, in which residue motions are subjected to pairwise harmonic potentials, is known to capture essential aspects of conformational dynamics in proteins, but has so far remained mostly phenomenological, and unable to account for the chemical specificities of amino acids. We propose, for the first time, a method to derive residue- and distance-specific effective harmonic potentials from the statistical analysis of an extensive dataset of NMR conformational ensembles. These potentials constitute dynamical counterparts to the mean-force statistical potentials commonly used for static analyses of protein structures. In the context of the elastic network model, they yield a strongly improved description of the cooperative aspects of residue motions, and give the opportunity to systematically explore the influence of sequence details on protein dynamics.PLoS Computational Biology 08/2013; 9(8):e1003209. · 4.87 Impact Factor - SourceAvailable from: Gokce Guven[Show abstract] [Hide abstract]

**ABSTRACT:**We have studied the apo (Fe3+ free) form of periplasmic ferric binding protein (FbpA) under different conditions and we have monitored the changes in the binding and release dynamics of H2PO4- that acts as a synergistic anion in the presence of Fe3+. Our simulations predict a dissociation constant of 2.2$\pm$0.2 mM which is in remarkable agreement with the experimentally measured value of 2.3$\pm$0.3 mM under the same ionization strength and pH conditions. We apply perturbations relevant for changes in environmental conditions as (i) different values of ionic strength (IS), and (ii) protonation of a group of residues to mimic a different pH environment. Local perturbations are also studied by protonation or mutation of a site distal to the binding region that is known to mechanically manipulate the hinge-like motions of FbpA. We find that while the average conformation of the protein is intact in all simulations, the H2PO4- dynamics may be substantially altered by the changing conditions. In particular, the bound fraction which is 20$\%$ for the wild type system is increased to 50$\%$ with a D52A mutation/protonation and further to over 90$\%$ at the protonation conditions mimicking those at pH 5.5. The change in the dynamics is traced to the altered electrostatic distribution on the surface of the protein which in turn affects hydrogen bonding patterns at the active site. The observations are quantified by rigorous free energy calculations. Our results lend clues as to how the environment versus single residue perturbations may be utilized for regulation of binding modes in hFbpA systems in the absence of conformational changes.The journal of physical chemistry. B. 02/2014;

Page 1

Network-Based Models as

Tools Hinting at Nonevident

Protein Functionality

Canan Atilgan,1Osman Burak Okan,2

and Ali Rana Atilgan1

1Faculty of Engineering and Natural Sciences, Sabanci University, 34956 Istanbul, Turkey;

email: canan@sabanciuniv.edu; atilgan@sabanciuniv.edu

2Department of Materials Science and Engineering, Rensselaer Polytechnic Institute, Troy,

New York, 12180; email: okano2@rpi.edu

Annu. Rev. Biophys. 2012. 41:205–25

First published online as a Review in Advance on

February 23, 2012

The Annual Review of Biophysics is online at

biophys.annualreviews.org

This article’s doi:

10.1146/annurev-biophys-050511-102305

Copyright c ? 2012 by Annual Reviews.

All rights reserved

1936-122X/12/0609-0205$20.00

Keywords

dynamic heterogeneity, collective motions, allosteric regulation,

coarse-grained models, bond-orientational order, protein stability in

cellular proteomes

Abstract

Network-based models of proteins are popular tools employed to determine

dynamic features related to the folded structure. They encompass all topo-

logical and geometric computational approaches idealizing proteins as di-

rectly interacting nodes. Topology makes use of neighborhood information

of residues, and geometry includes relative placement of neighbors. Coarse-

grained approaches efficiently predict alternative conformations because of

inherent collectivity in the protein structure. Such collectivity is moderated

by topological characteristics that also tune neighborhood structure: That

rich residues have richer neighbors secures robustness toward random loss

of interactions/nodes due to environmental fluctuations/mutations. Geom-

etry conveys the additional information of force balance to network models,

establishing the local shape of the energy landscape. Here, residue and/or

bond perturbations are critically evaluated to suggest new experiments, as

network-based computational techniques prove useful in capturing domain

movements and conformational shifts resulting from environmental alter-

ations. Evolutionarily conserved residues are optimally connected, defining

a subnetwork that may be utilized for further coarsening.

205

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Click here for quick links to

Annual Reviews content online,

including:

• Other articles in this volume

• Top cited articles

• Top downloaded articles

• Our comprehensive search

Further

ANNUAL

REVIEWS

Page 2

ANM: anisotropic

network model

Contents

INTRODUCTION............................................................... 206

GEOMETRY COMPLEMENTS TOPOLOGY IN SEARCH

OF PERSISTENT ORDER.................................................... 207

Contact Order.................................................................. 207

Redundancy and Collectivity.................................................... 209

Packing Anisotropy............................................................. 210

HOW (WELL) DOES A NETWORK CONSTRUCTION

OF THE FOLDED PROTEIN APPROXIMATE THE UNDERLYING

ENERGY LANDSCAPE?...................................................... 212

RESPONSE SCANNING MONITORS CONFORMATIONAL CHANGES

DUE TO ENVIRONMENTAL PERTURBATIONS .......................... 215

ESSENTIAL SUBNETWORKS FROM PROTEIN STRUCTURE

RETAIN PROTEIN-LIKE PROPERTIES..................................... 218

INTRODUCTION

Acentralgoalincurrentmolecularstructuralbiologyresearchistodetermineproteinfunctionality

from the knowledge of a few possible structures determined under a given set of conditions. This

in turn requires efficient prediction of the multitude of possible states of the protein, which is

whycoarse-grainedmodelsingeneral,andnetwork-basedmodelsinparticular,attractwidespread

attention.Inthecellenvironment,thereisceaselesscompetitionbetweenproteinsinteractingwith

other molecules, leading to many bond-breaking events; bond breaking relaxes tension, triggering

conformationalchanges.Thesesuccessiveperturbation-responseprocessesmaybequantified,e.g.,

by using a single-molecule system that contains both a dissociable bond and a protein that could

undergo force-dependent conformational opening (42). What should the resolution of coarsening

be in native protein models to undertake such a competition?

Coarse-grainedmodelsofproteinshaveprovedusefulinpredictingaplethoraofproperties.At

thelowestresolution,stabilityandthermalfoldingcharacteristicsofproteinsarededucedbyusing

merely the number of residues and a fraction of charged amino acids (36, 76). These observations

have been related to the mutation rates and population size, thereby establishing a connection

between the Darwinian theories of evolution and the physics-based models of proteins (85, 89).

Another coarse-grained model with uniform mass density successfully predicts the amplitudes

and timescales of the motions in cells (55). Without the explicit description of atomic details

or interaction sites, the model incorporates the excluded volume and surface characteristics of

proteins by spherical harmonic representations. The ultimate goal is to develop such extremely

coarsened models to explain how proteins can exist in multiple states that determine the fate of

the cell.

At a more detailed level of coarse-graining, protein dynamics around the native state naturally

hints at the use of fully elastic analysis for harmonic potential wells (15, 26, 28, 73, 79, 86). Such

constructions with uniform pseudobond potentials served the biophysics community remarkably

well despite caveats in parameterization (5, 14, 41, 78). The anisotropic network model (ANM)

is a tool that seeks to establish detailed static balance around each residue and uses the inherent

anisotropyofthecontactdistributionsinspace(5).Usingthisforcebalanceargumentinturnallows

one to build structure-specific quadratic potential functions, which allows direct computation

206Atilgan·Okan·Atilgan

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 3

of relative displacements from the Hessian matrices of the underlying potential. The approach

resolves many collective modes of motion such as hinge, bending, and shear (33). These motions

arerepresentedbyafewcollectivemodesforsomeproteinsorbysuperpositionofmanyforothers

(61, 74).

However, multibasin hopping or single potential wells decorated with appreciable rugged

landscapesarenotcapturedinsuchconstructions.Onewaytoproceedistousemoleculardynamics

(MD) to account for quasi-harmonic effects and migrations among collective modes arising from

competing basins around the global minima (35). Can we reconcile coarse-grained models with

extensiveMDsimulations?Althoughsomefeaturesofthefreeenergysurfaceareintrinsicallyhigh

dimensional, projection of conformations into lower-dimensional spaces sketches out a network

among the accessible conformational states (22). A method that provides the transition vectors

associated with the slow degrees of freedom has yet to be devised.

In this review, we explore both topological and geometric constructs that discretize protein

structures,whichwegenerallycallnetwork-basedmodels.Weevaluatetheutilityofthismodeling

approach to explore protein functionality, and we discuss hybrid procedures that make use of MD

simulations together with techniques devised for network-based models. We first elaborate on

recent developments in packing of amino acids, because the full strengthof network-based models

necessitates a better understanding of the residue environment in the native state (17, 51). One

fundamental problem is the rigorous characterization of residue packing, which in turn defines

local equilibrium and gives rise to flexibility and functionality. We bring new insights by drawing

comparisons to crystalline and glassy systems bearing various forms of polyhedral order and local

anisotropy.

The importance of anisotropy emerges in the Hessian matrix and correspondingly in the

covariance matrix. Whereas the former may be obtained merely using the local force balance,

the latter is constructed using MD trajectories. In the next section we discuss the distribution of

timescales in the dynamics of folded proteins. Monitoring window length points to the presence

of dynamic heterogeneity of the system, which is observed whenever a region’s relaxation time is

considerablydifferentfromthatoftheothers(21).Thecovariancematrixobtainedbyfine-grained

simulations is utilized as a kernel for calculating the first few fundamental modes or for evaluating

the response characteristics upon perturbations. Therefore, it is of utmost interest to moderate

the proper interactions by tuning the simulation time.

The following section studies the single-site or multisite perturbation analyses of proteins

withintheframeworkofnetwork-basedapproaches.Elasticnetworkconstructionhelpsoneprobe

conformational changes due to altered physical and chemical environments induced by external

fields or the change of scalar parameters such as pH. Tools such as these, when properly set up,

are much needed for functional manipulation of proteins at the atomic level. In the final section,

we review methods that determine the smallest set of interactions in the protein that retain the

protein-like qualities of the constructed networks, using the network properties we have been

investigating.

GEOMETRY COMPLEMENTS TOPOLOGY IN SEARCH

OF PERSISTENT ORDER

Contact Order

Coarse-graining of a protein structure inevitably entails discretization of soft nanomatter at a

suitablelevelofresolution,whichisthatofasingleresidueinallworkreviewedinthismanuscript.

Such an approach permits one to examine the protein as a set of points and their interactions,

www.annualreviews.org • Network-Based Models of Proteins 207

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 4

Residue networks:

networks (graphs)

constructed from

protein coordinates,

where Cαor Cβ

atoms within a selected

cutoff distance are

connected

Radial distribution

function: describes

how number density

varies with distance

from a central bead

FCC: face-centered

cubic

Clustering

coefficient (C):

probability that

neighbors of a node

are also connected to

each other

0.0 0.51.01.5 2.0

Protein

FCC

Eigenvalue, λ

Eigenvalue

distribution, p(λ)

468 10 1214 1618

Cutoff distance, r (Å)

Pair distribution

function, g(r)

abc

10020 40 60 80

Contact map

120

20

40

80

100

120

60

Figure 1

(a) Radial distribution function of Cβatoms (3). The first coordinating region of each residue ends at 6.7˚A, while the second shell,

which has significantly lower density, ends at 8.5˚A (gray dashed line). (b) Contact map for hen egg white lysozyme at 8.5˚A cutoff.

(c) The eigenvalue distribution of L∗for residue network and face-centered cubic crystals is remarkable similar. Multiplicity of λ = 1

indicates the amount of motif duplication in the network. Region of λ → 0 departs for the two network types.

bringing in a network view to protein structure characterization. We call these residue networks

(3).

Each residue of a folded protein is considered as a single point, which is centered on either its

Cαor Cβatom. The radial distribution function, shown in Figure 1a, marks the regions of the

first and second coordination shells of amino acids residing in the protein. The incidence matrix

of a protein is formed by associating each residue with its bonds. Here, the term bond refers to

either all contacts along the contour of the chain or nonbonded contacts within a selected cutoff

distance. Although the radial distribution function guides us toward the identities of the direct

contacts of a given residue at ∼6.7˚A, we find longer-ranged neighbors to play an intriguing role

in determining a plethora of protein properties.

The rows of the incidence matrix are the residue identities, and the columns are the bonds;

whenever the ith residue is associated with the jth bond, the (i,j) entry of the incidence matrix

is 1. For a protein with N residues, the incidence matrix is made of N × M entries, where M is

the total number of bonds. One may multiply the incidence matrix by its transpose, summing up

the number of bonds of each residue, to obtain an N × N matrix whose off-diagonal entries are 1

for contacting pairs of residues, thus forming the contact map (adjacency matrix A at the selected

cutoff distance, Figure 1b). The diagonal matrix D is composed of the connectivity of the residue,

obtained by the sum of the neighbors of i; average connectivity of the system is denoted K. The

Kirchhoff matrix (also referred to as the Laplacian) is the difference D−A. The diagonal elements

of the inverse of the Kirchhoff matrix are proportional to the Debye-Waller factors (14).

The Laplacian may be normalized to dispose of the effect of the number of nodes comprising

the network, L∗= (D−1)1/2(D−A)(D−1)1/2. The eigenvalue spectrum of L∗is used to catego-

rize networks, i.e., the presence of an eigenvalue at λ = 2 implies the network is bipartite, the

multiplicity of the eigenvalues at λ = 1 is a measure of motif duplication in the network, and

the smallest nonzero eigenvalue indicates how well the network is connected (17). Note that an

extension to a three-dimensional version of the normalized Laplacian was recently applied to the

study of the local structural arrangements (9). L∗obtained from proteins is remarkably similar to

those of face-centered cubic (FCC) lattices (Figure 1c).

Local motifs are readily quantified in graph theory by the clustering coefficient, C. For random

networks with Poisson connectivity distributions, C ≈ K/N → 0, since K ? N. For proteins, we

find C to depart significantly from 0, while having a constant value in the core (75), approaching

208Atilgan·Okan·Atilgan

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 5

d

0.3

0.4

0.5

0.6

Clustering coefficient, C

345678

N = 150

N = 210

N = 310

Residue depth, d (Å)

4

6

8

10

Nearest-neighbor degree, knn

ab

24681012

Connectivity, k

Uncorrelated

Proteins

Figure 2

(a) Depth dependence of the clustering coefficient, C, averaged over proteins of fixed sizes (3); at residue

depth, d > 4˚A, C → 1/3. (b) Nearest-neighbor degree correlations (knn) versus connectivity (k) plots for

residue networks with N = 190–210; slope is C (82). There is significant correlation in the network, unlike

random networks (gray dashed line).

Nearest-neighbor

degree correlations

(knn): average

connectivity of the

nearest neighbors of

residues with a given

connectivity

Average path length

(APL): average over

the minimum number

of connections that

must be transversed to

connect a residue to all

others

one-third beyond a depth of 4˚A (Figure 2a). This value of C is commensurate with that of FCC

lattices that display eigenvalue spectra similar to that of proteins (Figure 1c). Finally, the nearest-

neighbor degree correlation (knn) of residue networks departs significantly from random networks

for which no correlation is expected (Figure 2b) (1, 13, 82). The slope in the knnversus k curve is

equal to C for the current network distributions (82). Positive slopes in such curves are associated

with the so-called assortatively mixed networks, which percolate more easily and are more robust

toward vertex removal (58).

Redundancy and Collectivity

Robustness to perturbations is an important issue in the fluctuating environment of a protein,

where instantaneous loss of connectivity may often occur. Robustness may then be related to the

number of alternative ways a residue may reach that lost contact. Let us call it redundancy. In fact,

the average number of alternative two-step paths a given residue generates to its neighbors when

their direct contact is removed is equal to 2C.

Robustness of a residue, however, is not only a local property. Consider at the same time how

easily reachable that specific residue is by the other residues of the network. The average path

length(APL)ofaresidueisanappropriatemeasureofreachabilityandishighlycorrelatedwiththe

experimentally measured residue fluctuations (3). The redundancies provided by the alternative

paths may be regarded differently when the residue undertaken is a few steps away from the others

(low APL), compared with the case in which the residue’s accessibility is too costly (high APL).

Imagine a residue whose local environment provides very few alternative paths (low redundancy)

or a plethora of two-step links available to navigate among its neighbors (high redundancy). The

degree of collectivity of motions in a protein depends on the propensity of its residues to find

alternative routes to communicate with function-related destinations such as the active site (16).

Consider two limiting cases: high redundancy accentuated by low APL versus low redundancy

compensated by high APL. Whereas the former resembles a highly coordinated group of residues

placed in a blob, the latter looks like a group of residues arranged sequentially on a string. The

number of alternative paths shall be normalized by its overall reachability. Defined in this final

form RI = C/APL (8).

www.annualreviews.org • Network-Based Models of Proteins209

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 6

Redundancy index

(RI): ratio of local

clustering (C) to global

efficiency of

communication (APL)

to quantify degree of

collectivity

Bond-orientational

order (BOO):

an order pertaining to

geometric centers of

bonds emanating from

a central bead

A low redundancy index (RI) signifies that the protein motions may be described by a few col-

lective modes; a high RI, on the other hand, points out that protein may not easily be decomposed

tostructuralsubunits,makingamodaldescriptionoftheobservedmotionsproblematic(8,25).An

attempt toward identifying rigid domains in the protein has been made in this direction, relating

the rigidity to the collectivity of the conformational change (87), as well as directly clustering

interatomic distance deviations from MD simulations (20). Although topology is important for

determining the extent of the designability of conformations (50), once the principal components

of the motion come into view, the directionality information between interacting pairs of nodes

is necessary. In three dimensions, network construction at the coarse-graining level of the residue

is achieved by using anisotropic networks and extending the representation from the N × N

Kirchhoff matrix to the 3N × 3N Hessian of elastically interacting residues.

Packing Anisotropy

Cubic,hexagonal,andicosahedralorderingshaveallbeenformerlysingledoutasviablecandidates

to model proteins in their native conformation (60). Icosahedral and FCC-type coordination have

been put forward after rotational superposition of residue environments onto their respective 13

atom clusters (12, 31, 63). Hexagonal close-packed systems have been suggested on the basis of

generalized hydrophobicity arguments and protein folding systems have been suggested on the

basis of the hydrophobic-polar model (2, 11).

To analyze the anisotropy in connectivity around each residue, the bond-orientational order

(BOO) is used (9, 70):

⎛

2l + 1

Ylmaresphericalharmonicfunctionsforabondvectorfromiton;θ,ϕ arepolaranglesofthisbond;

and Nb(i) is the total number of contacts of residue i. 6 is the lowest value of l for which cubic,

hexagonal, and icosahedral ordering are all nonzero, making them concurrently detectable. In

defining a mean protein environment, we therefore examine the average Q6(i) over all residues (9)

aswellasitsdistribution(Figure3a,b).Forcommoncrystallographicarrangements,asexemplified

herebyFCC,theadditionofbondstoalargercoordinationspaceleadstoagradualdecreaseinlocal

anisotropy(71).However,forproteins,BOOispersistentat allcutoffs.Theplateaureachedby Q6

hints at a nonnegligible residual anisotropy and is comparable in value to what has been computed

for supercooled Lennard-Jones systems (70). The smallest distance at which this particular BOO

builds up can be seen as a core that contains the essence of structural dynamics exhibited by the

molecule.

In characterizing local symmetries of the coordination environment, a third-order rotational

invariant, W6, was proposed (71):

?

?

where (

m1

m2

a given residue. The unique property of ˆ W6is its definitive magnitude. For example, for simple

cubic, body-centered cubic, and FCC systems, the absolute value is identically 0.013161 and is not

Ql(i) =

⎝

4π

l?

m=−l

?????

1

Nb(i)

Nb(i)

?

n=1

Ylm[θ (− →

rn−− →

ri),φ (− →

rn−− →

ri)]

?????

2⎞

⎠

1/2

.

1.

ˆ Wl=

l?

m1=−l

l?

m2=−l

l?

m3=−l

lll

m1

m2

m3

?

??2?3/2

Qlm1Qlm2Qlm3δm1+m2+m3,0

l?

m=−l

??Qlm

,

2.

lll

m3) represents a Wigner 3-j symbol and the overbar denotes neighbor average for

210Atilgan·Okan·Atilgan

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 7

68101214 1618

0.0

0.2

0.4

0.6

Protein

Protein

FCC

Q6

0.00.20.40.60.81.0

5 Å

6 Å

14 Å

p(Q6)

681012141618

–0.04

–0.03

–0.02

–0.01

–0.00

FCC

W6

^

^

–0.10.0

^

W6

0.1 0.2

5 Å

6 Å

14 Å

–0.2

p(W6)

Cutoff distance, r (Å)

Cutoff distance, r (Å)

Q6

ab

cd

Figure 3

Bond-orientational order of 48 proteins (140 < N < 160, <N> = 150) compared with those of

face-centered cubic (FCC) lattices (bead diameter set to average Cα–Cαdistance along chain, 3.7˚A). (a) Q6.

(b) Q6distribution of proteins obtained at different cutoffs. At 5˚A, orientational order is peaked around the

FCC value. (c) Comparison of ˆ W6for proteins and FCC lattices. FCC lattices have cutoff-independent ˆ W6

(teal). Optimal cutoff values reported in literature are ∼13˚A, at which residues sense strong FCC-like order

in their coordination shells. (d) ˆ W6distribution of proteins obtained at different cutoffs. Extensive

triangular/tetrahedral local order exists in protein structures as indicated by the sharp peaks at 5˚A.

altered when surface bonds are added to the computation (71). This remarkable property enables

one to assign different ordering types to different residue environments (Figure 3c,d). It shows

that significant order exists in the protein structure, clearly detected at 5˚A. At longer distances,

the smoother distribution points to the increasing symmetry in all directions, in accordance with

the robustness of the slowest modes of motion (9).

In three dimensions, a given bond (contact) of a selected residue is associated with its direc-

tionality, quantified by the direction cosines. For a given residue with m bonds, this information

may be stored in a 3 × m coefficient matrix, where each column records the direction cosines

along the bond representing the interaction. Generalizing to the whole system of N nodes and

a total of M interactions, one gets the 3N × M direction cosine matrix B, which is the three-

dimensionalcounterpartoftheincidencematrix.Thus,analogoustotheKirchhoffmatrixobtained

by operating on the incidence matrix, BBTis exactly the Hessian if harmonic interactions with

uniform force constants for all M bonds in the network are assumed. It may also be viewed as an

N × N supermatrix, composed of 3 × 3 submatrices. Each off-diagonal term of the supermatrix

www.annualreviews.org • Network-Based Models of Proteins211

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 8

CORRESPONDENCE BETWEEN TOPOLOGICAL AND GEOMETRICAL

QUANTITIES EMPLOYED IN PROTEIN PHYSICS

In the topological case of network theory, local parameters are represented as moments of A. For the geometric

counterpart, local information is encoded in Qlm, showing up in series expansion of bond density on a unit sphere.

A geometric nth moment of bond density is matched by a topological moment of the same order calculated from A

as follows. (a) Degree connectivity of each node is computed from column-wise summation of the corresponding

entries in A, also recovered when bond density expansion is integrated over the unit sphere around the node.

(b) Second moments deduce how the coordination is formed/translated around a node, given the local environment

is fixed at its neighbors. The local environment is characterized by the degree connectivity in the topological and

bond-orientational order (BOO) in the geometric picture. The second-order geometric invariant is a two-point

spatial correlation function between a pair of nodes; i = j is an autocorrelation function leading to Ql. (c) Third

moments are measures of local compactness; both C and Wluse the coupling of three vectors/edges via a triangle

constraint. In BOO this condition is enforced by Wigner 3-j symbols appearing in the numerator of Wl.

ENM: elastic network

model

contains the direction cosines of the interacting pair, and the diagonals have the total interactions

acting on a residue. (BBT)−1is the covariance matrix C for a given configuration, which is also an

N × N supermatrix whose ijth element is the 3 × 3 matrix of correlations between the x-, y-,

and z-components of the fluctuations ?Riand ?Rjof residues i and j. The trace of the 3 × 3

diagonal elements of C is proportional to the Debye-Waller factors (5, 15).

Thesuccess ofANMs forpredictingresiduefluctuations as well as theprincipalcomponents of

(BBT)−1to describe various binding and functioning modes of a plethora of proteins suggests the

harmonic assumption to be valid under many circumstances, despite its simplistic treatment of the

residue environment (24, 28, 48, 54, 62). A more realistic approach to obtaining the covariance

matrix of a folded protein is to use MD simulations with their all-atom descriptions and improved

potentials that directly take into account the distance-dependent interactions. An MD trajectory

may also be viewed at a coarse-grained level by following the coordinates of, e.g., the Cαatoms.

The deviation of residue j from the average structure obtained over a selected time window w

is ?Rj(t) = Rj(t) − ?Rj(t)?w. For all residues, the deviations may be recorded in the 3N × w

trajectory matrix ?R. The covariance matrix C is the product ?R?RT. The correspondence

between C matrices obtained from MD simulations and elastic network models (ENMs) has been

previously established for autocorrelations (19, 29) and cross-correlations (7). We next explore

the extent to which the approximation (BBT)−1≈ ?R?RTmay be generalized.

HOW (WELL) DOES A NETWORK CONSTRUCTION OF THE FOLDED

PROTEIN APPROXIMATE THE UNDERLYING ENERGY LANDSCAPE?

The free energy surface of a folded protein is rich and complex, accommodating a plethora of

conformations sampled via motions spanning the femtosecond to second timescales (39). The

conformationsmaybeclassifiedatdifferentresolutions,e.g.,jumpsbetweensidechainrotamerson

the picosecond to nanosecond range or domain motions on the microsecond to millisecond range.

These motions may be observed via spectroscopic techniques operating at different timescales or

by MD simulations. The size of the time window of observation determines the upper bound

of protein motions probed (30). Equally important is that there are durations of observation at

which no new motions enter into the window. Following these timescales at which plateaus in the

relaxationfunctionsappear,newmotionsrelevanttoproteinfunctionatthenextlevelofhierarchy

212 Atilgan·Okan·Atilgan

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 9

0.2

0.4

0.6

0.8

1.0

C(t)

ab

40 ns

10 ns

1 ns

5 ns

20 ns

0.0

0.00.20.4 0.60.8 1.0

Time (ns)

Figure 4

(a) C(t) of hen egg white lysozyme (100 ns molecular dynamics simulations at 300 K, 1 atm) (59). Curves are

calculated for w = 1, 5, 10, 12, 14, 16, 18, 20, 40 ns. Initial decay on the picosecond timescale is the same for

all. The longer relaxation time, on the order of nanseconds, has increasingly higher contribution as w is

increased from 1 to 10 ns. We observe that the curves display the same relaxation pattern in the range of w

= 10–20 ns. An additional, slower relaxation time comes into the window of observation for larger w,

exemplified by w = 40 ns. (b) Schematic of the free energy landscape, resolved at a given tier of the many

hierarchical motions occurring in proteins. Assuming this corresponds to local rearrangements of chain units

taking place on nanosecond timescales, each displayed well would incorporate smaller undulations (not

shown) corresponding to faster vibrational motions or methyl rotations. The whole picture would be

embedded in a more complex energy landscape, where rearrangements of domains on a microsecond to

millisecond range would be apparent. Envelope of intradomain motions is schematically shown (red dashed

curve), arbitrarily chosen here as the parabola having a minimum around the initial structure (red dot).

Sampling probability of conformations within wells is projected (gradations of gray). Monotonous probability

distribution implied by envelope is shown as gradations of red.

HEWL: hen egg

white lysozyme

are onset (67). In this section, we shall use the well-studied hen egg white lysozyme (HEWL) as

an example, and we shall monitor ?Rj(t). Figure 4a shows the relaxation of Cαatoms from the

average structure obtained in w:

???Ri(0) · ?Ri(t)?

C(t) = Ci(t) =

w

??Ri(0)2?

w

?

,

3.

where average over time and residue index are shown by the brackets and overbar, respectively.

For observation windows less than 10 ns, there is a continuous shift in the relaxation curves

to incorporate slower timescales (Figure 4a). In the observation window range of w = 10–

20 ns, however, the relaxation profiles are packed together. Thus, at short w there are some

partially relaxed processes contributing to the decays, and by extending the observation time, a

larger portion of these processes are included in the decay profiles. For certain w, however, all

contributingprocesseshaveessentiallyrelaxed,formingaplateau.Abovew = 20ns,therelaxation

curves once again display shifts, owing to longer timescale processes beginning to influence the

dynamics in these wider windows.

Fromafreeenergylandscapeperspective,thecharacteroftherelaxationprofilesdependsonthe

resolution at which the system is sampled. The black curve in Figure 4b represents the landscape

at a selected level of resolution. Starting from an initial configuration, for example, corresponding

to the red dot, a 1-ns relaxation profile represents the memory of the motion of the protein in

part of that particular substate. A longer window of 5 ns corresponds to more sampling within

www.annualreviews.org • Network-Based Models of Proteins 213

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 10

RMSD: root-mean-

square deviation

the same well. Observation windows in the range of 10 to 20 ns represent equivalent relaxation

profiles, implying that the well is sampled to its full extent and that no jumps to neighboring wells

occur. Finally, the 40-ns window now incorporates one or more jumps between adjacent wells,

thus incorporating a new channel of relaxation manifested as a slower relaxation time component

superposed on those observed earlier.

Howwelldoestheapproximation(BBT)−1≈?R?RTwork?Figure 4impliesthattherugged-

ness of the landscape captured by long trajectories and included in the quasi-harmonic description

(43) of the protein dynamics will be lost by the quadratic approximation around a selected con-

formation. Conversely, short trajectories do not approximate the curvature of the envelope of

conformational substates. The curvature of the network model is controlled by the cutoff distance

used in the construction (29).

The problem is also tractable at the static limit (t = 0) of residue autocorrelations and cross-

correlations.Figure5illustratesthetimedependenceofrelevantpropertiesforthecaseofHEWL.

The average structures are similar to each other; the largest root-mean-square deviation (RMSD)

Time interval

Covariance matrıx

Mean-square fluctuations

Slowest mode

5 ns10 ns20 ns 40 ns90 ns

20 60 100 2060100

206010020 60100 2060100

Figure 5

Simulation time dependence of hen egg white lysozyme properties. A portion of the trajectory used in the analysis is shown (90 ns

molecular dynamics simulations, following the 10 ns discarded for equilibration). The main function of hydrolysis of glycosidic linkages

occurs via relative motion of the two domains. The top and bottom halves in the structures shown are the α- and β-domains,

respectively. Loop residues 46–50 and 66–70 interact throughout the trajectory, displaying peaks in all covariances, whereas loop

residues 16–21 and 100–103 interact partially. The long-range effect of residues 16–21 on all others, best viewed by the cross-

correlation line clearly visible from the 40 ns covariance matrix, begins to show traces only at w = 20 ns and is retained thereafter.

Mean-square fluctuations are also displayed, along with the shape of the slowest nontrivial mode belonging to that portion of the

trajectory. The root-mean-square deviation between any pair of the displayed average structures ranges from 0.2 to 0.9˚A.

214Atilgan·Okan·Atilgan

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 11

PRS: perturbation-

response scanning

FBP: ferric-binding

protein

is 0.9˚A between the total (90-ns) trajectory and any one of the 5-, 10-, or 20-ns portions, and

the RMSD between the 5- and 40-ns average structures is only 0.3˚A. The covariance matrices

obtained from these trajectory pieces, on the other hand, may be different, although different

portions of 5-, 10-, and 20-ns pieces are similar. The similarity of C at short times is also reflected

in mean-square fluctuations and the shape of the slowest mode. C constitutes the quasi-harmonic

description of the protein dynamics for a given time window. It captures similar features of the

conformationalspaceupto20ns;newfeaturesareaddedatlongertimes.Thisisbestmanifestedin

themodeshapeat90ns,whichhaslittleoverlapwithallothersdisplayed.Thesimilarity/difference

between the covariance matrices may be further accentuated by low-pass filtering.

On the other hand, for this example of HEWL, any ENM approximation (15, 73) of a selected

average structure leads to the same slowest mode because of the low RMSD between the averages

obtained at the various durations. In fact, the slowest ANM mode obtained from the structure

averaged over the 90-ns trajectory has overlaps of 0.70, 0.83, and 0.83 with those modes obtained

fromthecovariancematricesaveragedover5-,10-,and20-nstrajectories,respectively.Itdecreases

to 0.4 for the 40-ns trajectory and to 0.2 ns for the total trajectory. Thus, the resolution of ANM

coincides with that of the quasi-harmonic description of tens of nanoseconds and not longer

timescalesatwhichintradomainmotionsenterintothewindowofobservations(59,68).However,

insofar as the dynamics related to the slowest motions are concerned, the toolkits developed for

the ENMs are equally applicable for MD-calculated covariance matrices covering any range of

timescale of interest. It is only the definition of “slowest motions” that differs, depending on the

level of hierarchy of the energy well within the landscape, accessible by the selected timescale.

RESPONSE SCANNING MONITORS CONFORMATIONAL CHANGES

DUE TO ENVIRONMENTAL PERTURBATIONS

To get useful information similar to those obtained in experiments, it is of utmost interest to use

a methodology that puts the system slightly out of equilibrium and monitors the evolution of the

response(44).Experimentally,theperturbationmayarriveintheformofchangingenvironmental

factorssuchaspH,oritmayactdirectlyonthechainasinpulling(37,90)orothersingle-molecule

experiments (56), as well as through mutations or ligand binding. Theoretically, the perturbation

might be a force given to the system mimicking the abovementioned forms (4, 43). The response

of the system provides information other than the operating modes of motion discovered by, e.g.,

ANM.

Examples of such approaches are in the literature, all of which operate in the linear response

regime.Inoneall-atomstudy,theperturbationisappliedasafrozendisplacementtoselectedatoms

of the protein, followed by energy minimization; the response is measured as the accompanying

displacements of all other atoms of the protein (18, 19). This has led to finding the shifts in the

energy landscape that accompany binding (18, 81). The method based on molecular mechanics

scans all residues to produce comparative results. In various studies, the perturbations on residues

are introduced by modifying the effective force constants (65), links between contacting residue

pairs (88, 91), or both (57). The more recent perturbation-response scanning (PRS) methodology

has successfully demonstrated that the conformations of a variety of proteins may be manipulated

by single-residue perturbations (8). By using PRS to study proteins in detail, residues that are

structurally amenable to inducing the necessary conformational change upon binding in ferric-

binding protein (FBP) (7) and calmodulin (4) have been mapped.

Theimpetusfordevelopingtheoreticalmodelsofproteinsistorelateexperimentalobservations

to the detailed interactions within the structure. One example is provided by pH, which affects

protein function by altering accessible conformations (69). Various approaches incorporate the

www.annualreviews.org • Network-Based Models of Proteins 215

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 12

2135

7

46

–1.0

–0.8

–0.6

–0.4

–0.2

0.0

D52

E45

pH

Degree of ionization

Figure 6

Degree of ionization curves of the 39 negatively charged residues calculated via the PHEMTO (protein

pH-dependent electric moment tools) server (45) for the iron-loaded ferric-binding protein structure. Most

residues residing in the protein environment display values near the standard pKavalue of an isolated Asp or

Glu residue (∼4). However, the pKavalues of two residues have upshifted toward 6.5, the estimated pH of

the environment when the protein resides in vivo (gray dashed line) (32).

effect of pH-induced conformational changes at the proteome level (34, 36). Consider the case of

iron transport. In vertebrates, transferrins are nonheme iron-binding bilobed proteins, capable of

scavenging free ferric ions, and transport one ion in each lobe through the body. Iron is released

in the endocytic vesicle, assisted by low pH (46). Bacteria have developed similar FBPs, with

structural similarity to that of a single lobe of transferrin, for sequestering iron from the host

organism. In Figure 6, we display the calculated pKavalues of the weak acids in this protein.

Two of the residues have upshifted values from the standard, and it is of interest to relate this

observation to the conformational change.

The ligand-bound state of a protein may be described by a perturbation of the Hamiltonian of

the unbound state. Under the linear response theory, the shift in the coordinates is approximated

by (43, 88)

1

kBT??R?RT?0?F =

wherethesubscripts1and0denoteperturbedandunperturbedconfigurationsoftheprotein.The

?Fvectorcontainsthecomponentsoftheexternallyinsertedforcevectorsontheselectedresidues;

e.g., for the perturbation of a single residue i, (?F)T= {000...?Fi

ThePRStechniquereliesonrepeatingtheabovelinearresponsetheorycalculationbyscanning

the residues of the protein one by one and focusing further on those perturbations that overlap

with the conformational change ?R1= ?R?1−?R?0. There is no a priori assumption about how a

force might be generated at a particular point. Conversely, after finding the force/residue pair that

best leads to the conformational change of interest, this finding is related to the possible causes.

For the case of FBP, charged residues E45, D47, and D52, which reside 25 to 30˚A away from

the ferric ion, have high overlaps with the closed-to-open-form conformational change, and two

of these residues display the upshifted pKavalues (Figure 6). Similarly, charged residues with

irregular pKavalues that remotely control the conformational change of calmodulin have been

revealed by PRS (4).

One may further analyze the singular values of the 3 × 3 submatrices of C (7) to better

understand the nature of the response by the transformation Cki

?R1= ?R?1− ?R?0?

1

kBTC?F

4.

x?Fi

y?Fi

z...0 00}1×3N.

3×3= U3×3?3×3UT

3×3. If Ckihas

216Atilgan·Okan·Atilgan

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 13

ba

Figure 7

(a) FBP in truss representation. Interactions between residue pairs within the cutoff distance (8˚A) are shown by thin gray lines; the

backbone is shown by thick black wireframe. Fe is shown as a gold sphere and resides between the moving and fixed domains. (b) The

red-boxed region of the protein in panel a is magnified, and representative results of PRS are demonstrated. In PRS, each residue is

sequentially perturbed (e.g., following Cαatoms one by one along the dashed blue trace of the backbone) by an external force in a

randomly selected direction. The PRS method implies charged residues residing 23 to 29˚A away from Fe to cause the conformational

change between holo and apo forms. Overlap greater than 0.8 is obtained by C matrix from both 10-ns MD trajectory and ANM by

perturbing residues E45(MD), D47(ANM/MD), and D52(ANM). Perturbations on D52 are shown by 20 different vectors within the

red sphere; only the response of direct contacts of Fe is displayed for clarity (red vectors). Moving-domain residues in direct contact with

Fe are aligned in the direction of the opening ligand entry-exit pathway. Responses of the moving-domain contacts 8,9,57 are aligned

(black dashed arrows), and the maximum angle between any pair of those responses is 20◦. Responses of the fixed domain contacts are

distributed in a plane. Abbreviations: FBP, ferric-binding protein; PRS, perturbation-response scanning; MD, molecular dynamics;

ANM, anisotropic network model.

onedominantsingularvalue,i.e.,λmax/?

responses?Rkitoanumberofperturbationswillalignalongu1,asshowninFigure 7b.Inthecase

of FBP, the response of moving-domain residues contacting the bound ion moves in a concerted

fashion to simultaneously expose the ion for easy dissociation. Many authors have discussed the

so-called ferric binding dilemma, i.e., that dissociation of ferric ions actually occurs, although the

association constant is extremely high, on the order of 1017to 1022M−1. The changing pH of

the environment in FBP-like proteins is also a possible route to dissociation (46). PRS provides

evidence of how local forces propagate to the active site and, along with shifted pKa values,

suggests a coupling between the electrostatic environment and mechanical response, leading to

new experiments.

The application of PRS-like methods is attractive for studying the shape change of interacting

proteins in the crowded cell environment (66). However, to achieve these goals, one must be able

to further coarse-grain proteins, beyond the scale of single residues (55, 64). Using the network

properties we have been investigating, we next seek the smallest set of protein interactions that

retain the protein-like qualities of the constructed networks.

i=1,3λi≈ 1,thenirrespectiveoftheforceappliedoni,the

response on k will be projected onto the associated eigenvector, umax. Therefore, the collection of

www.annualreviews.org • Network-Based Models of Proteins217

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 14

Random

Weak

Strong

2

6

10

14

Optimal path length

Link weight (units of kBT)

2.62.83.03.2 3.43.6

log N / log K

0

10

20

30

40

50

0

10

20

30

40

50

–1.8 –1.4 –1.0 –0.6 –0.2 0.2

0.61.01.4

–1.8 –1.4 –1.0 –0.6 –0.2

0.20.61.01.4

Link weight distribution

wcut = 0

wcut = – 0.6

Figure 8

(Left) Dependence of weak path length (WPL) versus strong path length (SPL) on log N/log K (6). For random networks, the

theoretical value of the slope is 1. WPLs calculated from protein structures are only slightly longer than average path length (APL, not

shown). SPLs are the longest. (Right) Link weights assigned to residue pairs as obtained from a knowledge-based potential (77) have the

distribution shown. Screening protein interactions weaker than a specified threshold up to wcut = 0 does not modify the APL (red

circles). Screening at wcut = −0.6 kBT leads to path lengths approximately equal to those of SPL (blue circles). Backbone connectivity is

held during screening, whereas nonbonded interactions are broken if the interaction strength between a pair of contacting residues at a

cutoff distance of 6.7˚A is greater than the interaction strength of wcut.

Weak path length

(WPL): optimized

path length that

minimizes the sum of

the link weights along

the path

Strong path length

(SPL): optimized path

length that minimizes

the maximum weight

encountered along a

given path

ESSENTIAL SUBNETWORKS FROM PROTEIN STRUCTURE

RETAIN PROTEIN-LIKE PROPERTIES

Proteins are molecular machines that carry out a specific set of functions (73). The function either

utilizesthedynamicsasawholeduringtheinteractionsbetweenproteinsandtheirenvironmentor

shuttlesinformationwithintheproteinstructure.Theplethoraofinteractionssustainsaredundant

network structure to guarantee proper functioning. However, all interactions are not present at

all times due to possible extreme events in the fluctuating environment; nor are they present with

the same strength.

The first step in the analysis is to impose interaction strength on the residue pairs.

Methods include distance-based force constants (47), cutoff free approaches (27), and the number

of atom-atom contacts (83). Another approach is to assign weights to interacting residue pairs via

knowledge-based potentials (6). Evaluations of weight-optimized paths (weak path length, WPL;

strong path length, SPL) were made for various networks types (23). APLs display a linear depen-

dence on log N/log K, while the slope deviates substantially from 1, which is the analytical result

for Poisson-distributed random networks (Figure 8).

Interestingly,onemayfindequivalent homogenoussubnetworksfromtheinitialproteinstruc-

ture,giventhattheweakestinteractionsarescreened(Figure 8),ratherthanfromrandomscreen-

ing. Deleting 50% of the nonbonded interactions with the weakest links leads to WPL, whereas

retaining only 20% of the strongest pairs (which are mostly hydrophobic-hydrophobic interac-

tions) mimics SPL. During the screening process, the average number of neighbors is reduced;

218Atilgan·Okan·Atilgan

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.

Page 15

Weight cutoff, wcut

–1.8 –1.2–0.60.0 0.61.2

0.00

0.04

Lowest eigenvalue, λo

0.08

0.12

b

–1.8–1.2–0.60.00.61.2

0.0

0.1

0.2

0.3

0.4

Clustering coefficient, C

a

0.5

1.5

2.0

–1.0

1.0

–1.5

–0.5

0

0.5

1.5

1.0

0

λ

0

0.05

0.1

p(λ)

Cutoff

Figure 9

Local and global protein-like parameters of screened protein subnetworks. (a) Clustering coefficient, C,

measures local order. (b) Lowest nonzero eigenvalue quantifies collectivity of motions. Both display a

transition behavior. Spectra of the eigenvalues for the subnetworks are shown on the right.

not so evident is how this behavior is paralleled by the change in the local parameters measurable

such as C and global parameters such as the lowest eigenvalue (Figure 9). Local order is retained

even for protein subnetworks whose communication paths approximate those of SPL. The similar

transition in these local and global network parameters is reflected in their distributions—for the

substantially reduced networks at wcut = −0.6 kBT, the eigenvalue spectra also transform from

that of a condensed-matter-like character to a single-polymer-chain-like character (Figure 9b).

Thus, interactions in proteins may be represented as the superposition of the most cohesive,

essential contacts, the lack of which makes the network nonfunctional, and a redundant set that

bootstraps the function to the structure. The latter ensures that in the fluctuating environment of

the protein, which may be viewed as a network under continuous attack, function is carried out

even at the most extreme conditions. This is at the root of the robustness observed in networks

following the introduction of an overwhelming number of mutations, as well as its vulnerability

towardaselectsubsetofsinglesitemutationsthatprovesdetrimentaltofunction(10).Itispossibly

the means by which evolution proves to be such an efficient instrument (80).

Different theories developed to explain long-range communication within proteins have been

extensively reviewed (84). Evolutionarily couple-conserved residues also residing on the SPL (6)

discovered via statistical coupling analysis (53) have led to the idea of a physically connected

www.annualreviews.org • Network-Based Models of Proteins219

Annu. Rev. Biophys. 2012.41:205-225. Downloaded from www.annualreviews.org

by Sabanci University on 05/14/12. For personal use only.