Content uploaded by Jaewoon Jung
Author content
All content in this area was uploaded by Jaewoon Jung on Jul 14, 2015
Content may be subject to copyright.
Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005, pp. 625∼630
Identification of the Protein Native Structure by Using a
Sequence-Dependent Feature in Contact Maps
Jaewoon Jung
∗
and Hie-Tae Moon
Department of Physics, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701
Jooyoung Lee
†
School of Computational Sciences, Korea Institute for Advanced Study, Dongdaemun-gu, Seoul 130-722
(Received 12 October 2004)
We present a new approach for fold recognition to identify the native and the near-native pro-
tein structures among decoy structures by using pair-wise contact potentials between amino acid
residues. For a given protein structure, a new scoring function is defined as the difference between
the contact energy for its native sequence and the average contact energy for random sequences of
the same contact map. We have tested the new scoring function for the various decoy sets available
in the literature and have found that the new scoring function is more useful than the original con-
tact energy, especially for decoy sets where the total number of contacts from the native structure
is similar to those from the decoy conformations. From this observation, we conclude that the more
native-like the structure is, the more likely that it distinguishes the native sequence from random
sequences. We demonstrate that, for a given contact potential, a simple, but more efficient, new
scoring function can be constructed.
PACS numbers: 87.14.Ee, 87.15.By, 87.15.Cc
Keywords: Protein folding, Native structure, Decoy structure
I. INTRODUCTION
Determination of the three-dimensional structure of a
protein from its amino-acid sequence is a major unsolved
problem in structural biology [2,3]. One way to achieve
this goal is to develop a scoring function that can dis-
tinguish the native structure of a protein from a large
number of decoy conformations. For this reason many
scoring functions, including empirical contact energies,
have been investigated [4–15,18–21]. The traditional at-
tempts for empirical contact energies are typically based
on a residue-residue contact function obtained by using
a quasi-chemical approximation [4–7]. The basic idea of
this method is to investigate pairing frequencies between
two amino acids, observed from various native structures
in the Protein Data Bank, normalized against those ex-
pected from random pairing. Other contact energy func-
tions are also obtained by optimizing interaction poten-
tials so that the energy of the native structure becomes
lower than the energy of competing decoy structures [9]
and/or by maximizing the energy gap between the native
state and the decoy states normalized by the energy vari-
∗
Current Address : Department of Chemistry, Korea Advanced
Institute of Science and Technology, Yuseong-gu, Daejeon 305-701
†
E-mail: Corresponding author : jwjung@kaist.ac.kr
ance of decoy states [14]. Most of these energy functions
are knowledge-based potentials, and indications are that
they are correlated with atomic potentials [16].
These previous efforts are focused on developing pair-
wise contact potentials that are able to discriminate the
native states of a set of proteins from many structural
decoys. In this work, we propose a different kind of ap-
proach where we adopt the pair-wise contact potential
developed by Miyazawa and Jernigan (MJ) [6]. The MJ
contact potential has been quite successful [6,17]. The
new scoring function is defined as the difference between
the original energy function and the average energy ob-
tained from random sequences.
In the present work, we examine the performance of
the new scoring function in fold recognition for various
decoy sets. Especially, we investigate if the new scoring
function is more useful than the original MJ contact en-
ergy in identifying native structures from many decoys.
We find that the correlation between the new scoring
function and the RMSD (Root Mean Square Deviation)
measured from the native structure is more significant
than the correlation between the original contact energy
and the RMSD.
II. METHOD
-625-
-626- Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005
1. Structure Comparison
To compare the two structures, the RMSD (root mean
square deviation) is used. The RMSD between two struc-
tures a and b is defined as
RM SD =
v
u
u
u
t
X
1≤i≤N
|r
ai
− r
bi
|
2
N
, (1)
where structures a and b are superimposed so that the
value of the RMSD becomes minimum, r is the coordi-
nates of alpha carbon atoms, and N is the number of
amino acids.
2. Contact Energy of a Protein
Before considering a new scoring function, we describe
the definition of the contact energy of a protein. The
contact energy of a protein is defined as
E
k
=
X
i,j
∆
i,j
B(a
i
, a
j
). (2)
In this equation, ∆
i,j
= 1 if the amino acids at the po-
sitions i and j are in contact; ∆
i,j
= 0, otherwise. The
contact between amino acids i and j is defined to ex-
ist if their side chain centroids are within 6.5
˚
A [5–7].
B(a
i
, a
j
) is the pair-wise contact energy between amino
acids of types a
i
and a
j
.
3. Calculation of New Scoring Functions
We assume that the probability that a protein adopts
structure X and sequence S follows the Boltzmann dis-
tribution
P (X, S) ∝ exp(−βE(X, S)). (3)
Then, the probability that a structure X
k
and a native
sequence S
N
are selected together becomes
P (X
k
, S
N
) ∝ exp(−βE(X
k
, S
N
))
= exp(−β(E(X
k
, S
N
) − hEi
k
)) exp(−βhEi
k
), (4)
where hEi
k
is the average contact energy when the na-
tive sequence is replaced by random sequences. It should
be noted that the energy contribution hEi
k
is indepen-
dent of its sequence S
N
and depends only on the struc-
ture X
k
.
The probability P (X
k
, S
N
) is considered to have two
parts. One is the sequence- and structure-dependent
E − hEi
k
, and the other is the sequence-independent
and structure-only-dependent hEi
k
. Since the sequence-
independent term hEi
k
plays the role of estimating only
the total number of contacts for the given structure X
k
,
we assume that it is not as important as the sequence-
and-structure-dependent E − hEi
k
. Finally, we assume
that
exp(−β(E(X
N
, S
N
) − hEi
N
))
> exp(−β(E(X
k
, S
N
) − hEi
k
)), k 6= N. (5)
Then, the following inequality is satisfied
E(X
N
, S
N
) − hEi
N
< E(X
k
, S
N
) − hEi
k
. (6)
From this, the new scoring function is defined as E−hEi.
III. RESULTS AND DISCUSSION
Figure 1 shows the relationship between the RMSD
measured from the native structure and the original MJ
scoring function for the protein 1ctf in the 4-state re-
duced decoys. Figure 2 corresponds to the results ob-
tained by using the new scoring function. Each data
Fig. 1. Relationship between the original MJ contact en-
ergy E and the RMSD. The energy is the sum of pair-wise
MJ contact potentials when each structure is mounted on a
native sequence. Here, the correlation is 0.34.
Fig. 2. Relationship between E − hEi and the RMSD.
hEi is the average of the sum of pair-wise contact potentials
calculated from 1000 random sequences. The correlation is
0.58.
Identification of the Protein Native Structure by· · · – Jaewoon Jung et al. -627-
Table 1. Z−scores calculated using the original MJ contact energy E, and the modified scoring function E − hEi.
decoy set protein average comparison
1ctf 1r69 1sn3 2cro 3icb 4pti 4rxn
4-state reduced
a
3.60/3.40 4.52/4.37 2.40/3.03 4.05/4.36 2.12/2.48 3.60/3.33 2.91/3.33
3.31/3.47
b
3/4
1fc2 1hdd-C 2cro 4icb
fisa
1.59/−0.01 3.07/2.24 4.33/2.73 5.98/4.40
3.74/2.34 4/0
1bg8-A 1bl0 1eh2 1jwe smd3
fisa casp3
3.27/2.19 1.77/−0.17 2.92/2.32 4.77/2.01 4.09/2.43
3.36/1.76 5/0
1beo 1ctf 1fca 1nkl
lattice ssfit
2.67/5.02 3.39/5.49 3.08/3.03 2.48/6.42
2.91/4.99 1/3
1b0n-B 1bba 1ctf 1dtk 1fc2 1igd 1shf-A
1.91/−0.05 −0.16/−1.63 3.79/3.31 3.99/0.67 −3.64/−6.43 3.45/3.34 2.47/1.16
lmds
2cro 2ovo 4pti
2.64/0.77 10/0
7.37 /4.25 3.06/2.12 4.19/0.98
1ctf 1eh2 1khm 1nkl 1pgb
semfold
2.89/2.51 3.62/4.22 1.60/2.65 1.58/2.90 1.66/1.55
2.27/2.77 2/3
1ash 1bab-B 1col-A 1cpc-A 1ecd 1emy 1flp
2.96/3.22 1.45/1.71 4.79/4.84 3.55/3.94 1.81/1.78 0.99/1.42 2.40/2.42
1gdm 1hbg 1hbh-A 1hbh-B 1hda-A 1hda-B 1hlb
2.65/2.53 2.16/1.87 0.89/1.05 0.96/1.35 0.94/1.87 1.89/1.85 0.83/2.17
hg structal
1hlm 1hsy 1ith-A 1mba 1mbs 1myg-A 1myj-A
1.50/1.85 5/23
−2.69/−0.73 0.44/1.54 1.71/1.92 2.36/2.40 −0.75/−0.26 1.34/1.81 1.38/1.71
1myt 2dhb-A 2dhb-B 2lhb 2pgh-A 2pgh-B 4sdh-A
1.81/2.15 1.53/1.81 0.36/1.06 1.50/1.54 1.31/1.56 0.45/1.45 2.93/1.93
1acy 1baf 1bbd 1bbj 1dbb 1dfb 1dvf
−0.74/0.69 −0.71/0.10 −0.38/0.77 −0.37/1.29 −1.30/0.30 −1.02/0.11 0.20/0.40
1eap 1fai 1fbi 1fgv 1fig 1flr 1for
−0.74/0.40 −0.16/0.44 −0.85/0.18 −0.82/0.20 −1.50/0.80 0.20/0.49 −2.57/0.02
1fpt 1frg 1fvc 1fvd 1gaf 1ggi 1gig
−1.31/0.22 1.13/1.23 −0.12/0.43 0.04/0.87 −1.85/−0.06 0.18/1.05 0.96/1.08
1hil 1hkl 1iai 1ibg 1igc 1igf 1igi
−0.16/0.37 −1.63/0.17 −1.45/0.38 0.22/0.17 −0.44/0.68 0.63/0.77 −0.36/0.44
1igm 1ikf 1ind 1jel 1jhl 1kem 1mam
ig structal
−0.68/0.40 −0.58/0.76 −0.08/1.58 −1.02/−0.33 −0.01/0.79 −0.65/1.13 −0.40/0.81
−0.50/0.61 1/58
1mcp 1mlb 1mrd 1nbv 1ncb 1ngq 1nmb
−0.26/0.44 −0.67/0.74 −0.56/0.39 0.29/0.77 −0.09/0.75 −0.42/0.75 −0.65/0.69
1nsn 1opg 1plg 1rmf 1tet 1ucb 1vfa
−2.31/0.38 −0.32/−0.01 0.08/0.89 −1.54/−0.27 −0.72/0.31 −0.69/0.55 −0.54/0.58
1vge 1yuh 2cgr 2fb4 2fbj 2gfb 3hfl
−0.67/0.40 −1.03/0.13 0.09/1.35 0.68/1.82 −0.60/0.80 0.11/0.35 −0.78/0.30
3hfm 6fab 7fab
0.38/1.04 −0.62/0.75 −0.06/1.74
1dvf 1fgv 1flr 1fvc 1gaf 1hil 1ind
0.24/0.37 −0.61/0.27 0.26/0.52 −0.03/0.46 −1.45/0.07 0.02/0.46 −0.04/1.23
1kem 1mlb 1nbv 1opg 1vfa 1vge 2cgr
ig structal hires
−0.59/1.02 −0.44/0.77 0.23/0.65 −0.40/0.01 −0.38/0.56 −0.32/0.55 0.22/1.31
−0.21/0.71 0/18
2fb4 2fbj 6fab 7fab
0.48/1.35 −0.52/0.78 −0.46/0.75 −0.07/1.70
total 36/109
a
For A/B, A and B corresponds to the Z-score of E and E − hEi, respectively
b
For A/B, A and B corresponds to the number of proteins that E is superior and the number of proteins that E − hEi is
superior, respectively
point represents a structure in the decoy set, and the
point with RM SD = 0 corresponds to the native X-ray
structure. From these figures, we observe that the new
scoring function E − hEi has a higher correlation with
the RMSD than the original contact energy E.
To compare the performances of E and E − hEi in
more detail, we calculated the Z-scores of the native
structures, the correlations between the RMSD and the
scoring function, and the ranks of the native structures
in various decoy sets. The results are summarized in
-628- Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005
Table 2. The correlation calculated using the original MJ contact energy E, and the modified scoring function E − hEi.
decoy set protein average comparison
1ctf 1r69 1sn3 2cro 3icb 4pti 4rxn
4-state reduced
a
0.34/0.58 0.21/0.50 0.19/0.38 0.39/0.57 0.43/0.68 0.16/0.29 0.27/0.49
0.28/0.50
b
0/7
1fc2 1hdd-C 2cro 4icb
fisa
0.22/0.35 0.17/0.31 0.18/0.20 0.17/0.12
0.19/0.25 1/3
1bg8-A 1bl0 1eh2 1jwe smd3
fisa casp3
0.26/0.19 0.38/0.40 0.26/0.23 −0.12/−0.22 0.19/−0.14
0.19/0.09 4/1
1beo 1ctf 1fca 1nkl
lattice ssfit
0.04/0.02 0.06/0.04 −0.02/−0.03 0.01/0.04
0.02/0.02 3/1
1b0n-B 1bba 1ctf 1dtk 1fc2 1igd 1shf-A
−0.13/−0.29 0.03/0.11 0.18/0.09 0.19/0.08 −0.03/−0.14 0.13/0.12 0.07/0.06
lmds
2cro 2ovo 4pti
0.08/0.02 8/2
0.11/0.02 0.17/0.21 0.06/−0.06
1ctf 1khm 1nkl 1pgb
semfold
0.09/0.10 0.08/0.04 0.02/0.04 0.04/0.06
0.06/0.06 1/3
1ash 1bab-B 1col-A 1cpc-A 1ecd 1emy 1flp
0.50/0.51 0.82/0.85 0.67/0.53 0.69/0.60 0.63/0.67 0.62/0.74 0.55/0.71
1gdm 1hbg 1hbh-A 1hbh-B 1hda-A 1hda-B 1hlb
0.78/0.85 0.46/0.60 0.81/0.81 0.78/0.82 0.85/0.84 0.84/0.89 0.52/0.57
hg structal
1hlm 1hsy 1ith-A 1mba 1mbs 1myg-A 1myj-A
0.66/0.72 5/23
0.05/0.12 0.62/0.70 0.60/0.72 0.67/0.78 0.57/0.58 0.71/0.77 0.73/0.81
1myt 2dhb-A 2dhb-B 2lhb 2pgh-A 2pgh-B 4sdh-A
0.65/0.71 0.89/0.86 0.77/0.89 0.47/0.56 0.92/0.90 0.83/0.86 0.60/0.81
1acy 1baf 1bbd 1bbj 1dbb 1dfb 1dvf
0.49/0.57 0.55/0.51 0.39/0.49 0.44/0.50 0.47/0.54 0.37/0.40 0.48/0.50
1eap 1fai 1fbi 1fgv 1fig 1flr 1for
0.33/0.38 0.44/0.51 0.36/0.44 0.44/0.49 0.31/0.42 0.43/0.50 0.32/0.49
1fpt 1frg 1fvc 1fvd 1gaf 1ggi 1gig
0.40/0.49 0.54/0.60 0.17/0.09 0.50/0.58 0.38/0.43 0.49/0.52 0.36/0.33
1hil 1hkl 1iai 1ibg 1igc 1igf 1igi
0.51/0.58 0.37/0.44 0.45/0.54 0.22/0.21 0.53/0.55 0.53/0.58 0.20/0.23
1igm 1ikf 1ind 1jel 1jhl 1kem 1mam
ig structal
0.43/0.55 0.33/0.36 0.39/0.43 0.36/0.45 0.40/0.36 0.45/0.52 0.17/0.27
0.38/0.44 7/49
1mcp 1mlb 1mrd 1nbv 1ncb 1ngq 1nmb
0.42/0.57 0.41/0.46 0.19/0.26 0.42/0.49 0.53/0.54 0.34/0.42 0.05/−0.01
1nsn 1opg 1plg 1rmf 1tet 1ucb 1vfa
0.32/0.52 0.45/0.45 0.47/0.52 0.44/0.50 0.43/0.56 0.58/0.58 0.18/0.25
1vge 1yuh 2cgr 2fb4 2fbj 2gfb 3hfl
0.01/0.13 0.16/0.13 0.42/0.57 0.44/0.54 0.42/0.42 0.23/0.19 0.02/0.12
3hfm 6fab 7fab
0.45/0.50 0.45/0.52 0.47/0.48
1dvf 1fgv 1flr 1fvc 1gaf 1hil 1ind
0.47/0.55 0.45/0.60 0.60/0.61 0.11/0.05 0.29/0.46 0.53/0.65 0.23/0.47
1kem 1mlb 1nbv 1opg 1vfa 1vge 2cgr
ig structal hires
0.47/0.65 0.38/0.53 0.36/0.49 0.35/0.41 0.09/0.25 −0.10/0.08 0.43/0.72
0.36/0.50 1/17
2fb4 2fbj 6fab 7fab
0.41/0.66 0.55/0.57 0.39/0.62 0.49/0.66
total 30/106
a
For A/B, A and B corresponds to the correlation with RMSD of E and E − hEi, respectively
b
For A/B, A and B corresponds to the number of proteins that E is superior and the number of proteins that E − hEi is
superior, respectively
Tables 1- 3.
First, the Z-scores are shown in Table 1. For each
scoring function f (E and E−hEi), the Z-score is defined
as Z = −
f
N
−hfi
σ
, where f
N
is the scoring function of the
native structure, hfi is the average of the scoring function
measured from decoy structures, and σ is the variance of
the scoring function in decoy structures. A large value
of the Z-score indicates that the native structure can
be distinguished well from the decoy structures. Thus,
Identification of the Protein Native Structure by· · · – Jaewoon Jung et al. -629-
Table 3. The rank of the native structure calculated using the original MJ contact energy E, and the modified scoring
function E − hEi.
decoy set protein comparison
1ctf 1r69 1sn3 2cro 3icb 4pti 4rxn
4-state reduced
a
1/1 1/1 3/1 1/1 10/2 1/1 2/1
b
0/3
1fc2 1hdd-C 2cro 4icb
fisa
30/263 2/10 1/3 1/1
3/0
1bg8-A 1bl0 1eh2 1jwe smd3
fisa casp3
1/17 42/552 2/28 1/33 1/10
5/0
1beo 1ctf 1fca 1nkl
lattice ssfit
13/1 2/1 1/1 13/1
0/3
1b0n-B 1bba 1ctf 1dtk 1fc2 1igd 1shf-A
17/261 281/482 1/1 1/57 501/501 1/1 4/52
lmds
2cro 2ovo 4pti
6/0
1/1 1/8 1/62
1ctf 1eh2 1khm 1nkl 1pgb
semfold
27/54 2/2 1245/68 654/24 562/696
2/2
1ash 1bab-B 1col-A 1cpc-A 1ecd 1emy 1flp
1/1 3/1 1/1 1/1 1/1 6/4 1/1
1gdm 1hbg 1hbh-A 1hbh-B 1hda-A 1hda-B 1hlb
1/1 1/1 9/9 8/2 6/1 1/1 7/1
hg structal
1hlm 1hsy 1ith-A 1mba 1mbs 1myg-A 1myj-A
0/14
-30/22 9/3 1/1 1/1 25/19 5/2 5/2
1myt 2dhb-A 2dhb-B 2lhb 2pgh-A 2pgh-B 4sdh-A
2/1 3/1 15/2 2/2 3/3 13/1 1/1
1acy 1baf 1bbd 1bbj 1dbb 1dfb 1dvf
51/10 52/34 46/7 46/1 58/22 55/36 28/18
1eap 1fai 1fbi 1fgv 1fig 1flr 1for
54/20 29/18 54/28 53/32 58/4 30/17 59/39
1fpt 1frg 1fvc 1fvd 1gaf 1ggi 1gig
58/29 7/2 42/20 35/5 59/40 29/1 8/4
1hil 1hkl 1iai 1ibg 1igc 1igf 1igi
41/24 58/35 58/18 27/2 50/12 14/10 41/17
1igm 1ikf 1ind 1jel 1jhl 1kem 1mam
ig structal
51/22 46/8 33/1 56/51 36/4 51/1 45/4
0/59
1mcp 1mlb 1mrd 1nbv 1ncb 1ngq 1nmb
39/19 53/4 54/21 27/6 43/3 47/6 50/9
1nsn 1opg 1plg 1rmf 1tet 1ucb 1vfa
59/20 45/40 32/4 57/50 54/24 54/13 48/12
1vge 1yuh 2cgr 2fb4 2fbj 2gfb 3hfl
52/20 56/28 33/1 13/2 51/8 31/20 56/25
3hfm 6fab 7fab
23/2 50/6 36/1
1dvf 1fgv 1flr 1fvc 1gaf 1hil 1ind
10/7 17/12 9/4 15/6 19/13 12/7 11/1
1kem 1mlb 1nbv 1opg 1vfa 1vge 2cgr
ig structal hires
17/1 16/3 12/4 17/14 16/4 16/5 9/1
0/18
2fb4 2fbj 6fab 7fab
6/2 15/3 18/1 12/1
total 16/99
a
For A/B, A and B corresponds to the rank of the native structure calculated by E and E − hEi, respectively
b
For A/B, A and B corresponds to the number of proteins that E is superior and the number of proteins that E − hEi is
superior, respectively
we want a scoring function that has a large Z-score. In
Table 1, E performs better than E − hEi for proteins
in the fisa, the fisa casp3, and the lmds decoy sets, and
E − hEi performs better than E for most proteins in
the lattice ssfit, the hg structal, the ig structal, and the
ig structal hires decoy sets (except for a few proteins).
In Table 2, the correlations between each scoring
function and the RMSD (from the native structure)
-630- Journal of the Korean Physical Society, Vol. 46, No. 3, March 2005
are shown. Even for the fisa, the fisa casp3, and the
lmds proteins, E does not seem to have a higher cor-
relation than E − hEi whereas E − hEi is superior to
E for the 4-state reduced, hg structal, ig structal, and
ig structal hires. This means that E −hEi is better than
E in selecting native-like structures. It should be noted
that the contact map of the native structure does not
determine the native structure in a unique fashion; i.e.,
the reconstruction of the native structure from its con-
tact map is not straightforward. However, contact maps
are constructed by predetermined decoy structures, and
the task of identifying the correct contact map of the
native structure from among these decoy structures is
important. Therefore, a good scoring function should
show a good correlation with the RMSD measured from
the native structure. When the correlation is high, a
native-like structure is more likely to be identified as a
native fold. Table 3 shows the ranks of native structures.
Like in Table 1, E is superior to E − hEi for the fisa, the
fisa casp3 and the lmds proteins, but for the other sets,
E − hEi is superior.
If Table 1 - 3 are considered, E seems to be better
only for the fisa, the fisa casp3, and the lmds proteins,
and E − hEi is better than for the other sets. The rea-
son that E is better for fisa, fisa casp3, and lmds is as
follows: For most proteins where E performs better than
E − hEi, their native structures contain more contacts
than the decoy structures do. That is, for these proteins,
their native structures can be identified by considering
only the total number of contacts, and the character-
istics of E is not a discriminating factor. The rest of
the cases where E − hEi did not perform better than
E are for very small chains (less than 45 amino acids).
In summary, for decoy sets where the total number of
contacts from the native structure is more or less similar
to the total numbers of contacts from decoy structures,
E−hEi performs consistently better than E based on the
Z-score, the correlation with the RMSD, and the rank
of the native structure.
For the set of 4-state reduced decoys, we investigated
the difference between E and E − hEi. These decoys are
generated by keeping most of the native conformation
fixed in its native form [1]; therefore, their conformations
have evenly distributed RMSD values. The set of 4-state
reduced decoys has many near-native conformations, and
the RMSDs are well distributed at low and high values.
If these 4-state reduced decoys are considered, the differ-
ence in the performances between E and E − hEi from
Table 1 and Table 3 is not significant, but Table 2 shows
that E − hEi has a higher correlation with RMSD than
E does for all proteins. This indicates that E−hEi could
be more useful in finding native-like structures.
IV. CONCLUSION
For a given contact energy, we introduce a new scor-
ing function, the difference between the original contact
energy and the average contact energy calculated from
random sequences. The new scoring function is shown to
perform better than the original contact energy for decoy
sets where decoy structures have similar total numbers
of contacts as the native structure. Out of 145 proteins
from 9 decoy sets, the new scoring function is shown to
be more useful for about 75 % of those proteins. From
the results, we suggest a better approach to distinguish
the native structure from decoy sets.
ACKNOWLEDGMENTS
This work was supported by the Ministry of Science
and Technology (Jung & Moon) and by grant No. R01-
2003-000-11595-0 (Lee) from the Basic Research Pro-
gram of the Korean Science & Engineering Foundation.
REFERENCES
[1] B. Park and M. Levitt, J. Mol. Biol. 258, 367 (1996).
[2] C. Anfinsen, Science 181, 223 (1973).
[3] C. Branden and J. Tooze, Introduction to protein struc-
ture (New York, Freedman, 1991).
[4] S. Tanaka and H. Scheraga, Macromolecules 9, 945
(1976).
[5] S. Miyazawa and R. L. Jernigan, Macromolecules 18, 534
(1985).
[6] S. Miyazawa and R. L. Jernigan, J. Mol. Biol 256, 623
(1996).
[7] S. Miyazawa and R. L. Jernigan, Proteins: Struct. Funct.
Genet. 34, 49 (1999).
[8] D. Hinds and M. Levitt, Proc. Natl. Acad. Sci. USA 89,
2536 (1992).
[9] D. Tobi and G. Shafran and N. Linial and R. Elber,
Proteins: Struct. Funct. Genet. 40, 71 (2000).
[10] J. Skolnick and A. Kolinski and A. Oritiz, Proteins:
Struct. Funct. Genet. 38, 3 (2000).
[11] I. Bahar and R. L. Jernigan, J. Mol. Biol 266, 195 (1996).
[12] E. Huang and S. Subbiah and M. Levitt, J. Mol. Biol
252, 709 (1995).
[13] E. Huang, S. Subbiah, J. Tsai and M. Levitt, J. Mol.
Biol 257, 716 (1996).
[14] L. Mirny and E. Shakhnovich, J. Mol. Biol 264, 1164
(1996).
[15] B. Park and M. Levitt, J. Mol. Biol. 266, 831 (1997).
[16] D. Mohanty and B. N. Dominy and A. Kolinski and C. L.
Brooks and J. Skolnick, Proteins: Struct. Funct. Genet.
35, 447 (1999).
[17] E. I. Shakhnovich, Phys. Rev. Lett, 72, 3907 (1994).
[18] J. Lee and S. Y. Kim and J. Lee, J. Korean Phys. Soc.
44, 594 (2004).
[19] J. Sim and S. Y. Kim and A. Yoo and J. Lee, J. Korean
Phys. Soc. 44, 611 (2004).
[20] M. Heo and S. Kim and E. J. Moon and M. Cheon and
K. Chung and I. Chang, J. Korean Phys. Soc. 44, 1571
(2004).
[21] M. Cheon, M. Heo, E. J. Moon, S. Kim, K. Chung, I.
Chang and H. Kim, J. Korean Phys. Soc. 44, 550 (2004).