Structural analysis of natural killer cell receptor protein 1 (NKR-P1) extracellular domains suggests a conserved long loop region involved in ligand specificity.
ABSTRACT Receptor proteins at the cell surface regulate the ability of natural killer cells to recognize and kill a variety of aberrant target cells. The structural features determining the function of natural killer receptor proteins 1 (NKR-P1s) are largely unknown. In the present work, refined homology models are generated for the C-type lectin-like extracellular domains of rat NKR-P1A and NKR-P1B, mouse NKR-P1A, NKR-P1C, NKR-P1F, and NKR-P1G, and human NKR-P1 receptors. Experimental data on secondary structure, tertiary interactions, and thermal transitions are acquired for four of the proteins using Raman and infrared spectroscopy. The experimental and modeling results are in agreement with respect to the overall structures of the NKR-P1 receptor domains, while suggesting functionally significant local differences among species and isoforms. Two sequence regions that are conserved in all analyzed NKR-P1 receptors do not correspond to conserved structural elements as might be expected, but are represented by loop regions, one of which is arranged differently in the constructed models. This region displays high flexibility but is anchored by conserved sequences, suggesting that its position relative to the rest of the domain might be variable. This loop may contribute to ligand-binding specificity via a coupled conformational transition.
-
Citations (0)
-
Cited In (0)
Page 1
1
Structural analysis of natural killer cell receptor protein 1 (NKR-
P1) extracellular domains suggests a conserved long loop region
involved in ligand specificity
Received: 18.07.2010 / Accepted: 24.08.2010
Žofie Sovová1.2, Vladimír Kopecký Jr.3, Tomáš Pazderka3, Kateřina Hofbauerová3,4 Daniel
Rozbeský4,5, Ondřej Vaněk4,5, Karel Bezouška4,5, Rüdiger Ettrich1,2,
1Laboratory of Structural Biology, Institute of Systems Biology and Ecology, Academy of
Sciences of the Czech Republic, Zámek 136, 37333 Nové Hrady, Czech Republic
2Faculty of Sciences, University of South Bohemia, Zámek 136, 37333 Nové Hrady, Czech
Republic
3Institute of Physics, Faculty of Mathematics and Physics, Charles University in Prague,
Ke Karlovu 5, 12116 Prague 2, Czech Republic
4Institute of Microbiology, Academy of Sciences of the Czech Republic, Vídeňská 1083,
14220 Prague 4, Czech Republic
5Department of Biochemistry, Faculty of Science, Charles University in Prague, Albertov
2030, 12840 Prague 2, Czech Republic
Tel: +420-386 361 297, Fax: +420-386 361 279; E-mail: ettrich@nh.usbe.cas.cz
Abstract
Receptor proteins at the cell surface regulate the ability of natural killer cells to recognize and
kill a variety of aberrant target cells. The structural features determining the function of
natural killer receptor proteins 1 (NKR-P1s) are largely unknown. In the present work, refined
homology models are generated for the C-type lectin-like extracellular domains of rat NKR-
P1A and NKR-P1B, mouse NKR-P1A, NKR-P1C, NKR-P1F, and NKR-P1G, and human
NKR-P1 receptors. Experimental data on secondary structure, tertiary interactions, and
thermal transitions are acquired for four of the proteins using Raman and infrared
spectroscopy. The experimental and modeling results are in agreement with respect to the
overall structures of the NKR-P1 receptor domains, while suggesting functionally significant
Manuscript
Click here to download Manuscript: JMMO1519R1_final.doc
Page 2
2
local differences among species and isoforms. Two sequence regions that are conserved in all
analyzed NKR-P1 receptors do not correspond to conserved structural elements as might be
expected, but are represented by loop regions, one of which is arranged differently in the
constructed models. This region displays high flexibility but is anchored by conserved
sequences, suggesting that its position relative to the rest of the domain might be variable.
This loop may contribute to ligand-binding specificity via a coupled conformational
transition.
Keywords
Molecular dynamics Two-dimensional correlation analysis Differential
scanning calorimetry Thermal dynamics Raman Spectroscopy FTIR Topology RMSF
Cladogram
Page 3
3
Introduction
Natural killer (NK) cells are large granular lymphocytes able to recognize and kill a large
variety of target cells and to regulate the reactions at the interface of innate and adaptive
immunity through secretion of lymphokines and by direct killing [1, 2]. The cytotoxic activity
of NK cells is tightly regulated via activating and inhibiting cell-surface receptors, one group
of them being the NK cell lectin-like receptor proteins, one subtype of which are the NKR-
P1s. The NKR-P1s are transmembrane glycoproteins classified as type II due to their external
C-terminus, with an extracellular C-type lectin-like domain (CTLD) and a short cytoplasmic
domain. To initiate the NK cell response, activating NK cell receptors recognize a diverse
range of ligands including cytokines, antibody Fc domains and other proteins and saccharides
presented by target cells. Inhibitory NK cell receptors recognize MHC class I molecules and
other proteins serving as a marker of cell health. Tumor, virally infected, stressed or otherwise
damaged cells that can resist T cell mediated immunity because of the low levels of these
markers expressed at their surface are no longer protected by their inhibitory signals and this
may lead to NK cell activation and an elimination of the target cells [3]. Functions of
NKR-P1s were enigmatic until their ligands were found to be closely related family of Clr
lectin-like receptors [2]. Despite NKR-P1 family include both activating and inhibitory
receptors, the CTLDs of NKR-P1 family members share considerable homology; thus the
structural origins of ligand-binding specificity are of interest. In the present work, multiple
sequence alignment, protein homology modeling, and molecular dynamics simulations are
combined with protein expression, vibrational spectroscopy, and thermal analysis to examine
evolutionary and structural divergences within the family of NKR-P1 receptor CTLDs.
Methods
Protein preparation
DNA coding for the extracellular part of mNKR-P1A/C proteins (mNKR-P1A/C; to specify
the origin of the receptor, the first letter of each organism name is used prior to the receptor
name, i.e., m for mouse, r for rat, etc.), was amplified from the total mRNA isolated from the
spleen of C57BL/6 mouse and subcloned into expression vector pET-30a (Novagen).
Page 4
4
Expression plasmids were transformed into Escherichia coli BL-21 (DE3) Gold (Stratagene).
Bacteria were grown in LB medium, induction was performed with 0.1 mM isopropyl-β-D-
thiogalactopyranoside and the induced culture was grown for 2 h. Proteins were refolded from
inclusion bodies and were purified by HPLC ion exchange and gel filtration chromatography.
The protein samples for experiments were concentrated at 10.4 mg/mL in a 15 mM Tris–HCl
buffer with 150 mM NaCl, pH 8.0.
The rNKR-P1A/B proteins were prepared as described previously [4]. Proteins were dissolved
in a 10 mM Tris-HCl buffer with 50 mM NaCl, pH 7.4, in concentrations of 9.1 mg/mL and
3.8 mg/mL, respectively. The concentrations were determined by Bradford assay [5].
Raman spectroscopy
Raman spectra of aqueous solutions of mNKR-P1A/C proteins were recorded in a standard
90° geometry on a multichannel instrument based on Spex 270M single spectrograph with
1800 grooves/mm grating (Jobin-Yvon), a holographic notch-plus filter (Kaiser Optical
Systems) and a liquid nitrogen cooled CCD detection system (Princeton Instruments)
measuring 1340 pixels along the dispersion axis. The spectral resolution was approximately
5 cm-1. Samples in a capillary micro-cell (5 µL inner volume) were excited with a 532.2 nm
line (300 mW of radiant power per sample) NdYAG Verdi 2 laser (Coherent) and kept at 4 °C
during all experiments using an external water bath (Neslab). The acquisition time for the
spectra was 60 minutes. The spectra of temperature dependence each consisted of 5 min
exposures. Temperatures in the 5–90 °C region were adjusted with 5 °C increments using the
external water bath (Neslab) and equilibrated for 4 minutes before measurement. The
wavenumber scale was calibrated with neon glow-lamp lines. Therefore, Raman frequencies
of well-resolved bands are accurate to ±0.5 cm-1.
The solutions of rNKR-P1A/B proteins were excited with a 514.5 nm Ar-ion laser Innova 300
(Coherent) using the same Raman spectrometer with the same experimental setup. Both
spectra were accumulated for 600 min to produce traces of the highest quality.
Drop coating deposition Raman spectroscopy
The samples of mNKR-P1A/C (4 µL) were dialyzed (on Millipore filters 0.025µm/white
VSWP/13 mm) against deionized distilled water for 35 minutes. A 2 µL volume of protein
Page 5
5
solution, with an approximate concentration of 1 mg/mL, was deposited on a standard DCDR
substrate SpectRIM™ (Tienta Sciences) consisting of a polished stainless steel plate coated
with a thin layer of Teflon [6]. After air-drying at room temperature, approximately 20
minutes, Raman spectra were collected from ―coffee rings‖ of former droplets [7] using a
Raman microspectrometer HR800 (Horiba Jobin Yvon) with a 514.5 nm Ar-ion excitation
laser (Melles Griot). A 50× microscope objective (N.A. 0.75, Olympus) was used to focus the
5 mW excitation laser to a diameter, approximately 1.5 µm, on the sample, and the spectra
were integrated for 20 min using a 600 grooves/mm grating and liquid nitrogen cooled CCD
detector (1024 × 256 pixels, Symphony). The spectrometer was calibrated using a band of Si-
vibrations at 520.7 cm-1. The spectral resolution was approximately 5 cm-1.
The samples of rNKR-P1A/B proteins were treated in the same way as the mouse proteins.
Nevertheless, the spectra of rNKR-P1A/B protein were integrated for 4 min using a 632.8 nm
He-Ne laser excitation in the same setup as mentioned previously.
Infrared spectroscopy
Infrared spectra of mNKR-P1A/C proteins were recorded with a Bruker Vector 33 FTIR
spectrometer using a standard MIR source, a KBr beamsplitter and a DTGS detector. 5000
scans were collected with a Blackman-Harris 3-term apodization function at a spectral
resolution of 2 cm-1. Aqueous protein solutions were measured at room temperature in a CaF2-
cell with a 10µm path length. Measurements in thermal dynamics were performed using a
thermal cell holder BioJACK™ (BioTools). Temperature was adjusted from 5 to 90 °C with
an increment of 5 °C using an en external water bath (Neslab) and equilibrated for 4 minutes
before measurements took place. 1000 scans were performed. Attenuated total reflection
(ATR) FTIR measurements, 5000 scans, were realized by using an ATR-MIRacl™AG –
single diamond horizontal ATR (Pike Technologies).
The rNKR-P1A/B proteins were measured using a Bruker IFS 66/S FTIR spectrometer
equipped with an MCT detector. 4000 scans were collected with a Happ-Genzel apodization
function at a spectral resolution of 4 cm-1. The rest of the FTIR setup remained the same as
the measurements for mNKR-P1 proteins. Spectral contribution of the buffer was corrected
following the standard algorithm [8]. Spectrum of water vapors was subtracted and finally, all
spectra were normalized.
Page 6
6
Differential scanning calorimetry
The protein's stock solutions were diluted to the desired concentration – i.e. mNKR-P1A to
0.20 mg/mL and mNKR-P1C to 0.52 mg ml-1. Calorimetric measurements were performed
using the Model 6100 Nano II Differential Scanning Calorimeter – N-DSC II
(Microcalorimetry Sciences Corporation). The samples were scanned from 5 to 90 °C at a
heating rate of 1 °C/min under a constant excess pressure of 3 atmospheres. The appropriate
DSCRun and CpCalc 2.2 software was used for data acquisition and analysis. After baseline
subtraction of the buffer–buffer signal, the molar excess heat capacity function was obtained
by dividing the protein concentration and cell volume (0.299 ml).
Sequence and phylogeny analysis
Sequences specified in Table 1 were aligned in ClustalX [21] and used for phylogeny
analysis. For sequence analysis purposes, we used only their C-type lectin-like domains
(CTLDs). Phylogenetic analysis was originally performed for protein sequences only;
resultant trees however revealed too many polytomies. The number of polytomies was
decreased by using nucleotide sequences for tree construction. Excluding Bayesian analyses,
all protein and DNA sequences analyses were performed using Phylip 3.69 software [22] with
the following algorithms: Neighbor-Joining [23], maximum parsimony [24], maximum
likelihood [25] with assuming molecular clock and Fitch-Margoliash method assuming
molecular clock (Kitsch method) [26]. Sequences were bootstrapped 1000× with the
exception of the computationally most expensive method maximum likelihood, where
sequences were bootstrapped 100×. To confirm that this value of bootstrapping was high
enough, we constructed the trees with the same settings but bootstrapped 1100× or 110×. To
calculate the distance matrix, and for the maximum likelihood method, we used the Jones-
Taylor-Thornton matrix for protein sequences and the F84 matrix for DNA sequences. Where
possible the sequences were jumbled 5×. In the Neighbor-Joining method the order of
sequences was randomized. Probabilities of branch occurrence were calculated according to
one of the most commonly used tests on the reliability of an inferred tree, Felsenstein's
bootstrap test [27], which was evaluated using Efron's [28] bootstrap resampling technique.
Resultant consensus cladograms were done using the 50% majority consensus rule. Bayesian
analysis was performed using MrBayes 3.1.2 [29] with the same initial sequences used
previously. Invariable gamma distribution and a GTR model was used to describe the
Comment [EH1]:
Comment for typesetter:
Ref 9-20 are cited in Table 1
Page 7
7
parameters for the likelihood model, 150 000 for protein cycles and 100 000 for DNA Markov
chain Monte Carlo cycles, with the state being swapped every 100×.
Homology modeling
Primary structures of mouse NKR-P1A, mNKR-P1C, rNKR-P1A, and rNKR-P1B
extracellular domains were extracted from the database (Table 1). Templates were identified
using BLAST [30] with matrix BLOSUM62. The identified homologs were almost the same
for all seven models that were built. The template with highest identity to the target sequence
was used. If some structures had the same identity, the one with the better resolution was
used. Templates and identities used were: 1YPQ (C-type lectin-like domain of human
oxidized low density lipoprotein receptor 1 (LOX-1) [31]) for mNKR-P1A/C (identity
31 %/32 %), 1XPH (CD209 antigen-like protein 1 [32]) for rNKR-P1B (identity 33 %),
3HUP (early activation antigen CD69 [33]) for mNKR-P1F (identity 33 %), 1E87 (early
activation antigen CD69 [34]) for mNKR-P1G (identity 35 %) and 2BPD (beta-glucan
receptor Dectin-1 [35]) for hNKR-P1 (identity 32 %). Only in the case of rNKR-P1A was an
alternative technique tried. The neighboring sequence from the Neighbor-Joining
phylogenetic tree (1000× bootstrapped) was used, i.e., structure 3CAD (lectin-related NK cell
receptor LY49G1 [36]). Alignment of template and modeled sequences used ClustalX.
Alignments were checked to ensure that general features in sequence-conserved regions of
CTLDs, were maintained as found in sequence multiple alignments of CTLDs. The resulting
aligned sequences used for modeling are shown in Fig. 8. Ten models were calculated for
every protein and template using Modeller 9v4 [37]. The best model from every group was
chosen by the distribution of amino acids in the Ramachandran plot and stereochemical g-
factor (both calculated by Procheck [38]), Modeller distribution function and by visual
inspection using SwissPDBViewer [39]. Secondary structure was determined by Procheck
according to Kabsch and Sander [40].
Molecular dynamics
All selected models were minimized in SCP water solution with Gromacs 3.3.3 package [41],
using virtual site hydrogens. The modified version of force field Gromos87 (usually called
Gromacs) [42, 43] was used. Temperature was held at 300 K by separately connecting the
protein and solution to the external temperature bath (t = 0.1 ps) while the pressure was held
Page 8
8
at 1 bar by connecting to the pressure bath (t = 0.1 ps). Algorithms SETTLE (for water) and
LINCS (for protein) were used to restrict covalent bond length and long-range electrostatic
interactions were calculated using the Particle-Mesh Ewald method. Optimization with
steepest descent energy minimization was followed by solvent optimization using a time step
of 1 fs for 10 ps. Counter ions were added to neutralize the simulation box and were
consequently minimized for 10 ps using a time step of 1 fs. Finally, the protein was
minimized for 20 ps using a time step of 2 fs. For production, the simulation runs were started
using a time step of 5 fs. Root mean square deviation and radius of gyration analysis was
performed every 10 ns to check if the system reached equilibrium. The total production run
time was 110 ns for mNKR-P1A, 10 ns for rNKR-P1A, 20 ns for rNKR-P1B and 100 ns for
mNKR-P1C. The human and mouse models NKR-P1F/G were equilibrated for 10 ns.
After the dynamics run, all structures were minimized using the steepest descent algorithm.
The representation of secondary structure types according to Kabsch and Sander [40] was
calculated by Procheck software, and the root mean square fluctuation for each residue was
calculated in Gromacs.
Results and discussion
Structural analysis of NKR-P1 proteins
Four NKR-P1 extracellular domains, rNKR-P1A and B and mNKR-P1A and C, were
overexpressed in E. coli, refolded from inclusion bodies, and purified for spectroscopic
analysis as described in Methods. Receptor domains were structurally analyzed by Raman,
FTIR, and ATR-FTIR spectroscopy, including a novel drop-coating deposition Raman
method that yields native-state spectra from very small solid samples [7]. Spectral
assignments are given in Table 2, and secondary structure contents estimated using the pattern
recognition least-squares method (LSA) are given in Table 3.
Fig. 1 compares the FTIR spectra of mNKR-P1A and C domains. The similarity of the spectra
is immediately obvious, indicating that mouse A and C proteins share the same fold with very
similar secondary structure content. The slight differences in the spectra can be explained by
Page 9
9
their different amino acid compositions. For example, the second derivative band at 1517 cm-1
is connected with Tyr ring vibrations [54], and is more intense in mNKR-P1A that has five
Tyr residues that in mNKR-P1C with only one Tyr. ATR-FTIR spectra of proteins were also
measured for better resolution of subtle differences, and no further differences were detected
(data not shown). The FTIR results are in excellent agreement with Raman spectroscopy data
(Fig. 2 and Table 3). Solution Raman spectra (not shown) are also highly similar to drop-
coated deposition spectra, which provide better signal-to-noise ratio.
The secondary structure estimates using the LSA method, which analyzes the amide I band in
the case of Raman spectroscopy [49], and amide I and II bands in FTIR spectra [50], are not
perfect for mNKR-P1A and C domains, suggesting a protein fold whose spectral pattern is not
included in the LSA reference set [51]. Thus, the absolute percentages in the secondary
structure content probably contain larger errors than estimated in Table 3. The fits for Raman
spectral data are better. Nevertheless, both FTIR and Raman spectroscopy predictions differ
by no more than 4 % in each method for mouse A and C variants, and for each variant some
predictions are in agreement within the margin of error. These results thus suggests only a
small difference in the content of -helices and β strands of about 10 % for the two mouse
domains.
Negative bands are observed in the Raman difference spectrum between mouse variants A
and C in the region reporting on S-S-bridge conformations (Fig. 2). The presence of these
intense and narrow vibration bands reflects a more well-defined conformation of S–S bridges
in mNKR-P1C and thus lower flexibility. Thus, the C variant is probably more rigid than the
A variant, in agreement with the higher stability detected in calorimetric data as discussed
below.
More distinctive differences can be seen between the FTIR spectra of rNKR-P1A and B
domains (Fig. 3). The band positions of the rat A variant differ only slightly more than the
differences between the two mouse variants, and therefore the rat A variant can be structurally
clustered with both mouse variants; rNKR-P1B is the most distinctive protein within the
measured group. Raman difference spectroscopy (Fig. 4) adds detail to these distinctions, with
significant differences between A and B variants in the amide I region, ca. 1671 cm-1 and
amide III, ca. 1237 cm-1. In the amide III region, the positive band, ca. 1230 cm-1, corresponds
to an excess of -structures in rNKR-P1A relative to the B variant. The negative band at
1280 cm-1 corresponds to lower -helix content in the A variant when compared with the B
Comment [EH2]:
Comment for typesetter:
Ref 44-47 are cited in Table 2
Ref 48 is cited in figure caption of Fig.2
Ref 49-51 are cited in Table 3
Ref 52-53 is cited in figure caption of Fig.1
Page 10
10
variant. The aromatic amino acid composition of rNKR-P1A/B domains are the same; thus
the Raman difference spectrum in Fig. 4 reflects mostly secondary structure differences
between the proteins. Comparing all Raman spectra, rNKR-P1B protein is again the most
distinctive, with mouse A and C variants very similar, and rat variant A being somewhere in
between.
Thermal dynamics
Differential scanning calorimetry showed an irreversible denaturation transition for mNKR-
P1A and C domains corresponding to a single, noncooperative thermal transition, with
melting temperatures of 69 °C for mNKR-P1A and 72 °C for the C variant (data not shown).
Thermodynamic parameters for the transitions, determined as described in Methods, were the
same for both proteins, H = 48 kcal mol-1, S = 0.14 kcal·K-1·mol-1. Therefore the mNKR-
P1C domain is slightly more thermostable than the A variant.
Two-dimensional correlation spectroscopy (2DCoS) was applied to FTIR and Raman spectra
to add structural detail to the thermal unfolding transitions of mNKR-P1A/C proteins. The
aim of this technique is to identify in-phase and out-of-phase correlations between spectral
intensity variations occurring at different wavenumbers that are induced by external
perturbation of the studied system; a generalized formalism for 2DCoS can be found in ref.
57–59. The main advantage of 2DCoS is that it allows enhancement of spectral resolution by
spreading overlapping bands over a second dimension. In addition, sign analysis of the
correlation peaks in the 2D maps may improve band assignment and permit establishment of a
sequence of events during the perturbation process [57–59].
Synchronous 2D correlation FTIR spectra (Fig. 5 left) suggest slightly lower stability in the A
variant relative to C, in agreement with calorimetric data, and suggest this transition is
correlated with exposure of hydrophobic regions in-strands. The Raman synchronous
spectrum is significantly less complicated (Fig. 6 left) and reports on the same denaturation
process, consistent with the FTIR results. The asynchronous 2DCoS spectra (Fis. 5 and 6
right) reveal sequential, but not coincidental, spectral changes. The asynchronous spectrum
has no autopeaks and consists exclusively of crosspeaks that are antisymmetric in regard to
the diagonal line. Asynchronous cross peaks develop only if the intensities of two spectral
features change and are out of phase with each other. If the asynchronous peak sign becomes
Comment [EH3]:
Comment for typesetter:
Ref 55-56 are cited in figure caption of Fig,2
Page 11
11
positive then the intensity change at the given wavenumber on the x-scale occurs
predominantly before it is connected with the 2DCoS y-scale and vice versa. This sign rule is
reversed if the synchronous correlation intensity at the same coordinate becomes negative.
Application of these so called Noda rules [57–59] to mNKR-P1C second derivative FTIR
spectra indicate the following sequential order of structural changes. Changes in -turns occur
first, followed by an increase in -sheet content, then changes in -helices followed by
changes in loops. These structural changes precede emergence of -aggregated structures,
which is followed by changes in -turns. The sequential order in the mNKR-P1A variant is
the same, although the emergence of -aggregates was not observable. The sequential order of
secondary structure changes from FTIR is in agreement with that from Raman data.
Asynchronous analysis of Raman data shows that changes in Tyr/Phe vibration at 1600 cm-1
precede changes in -turns, and the increase of -sheet content is followed by changes in
Trp/Phe vibrations at 1590 cm-1.
Heterospectral 2DcoS was used because it can reveal correlations between different spectral
regions or even between two different spectroscopic techniques and may help to detect
vibrations of a similar nature or connections with the same process [59]. Only synchronous
heterospectral 2DCoS Raman spectroscopy was used to permit direct correlation of secondary
structure elements with specific residue types, and because Raman spectra contain more
residue-specific spectral information than FTIR spectra. The synchronous spectra display only
cross peaks (Fig. 7) because autopeaks are generated only by the correlation of the same
bands as in Figs. 5 and 6. Correlations were investigated in the changes of the secondary
structure, represented by the Raman amide I region (1580–1700 cm-1), and the regions where
aromatic side chains have intense Raman bands. The strong correlations reported in Fig. 7
suggest that one or more Trp residues experience major changes in their surroundings during
the early stages of thermal denaturation.
The structural changes occurring in NKR-P1 proteins during the increase of temperature can
be interpreted tentatively as follows. Some large flexible part with high content of -turns is
rearranged first. Trp residues appear to be involved in significant dynamical behavior during
these early stages of denaturation. Next, the -sheet content increases before or during
continuous decrease of -helix content. These changes lead to the exposure of a hydrophobic
Page 12
12
region, perhaps containing Tyr and/or Phe residues, leading to -aggregates followed by
changes in -turns.
Modeling
Homology modeling and molecular dynamics
Candidate template sequences were identified using BLAST [30] and pair-aligned with
modeling target sequences using ClustalX as described in Methods. The template structure
with highest sequence identity to each target was used for modeling; identities ranged from 31
to 35%, and the resulting aligned sequences used for modeling are shown in Fig. 8. Template
structures used were: 1YPQ (C-type lectin-like domain of human oxidized low density
lipoprotein receptor 1 (LOX-1) [31]) for mNKR-P1A/C, 1XPH (CD209 antigen-like protein 1
[32]) for rNKR-P1A/B, 3HUP (early activation antigen CD69 [33]) for mNKR-P1F, 1E87
(early activation antigen CD69 [34]) for mNKR-P1G, and 2BPD (beta-glucan receptor
Dectin-1 [35]) for hNKR-P1. These template structures are similar to each other as judged by
C root-mean-square deviations ranging from 0.94 to 1.50 among them, and all have
functions close to those of the NKR-P1 proteins [61]. C-type lectin-like NK receptors are
dimeric in their nature, but as the mode of dimerization is not conserved within this receptor
family, we did not attempt to model NKR-P1s as dimers. Moreover, in native receptors
covalent linkage by one or more disulfide bridges in the so-called "stalk region" close to the
cell membrane is involved in dimerization. This region is usually omitted from soluble
recombinant receptor domain constructs and thus is also missing in available crystal
structures. Finally, all recombinant NKR-P1s prepared in this study behaved like a
monomeric species during their purification (data not shown), pointing out for rather weak
dimer formation within the NKR-P1 family, however, we cannot exclude that cooperativity
between monomers will occur upon ligand binding. For consistency, the numbering of
sequences begins at residue 89 in all sequences, and specific residue numbers discussed in the
text are those of the mNKR-P1A sequence. Ten models were calculated for every sequence-
template pair using Modeller 9v4 [37]. For further analysis the best model from every group
was chosen as described in Methods, and refined by at least 10 ns of molecular dynamic
simulations at room temperature in explicit solvent.
Comment [EH4]:
Comment for typesetter:
Ref 60 is cited in figure caption of Fig.7
Page 13
13
All models share the same basic α/β fold represented in Fig. 9 by rNKR-P1B. Helices
surround a beta core composed of long antiparallel β-strands 2 and 8 that form a central
'pillar' flanked at one end by short strands 1 and 3 and at the other end by a small antiparallel
sheet formed by strands 4-7. Although β-strand 3 is very short, its residues are highly
conserved and they adopt very similar positions in all models (Table 4). The lengths of
secondary structure elements (Table 5) reveal that a conserved number of residues forms
strands 2 and 8 of the central pillar except in mNKR-P1F that is 1-2 residues shorter. These
strands define the height of the receptor on the outside surface of the cell membrane. The
models contain the three disulfide bonds identified in experimental data [62], except for
mNKR-P1C where Cys122 is replaced by Ser.
Although the core of the protein is nearly the same for all seven models (C rmsd <2 Å for
the beta core defined above), the number and orientation of helices as well as the topological
organization of secondary elements vary somewhat among the models, defining four groups
(Fig. 10). Mouse NKR-P1A/C models are more similar to each other than rat models are to
each other. In addition to the topology differences, the prominent loops anchored by small
antiparallel sheets differ in their arrangement among models, making rmsd comparisons
meaningless for these segments. In all cases one small sheet (residues Ala127, Tyr128 and
Leu129 in mNKR-P1A) anchors one loop of ~5 to 7 residues (residues Met130 to Gln135 in
mNKR-P1A, green loop in Fig. 10, first half of region III in Fig. 12) and another small sheet
(residues Trp165, Lys166 and Trp167 in mNKR-P1A) anchors one long loop of ~ 19 residues
(e.g., residues Arg168 to Asp187 in mNKR-P1A, blue loop between sheets 5 and 6 in Fig. 10,
region V until beginning of region IVb in Fig. 12). In rNKR-P1B and hNKR-P1 the long loop
contains an additional short helix that packs on one side of the core together with helix B. In
rNKR-P1B the connectivity of the loops to the small sheet also differs.
These differences in the spatial arrangement of the loops among the models are reflected also
in the behavior of each model in molecular dynamics simulations. In all simulations the
protein core is stable, with Cα rmsds reaching plateau values of ~ 1.3 to 2.5 Å (data not
shown). The root mean square fluctuations for Cα calculated from the last nanosecond of the
equilibrated part of each trajectory clearly identify the long loop region as very flexible (Fig.
11). Both ends of the loop, immediately after the N- and C-terminal anchoring residues, are
extremely flexible. The root mean square fluctuations of the other residues in the loop are
slightly higher than the average for the whole protein, although the structure within the loop is
stable in the time period that could be simulated. The N-terminal anchoring residues are the
Page 14
14
highly conserved WKW sequence motif (residues Trp165, Lys166 and Trp167 in mNKR-
P1A). The loop occasionally folds back onto the protein surface, where it forms a
hydrophobic interaction with the two tryptophan residues. Together these results suggest that
the loop region is stably anchored to the core, but can adopt alternative positions relative to
the core.
Sequence analysis
The 33 available NKR-P1 CTLD domain sequences (Table 1) were aligned in ClustalX [21].
Seven conserved regions are identifiable (Fig. 12). Among these regions, conservation among
orthologs is highest in the beta core and lowest in the two loops, the smaller loop
corresponding to region III and the large loop corresponding to region V. In region III the
chemical properties of loop residues are preserved despite the sequence variations. In contrast,
substitutions in extended region V (L160 to T183) cause significant changes in the chemical
character of some loop residues among natural NKR-P1 variants, leading to two groups
typified by NKR-P1 subfamilies A-D and F/G. The NKR-P1A/D group presents L160, N164,
T171, K178 and T183, whereas the NKR-P1F/G group substitutes a polar residue (Glu or
Gln) for Leu160, a beta-branched residue (Thr, Val, or Ile) for Asn164, a more hydrophobic
residue (Ile or Val) for Thr171, a Ser or Lys residue for Arg178, and a more polar residue
(Asn, Asp, or Glu) for Thr183. The results of in vivo binding experiments with C-type lectin
receptor (Clr) isoforms [11] suggest that these sequence differences may be related to ligand
specificity: rNKRP1A and rNKRP1B bind to Clr11 only; rNKRP1F and rNKRP1G bind to
Clr2, Clr6, and Clr7 but not to Clr11, and rNKRP1F also binds to Clr3 and Clr4. Thus, the
long adaptable loop of the C-type lectin-like domain may encode its ligand specificity.
Phylogenetic analysis
Phylogenetic analysis was originally performed with only the protein sequences of the C-type
lectin-like domains, starting at position 89 for mouse NKR-P1A. Although five methods gave
similar results, cladograms contained a large number of polytomies. DNA nucleotide
sequences were then applied to bring in additional information for tree construction.
Cladograms based on DNA sequences were constructed with several different approaches as
described in Methods, leading to nearly identical results. In every case, the major branches of
the cladogram were the same, as confirmed by bootstrapping. Bootstrapping failed only when
Page 15
15
the arrangement of species within the clade was different. This robustness of the final result
supports the principal correctness of the tree. The resultant cladogram obtained by the
maximum-likelihood method is shown in Fig. 13. Where some methods suggest polytomy or
further branching, both possibilities are drawn.
The cladogram displays three major branches. The first is composed of human and chicken
NKR-P1s. The second main branch is composed of rodent NKR-P1s from the telomeric part
of the NK gene complex. In this branch the sequences are first branched according to the
receptor subfamily and then according to the species. The third major group consists of rodent
receptors from the centromeric part of the NK gene complex. Here, the sequences are ordered
first according to species and then according to receptor subfamily. The most undefined part
of the cladogram includes the rat NKR-P1As, as some methods even create pentatomy.
Comparison of modeling and spectroscopic results
The extent of predicted α-helical structure in rNKR-P1B agrees closely with experiments,
especially with the data from FTIR. In the case of mNKR-P1A/C, more α-helical structure is
detected in the spectra than is present in the models. One likely explanation that would be in
quantitative agreement with the data is that short α-helix C, which is present in human and rat
NKR-P1B models, is also present in both mouse structures. In the case of β-strands, the
algorithm used to determine hydrogen bonds in models uses an average cutoff value, whereas
in nature this cutoff is not so sharp, and it also depends on the β-strand environment and other
factors. The agreement between the spectroscopic data and the models is particularly strong
for beta turns, especially when measured by FTIR, whereas Raman spectra give slightly
higher turns content. The phylogeny depicted in Fig. 13 is also consistent with the vibrational
spectra, which group mouse A/C variants together as most distinct from rNKR-P1B, and with
rat variant A being somewhere in between.
Disulfide bond conformations for Cys122-210 and Cys94-105 are unambiguously in a GGG
conformation in all computational models. Therefore these two bridges can be assigned to the
two clear GGG conformations determined from the band at 509 cm-1 (Fig. 2) representing the
disulfide bridges region of stretching vibrations. The third disulfide bridge between Cys189-
202 in the models adopts conformations in between the more rare GGT and TGT
conformations, and could be assigned to the TGT conformation observed experimentally (Fig.
2).
View other sources
Hide other sources
-
Available from Vladimír Kopecký · 25 Sep 2012
-
Available from unideb.hu