Content uploaded by Svetlana Antonyuk
Author content
All content in this area was uploaded by Svetlana Antonyuk
Content may be subject to copyright.
structural communications
342 doi:10.1107/S1744309113003369 Acta Cryst. (2013). F69, 342–345
Acta Crystallographica Section F
Structural Biology
and Crystallization
Communications
ISSN 1744-3091
Structure of the hypothetical DUF1811-family
protein GK0453 from Geobacillus kaustophilus
HTA426
Balasundaram Padmanabhan,
a,b
*
Yoshihiro Nakamura,
b
Svetlana V. Antonyuk,
c
Richard W. Strange,
c
S. Samar
Hasnain,
c
Shigeyuki
Yokoyama
b,d
and Yoshitaka
Bessho
b,e
*
a
Department of Biophysics, National Institute of
Mental Health and Neuro Sciences (NIMHANS),
Bangalore 560 029, India,
b
RIKEN Systems and
Structural Biology Center, 1-7-22 Suehiro-cho,
Tsurumi-ku, Yokohama, Kanagawa 230-0045,
Japan,
c
Molecular Biophysics Group, Institute of
Integrative Biology, University of Liverpool,
Liverpool L69 7ZB, England,
d
Laboratory of
Structural Biology and Department of Biophysics
and Biochemistry, Graduate School of Science,
The University of Tokyo, 7-3-1 Hongo,
Bunkyo-ku, Tokyo 113-0033, Japan, and
e
RIKEN SPring-8 Center, Harima Institute, 1-1-1
Kouto, Sayo, Hyogo 679-5148, Japan
Correspondence e-mail:
paddy@nimhans.kar.nic.in,
bessho@spring8.or.jp
Received 28 December 2012
Accepted 2 February 2013
PDB Reference: GK0453, 2yxy
The crystal structure of a conserved hypothetical protein, GK0453, from
Geobacillus kaustophilus has been determined to 2.2 A
˚resolution. The crystal
belonged to space group P4
3
2
1
2, with unit-cell parameters a=b= 75.69,
c= 64.18 A
˚. The structure was determined by the molecular-replacement
method and was refined to a final Rfactor of 22.6% (R
free
= 26.3%). Based on
structural homology, the GK0453 protein possesses two independent binding
sites and hence it may simultaneously interact with two proteins or with a
protein and a nucleic acid.
1. Introduction
As part of the RIKEN Structural Genomics Initiative (RSGI)
project, in collaboration with UK Structural Genomics, we selected
the hypothetical protein GK0453 (13 kDa, 113 residues) from Geo-
bacillus kaustophilus HTA426 to predict its function from analysis
of its crystal structure. The GK0453 protein is a member of the
DUF1811 family in the Pfam database (Bateman et al., 2002).
G. kaustophilus, from the Bacillaceae family, was isolated from deep-
sea sediment from the Mariana Trench (Takami et al., 1997). It is an
aerobic, endospore-forming, Gram-positive bacterium that grows
optimally at 333 K, with an upper temperature limit of 347 K (Takami
et al., 2004). There are 174 uncharacterized proteins in the DUF1811
family, and many are from Bacillus and Staphylococcus species that
are known to cause a wide variety of diseases such as nosocomial
infections. Thus, the proteins in this family may represent potential
drug targets for highly selective bactericides or novel chemotherapies
for these pathogens. The crystal structure of YfhH from B. subtilis,
which belongs to this family, has been determined (PDB entry 1sf9;
Midwest Center for Structural Genomics, unpublished work);
however, the function of this protein is still unclear. Here, we describe
the crystal structure of the hypothetical DUF1811-family protein
GK0453 from G. kaustophilus and discuss its function based on
structural homology.
2. Methods and materials
2.1. Cloning, expression and purification
The gene encoding the GK0453 protein (gi:56418988) was ampli-
fied via PCR using G. kaustophilus HTA426 genomic DNA and was
cloned into the pET-15b expression vector (Merck Novagen, Darm-
stadt, Germany). The tobacco etch virus (TEV) protease recognition
sequence was inserted in the N-terminal tag region of the expression
vector, which was then introduced into the Escherichia coli Rosetta
(DE3) strain (Merck Novagen, Darmstadt, Germany). The recom-
binant strain was cultured in 5 l LB medium containing 30 mgml
1
chloramphenicol and 50 mgml
1
ampicillin. The harvested cells
(23.3 g) were lysed by sonication on ice in 35 ml of 20 mMTris–HCl
buffer pH 8.0 containing 500 mMNaCl, 5 mM-mercaptoethanol
and 1 mMphenylmethylsulfonyl fluoride. The cell lysate was heat-
treated at 343 K for 13 min and then centrifuged at 15 000gfor 30 min
at 277 K. The supernatant was applied onto a HisTrap HP column
(GE Healthcare Biosciences) equilibrated with 20 mMTris–HCl
buffer pH 8.0 containing 500 mMNaCl and 20 mMimidazole and
eluted with a linear (20–500 mM) gradient of imidazole. The target
sample, which eluted in the 500 mMimidazole fraction, was collected
and applied onto a HiLoad 16/60 Superdex 200 pg column (GE
Healthcare Biosciences) equilibrated with 20 mMTris–HCl buffer
pH 8.0 containing 200 mMNaCl and 20 mMimidazole. The eluted
fractions containing the target sample were collected and treated with
TEV protease at 303 K for 60 min. The sample was then applied onto
a HisTrap HP column (GE Healthcare Biosciences) equilibrated with
20 mMTris–HCl buffer pH 8.0 containing 500 mMNaCl and 20 mM
imidazole. The flowthrough fraction was collected and desalted by
fractionation on a HiPrep 26/10 column (GE Healthcare Biosciences)
with 20 mMTris–HCl buffer pH 8.0 containing 200 mMNaCl. The
protein sample was analyzed by SDS–PAGE and its identity was
confirmed by N-terminal amino-acid sequencing. After concentration
to 28.5 mg ml
1
by ultrafiltration, the protein yield was 42.8 mg from
23.3 g of cells.
2.2. Protein crystallization, data collection and processing
Crystallization was performed by the microbatch-under-oil method
at 291 K. A 0.5 ml aliquot of crystallization reagent was mixed with
0.5 ml of the 28.5 mg ml
1
protein solution and was covered with 15 ml
of silicone and paraffin oil. In the initial screening, small crystals
appeared in a drop composed of 0.1 MTris–HCl buffer pH 8.5
containing 20%(w/v) PEG MME 2000 and 0.01 Mnickel(II) chloride
hexahydrate (Crystal Screen 2 condition No. 45; Hampton Research).
After optimization, large crystals were obtained from a crystallization
reagent consisting of 0.1 MTris–HCl buffer pH 8.1 containing
13.3%(w/v) PEG MME 2000 and 0.01 MNiCl
2
. Crystals suitable for
X-ray data collection appeared within 1 d and reached dimensions of
0.42 0.15 0.12 mm (Fig. 1a). The crystals were flash-cooled in a
nitrogen-gas stream at 100 K using 10%(v/v) glycerol as a cryopro-
tectant. An X-ray diffraction data set was collected using a MAR
Mosaic 225 CCD detector on beamline PX10.1 at the Daresbury
Synchrotron Radiation Source (SRS), England. The data were inte-
grated and scaled using the HKL-2000 software package. The data-
reduction statistics are summarized in Table 1.
2.3. Structure determination and refinement
The crystal structure of GK0453 was determined by the molecular-
replacement method, using the YfhH protein structure as a search
model (PDB entry 1sf9; Midwest Center for Structural Genomics,
unpublished work). The program MOLREP from the CCP4 suite
(Winn et al., 2011) was used for structure determination. It generated
a distinct peak with an Rfactor of 48.9% and a correlation coefficient
of 45.1% for data in the resolution range 20–4 A
˚. The structure
unambiguously revealed that the crystal belonged to space group
P4
3
2
1
2 and contained one molecule in the asymmetric unit. The
model was refined with CNS (Bru
¨nger et al., 1998) and several rounds
of manual fitting and re-fitting were performed using the program O
(Jones et al., 1991), with careful inspection of the 2F
o
F
c
,F
o
F
c
and OMIT electron-density maps. The final Rfactor and R
free
were
22.6 and 26.3%, respectively, at 2.2 A
˚resolution. In the final struc-
ture, four residues (residues 1–4) in the N-terminal region and five
residues in the C-terminal region (residues 109–113) were absent
owing to poor electron density in these regions. The stereochemistry
of the GK0453 structure was good as assessed by MolProbity (Chen
et al., 2010). The structure was deposited in the PDB under accession
code 2yxy. The refinement statistics are summarized in Table 1.
structural communications
Acta Cryst. (2013). F69, 342–345 Padmanabhan et al. GK0453 343
Figure 1
The structure of GK0453 from G. kaustophilus.(a) Crystals of the GK0453 protein.
(b) Cartoon representation of the tertiary structure of GK0453 coloured in a
rainbow ramp from blue at the N-terminus to red at the C-terminus. All figures
were produced with PyMOL (Schro
¨dinger) unless mentioned otherwise.
Table 1
Summary of data-collection and refinement statistics.
Values in parentheses are for the highest resolution shell.
Data collection
Source SRS PX10.1
Wavelength (A
˚) 1.117
Space group P4
3
2
1
2
Unit-cell parameters (A
˚)a=b= 75.7, c= 64.2
Resolution (A
˚) 20.0–2.2
Completeness (%) 99.8 (99.6)
Multiplicity 10.3 (10.6)
R
merge
† (%) 7.8 (27.9)
Refinement statistics
No. of molecules in asymmetric unit 1
Resolution limits (A
˚) 20.0–2.2
cutoff 0
No. of reflections 9710
Rfactor‡/R
free
§ (%) 22.6/26.3
No. of protein residues 104
No. of water molecules 170
R.m.s. deviations
Bond lengths (A
˚) 0.011
Bond angles ()1.4
†R
merge
=Phkl PijIiðhklÞhIðhklÞij=Phkl PiIiðhkl Þ.‡R=Phkl jFobsjjFcalc j=
Phkl jFobsj, where F
obs
and F
calc
are the observed and calculated structure factors,
respectively. § R
free
was calculated with 5% of data that were omitted from
refinement.
3. Results and discussion
3.1. Overall structure
The overall tertiary structure of G. kaustophilus GK0453 consists
of two small domains at the N- and C-terminal regions (Fig. 1b). The
N-terminal region contains a helix–turn–helix motif (1, Lys14–
Met34; 2, Val37–Tyr53). The C-terminal domain possesses a
-barrel-like structure with four -strands (1, Glu64–Ile68; 2,
Ala71–Lys82; 3, Phe85–Arg90; 4, Glu98–Pro101). A long loop
containing a 3
10
-helix (Pro57–Asp59) connects the N- and C-terminal
domains (Figs. 1band 2a).
3.2. Structure comparison and functional prediction
ADALI (Holm & Rosenstro
¨m, 2010) search was performed for
the GK0453 structure to identify structural homologues within the
RCSB PDB. The search revealed that the GK0453 structure is very
similar to that of the hypothetical protein YfhH (60.8% sequence
identity; PDB entry 1sf9). Superimposition of the GK0453 structure
on the YfhH structure yielded a Z-score of 15.4 and an r.m.s.d. of
1.8 A
˚for 101 C
atoms (Figs. 2aand 2b). All other results from the
search showed that the structural similarity occurred within the
distinct domains of either the 44-amino-acid N-terminal region or the
54-amino-acid C-terminal region. The N-terminal region of GK0453
is structurally homologous to the C-terminal domain of UvrB (PDB
entry 1qoj; 19% identity; Z-score of 7.3 and r.m.s.d. of 0.6 A
˚for 43 C
atoms; Sohi et al., 2000) and to the minimal Rab-binding domain
of rabenosyn-5 (PDB entry 1z0j; 23% identity; Z-score of 6.9 and
r.m.s.d. of 1.7 A
˚for 48 C
atoms; Eathiraj et al., 2005) (Fig. 2c). The
UvrB C-terminal domain interacts with the UvrC C-terminal domain
during excision repair in E. coli, and the UvrBC complex is part of the
UvrABC endonuclease system, which catalyzes DNA damage repair
(Sohi et al., 2000). The other protein family, including rabenosyn-5,
selectively recognizes distinct subunits of Rab GTPases exclusively
through interactions with the switch and inter-switch regions in the
helix–turn–helix motif (Eathiraj et al., 2005). A similar structural
feature is also observed in several PDB structures of proteins that
interact with 23S ribosomal RNA. For example, the ribosomal protein
L29 (PDB entry 2gya, chain W; Mitra et al., 2006) yielded 20%
identity and a Z-score of 7.2 and an r.m.s.d. of 1.7 A
˚for 50 C
atoms.
This ribosomal subunit protein, which contains a helix–turn–helix
motif, extensively interacts with the 23S ribosomal RNA. Hence, we
speculated from this analysis that the N-terminal region of GK0453
may be involved in a protein–protein or a protein–nucleic acid
interaction.
In contrast, the C-terminal domain of GK0453 is structurally
homologous to those of the MRG15 chromodomain (PDB entry 2f5k;
22% sequence identity; Z-score of 6.7 and r.m.s.d. of 2.3 A
˚for 54 C
atoms; Zhang et al., 2006), the nuclear protein KIN17 (PDB entry
2ckk, chain A; 16% sequence identity; Z-score of 6.4 and r.m.s.d. of
2.2 A
˚for 52 C
atoms; le Maire et al., 2006) and type II R-plasmid-
encoded R67 dihydrofolate reductase (R67 DHFR; PDB entry 2rh2,
chain A; 12% identity; Z-score of 6.5 and r.m.s.d. of 2.7 A
˚for 51 C
atoms; Krahn et al., 2007) (Fig. 2d).
The R67 DHFR protein is an NADPH-dependent enzyme that
catalyzes the reduction of dihydrofolate (DHF) to tetrahydrofolate
structural communications
344 Padmanabhan et al. GK0453 Acta Cryst. (2013). F69, 342–345
Figure 2
Structural comparisons of GK0453. (a) Sequence alignment of GK0453 (Q5L2U2_GEOKA) with the hypothetical protein YfhH (YFHH_BACSU) and representative
structurally similar proteins UvrB (UvrB_Ecoli; amino acids 628–673) and the MRG15 chromodomain protein (MO4L1_HUMAN; amino acids 6–65) corresponding to the
N- and C-terminal regions of GK0453, respectively. The secondary-structure elements of GK0453 are indicated above the alignment and residues that are similar between
GK0453 and YfhH are coloured red. The figure was generated by ESPript (Gouet et al., 1999). Superimpositions are shown of (b) GK0453 (pink) on the YfhH protein
(green), (c) the N-terminal region of GK0453 on the C-terminal domain of UvrB (light blue) and on the Rab-binding domain of rabenosyn-5 (yellow) and (d) the C-terminal
region of GK0453 on the MRG15 chromodomain (cyan) and on the type II dihydrofolate reductase DHFR (orange). The methylated histone-tail recognizing residues Tyr26,
Tyr46 and Trp49 in the MRG15 chromodomain are depicted by sticks.
(THF). THF is essential for the synthesis of thymidylate, purine
nucleosides, methionine and other metabolic intermediates (Krahn et
al., 2007). Since the functional tetramerization and the critical resi-
dues for enzymatic function are absent in GK0453, it is unlikely that
GK0453 possesses an activity similar to that of DHFR. The human
KIN17 protein is an essential nuclear component that plays a critical
role in maintaining the integrity of the human global genome-repair
machinery. The SH3-like -barrel domain of KIN17 has been shown
to interact with RNA (le Maire et al., 2006). As GK0453 possesses a
hydrophobic environment in the corresponding region, it is unlikely
that GK0453 interacts with RNA or DNA through this region
(Fig. 3a).
The MRG15 chromodomain participates in chromatin remodelling
and transcription regulation by interacting with the methylated
histone tail of the nucleosome (Zhang et al., 2006; Steiner et al., 2002).
The -barrel core forms a hydrophobic pocket containing three
conserved residues, Tyr26, Tyr46 and Trp49, as a potential binding site
for interaction with the methylated histone tail (Fig. 2d). The domain
bearing these residues, which are responsible for recognizing the
methylated histone tail, is absent in GK0453; however, the GK0453
and MRG15 chromodomain structures both possess similar hydro-
phobic environments (Fig. 3). However, the structural analysis
suggested that the -barrel domain of GK0453 may also be involved
in protein–protein interactions with an unknown function.
In conclusion, the crystal structure of the hypothetical protein
GK0453 revealed two small domains: a helix–turn–helix motif at
the N-terminal region and an SH3-like -barrel structure at the
C-terminal region. Based on structural comparisons, we speculate
that the GK0453 protein may simultaneously interact with two
proteins or with a protein and a nucleic acid to exert its unknown
function.
We thank Ms Nagisa Takemoto and Drs Yoshihiro Agari and Akeo
Shinkai for their assistance with sample preparation. This work was
supported in part by the RIKEN Structural Genomics/Proteomics
Initiative (RSGI), the National Project on Protein Structural and
Functional Analyses, the X-ray Free Electron Laser Priority Strategy
Program (to YB) and a Grant-in-Aid for Scientific Research
(23651126 to YB) from the Ministry of Education, Culture, Sports,
Science and Technology (MEXT) of Japan. This work was also
supported by the Synchrotron Radiation Department at the Science
and Technology Facilities Council, Daresbury Laboratory UK and
by beamline 10.1 at the Synchrotron Radiation Source, which was
supported by Biotechnology and Biological Sciences Research
Council Grant BB/E001971 (to SSH and RWS).
References
Bateman, A., Birney, E., Cerruti, L., Durbin, R., Etwiller, L., Eddy, S. R.,
Griffiths-Jones, S., Howe, K. L., Marshall, M. & Sonnhammer, E. L. (2002).
Nucleic Acids Res. 30, 276–280.
Bru
¨nger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-
Kunstleve, R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., Pannu, N. S., Read,
R. J., Rice, L. M., Simonson, T. & Warren, G. L. (1998). Acta Cryst. D54,
905–921.
Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M.,
Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010).
Acta Cryst. D66, 12–21.
Eathiraj, S., Pan, X., Ritacco, C. & Lambright, D. G. (2005). Nature (London),
436, 415–419.
Gouet, P., Courcelle, E., Stuart, D. I. & Me
´toz, F. (1999). Bioinformatics,15,
305–308.
Holm, L. & Rosenstro
¨m, P. (2010). Nucleic Acids Res. 38, W545–W549.
Jones, T. A., Zou, J.-Y., Cowan, S. W. & Kjeldgaard, M. (1991). Acta Cryst.
A47, 110–119.
Krahn, J. M., Jackson, M. R., DeRose, E. F., Howell, E. E. & London, R. E.
(2007). Biochemistry,46, 14878–14888.
Maire, A. le, Schiltz, M., Stura, E. A., Pinon-Lataillade, G., Couprie, J.,
Moutiez, M., Gondry, M., Angulo, J. F. & Zinn-Justin, S. (2006). J. Mol. Biol.
364, 764–776.
Mitra, K., Schaffitzel, C., Fabiola, F., Chapman, M. S., Ban, N. & Frank, J.
(2006). Mol. Cell,22, 533–543.
Sohi, M., Alexandrovich, A., Moolenaar, G., Visse, R., Goosen, N., Vernede,
X., Fontecilla-Camps, J. C., Champness, J. & Sanderson, M. R. (2000). FEBS
Lett. 465, 161–164.
Steiner, T., Kaiser, J. T., Marinkovic¸ , S., Huber, R. & Wahl, M. C. (2002).
EMBO J. 21, 4641–4653.
Takami, H., Inoue, A., Fuji, F. & Horikoshi, K. (1997). FEMS Microbiol. Lett.
152, 279–285.
Takami, H., Nishi, S., Lu, J., Shimamura, S. & Takaki, Y. (2004). Extremophiles,
8, 351–356.
Winn, M. D. et al. (2011). Acta Cryst. D67, 235–242.
Zhang, P., Du, J., Sun, B., Dong, X., Xu, G., Zhou, J., Huang, Q., Liu, Q., Hao,
Q. & Ding, J. (2006). Nucleic Acids Res. 34, 6621–6628.
structural communications
Acta Cryst. (2013). F69, 342–345 Padmanabhan et al. GK0453 345
Figure 3
The electrostatic surface potentials of (a) the GK0453 protein and (b) the MRG15
chromodomain protein. The arrows indicate the similar hydrophobic environments
present in both GK0453 and MRG15. The surface is coloured red and blue for
potential values below 5k
B
Tand above +5k
B
T, respectively, where k
B
is the
Boltzmann constant and Tis room temperature.