ArticlePDF Available

Loops in Globular Proteins: A Novel Category of Secondary Structure

Authors:

Abstract

The protein loop, a novel category of nonregular secondary structure, is a segment of contiguous polypeptide chain that traces a "loop-shaped" path in three-dimensional space; the main chain of an idealized loop resembles a Greek omega (omega). A systematic study was made of 67 proteins of known structure revealing 270 omega loops. Although such loops are typically regarded as "random coil," they are, in fact, highly compact substructures and may also be independent folding units. Loops are almost invariably situated at the protein surface where they are poised to assume important roles in molecular function and biological recognition. They are often observed to be modules of evolutionary exchange and are also natural candidates for bioengineering studies.
... The copyright holder for this preprint this version posted January 12, 2022. ; https://doi.org/10.1101/2022.01.11.475920 doi: bioRxiv preprint (76) observed as stable intermediates for sv10. The structures were extracted from simulations performed at 340 K and were drawn using VMD (73). ...
... ; https://doi.org/10.1101/2022.01.11.475920 doi: bioRxiv preprint peak of 2.7 Å per residue, and the other peaked at 3.2 Å per residue. This implies that the blockier sequence populates a metastable state, which for sv10 corresponds to Ω-loop-like structures (76) that are shown in Fig. 4D. These metastable structures are stabilized by electrostatic attractions between the N-terminal Lys patch and C-terminal Glu patch. ...
Preprint
Full-text available
The most commonly occurring intrinsically disordered proteins (IDPs) are polyampholytes, which are defined by the duality of low net charge per residue and high fractions of charged residues. Recent experiments have uncovered surprises regarding sequence-ensemble relationships of model polyampholytic IDPs. These include differences in conformational preferences for sequences with lysine vs. arginine, and the suggestion that well-mixed sequences either form globules or conformations with ensemble averages that are reminiscent of ideal chains wherein intra-chain and chain-solvent interactions are counterbalanced. Here, we explain these observations by analyzing results from atomistic simulations. We find that polyampholytic IDPs generally sample two distinct stable states, namely globules and self-avoiding walks. Globules are favored by electrostatic attractions between oppositely charged residues, whereas self-avoiding walks are favored by favorable free energies of hydration of charged residues. We find sequence-specific temperatures of bistability at which globules and self-avoiding walks can coexist. At these temperatures, ensemble averages over coexisting states give rise to statistics that resemble ideal chains without there being an actual counterbalancing of intra-chain and chain-solvent interactions. At equivalent temperatures, arginine-rich sequences tilt the preference toward globular conformations whereas lysine-rich sequences tilt the preference toward self-avoiding walks. This stems from intrinsic differences in free energies of hydration between arginine and lysine. We also identify differences between aspartate and glutamate containing sequences, whereby the shorter aspartate sidechain engenders preferences for metastable, necklace-like conformations. Finally, although segregation of oppositely charged residues within the linear sequence maintains the overall two-state behavior, compact states are highly favored by such systems. Significance Statement Intrinsically disordered regions (IDRs) of proteins, when tethered to folded domains, function either as flexible tails or as linkers between domains. Most IDRs are polyampholytes that comprise a mixture of oppositely charged residues. Recent measurements of tethered polyampholytes revealed several surprises including the tendency of arginine- and lysine-rich sequences to behave very differently from one another. Using computer simulations, we show that the differences between arginine- and lysine-rich sequences are determined primarily by differences in free energies of hydration. Further, we find that the interplay between electrostatic attractions and favorable free energies of hydration creates distinct stable states for polyampholytic IDRs. These findings have implications for switch-like transitions and the regulation of effective concentrations of interaction motifs by IDRs.
... As a rule, they are located on the surface of globular proteins, connect membrane α-helices on the cytoplasmic or extracellular surface, and are often involved in recognition processes [56]. On average, a protein molecule contains around four Ω-loops, the distance between the ends is less than the α-α carbon separation in a loop, the twisting angles of the main chain are not repeated, and there are fewer hydrogen bonds of the main chain [57]. The hydrogen bond in the main chain of the loop is irregular, which favors the packing of side chains within long loops [51]. ...
... To calculate the chirality of the Ω-loops ( Figure 5), the data in [57] were utilized. The chirality calculation data for 190 Ω-loops are presented on the chirality map ( Figure 12) and in Table A3. ...
Article
Full-text available
In this study we consider the features of spatial-structure formation in proteins and their application in bioengineering. Methods for the quantitative assessment of the chirality of regular helical and irregular structures of proteins are presented. The features of self-assembly of phenylalanine (F) into peptide nanotubes (PNT), which form helices of different chirality, are also analyzed. A method is proposed for calculating the magnitude and sign of the chirality of helix-like peptide nanotubes using a sequence of vectors for the dipole moments of individual peptides.
... Their hydrolytic activity varies from narrow-spectrum to carbapenemhydrolysing enzymes and depends on a two-step process involving the catalytic Ser70 residue (as defined in the consensus amino acid numbering scheme for Class A enzymes [32]), a coordinated water molecule, as well as lysine and glutamic acid residues that are part of the active site (see Fig. 2a for a detailed description of β-lactam hydrolysis by serine β-lactamases like Class A enzymes). The catalytic pocket in proteins from this class is framed by an Ω-loop ( Fig. 2b; shown in red), a non-regular unit of secondary structure which traces a loop-shaped path in three-dimensional space and is found at the surface of many globular proteins [33,34]. In β-lactamases in particular, the Ω-loop contains a catalytic glutamic acid that is linked to its neighbouring residue by a highly conserved cis peptide bond [35][36][37]. ...
Article
Full-text available
The discovery of penicillin by Alexander Fleming marked a new era for modern medicine, allowing not only the treatment of infectious diseases, but also the safe performance of life-saving interventions, like surgery and chemotherapy. Unfortunately, resistance against penicillin, as well as more complex β-lactam antibiotics, has rapidly emerged since the introduction of these drugs in the clinic, and is largely driven by a single type of extra-cytoplasmic proteins, hydrolytic enzymes called β-lactamases. While the structures, biochemistry and epidemiology of these resistance determinants have been extensively characterized, their biogenesis, a complex process including multiple steps and involving several fundamental biochemical pathways, is rarely discussed. In this review, we provide a comprehensive overview of the journey of β-lactamases, from the moment they exit the ribosomal channel until they reach their final cellular destination as folded and active enzymes.
... However, the most significant changes are observed in the tip of loop-2 [22]. This fragment adopts a configuration of an omega loop, a widespread structural motif of the globular protein' secondary structure [43,44]. ...
Article
Full-text available
Cobra cytotoxins (CTs) belong to the three-fingered protein family and possess membrane activity. Here, we studied cytotoxin 13 from Naja naja cobra venom (CT13Nn). For the first time, a spatial model of CT13Nn with both “water” and “membrane” conformations of the central loop (loop-2) were determined by X-ray crystallography. The “water” conformation of the loop was frequently observed. It was similar to the structure of loop-2 of numerous CTs, determined by either NMR spectroscopy in aqueous solution, or the X-ray method. The “membrane” conformation is rare one and, to date has only been observed by NMR for a single cytotoxin 1 from N. oxiana (CT1No) in detergent micelle. Both CT13Nn and CT1No are S-type CTs. Membrane-binding of these CTs probably involves an additional step—the conformational transformation of the loop-2. To confirm this suggestion, we conducted molecular dynamics simulations of both CT1No and CT13Nn in the Highly Mimetic Membrane Model of palmitoiloleoylphosphatidylglycerol, starting with their “water” NMR models. We found that the both toxins transform their “water” conformation of loop-2 into the “membrane” one during the insertion process. This supports the hypothesis that the S-type CTs, unlike their P-type counterparts, require conformational adaptation of loop-2 during interaction with lipid membranes.
... However, two other AA that are not in the universal genetic code can also be found in some proteins: selenocysteine in all three domains of life and pyrrolysine in methanogenic archaea and some bacteria (Rother and Krzycki, 2010 Secondary structure corresponds to local, defined structural organization in segments of a protein. The α-helix and ß-sheet are the most common secondary structures, but ß-turns (Hutchinson and Thornton, 1994) and Ω-loops (Leszczynski and Rose, 1986;Fetrow, 1995) are also frequent. Tight turns and flexible loops are usually found between helices and sheets. ...
Article
Full-text available
The recent discovery of extrasolar Earth-like planets that orbit in their habitable zone of their system, and the latest clues of the presence of liquid water in the subsurface of Mars and in the subglacial ocean of Jupiter's and Saturn's moons, has reopened debates about habitability and limits of life. Although liquid water, widely accepted as an absolute requirement for terrestrial life, may be present in other bodies of the solar system or elsewhere, physical and chemical conditions, such as temperature, pressure, and salinity, may limit this habitability. However, extremophilic microorganisms found in various extreme terrestrial environments are adapted to thrive in permanently extreme ranges of physicochemical conditions. This review first describes promising environments for life in the Solar System and the microorganisms that inhabit similar environments on the Earth. The effects of extreme temperatures, salt, and hydrostatic pressure conditions on biomolecules will be explained in some detail, and recent advances in understanding biophysical and structural adaptation strategies allowing microorganisms to cope with extreme physicochemical conditions are reviewed to discuss promising environments for life in the Solar System in terms of habitability.
Article
Full-text available
Significance Intrinsically disordered regions (IDRs) of proteins, when tethered to folded domains, function either as flexible tails or as linkers between domains. Most IDRs are polyampholytes that comprise a mixture of oppositely charged residues. Recent measurements of tethered polyampholytes showed the tendency of arginine- and lysine-rich sequences to behave very differently from one another. Using computer simulations, we show that these differences are determined by differences in free energies of hydration, steric volumes, and other considerations. Further, the interplay between electrostatic attractions and favorable free energies of hydration creates distinct stable states for polyampholytic IDRs. These findings have implications for switch-like transitions and the regulation of effective concentrations of interaction motifs by IDRs.
Article
In apoptotic pathway, the interaction of Cytochrome c (Cytc) with cardiolipin in vivo is a key process to induce peroxidase activity of Cytc and trigger the release of Cytc in the inner mitochondria into cytosol. The peroxidase active form of Cytc occurs due to local conformational changes that support the opening of the heme crevice and the loss of an axial ligand between Met80 and heme Fe. Structural adjustments at the Ω-loop segments of Cytc are required for such process. To study the role of the distal Ω-loop segments comprising residues 71–85 in human Cytc (hCytc), we investigated a cysteine mutation at Pro76, one of the highly conserved residues in this loop. The effect of P76C mutant was explored by the combination of experimental characterizations and molecular dynamics (MD) simulations. The peroxidase activity of the P76C mutant was found to be significantly increased by ∼13 folds relative to the wild type. Experimental data on global denaturation, alkaline transition, heme bleaching, and spin-labeling Electron Spin Resonance were in good agreement with the enhancement of peroxidase activity. The MD results of hCytc in the hexacoordinate form suggest the important changes in P76C mutant occurred due to the unfolding at the central Ω-loop (residues 40–57), and the weakening of H-bond between Tyr67 and Met80. Whereas the experimental data implied that the P76C mutant tend to be in equilibrium between the pentacoordinate and hexacoordinate forms, the MD and experimental information are complementary and were used to support the mechanisms of peroxidase active form of hCytc.
Article
Full-text available
The structure of crystalline carp muscle calcium-binding protein (parvalbumin) has been determined by x-ray diffraction techniques to nominal 1.85-A resolution. Isomorphous and anomalous scattering data were measured for three heavy atom derivatives, 3-chloromercuri-2-methoxypropyl urea, mercury bromide, and ethyl mercury chloride, to 2.0-A resolution using precession photography. As described in Paper III in this series the 2.0-A phases were refined and the 2.0- to 1.85-A phases were determined by use of the tangent formula. The electron density map is interpreted in terms of the 108 amino acid sequence described in Paper I in this series. A calcium ion is bound in the loop between helix C and helix D and a second calcium is bound in the EF loop. The entire CD region is related to helix E, the EF loop, and the terminal helix F by an approximate intramolecular 2-fold axis. Although it does not bind calcium the AB region has a structure similar to the CD and EF regions and appears to have resulted from a gene triplication. The molecule is generally spherical with a well defined hydrophobic core, one-seventh of its total volume, composed of side chains of phenylalanine, isoleucine, leucine, and valine. All of the polar side chains are at the surface except those associated with calcium binding and with an invariant internal salt bridge between arginine-75 and glutamic acid-81.
Article
Full-text available
The structure of bovine erythrocyte Cu, Zn superoxide dismutase has been determined to 2 Å resolution using only the larger structure factors beyond 4 Å. The enzyme crystallizes in space group C2 with two dimeric enzyme molecules per asymmetric unit. All four crystallographically independent subunits were fitted separately to the electron density map at 2 Å resolution on the University of North Carolina GRIP-75 molecular graphics system. Atomic co-ordinates were refined using the Hendrickson & Konnert (1980) program for stereochemically restrained refinement against structure factors, which allowed the use of non-crystallographic symmetry. The crystallographic residual error for the refined model was 25.5% with a root-mean-square deviation of 0.03 Å from ideal bond lengths and an average atomic temperature factor of 12 Å2.Each enzyme subunit is composed primarily of eight antiparallel β strands that form a flattened cylinder, plus three external loops. The β barrel is asymmetrical and can be viewed as having two distinct sides; β strands 5 to 8 are shorter with fewer hydrogen bonds, less regular side-chain alternation, and greater twist than strands 1 to 4. The main-chain hydrogen bonds primarily link β strand residues; side-chain to main-chain hydrogen bonds are extensively involved in the formation of tight turns, which form a major structural element of the three loops. The largest loop includes both a disulfide region and a Zn-liganding region, each of which resembles one of the other two loops in overall structure. The second largest loop includes a short section of α helix. The smallest loop forms a Greek key connection across one end of the β barrel. The single disulfide bond, which forms a left-handed spiral, covalently joins the largest loop to the beginning of β strand 8.Symmetrically related β bulge pairs fold the two large loops back against the external surface of the β barrel to surround the active channel. The active site Cu(II) and Zn(II) lie 6.3 Å apart at the bottom of this long channel; the Zn is buried, while the Cu is solvent-accessible. The side-chain of His61 forms a bridge between the Cu and Zn and is coplanar with them within the current accuracy of the data. The Cu ligands ND1 of His44 and NE2 of His46, −61 and −118 show an uneven tetrahedral distortion from a square plane. The Cu has a fifth axial coordination position exposed to solvent. Zn ligands ND1 of His61, −69 and −78 and OD1 of Asp81 show tetrahedral geometry with a strong distortion toward a trigonal pyramid having the buried Asp81 at the apex. Both the side-chains and mainchains of the metal-liganding residues are stabilized in their orientation by a complex network of hydrogen bonds.
Article
For a successful analysis of the relation between amino acid sequence and protein structure, an unambiguous and physically meaningful definition of secondary structure is essential. We have developed a set of simple and physically motivated criteria for secondary structure, programmed as a pattern-recognition process of hydrogen-bonded and geometrical features extracted from x-ray coordinates. Cooperative secondary structure is recognized as repeats of the elementary hydrogen-bonding patterns “turn” and “bridge.” Repeating turns are “helices,” repeating bridges are “ladders,” connected ladders are “sheets.” Geometric structure is defined in terms of the concepts torsion and curvature of differential geometry. Local chain “chirality” is the torsional handedness of four consecutive Cα positions and is positive for right-handed helices and negative for ideal twisted β-sheets. Curved pieces are defined as “bends.” Solvent “exposure” is given as the number of water molecules in possible contact with a residue. The end result is a compilation of the primary structure, including SS bonds, secondary structure, and solvent exposure of 62 different globular proteins. The presentation is in linear form: strip graphs for an overall view and strip tables for the details of each of 10.925 residues. The dictionary is also available in computer-readable form for protein structure prediction work.
Article
An improved cube method has been developed for calculating the intensity of diffuse x-ray scattering of macromolecules in solution using a certain set of their atomic coordinates. The technique is based on the ideas of B. Lee and F. M. Richards [(1971) J. Mol. Biol.55, 374–400] and Richards [(1977) Annu. Rev. Biophys. Bioeng.6, 151–176] on the possibility of estimating the molecular and accessible surface of a particle by “rolling” a sphere, simulating a water molecule, on its molecular surface. It is shown that this technique is more advantageous than earlier versions of the cube methods. The improved technique for calculating scattering curves was utilized for several globular proteins, and for the first time, reliable scattering curves were obtained for protein-“bound” water complexes. In the case of globular proteins and tRNA, this technique has permitted a strict evaluation of their accessible surfaces, their volumes, and, apparently for the first time, their complete molecular surfaces.
Article
A brief review is given of the use of molecular surface area in estimations of hydrophobic forces and of their influence on protein structure. Molecular area can be used as an estimate of the free energy of transfer of a solute between solvents of differing polarity. Area changes that occur on forming secondary and tertiary structural units are examined including the possible uses of such estimates in algorithms for folding peptide chains. In 1975Lumry andRosenberg introduced the concept of mobile defects in discussing protein dynamics with particular reference to hydrogen exchange experiments (25). This idea is examined quantitatively in this paper. A first order approach is taken by assuming volume changes to occur by isometric expansion and contraction of the small cavities that occur as packing defects in the structures of all proteins. The positions and mean volume of these defects can be derived from the known X-ray structure using the Voronoi construction. The volume fluctuations of these cavities are assumed to follow a normal distribution leading to a probability of expansion to any preset value, vo. The standard deviation of the fluctuation, σ, is related to the mean isothermal compressibility of the protein and is a characteristic of that particular molecule (8). The parameter vo is related to the process being considered and should be the same for all proteins. For hydrogen exchange it should be related to the volume of a water molecule. For fluorescence quenching it would reflect the molecular volume of the quencher. The theory has been applied to myoglobin and pancreatic trypsin inhibitor. Reasonable agreement with hydrogen exchange data for the slowly exhanging amide protons can be obtained, but there is difficulty with the rapidly exchanging protons and access to the protein surface. No explicit account has yet been taken of the changes in exchange rate due to primary or secondary structure, factors which will particularly effect the surface positions.
Article
The X-ray atomic co-ordinates from 29 proteins of known sequence and structure were utilized to elucidate 459 β-turns in regions of chain reversals. Tetrapeptides whose αCiαC(i + 3) distances were below 7 Å and not in a helical region were characterized as β-turns. In addition, β-turns were considered to have hydrogen bonding if their computed O(i)N(i + 3) distances were ≤3.5 Å. The torsion angles of 26 proteins containing 421 β-turns were examined and classified into 11 bend types based on the (φ, ψ) dihedral angles of the i + 1 and i + 2 bend residues. The average frequency of β-turns is 32% as compared to the 38% helices and 20% β-sheets in the 29 proteins. The most frequently occurring bend residues are Asn, Cys, Asp in the first position, Pro, Ser, Lys in the second position, Asn, Asp, Gly in the third position, and Trp, Gly, Tyr in the fourth position. Residues with the highest β-turn potential in all four positions are Pro, Gly, Asn, Asp, and Ser with the most hydrophobic residues (i.e. Val, IIe, and Leu) showing the lowest bend potential. However, in the region just beyond the β-turns, hydrophobic residues occur with greater frequency than do hydrophilic residues. An environmental analysis of β-turn neighboring residues shows that reverse chain folding is stabilized by anti-parallel β-sheets as well as helix-helix and α-β interactions. The β-turn potential at the 12 positions adjacent to and including the bend were plotted for the 20 amino acids and showed dramatic positional preferences, which may be classified according to the nature of the side-chains. An examination of the 27 β-turns in elastase showed that 21 were found in identical positions as those in α-chymotrypsin. However, only 37 of the 84 bend residues were conserved, indicating that structural similarity may persist despite differences in sequence homology. A survey of residues occupying bend types I′, II′ and III′ showed that Gly appeared most frequently in the third position in bend types I′ and III′ as well as in the second position in bend types II′ and III′. Fourteen hydrogenbonded type II bends were found without a Gly at the third position, contrary to the energy calculations. Eight type VI bends with a cis Pro at the third position were also elucidated.
Article
The three-dimensional crystal structure of bovine trypsinogen at approximately pH 7.5 was initially solved at 2.6 A resolution using the multiple isomorphous replacement method. Preliminary refinement cycles of the atomic coordinates trypsinogen have been carried out first to a resolution of 2.1 A, and later to 1.9 A, using constrained difference Fourier refinement; During the process, structure factors Fc and phi c were calculated from the trypsinogen structure and final interpretation was based on an electron-density map computed with terms (2 Fo - Fc) and phases phic at a resolution of 1.9 A. Crystals of trypsinogen grown from ethanol-water mixtures are trigonal with space group P3121, and cell dimension a = 55.17 A and c = 109.25 A. The structure is compared with the bovine diisopropylphosphoryltrypsin structure at approximately pH 7.2, oirginally determined from orthohombic crystals by Stroud et al. (Stroud, R.M., Kay L.M., and Dickerson, R.E. (1971), Cold Spring Harbor Symp. Quant. Biol. 36, 125-140; Stroud, R.M., Kay, L.M., and Dickerson, R.E. (1974), J. Mol. Biol. 83, 185-208), and later refined at 1.5 A resolution by Chambers and Stroud (Chambers, J.L., and Stroud, R.M. (1976), Acta Crystallogr. (in press)). At lower pH, 4.0-5.5 diogen, with cell dimensions a = 55.05 A and c = 109.45 A. This finding was used in the solution of the six trypsinogen heavy-atom derivatives prior to isomorphous phase analysis, and as a further basis of comparison between trypsinogen and the low pH trypsin structure. There are small differences between the two diisopropylphosphoryltrypsin structures. Bovine trypsinogen has a large and accessible cavity at the site where the native enzyme binds specific side chains of a substrate. The conformation and stability of the binding site differ from that found in trypsin at approximately pH 7.5, and from that in the low pH form of diisopropylphosphoryltrypsin. The catalytic site containing Asp-102, His-57, and Ser-195 is similar to that found in trypsin and contains a similar hydrogen-bounded network. The carboxyl group of Asp-194, which is salt bridged to the amino terminal of Ile-16 in native trypsin or other serine proteases, is apparently hydrogen bonded to internal solvent molecules in a loosely organized part of the zymogen structure. The unusually charged N-terminal hexapeptide of trypsinogen, whose removal leads to activation of the zymogen, lies on the outside surface of the molecule. There are significant structural changes which accompany activation in neighboring regions, which include residues 142-152, 215-550, 188A-195. The NH group of Gly-193, normally involved in stabilization of reaction intermediates (Steitz, T.A., Henderson, R., and Blow, D.M. (1969), J. Mol. Biol. 46, 337-348; Henderson, R. (1970), J. Mol. Biol. 54, 341-354; robertus, J.D., Kraut, J., Alden, R.A., and Birkoft, J.J. (1972), Biochemistry 11, 4293-4303) in the enzyme, is moved 1.9 A away from its position in trypsin...
Article
In a previous paper [Levitt, M., and Greer, J. (1977), J. Mol. Biol. 114, 181--239], an objective compilation of the secondary-structure regions in more than 50 different globular proteins was produced automatically. In the present paper, these assignments of secondary structure are analyzed to give the frequency of occurrence of the 20 naturally occurring amino acids in alpha helix, beta sheet, and reverse-turn secondary structure. Nineteen of these amino acids have a weak but statistically signficant preference for only on type of secondary structure. These preferences correlate well with the chemical structure of the particular amino acids giving a more objective classification of the conformational properties of amino acids than available before.