PreprintPDF Available

Identification of phosphosites that alter protein thermal stability

Authors:

Abstract

Proteomics has enabled the cataloguing of 100,000s of protein phosphorylation sites, however we lack methods to systematically annotate their function. Phosphorylation has numerous biological functions, yet biochemically all involve changes in protein structure and interactions. These biochemical changes can be recapitulated by measuring the difference in stability between the protein and the phosphoprotein. Building on recent work, we present a method to infer phosphosite functionality by reliably measuring such differences at the proteomic scale.
Identification of phosphosites that alter protein thermal stability
Ian R. Smith, Kyle N. Hess, Anna A. Bakhtina, Anthony S. Valente, Ricard A. Rodríguez-Mias
and Judit Villén
Department of Genome Sciences, University of Washington, Seattle WA, USA
*Corresponding author, email: jvillen@uw.edu
ABSTRACT
Proteomics has enabled the cataloguing of 100,000s of protein phosphorylation sites
1
, however
we lack methods to systematically annotate their function. Phosphorylation has numerous
biological functions, yet biochemically all involve changes in protein structure and interactions.
These biochemical changes can be recapitulated by measuring the difference in stability
between the protein and the phosphoprotein. Building on recent work, we present a method to
infer phosphosite functionality by reliably measuring such differences at the proteomic scale.
MAIN TEXT
Recently, Huang et al.
2
developed the Hotspot Thermal Profiling (HTP) method to identify
phosphosites that alter protein thermal stability, reporting 719 out of 2,883 (25%) phosphosites
with significant effects. The reported melting temperatures (T
m
) for phosphopeptides correlated
poorly with the T
m
for their corresponding proteins (R
2
= 0.18) (Fig. 1a), implying that many
phosphosites function by structurally reshaping the proteome. However, the low T
m
reproducibility between replicates (Supplementary Fig. 1a) suggests that this conclusion may be
due to technical variation (Supplementary Discussion). The HTP workflow consists of
phosphopeptide enrichment followed by separate isotopic labeling and mass spectrometric
analysis to derive T
m
values for phosphopeptides and proteins, respectively. Because
phosphopeptide samples also contained unmodified peptides, which are expected to have the
same T
m
as the protein, we can use these peptides to assess technical variation between the
two samples. Disconcertingly, our re-analysis revealed that 626 out of 3074 (20%) of the
co-enriched unmodified peptides had significant stability effects, almost the same percentage as
phosphopeptides (22%, 596 out of 2656 in our re-analysis) (Fig. 1b, Dataset S1). Additionally,
the T
m
correlation of these peptides with their protein T
m
was similarly low ( R
2
= 0.18) to the
correlation between phosphopeptides and protein (Supplementary Fig. 1b). In the absence of a
biological explanation, this suggests that the independent labeling and mass spectrometric
analysis of peptide and phosphopeptide samples could have introduced substantial technical
error precluding the comparison, and perhaps that the reported hits arise from a lack of
stringency in the applied statistical analysis (Supplementary Discussion).
1
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
To minimize technical noise derived from sample preparation, peptide samples should be
labeled and mixed prior to phosphopeptide enrichment (Supplementary Discussion and
accompanying manuscript
3
). Because scaling-up isobaric chemical labeling increases reagent
costs substantially, we have developed an alternative approach to identify phosphosites that
alter thermal stability, that we call Dali (Fig. 1c). Dali applies the Proteome Integral Stability
Alteration (PISA) method
4
, a simplified version of thermal proteome profiling
5
, in which the
soluble protein from the different temperature points are combined to provide an estimation of
the area under the protein melting curve. To reliably compare phosphopeptides to proteins, we
normalize each measurement to a 30
o
C treated proteome reference that is labeled with heavy
lysine, obtaining a relative stability (R
s
) measurement for phosphopeptides and proteins. This
30
o
C reference is mixed in with the temperature gradient treated samples prior to protein
digestion, and it is present during phosphopeptide enrichment and mass spectrometry (MS)
measurement of peptides and phosphopeptides.
We applied Dali to the S. cerevisiae proteome and obtained reproducible R
s
measurements for proteins (average R
2
= 0.76) and phosphopeptides (average R
2
= 0.65)
(Supplementary Fig. 1a). In contrast to the Huang et al. dataset, we find that the stability of
phosphopeptides correlates well with the stability of their respective proteins (R
2
=0.79 for mean
R
s
comparisons) (Fig. 1d), suggesting that most phosphosites do not alter protein stability as
also observed by Potel et al.
3
. As expected, the stability of non-modified peptides present in the
phosphopeptide-enriched samples also correlated well with their proteins (R
2
= 0.90 for mean R
s
comparisons), indicating that R
s
measurements in the phosphopeptide samples and protein
samples can be reliably compared (Supplementary Fig. 1c). Finally, our analysis yielded 71
phosphopeptide isoforms out of 2,345 (3%) with significantly different thermal stability than the
unmodified protein (Fig. 1e, Dataset S2). We detected several non-modified peptides with
significant differences in stability, yet this set constituted a much smaller fraction than found in
Huang et al. (Dataset S3). Many of these peptides (7 out of 16) were cases where the protein is
known to be post-translationally processed via cleavage (e.g. RPS31
6
) or splicing (e.g. VMA1
7
),
resulting in proteins and/or proteoforms of different thermal stability as our method measured
(Supplementary Fig. 2).
Among phosphosites that decreased protein thermal stability, we identified four sites
located at protein interfaces (Ser56 on PUP2, Ser59 on ARO8, Ser79 on TPI1 and Ser201 on
GAPDH) (Fig. 2a, Supplementary Fig. 3) that may act by disrupting protein-protein interactions.
For example, PUP2 is the alpha 5 subunit of the 20S proteasome, and Ser56 is a known Cdc28
substrate
8
located at the protein interaction interface with PRE6, the 20S proteasome alpha 4
subunit (Fig. 2a). The stability measured for the phosphopeptide spanning Ser56 is significantly
lower than the stability of PUP2, which is similar to other proteins in the 20S proteasome,
suggesting Ser56 phosphorylation may dissociate PUP2 from the 20S proteasome.
We identified stabilizing phosphosites that may play a role in the protein translation
process. For example, we found that phosphorylation at Ser38 on ribosomal protein RPL12/uL11
significantly increased protein stability (Fig. 2b). This phosphosite is an evolutionarily-conserved
2
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Cdc28 substrate
8
that is regulated during the cell cycle
9
and has been reported to be depleted in
polysomes and influence mitotic translation
10
. Due to RPL12 location at the ribosome P-stalk
and the proximity of residue Ser38 to elongation factor 2, we hypothesize that this phosphosite
may modulate the interaction with EF-2 to aid ribosomal translocation during protein synthesis,
and the change in conformation and binding may stabilize RPL12. We also identified a stabilizing
phosphorylation on NEW1 at Thr1191 (delta R
s
= 1.23) (Fig. 2c). NEW1 is a translation factor
that binds to the ribosome at a position analogous to eEF3 and fine-tunes the efficiency of
translation termination
11
. The identified phosphosite fits the CK2 consensus motif, is located
within the acidic C-terminal sequence of NEW1, and is highly conserved. A T1191A mutant has
growth defects
12
suggesting that phosphorylation is important for NEW1 function.
We were able to measure many key glycolysis proteins identifying phosphosites that
may modulate enzyme kinetics. For example, we measured the stability for six phosphosites on
PGK1, of which only Thr331 showed significantly decreased stability (Fig. 2d). This observation
agrees with the predicted stability effects of phosphomimetic substitutions on PGK1
13
(Fig. 2d).
We identified a stabilizing phosphosite at Ser149 in the three GAPDH isozymes TDH1, TDH2
and TDH3 (Fig 2e). Ser149 is adjacent to catalytic Cys150 and to the binding sites of substrates
glyceraldehyde-3-phosphate (G3P) and inorganic phosphate. Interestingly, Ser149
phosphorylation would occupy the inorganic phosphate binding site (Fig 2e). Additionally, it has
been recently reported that a TDH3 S149A mutant exhibits a growth defect with doxorubicin
compared to wild-type and decreases TDH3 activity to a greater extent than a TDH3
knockout
14
. Our results raise the possibility that S149 phosphorylation may increase the stability
of apo-GAPDH, the GAPDH-G3P reaction intermediate and aid phosphate transfer by
enhancing product release.
In this communication, we have outlined a novel proteomic method that enables robust
thermal stability comparison between proteins and phosphorylated proteoforms. Our method
identified 3% phosphosites in the S. cerevisiae proteome that significantly changed protein
melting behavior, with several examples potentially altering protein conformation and
interactions. Additional experiments will be needed to precisely characterize the function of
these phosphosites. One limitation of this method is that the sensitivity to detect changes in
stability is lower for proteins with extreme (low or high) melting temperature, which can be
circumvented by performing the experiment using different temperature gradients. Our method
can be extended to other model organisms and cell culture systems, as well as to other
post-translational modifications, expanding the proteomic toolkit to functionally annotate dynamic
protein modifications at scale.
3
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
METHODS
Yeast strains
All yeast experiments were conducted on the Saccharomyces cerevisiae haploid strain BY4741
( MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 ), a direct descendant from FY2, which is itself a direct
descendant of S288C.
S. cerevisiae growth, stable isotope labeling, and cell harvest
Two overnight yeast cultures were grown at 30
o
C in synthetic complete media (SCM) containing
6.7g/L yeast nitrogen base, 2g/L of synthetic complete mix minus lysine, 2% glucose, and
supplemented with either regular lysine (light culture) or
2
H
4
-lysine (heavy culture) at 0.872 mM
final concentration. These cultures were used to seed three 50mL cultures of each light and
heavy at OD
600
0.15, which were grown at 30
o
C and 45mL were harvested at OD
600
~ 1 by
centrifugation at 7,000 x g for 10min. Yeast pellets were washed by resuspension in 1.5mL
ice-cold sterile water and centrifugation in 2mL screw cap tubes at 21,000 x g for 10min; and
then snap-frozen in liquid nitrogen and stored at -80
o
C.
Cell lysis and protein extract temperature treatment
Frozen yeast cell pellets were resuspended in 700μL of non-denaturing lysis buffer (50mM
HEPES pH 7.0, 75mM NaCl) containing 0.5X protease inhibitors (Pierce) and phosphatase
inhibitors (50mM β-glycerophosphate, 10mM sodium pyrophosphate, 50mM of NaF, 1mM
sodium orthovanadate) on ice. Cells were lysed by bead beating with 0.5mm zirconia/silica
beads for 4 cycles of 60sec of mechanical agitation followed by 90sec rest on ice. Lysates were
clarified by sequential centrifugation, first at 1,200 x g for 1min to remove the beads and then at
21,000 x g for 10min at 4
o
C to remove cell debris. To bring all protein extracts to the same
concentration, extract volumes were adjusted to 1 OD
600
unit from a 45mL culture in 1mL.
Each cell extract was aliquoted into 2 strips of 8 PCR tubes each (1x8 for the temperature
gradient and 1x8 for the 30
o
C) dispensing 50μL of protein extract per tube. All samples were
initially equilibrated to 30
o
C for 5 min. Temperature gradient samples were subjected to 45.6
o
C,
46.8
o
C, 48.3
o
C, 50
o
C, 52
o
C, 53.6
o
C, 54.9
o
C, and 57
o
C, one tube to each temperature, for 5min.
In parallel, controls were subjected to an additional 30
o
C temperature treatment for 5min. All
samples were cooled down to room temperature for 10min. For each replicate, temperature
gradient samples were all pooled into one tube and 30
o
C controls were pooled into a separate
tube prior to centrifugation at 21,000 x g for 30min at 4
o
C. The soluble protein fractions for the
temperature gradient and 30
o
C controls were combined 2:1, three replicates with the
temperature gradient labeled heavy and the 30
o
C controls labeled light, and three additional
replicates with the labels swapped. We generated additional controls where heavy and light 30
o
C
controls were combined to assess potential differences in protein expression due to the different
labeling. Protein concentration was measured by the BCA assay.
4
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Protein reduction, alkylation, LysC digestion, and desalting
Samples were diluted 2-fold with a buffer containing 8M urea, 50mM HEPES pH 8.9, 75mM
NaCl, 1mM sodium orthovanadate, 50mM β-glycerophosphate, 10mM sodium pyrophosphate,
50mM NaF. Protein samples were subjected to reduction with 7.5mM dithiothreitol (DTT) for
30min at 55
o
C and alkylation with iodoacetamide (22.5mM) for 30min at room temperature in the
dark with agitation. The alkylation reaction was quenched with an additional 7.5mM DTT at room
temperature for 30min with agitation. The pH was adjusted to 8.5 with 1M Tris pH 8.9. Lysyl
endopeptidase (LysC; Wako Chemicals) was added at a 1:100 enzyme to protein ratio and
protein samples were incubated overnight with agitation at room temperature. LysC digestion
was quenched by addition of trifluoroacetic acid (TFA) to a final concentration of 1% and pH ~2-3
and the digests were stored at -80
o
C.
Peptide samples were desalted by solid-phase extraction over 50mg Sep-Pak tC
18
cartridges
(Waters). Packing material was washed with 1mL methanol, 3 x 1mL 100% acetonitrile, 1mL
70% acetonitrile, 0.25% acetic acid, 1mL 40% acetonitrile, 0.5% acetic acid, and equilibrated
with 3 x 1mL 0.1% TFA. Peptides were then loaded by gravity twice, washed with 3 x 1mL 0.1%
TFA and 1mL 0.5% acetic acid. Peptides were eluted with 600μL of 40% acetonitrile, 0.5%
acetic acid and 400μL 70% acetonitrile, 0.25% acetic acid, and aliquoted as follows: 40μg for
high-pH reversed-phase fractionation, 200μg for Fe
3+
-IMAC phosphopeptide enrichment, and
10μg for preliminary LC-MS/MS analysis to assess sample quality. All samples were dried by
vacuum centrifugation and stored at -80
o
C.
High-pH reversed-phase fractionation
Peptides were fractionated by high-pH reversed-phase fractionation on a 200μL pipette tip
packed with 4 layers of SDB-XC material (Empore). The material was washed with 50μL
methanol, 50μL 80% acetonitrile, 20mM ammonium formate, and 3 X 50μL 20mM ammonium
formate. Peptides (40μg) were solubilized in 40μL of 5% acetonitrile, 20 mM ammonium
formate, loaded onto the SDB-XC tip, and the flow-through was collected in a mass
spectrometer vial (fraction 1). Peptide fractions 2-5 were obtained by step elution with 40μL of
20mM ammonium formate in 10 %, 15%, 20%, and 80% acetonitrile and collection in mass
spectrometry vials. Peptide fractions were dried by vacuum centrifugation, solubilized in 3%
acetonitrile, 4% formic acid, and ~1μg of each fraction was analyzed by LC-MS/MS.
Fe
3+
-NTA IMAC phosphopeptide enrichment
Phosphopeptide enrichment was conducted by immobilized iron cation affinity chromatography
in batch mode and automated in a 96-well format on a KingFisher magnetic particle processor
as we described
15
. For each sample, ~200μg peptides were solubilized in 70 μL 0.1% TFA, 80
% acetonitrile and incubated with 80 μL of a 5% slurry of magnetic Fe-NTA beads (Cube
Biotech) in the same solvent for 30min. Beads were washed three times with 150μL 0.1% TFA,
80% acetonitrile and phosphopeptides were eluted with 50μL 50% acetonitrile, 0.37M
ammonium hydroxide. Eluates were acidified with 30μL 10% formic acid, 75% acetonitrile and
filtered over two-layer C18 extraction disks (Empore) packed in 200μL pipette tip, which had
5
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
been previously conditioned with 50μL100% methanol, 50μL 100% acetonitrile and 50μL 70%
acetonitrile, 0.25% acetic acid. Filtered peptides were collected in a mass spectrometer vial and
the peptides in the extraction disk were further eluted with 50μL 70% acetonitrile, 0.25% acetic
acid and collected into the same mass spectrometry vial. Phosphopeptide-enriched samples
were dried by vacuum centrifugation, solubilized in 3% acetonitrile, 4% formic acid, and one third
of each sample was analyzed by LC-MS/MS.
Liquid chromatography coupled to tandem mass spectrometry
Peptide samples were analyzed by nLC-MS/MS on a nanoAcquity UPLC (Waters) coupled to
an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher, San Jose, USA).
Samples were loaded on a 100μm x 3-cm trap column packed with 3μm C18 beads (Dr.
Maisch), separated on a 100μm x 30-cm capillary analytical column, packed with 1.9μm C18
beads (Dr. Maisch) and set at 50
o
C, using a 90-min reversed-phase gradient of acetonitrile in
0.125% formic acid, and online analyzed by mass spectrometry using data-dependent
acquisition. Each cycle consisted of 3 sec where one full MS1 scan was acquired on the
orbitrap at 120,000 resolution from 300 to 1575 m/z using an AGC of 7e5 and maximum
injection time of 50ms followed by MS/MS dependent scans on most intense precursor m/z ions
(only considering z = 2 to 5) until exhausting the 3sec cycle time, using 1.6 m/z isolation
window, HCD fragmentation at 28 normalized collision energy, and acquired at 15,000 resolution
on the orbitrap with an AGC of 5e4 (peptide samples) and AGC of 1e5 (phosphopeptide
samples) with a maximum injection time of 22ms. Dynamic exclusion was enabled to exclude
fragmented precursors from repeated MS/MS selection for 30sec. To increase coverage,
phosphopeptide samples were injected twice, and the data from the two technical replicates
were combined.
Database searching, peptide quantification, phosphosite localization, and R
s
calculation
MS data files for proteome samples were analyzed with MaxQuant
16
(v.1.6.7.0) to obtain peptide
identifications and quantifications, using the following parameters: protein sequence database
S.cerevisiae downloaded from SGD in July 2014, LysC enzyme specificity (cleavage Ct to K),
maximum of 2 missed cleavages, mass tolerance of 20ppm for MS1 and 20ppm for MS2, fixed
modification of carbamidomethyl on cysteines, variable modifications of oxidation on
methionines and acetylation on protein N-termini. Lysine residues were only allowed to be all
light or all
2
H
4
-Lys within the same peptide. Phosphoproteome samples were processed in
MaxQuant similarly as above, with additional variable modification of phosphorylation on serine,
threonine, and tyrosine residues. All searches were combined for MaxQuant filtering set to 1%
FDR at the level of peptide spectral matches and protein.
Quantification values for heavy and light peptide features were extracted from the evidence.txt
file. Quantification values for features corresponding to the same peptide sequence (e.g. same
peptide identified at multiple charge states or fractions) were summed up. Phosphopeptide
quantification features were aggregated to the phosphopeptide isoform level by summing
features corresponding to the same peptide sequence (e.g. same peptide identified at multiple
charge states or replicate injections) as well as overlapping peptide sequences sharing the
6
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
same combination of modifications. For each phosphopeptide isoform we required the
maximum localization probability to be greater than 75% for at least one site.
R
s
values were calculated as log
2
ratios of the quantification value for the temperature gradient
treated divided by the respective quantification for the 30
o
C control. Peptide R
s
distributions were
median normalized to 0, and the same correction value derived for each replicate was applied to
normalize the corresponding phosphopeptide isoform R
s
distributions. Peptides and
phosphopeptide isoforms with the 5% highest R
s
standard deviation across replicates were
excluded from the analysis. Protein R
s
values were calculated as the median of peptide R
s
for
that protein, requiring a minimum of 2 peptides per protein, and each peptide observed in at least
two replicates.
To identify phosphopeptide isoforms that have different R
s
than their unmodified protein
counterpart, we performed a t-test comparing phosphopeptide isoform R
s
values (n=6)
compared to protein R
s
values (n=6) and assuming unequal variance. Phosphopeptide isoforms
and protein counterparts were required to be observed in at least two replicates. P-values were
corrected for multiple hypothesis testing using the Benjamini-Hochberg method
17
. All data
analysis was conducted using R.
Structure visualization and bioinformatics
Protein complex annotations were extracted from the CYC2008 resource
18
. Protein structure
coordinates were downloaded from PDB and visualized and manipulated with PyMOL
19
. For
PUP2 interface analysis, we extracted 20S proteasome protein structure from PDB 1RYP
20
.
Protein interface structures for ARO8, TPI1, and GAPDH were extracted from PDB (4JE5
21
,
1NEY
22
, 3PYM
23
respectively). To assess the stabilizing effect of S149 phosphorylation at the
catalytic site of GAPDH, we aligned crystal structures of GAPDH with bound G3P (1NQO
24
)
and inorganic phosphates (1GYP
25
) to a NAD-bound yeast GAPDH structure (3PYM
23
). Data on
sequence conservation, protein interfaces, and predicted stability effects of mutations (ΔΔG
pred
)
were obtained from the mutfunc resource
13
.
Reanalysis of the Huang et al. data
Supplementary data from Huang et al.
2
was used to calculate the correlation between T
m
for
phosphopeptides and proteins and learn about their statistical parameters. For data re-analysis,
all MS files from the study were downloaded from MassIVE data repository (dataset identifier:
MSV000083786), converted to open format mzXML files, and database searched with Comet
26
(v.2018.01.4) to obtain peptide and phosphopeptide identifications, with the exception of
“Bulk_6_2” which failed to convert. Database search parameters were: human protein sequence
database from UniProt (UP000005640), mass tolerance of 50 ppm for precursor m/z and 0.2 Da
for fragment ions, trypsin enzyme specificity (cleavage Ct to K, R, except for KP, RP),
maximum of 2 missed cleavages, fixed modification of carbamidomethyl on cysteines and
TMT10 (+229.1629) on lysines and peptide N-termini, and variable modifications of oxidation on
methionines and acetylation on protein N-termini. Phosphorylation samples included variable
modification of phosphorylation at serine, threonine, and tyrosine. Search results were filtered to
7
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
a PSM 1% FDR with Percolator
27
. Phosphosite localization was conducted with an in-house
C++ implementation of Ascore
28
and sites with Ascore > 13 were considered confidently
localized (P < 0.05). TMT10 reporter ion intensities were extracted from MS/MS scans using
in-house TMT quantification software.
We attempted to replicate the analysis by Huang et al. by following the method description
provided in their manuscript. Biological and technical replicates were treated equally. For each
replicate, TMT reporter ion intensities for all peptide spectral matches from the proteome files
were summed to the protein level, and TMT reporter ion intensities for phosphopeptide spectral
matches were summed to the phosphopeptide isoform level. In addition, we used the same
strategy to aggregate to peptide-level the TMT signals for PSMs mapping to the same
unmodified peptide observed in the phosphopeptide-enriched samples. We implemented the
TPP package in R to fit melting curves for proteins, phosphopeptide isoforms, and unmodified
peptides in the phosphorylation-enriched sample. To recapitulate the reported results we had to
conduct the fitting for all samples together (Supplementary Discussion). Melting curves were
filtered for fitting R
2
> 0.8. T-tests were conducted by comparing T
m
values for phosphopeptide
isoforms or unmodified peptides observed in the phosphopeptide-enriched samples to the
unmodified protein T
m
values, assuming equal variances and without multiple hypothesis
correction (as implemented by Huang et al.). Of note, our reanalysis revealed that one of the
phosphoproteome technical injections for biological replicate 5 was instead a repeated MS
analysis of biological replicate 4.
Data availability
The mass spectrometry proteomics data generated for this manuscript have been deposited to
the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier
PXD016750. Reviewer username is reviewer30568@ebi.ac.uk; password 7aQ1gAbU.
ACKNOWLEDGEMENTS
We thank members of the Villén lab for scientific discussions, in particular Bianca Ruiz, Mario
Leutert, and Alex Hogrebe. We thank Ariadna Llovet Soto and Jimmy Eng for software
developments on the data analysis pipeline. I.R.S. and K.N.H were supported by NIH training
grant T32HG000035 . A.S.V. was supported by NIH training grant T32LM012419 . Most of this
work was supported by NIH grant R35GM119536 to J.V. The Villén lab is additionally supported
by NIH grants R01AG056359, R01NS098329, and RM1 HG010461, Human Frontiers Science
Program grant RGP0034/2018, a research program grant from the W.M. Keck Foundation, and
the University of Washington Proteome Resource UWPR95794.
AUTHOR CONTRIBUTIONS
I.R.S., K.N.H., R.A.R.-M, and J.V. conceived the study and designed the experiments. I.R.S.
conducted the experiments with advice from K.N.H., R.A.R.-M. and J.V., and assistance from
8
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
A.A.B. I.R.S. analyzed the data with advice from R.A.R.-M. and A.S.V. J.V. supervised the
study. I.R.S. and J.V. wrote the paper and all authors edited it.
COMPETING FINANCIAL INTERESTS
The authors declare no competing interests.
9
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
FIGURE LEGENDS
Figure 1. Most phosphosites have little effect on protein stability. a, Scatter plot and
Pearson correlation between T
m
of phosphopeptide isoforms (n=10) and T
m
of the corresponding
protein (n=11). Values were obtained from the Huang et al. supplementary dataset, using mean
T
m
values. Significant phosphopeptide isoforms in blue. b, Volcano plots showing differences in
protein thermal stability (T
m
) between phosphopeptide isoforms (n=10, left panel) or unmodified
peptides observed in the phosphopeptide enriched sample (n=10, right panel) and their
corresponding protein (n=10) (t-test, significant hits at p-value <0.05 in blue), from Huang et al.
data reanalysis. c, Dali workflow depicting SILAC labeling of yeast, the gradient temperature
treatment of the protein extract, the inclusion of a 30
o
C control, the quantification of soluble
protein by mass spectrometry, and the calculation of relative stability (R
s
). d, Scatter plot and
Pearson correlation as in (b) using Dali’s R
s
data. e, Volcano plots as in (a) with x-axis showing
R
s
values. Here n=6, p-values were Benjamini-Hochberg corrected and significant hits at
q-value < 0.05. Significant hits in blue, significant phosphopeptide isoforms found in proteins with
known cleavage or splicing events are in orange.
Figure 2. Examples of phosphosites that alter protein thermal stability. a, Boxplot of R
s
values and distributions for the phosphopeptide containing PUP2 Ser56, PUP2 protein, and all
the proteins in the 20S proteasome. Shown at the right is the structure of the 20S proteasome
with PUP2 in blue, Ser56 in red, and PRE6 in grey. b, R
s
values for RPL12 and RPL12 S38
phosphopeptide isoform. c, R
s
values for NEW1 and phosphopeptide isoform containing NEW1
T1191. d, R
s
values for PGK1 and all measured PGK1 phosphopeptide isoforms, with
significantly destabilizing phosphosite S331 shown in red. ΔΔG
pred
for all glutamic acid
phosphomimetic substitutions were obtained from mutfunc, with ΔΔG
pred
> 2 considered likely
destabilizing. e, GAPDH S149 phosphopeptide is shared across all GAPDH paralogs (TDH1,
TDH2, and TDH3). Boxplot shows R
s
values and distributions for peptides unique to one isoform
(TDH1, TDH2, TDH3), peptides shared among all GAPDH isoforms (all), all peptides for TDH3,
and the S149 phosphopeptide isoform. Bottom panel shows localization of S149 on GAPDH
structure near the binding site of the enzyme substrate. All boxplots show results from 6
biological replicates.
10
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
SUPPLEMENTARY FIGURE LEGENDS
Supplementary Figure 1. Reproducibility and robustness of Dali compared to HTP. a,
Boxplot depicting pairwise Pearson correlations between biological and technical replicates for
the HTP (T
m
values) and Dali (R
s
values) approaches. b, Scatter plot and Pearson correlation
between the mean T
m
for unmodified peptides observed in the phosphopeptide enriched
samples (n=10) and the mean T
m
for their corresponding proteins (n=11). Results from the
Huang et al. data re-analysis conducted by us. c, Scatter plot and Pearson correlation as in (b)
with R
s
values obtained from the Dali method (n=6).
Supplementary Figure 2. Examples of significant hits on proteins that undergo
post-translational splicing or cleavage. a, R
s
values for observed VMA1 unmodified peptides
identified in phosphopeptide-enriched samples and proteome samples displayed across the
length of VMA1. Spliced products from amino acid 2-283 and 738-1031 are joined to generate
the V-type proton ATPase catalytic subunit A proteoform, extinguishing the 284-737 segment
7
.
Peptides derived from the proteome samples are colored in gray and significant unmodified
peptides found in the phosphopeptide-enriched sample are in red. b, Similar plot to (a) for
RPS31, which is cleaved to generate ubiquitin (1-76 amino acid segment) and 40S ribosomal
protein S31 (77-152 amino acid segment) proteins
6
.
Supplementary Figure 3. Examples of phosphosites that alter protein thermal stability
and are located at protein interfaces. R
s
boxplots for a, ARO8 S59, b, TPI1 S79, and c,
GAPDH S201 phosphopeptide isoforms and their protein counterparts. ARO8 S59, TPI1 S79,
and GAPDH S201 reside at dimerization interfaces as shown in the structures to the right (PDB
accession: 4JE5, 1NEY, and 3PYM respectively). Phosphomimetic mutations ARO8 S59E and
TPI1 S79E are predicted to disrupt protein interfaces (ΔΔG
pred
= 3.78 and ΔΔG
pred
=8.04
respectively). Additionally, TPI1 S79E mutation is predicted to alter protein conformational
stability (ΔΔG
pred
= 2.39). ΔΔG
pred
> 2 is predicted to be destabilizing
12
.
Supplementary Figure 4. Phosphosites significantly altering protein thermal stability
using two different statistical settings. Volcano plots showing ΔT
m
for mean phosphopeptide
isoform to mean protein counterpart in the x-axis, and the t-test probability in the y-axis. a,
Huang et al. implementation shows a p-value because multiple hypothesis correction was not
applied. Significant phosphopeptide isoforms (blue) are defined by p-value < 0.05 . b, Our
proposed analysis consolidates data from MS reanalysis prior to statistical testing, which is
performed assuming unequal variances between phosphopeptide isoform and proteins and
corrects p-values for multiple hypothesis testing. Significant phosphopeptide isoforms (blue) are
defined by q-value < 0.05.
11
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
REFERENCES
1. Hornbeck, P. V. et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic
Acids Res. 43 , D512–D520 (2015).
2. Huang, J. X. et al. High throughput discovery of functional protein modifications by Hotspot
Thermal Profiling. Nat. Methods 16 , 894–901 (2019).
3. Potel, C., Kurzawa, N., Beecher, I., Mateus, A. & Savitski, M. M. Impact of phosphorylation
on thermal stability of proteins. bioRxiv (2020).
4. Gaetani, M. et al. Proteome Integral Solubility Alteration: A High-Throughput Proteomics
Assay for Target Deconvolution. J. Proteome Res. 18 , 4027–4037 (2019).
5. Savitski, M. M. et al. Tracking cancer drugs in living cells by thermal profiling of the
proteome. Science 346 , (2014).
6. Finley, D., Bartel, B. & Varshavsky, A. The tails of ubiquitin precursors are ribosomal
proteins whose fusion to ubiquitin facilitates ribosome biogenesis. Nature 338 , 394–401
(1989).
7. Kane, P. M. et al. Protein splicing converts the yeast TFP1 gene product to the 69-kD
subunit of the vacuolar H(+)-adenosine triphosphatase. Science 250 , 651–657 (1990).
8. Holt, L. J. et al. Global Analysis of Cdk1 Substrate Phosphorylation Sites Provides Insights
into Evolution. Science 325 , 1682–1686 (2009).
9. Dephoure, N. et al. A quantitative atlas of mitotic phosphorylation. Proc. Natl. Acad. Sci.
105 , 10762–10767 (2008).
10. Imami, K. et al. Phosphorylation of the Ribosomal Protein RPL12/uL11 Affects Translation
during Mitosis. Mol. Cell 72 , 84-98.e9 (2018).
11. Kasari, V. et al. A role for the Saccharomyces cerevisiae ABCF protein New1 in translation
termination/recycling. Nucleic Acids Res. 47 , 8807–8820 (2019).
12. Viéitez, C. et al. Towards a systematic map of the functional role of protein phosphorylation.
bioRxiv 872770 (2019) doi:10.1101/872770.
13. Wagih, O. et al. A resource of variant effect predictions of single nucleotide variants in model
organisms. Mol. Syst. Biol. 14 , e8430 (2018).
14. Ochoa, D. et al. The functional landscape of the human phosphoproteome. Nat. Biotechnol.
(2019) doi:10.1038/s41587-019-0344-3.
15. Leutert, M., Rodriguez-Mias, R. A., Fukuda, N. K. & Villén, J. R2-P2 rapid-robotic
phosphoproteomics enables multidimensional cell signaling studies. Mol. Syst. Biol. 15 ,
e9021 (2019).
16. Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized
p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26 ,
1367–1372 (2008).
12
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
17. Benjamini, Y. & Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful
Approach to Multiple Testing. J. R. Stat. Soc. Ser. B M ethodol. 57 , 289–300 (1995).
18. Pu, S., Wong, J., Turner, B., Cho, E. & Wodak, S. J. Up-to-date catalogues of yeast protein
complexes. Nucleic Acids Res. 37 , 825–831 (2009).
19. The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrödinger, LLC.
20. Groll, M. et al. Structure of 20S proteasome from yeast at 2.4 A resolution. Nature 386 ,
463–471 (1997).
21. Bulfer, S. L., Brunzelle, J. S. & Trievel, R. C. Crystal structure of Saccharomyces cerevisiae
Aro8, a putative α-aminoadipate aminotransferase. Protein Sci. 22 , 1417–1424 (2013).
22. Jogl, G., Rozovsky, S., McDermott, A. E. & Tong, L. Optimal alignment for enzymatic proton
transfer: Structure of the Michaelis complex of triosephosphate isomerase at 1.2-Å
resolution. Proc. Natl. Acad. Sci. 100 , 50–55 (2003).
23. Garcia-Saez, I., Kozielski, F., Job, D. & Boscheron, C. Structure of GAPDH 3 from S.
cerevisiae at 2.0 Å resolution. Submitt. PDB Data Bank (2010).
24. Didierjean, C. et al. Crystal Structure of Two Ternary Complexes of Phosphorylating
Glyceraldehyde-3-phosphate Dehydrogenase from Bacillus stearothermophilus with NAD
and d-Glyceraldehyde 3-Phosphate. J. Biol. Chem. 278 , 12968–12976 (2003).
25. Kim, H., Feil, I. K., Verlinde, C. L. M. J., Petra, P. H. & Hol, W. G. J. Crystal Structure of
Glycosomal Glyceraldehyde-3-phosphate Dehydrogenase from Leishmania mexicana:
Implications for Structure-Based Drug Design and a New Position for the Inorganic
Phosphate Binding Site. Biochemistry 34 , 14975–14986 (1995).
26. Eng, J. K., Jahan, T. A. & Hoopmann, M. R. Comet: an open-source MS/MS sequence
database search tool. Proteomics 13 , 22–24 (2013).
27. Käll, L., Canterbury, J. D., Weston, J., Noble, W. S. & MacCoss, M. J. Semi-supervised
learning for peptide identification from shotgun proteomics datasets. Nat. Methods 4 ,
923–925 (2007).
28. Beausoleil, S. A., Villén, J., Gerber, S. A., Rush, J. & Gygi, S. P. A probability-based
approach for high-throughput protein phosphorylation analysis and site localization. Nat.
Biotechnol. 24 , 1285–1292 (2006).
13
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Figure 1
ab
de
c
45
50
55
60
65
45 50 55 60 65
Protein Tm (ºC)
Phosphoisoform Tm (ºC)
R2 = 0.18
−2
−1
0
1
2
−2 −1 012
Protein Rs
Phosphoisoform Rs
R2 = 0.79
cell lysis
light Lys
S. cerevisiae
cultures
heavy Lys
cell lysis
45.6ºC 57ºC
temperature
gradient
30ºC pool,
soluble
control
pool,
soluble 2X
1X
temperature
% soluble protein
Rs = log2
temperature
gradient
30ºC control
temperature treatment MS analysis of soluble protein
−10 0 10 −10 0 10
0
3
6
9
Phosphopeptide
isoforms
Unmodified
peptides
∆Tm (ºC) ∆Tm (ºC)
-log10(p-value)
−2 −1 012−2 −1 012
0
1
2
3
4
Phosphopeptide
isoforms
Unmodified
peptides
∆Rs∆Rs
-log10(q-value)
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Figure 2
a
d
b
e
c
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
20S
proteasome
PUP2 S56
PUP2
Rs
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
Protein S38
RPL12
Rs
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 NEW1
T1191Protein
Rs
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 GAPDH
TDH1 TDH2 TDH3 All TDH3 S149
shared
peptides
peptides unique
to one isoform
Rs
20S proteasome
PUP2 (α5)PRE6 (α4)
S56
NADH
C150 S149
Pi
Pi
GAPDH
H176
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 PGK1
S36 S203 T331 T392 S397 S413Protein
Rs
Phosphosites
−1
0
1
2
3
S36E
T203E
T331E
T392E
S397E
S413E
ΔΔG
Phosphomimetic mutant
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Supplementary Figure 1
a b c
0.00
0.25
0.50
0.75
1.00 phosphoisoforms
protein
Bio Bio Bio Bio Tech Tech
this study Huang et al.
−2
−1
0
1
2
−2 −1 0 1 2
R2 = 0.90
Protein R
s
40
50
60
70
40 50 60 70
R2 = 0.18
Protein T
m
(ºC)
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Supplementary Figure 2
a b
−0.5
0.0
0.5
1.0
1.5
0 300 600 900
Protein sequence position
VMA1
−1
0
1
50 100 1500
Protein sequence position
RPS31
ubiquitin
RPS31
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Supplementary Figure 3
a
b
c
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
TDH1 TDH2 TDH3 All TDH3 S201
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 TPI1
S79
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 ARO8
S59
GAPDH
Protein
Protein
shared
peptides
peptides unique
to one isoform
TPI1
S79
ARO8
S59
TDH3
S201
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Supplementary Figure 4
a b
0.0
2.5
5.0
7.5
−10 −5 0 5 10
-log10(q-value)
∆T
m
(phosphoisoform-protein) (ºC)
0.0
2.5
5.0
7.5
−10 −5 0 5 10
-log10(p-value)
∆T
m
(phosphoisoform-protein) (ºC)
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
... In should be noted that recent findings from two independent groups 48,49 raised doubts in the extent of stability-altering phosphorylation postulated by Huang et al. 5 , indicating that the percentage of phosphorylation events leading to shift in protein stability is lower than expected. ...
... The AKT1 samples were resuspended in 20 mM ammonium hydroxide and separated into 96 fractions on an XBrigde BEH C18 2.1 × 150 mm column (Waters; Cat#186003023), using a Dionex Ultimate 3000 2DLC system (Thermo Scientific) over a 48 Fractions were then concatenated into 24 samples in sequential order (e.g. 1,25,49,73). ...
Article
Full-text available
Despite the immense importance of enzyme–substrate reactions, there is a lack of general and unbiased tools for identifying and prioritizing substrate proteins that are modified by the enzyme on the structural level. Here we describe a high-throughput unbiased proteomics method called System-wide Identification and prioritization of Enzyme Substrates by Thermal Analysis (SIESTA). The approach assumes that the enzymatic post-translational modification of substrate proteins is likely to change their thermal stability. In our proof-of-concept stu- dies, SIESTA successfully identifies several known and novel substrate candidates for sele- noprotein thioredoxin reductase 1, protein kinase B (AKT1) and poly-(ADP-ribose) polymerase-10 systems. Wider application of SIESTA can enhance our understanding of the role of enzymes in homeostasis and disease, opening opportunities to investigate the effect of post-translational modifications on signal transduction and facilitate drug discovery.
... Recently, thermal proteome profiling (TPP) has recently been developed 11 , which couples the cellular thermal shift assay (CETSA) 16 with multiplexed quantitative proteomics 17 . Protein thermal stability offers insights into protein state in situ 18 , because it reflects interactions with metabolites 19 , other proteins 15,20,21 and nucleic acids 21 , as well as the post-translational make-up [22][23][24] . Here, we combine reverse genetics with TPP in Escherichia coli to profile the effect of genetic perturbations on protein abundance and thermal stability. ...
... Having the largest perturbation dataset for TPP, we also investigated the underlying reasons for why proteins change melting behaviour in living cells. It has previously been observed that protein thermal stability can be affected by drug 11,20,25,39 , nucleic acid 21 and metabolite 19 binding, as well as protein interactions 15,20,21 and post-translational modifications [22][23][24] . Here, we show that protein thermal stability can also be affected by levels of cofactors and metabolites that directly bind the protein. ...
Article
Full-text available
Recent developments in high-throughput reverse genetics1,2 have revolutionized our ability to map gene function and interactions3–6. The power of these approaches depends on their ability to identify functionally associated genes, which elicit similar phenotypic changes across several perturbations (chemical, environmental or genetic) when knocked out7–9. However, owing to the large number of perturbations, these approaches have been limited to growth or morphological readouts¹⁰. Here we use a high-content biochemical readout, thermal proteome profiling¹¹, to measure the proteome-wide protein abundance and thermal stability in response to 121 genetic perturbations in Escherichia coli. We show that thermal stability, and therefore the state and interactions of essential proteins, is commonly modulated, raising the possibility of studying a protein group that is particularly inaccessible to genetics. We find that functionally associated proteins have coordinated changes in abundance and thermal stability across perturbations, owing to their co-regulation and physical interactions (with proteins, metabolites or cofactors). Finally, we provide mechanistic insights into previously determined growth phenotypes¹² that go beyond the deleted gene. These data represent a rich resource for inferring protein functions and interactions.
... Experimental approaches for high-throughput identification of protein phosphosite functions have been largely lacking. More recently, a promising approach has been to use a modified thermal proteome profiling (TPP) approach where the TPP protocol is conducted in conjunction with phosphoproteomics to identify functional phosphosites [18][19][20][21] . ...
Article
Protein phosphorylation is a key regulatory mechanism involved in nearly every eukaryotic cellular process. Increasingly sensitive mass spectrometry approaches have identified hundreds of thousands of phosphorylation sites, but the functions of a vast majority of these sites remain unknown, with fewer than 5% of sites currently assigned a function. To increase our understanding of functional protein phosphorylation we developed an approach (phospho-DIFFRAC) for identifying the phosphorylation-dependence of protein assemblies in a systematic manner. A combination of nonspecific protein phosphatase treatment, size-exclusion chromatography, and mass spectrometry allowed us to identify changes in protein interactions after the removal of phosphate modifications. With this approach we were able to identify 316 proteins involved in phosphorylation-sensitive interactions. We recovered known phosphorylation-dependent interactors such as the FACT complex and spliceosome, as well as identified novel interactions such as the tripeptidyl peptidase TPP2 and the supraspliceosome component ZRANB2. More generally, we find phosphorylation-dependent interactors to be strongly enriched for RNA-binding proteins, providing new insight into the role of phosphorylation in RNA binding. By searching directly for phosphorylated amino acid residues in mass spectrometry data, we identified the likely regulatory phosphosites on ZRANB2 and FACT complex subunit SSRP1. This study provides both a method and resource for obtaining a better understanding of the role of phosphorylation in native macromolecular assemblies. All mass spectrometry data are available through PRIDE (accession #PXD021422).
... Experimental approaches for high-throughput identification of protein phosphosite functions have been largely lacking. More recently, a promising approach has been to use a modified thermal proteome profiling (TPP) approach where the TPP protocol is conducted in conjunction with phosphoproteomics to identify functional phosphosites [18][19][20][21] . This method has proven useful to directly determine whether a phosphosite is likely to be functional or not. ...
Preprint
Full-text available
Protein phosphorylation is a key regulatory mechanism involved in nearly every eukaryotic cellular process. Increasingly sensitive mass spectrometry approaches have identified hundreds of thousands of phosphorylation sites but the functions of a vast majority of these sites remain unknown, with fewer than 5% of sites currently assigned a function. To increase our understanding of functional protein phosphorylation we developed an approach for identifying the phosphorylation-dependence of protein assemblies in a systematic manner. A combination of non-specific protein phosphatase treatment, size-exclusion chromatography, and mass spectrometry allowed us to identify changes in protein interactions after the removal of phosphate modifications. With this approach we were able to identify 316 proteins involved in phosphorylation-sensitive interactions. We recovered known phosphorylation-dependent interactors such as the FACT complex and spliceosome, as well as identified novel interactions such as the tripeptidyl peptidase TPP2 and the supraspliceosome component ZRANB2. More generally, we find phosphorylation-dependent interactors to be strongly enriched for RNA-binding proteins, providing new insight into the role of phosphorylation in RNA binding. By searching directly for phosphorylated amino acid residues in mass spectrometry data, we identified the likely regulatory phosphosites on ZRANB2 and FACT complex subunit SSRP1. This study provides both a method and resource for obtaining a better understanding of the role of phosphorylation in native macromolecular assemblies.
... The functional relevance of phosphosites can also be uncovered with an emerging strategy that combines thermal proteome profiling (TPP) with phosphoproteomics, that is, "Hotspot Thermal Profiling" (HTP) or "phospho-TPP" (Azimi et al., 2018;Huang et al., 2019;Potel et al., 2020;Smith et al., 2020). In TPP, proteins are treated with a series of increasing heat, which denatures the proteins and render them insoluble. ...
Article
Full-text available
Phosphorylation is a form of protein posttranslational modification (PTM) that regulates many biological processes. Whereas phosphoproteomics is a scientific discipline that identifies and quantifies the phosphorylated proteome using mass spectrometry (MS). This task is extremely challenging as ~30% of the human proteome is phosphorylated; and each phosphoprotein may exist as multiple phospho‐isoforms that are present in low abundance and stoichiometry. Hence, phosphopeptide enrichment techniques are indispensable to (phospho)proteomics laboratories. These enrichment methods encompass widely‐adopted techniques such as (i) affinity‐based chromatography; (ii) ion exchange and mixed‐mode chromatography (iii) enrichment with phospho‐specific antibodies and protein domains, and (iv) functionalized polymers and other less common but emerging technologies such as hydroxyapatite chromatography and precipitation with inorganic ions. Here, we review these techniques, their history, continuous development and evaluation. Besides, we outline associating challenges of phosphoproteomics that are linked to experimental design, sample preparation, and proteolytic digestion. In addition, we also discuss about the future outlooks in phosphoproteomics, focusing on elucidating the noncanonical phosphoproteome and deciphering the “dark phosphoproteome”. © 2020 John Wiley & Sons Ltd. Mass Spec Rev
Article
Introduction Protein phosphorylation is a primary mechanism of signal transduction in cellular systems. Isobaric tagging can be used to investigate alterations in phosphorylation events in sample multiplexing experiments where quantification extends across all conditions. As such, innovations in tandem mass tag methods can facilitate the expansion of the depth and breadth of phosphoproteomics analyses. Areas covered This review discusses the current state of tandem mass tag-centric phosphoproteomics and highlights advances in reagent chemistry, instrumentation, data acquisition, and data analysis. We stress that approaches for phosphoproteomic investigations require high-specificity enrichment, sensitive detection, and accurate phosphorylation site localization. Expert opinion Tandem mass tag-centric phosphoproteomics will continue to be an important conduit for our understanding of signal transduction in living organisms. We anticipate that progress in phosphopeptide enrichment methodologies, enhancements in instrumentation and data acquisition technologies, and further refinements in analytical strategies will be key to the discovery of biologically relevant data from phosphoproteomics studies.
Thesis
In the last decades, molecular biology has transformed into a data-rich discipline. This trend is driven by developments in imaging and the continuous increase in available omics technologies which allow for high-throughput profiling of various types of molecules in a given biological system. Classical omics approaches profile the abundance of thousands of cellular biomolecules, e.g., RNAs or proteins. Recently developed assays, such as Thermal Proteome Profiling (TPP), however, can additionally inform on biophysical states of proteins. By choosing the right experimental design or through contextualization of TPP experiments they can reveal small molecule protein engagement, protein-protein interaction (PPI) dynamics or effects of post-translational modifications (PTM). However, while experimental de- signs, reproducibility, amenable organisms and throughput of the TPP assay are being advanced at a fast pace, computational methods for statistical analysis of obtained data are lagging behind. This thesis proposes a suite of computational methods to provide tools for several of the aforementioned application areas of TPP. First, it describes a software package for analysis of TPP experiments in the context of PPIs and suggests a method for detection of differential PPIs across conditions. The application of this method to different TPP datasets revealed significantly changing PPIs during different phases of the human cell cycle and behavior of protein complexes in Escherichia coli within and across cellular compartments. Second, this work addresses a specific experimental TPP setup called 2D-TPP in which thermal stability of proteins is measured as a function of temperature and concentration of a compound of interest to find proteome-wide interactions of the compound. This was done by implementation of a curve-based hypothesis test to analyze data obtained from such experiments with false discovery rate control. The method was benchmarked on simulated data and on several real datasets. Application of the software to 2D-TPP datasets profiling epigenetic drugs revealed hitherto unknown off-targets and downstream effects of these drugs. Third, the same computational method was applied to a 2D-TPP dataset profiling ATP and GTP in a crude cell extract. The analysis of these datasets revealed functional roles of ATP in proteome regulation ranging from allosteric binding, over protein complex assembly and condensate formation. Last, a method for analysis of TPP experiments to profile the effect of PTMs is presented. While the application of this method led to the detection of phosphosites known to be involved in protein regulation, it also pointed out sites which appear to be involved in controlling the localization of proteins to membrane-less organelles. Taken together, this thesis introduces and showcases computational methods for different application areas of TPP. The presented methods are implemented as open source software packages to enable long-term availability and access to the broader community.
Article
Full-text available
Abstract Recent developments in proteomics have enabled signaling studies where > 10,000 phosphosites can be routinely identified and quantified. Yet, current analyses are limited in throughput, reproducibility, and robustness, hampering experiments that involve multiple perturbations, such as those needed to map kinase–substrate relationships, capture pathway crosstalks, and network inference analysis. To address these challenges, we introduce rapid‐robotic phosphoproteomics (R2‐P2), an end‐to‐end automated method that uses magnetic particles to process protein extracts to deliver mass spectrometry‐ready phosphopeptides. R2‐P2 is rapid, robust, versatile, and high‐throughput. To showcase the method, we applied it, in combination with data‐independent acquisition mass spectrometry, to study signaling dynamics in the mitogen‐activated protein kinase (MAPK) pathway in yeast. Our results reveal broad and specific signaling events along the mating, the high‐osmolarity glycerol, and the invasive growth branches of the MAPK pathway, with robust phosphorylation of downstream regulatory proteins and transcription factors. Our method facilitates large‐scale signaling studies involving hundreds of perturbations opening the door to systems‐level studies aiming to capture signaling complexity.
Preprint
Full-text available
Phosphorylation is a critical post-translational modification involved in the regulation of almost all cellular processes. However, less than 5% of thousands of recently discovered phosphorylation sites have a known function. Here, we devised a chemical genetic approach to study the functional relevance of phosphorylation in S. cerevisiae . We generated 474 phospho-deficient mutants that, along with the gene deletion library, were screened for fitness in 102 conditions. Of these, 42% exhibited growth phenotypes, suggesting these phosphosites are likely functional. We inferred their function based on the similarity of their growth profiles with that of gene deletions, and validated a subset by thermal proteome profiling and lipidomics. While some phospho-mutants showed loss-of-function phenotypes, a higher fraction exhibited phenotypes not seen in the corresponding gene deletion suggestive of a gain-of-function effect. For phosphosites conserved in humans, the severity of the yeast phenotypes is indicative of their human functional relevance. This study provides a roadmap for functionally characterizing phosphorylation in a systematic manner.
Article
Full-text available
Protein phosphorylation is a key post-translational modification regulating protein function in almost all cellular processes. Although tens of thousands of phosphorylation sites have been identified in human cells, approaches to determine the functional importance of each phosphosite are lacking. Here, we manually curated 112 datasets of phospho-enriched proteins, generated from 104 different human cell types or tissues. We re-analyzed the 6,801 proteomics experiments that passed our quality control criteria, creating a reference phosphoproteome containing 119,809 human phosphosites. To prioritize functional sites, we used machine learning to identify 59 features indicative of proteomic, structural, regulatory or evolutionary relevance and integrate them into a single functional score. Our approach identifies regulatory phosphosites across different molecular mechanisms, processes and diseases, and reveals genetic susceptibilities at a genomic scale. Several regulatory phosphosites were experimentally validated, including identifying a role in neuronal differentiation for phosphosites in SMARCC2, a member of the SWI/SNF chromatin-remodeling complex. Phosphorylation sites are ranked for functional relevance using a comprehensive, high-quality human phosphoproteome.
Article
Full-text available
Mass spectrometry enables global analysis of posttranslationally modified proteoforms from biological samples, yet we still lack methods to systematically predict, or even prioritize, which modification sites may perturb protein function. Here we describe a proteomic method, Hotspot Thermal Profiling, to detect the effects of site-specific protein phosphorylation on the thermal stability of thousands of native proteins in live cells. This massively parallel biophysical assay unveiled shifts in overall protein stability in response to site-specific phosphorylation sites, as well as trends related to protein function and structure. This method can detect intrinsic changes to protein structure as well as extrinsic changes to protein–protein and protein–metabolite interactions resulting from phosphorylation. Finally, we show that functional ‘hotspot’ protein modification sites can be discovered and prioritized for study in a high-throughput and unbiased fashion. This approach is applicable to diverse organisms, cell types and posttranslational modifications.
Article
Full-text available
Translation is controlled by numerous accessory proteins and translation factors. In the yeast Saccharomyces cerevisiae, translation elongation requires an essential elongation factor, the ABCF ATPase eEF3. A closely related protein, New1, is encoded by a non-essential gene with cold sensitivity and ribosome assembly defect knock-out phenotypes. Since the exact molecular function of New1 is unknown, it is unclear if the ribosome assembly defect is direct, i.e. New1 is a bona fide assembly factor, or indirect, for instance due to a defect in protein synthesis. To investigate this, we employed yeast genetics, cryo-electron microscopy (cryo-EM) and ribosome profiling (Ribo-Seq) to interrogate the molecular function of New1. Overexpression of New1 rescues the inviability of a yeast strain lacking the otherwise strictly essential translation factor eEF3. The structure of the ATPase-deficient (EQ2) New1 mutant locked on the 80S ribosome reveals that New1 binds analogously to the ribosome as eEF3. Finally, Ribo-Seq analysis revealed that loss of New1 leads to ribosome queuing upstream of 3'-terminal lysine and arginine codons, including those genes encoding proteins of the cytoplasmic translational machinery. Our results suggest that New1 is a translation factor that fine-tunes the efficiency of translation termination or ribosome recycling.
Article
Full-text available
The effect of single nucleotide variants (SNVs) in coding and noncoding regions is of great interest in genetics. Although many computational methods aim to elucidate the effects of SNVs on cellular mechanisms, it is not straightforward to comprehensively cover different molecular effects. To address this, we compiled and benchmarked sequence and structure-based variant effect predictors and we computed the impact of nearly all possible amino acid and nucleotide variants in the reference genomes of Homo sapiens, Saccharomyces cerevisiae and Escherichia coli. Studied mechanisms include protein stability, interaction interfaces, post-translational modifications and transcription factor binding sites. We apply this resource to the study of natural and disease coding variants. We also show how variant effects can be aggregated to generate protein complex burden scores that uncover protein complex to phenotype associations based on a set of newly generated growth profiles of 93 sequenced S. cerevisiae strains in 43 conditions. This resource is available through mutfunc (www.mutfunc.com), a tool by which users can query precomputed predictions by providing amino acid or nucleotide-level variants. © 2018 The Authors. Published under the terms of the CC BY 4.0 license
Article
Full-text available
PhosphoSitePlus(®) (PSP, http://www.phosphosite.org/), a knowledgebase dedicated to mammalian post-translational modifications (PTMs), contains over 330 000 non-redundant PTMs, including phospho, acetyl, ubiquityl and methyl groups. Over 95% of the sites are from mass spectrometry (MS) experiments. In order to improve data reliability, early MS data have been reanalyzed, applying a common standard of analysis across over 1 000 000 spectra. Site assignments with P > 0.05 were filtered out. Two new downloads are available from PSP. The 'Regulatory sites' dataset includes curated information about modification sites that regulate downstream cellular processes, molecular functions and protein-protein interactions. The 'PTMVar' dataset, an intersect of missense mutations and PTMs from PSP, identifies over 25 000 PTMVars (PTMs Impacted by Variants) that can rewire signaling pathways. The PTMVar data include missense mutations from UniPROTKB, TCGA and other sources that cause over 2000 diseases or syndromes (MIM) and polymorphisms, or are associated with hundreds of cancers. PTMVars include 18 548 phosphorlyation sites, 3412 ubiquitylation sites, 2316 acetylation sites, 685 methylation sites and 245 succinylation sites. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Article
Various agents, including drugs as well as non-molecular stimuli, induce alterations in the physico-chemical properties of proteins in cell lysates, living cells and organisms. These alterations can be probed by applying a stability- and solubility-modifying factor, such as elevated temperature, to a varying degree. As a second dimension of variation, drug concentration or agent intensity/concentration can be used. Compared to standard approaches where curves are fitted to protein solubility data acquired at different temperatures and drug concentrations, Proteome Integral Solubility Alteration (PISA) assay increases the analysis throughput by one to two orders of magnitude for unlimited number of factor variation points in such a scheme. The consumption of the compound and biological material decreases in PISA by the same factor. We envision widespread use of the PISA approach in chemical biology and drug development.
Article
Emerging evidence indicates that heterogeneity in ribosome composition can give rise to specialized functions. Until now, research mainly focused on differences in core ribosomal proteins and associated factors. The effect of posttranslational modifications has not been studied systematically. Analyzing ribosome heterogeneity is challenging because individual proteins can be part of different subcomplexes (40S, 60S, 80S, and polysomes). Here we develop polysome proteome profiling to obtain unbiased proteomic maps across ribosomal subcomplexes. Our method combines extensive fractionation by sucrose gradient centrifugation with quantitative mass spectrometry. The high resolution of the profiles allows us to assign proteins to specific subcomplexes. Phosphoproteomics on the fractions reveals that phosphorylation of serine 38 in RPL12/uL11, a known mitotic CDK1 substrate, is strongly depleted in polysomes. Follow-up experiments confirm that RPL12/uL11 phosphorylation regulates the translation of specific subsets of mRNAs during mitosis. Together, our results show that posttranslational modification of ribosomal proteins can regulate translation.
Article
The thermal stability of proteins can be used to assess ligand binding in living cells. We have generalized this concept by determining the thermal profiles of more than 7000 proteins in human cells by means of mass spectrometry. Monitoring the effects of small-molecule ligands on the profiles delineated more than 50 targets for the kinase inhibitor staurosporine. We identified the heme biosynthesis enzyme ferrochelatase as a target of kinase inhibitors and suggest that its inhibition causes the phototoxicity observed with vemurafenib and alectinib. Thermal shifts were also observed for downstream effectors of drug treatment. In live cells, dasatinib induced shifts in BCR-ABL pathway proteins, including CRK/CRKL. Thermal proteome profiling provides an unbiased measure of drug-target engagement and facilitates identification of markers for drug efficacy and toxicity.