Available via license: CC BY 4.0
Content may be subject to copyright.
Identification of phosphosites that alter protein thermal stability
Ian R. Smith, Kyle N. Hess, Anna A. Bakhtina, Anthony S. Valente, Ricard A. Rodríguez-Mias
and Judit Villén
Department of Genome Sciences, University of Washington, Seattle WA, USA
*Corresponding author, email: jvillen@uw.edu
ABSTRACT
Proteomics has enabled the cataloguing of 100,000s of protein phosphorylation sites
1
, however
we lack methods to systematically annotate their function. Phosphorylation has numerous
biological functions, yet biochemically all involve changes in protein structure and interactions.
These biochemical changes can be recapitulated by measuring the difference in stability
between the protein and the phosphoprotein. Building on recent work, we present a method to
infer phosphosite functionality by reliably measuring such differences at the proteomic scale.
MAIN TEXT
Recently, Huang et al.
2
developed the Hotspot Thermal Profiling (HTP) method to identify
phosphosites that alter protein thermal stability, reporting 719 out of 2,883 (25%) phosphosites
with significant effects. The reported melting temperatures (T
m
) for phosphopeptides correlated
poorly with the T
m
for their corresponding proteins (R
2
= 0.18) (Fig. 1a), implying that many
phosphosites function by structurally reshaping the proteome. However, the low T
m
reproducibility between replicates (Supplementary Fig. 1a) suggests that this conclusion may be
due to technical variation (Supplementary Discussion). The HTP workflow consists of
phosphopeptide enrichment followed by separate isotopic labeling and mass spectrometric
analysis to derive T
m
values for phosphopeptides and proteins, respectively. Because
phosphopeptide samples also contained unmodified peptides, which are expected to have the
same T
m
as the protein, we can use these peptides to assess technical variation between the
two samples. Disconcertingly, our re-analysis revealed that 626 out of 3074 (20%) of the
co-enriched unmodified peptides had significant stability effects, almost the same percentage as
phosphopeptides (22%, 596 out of 2656 in our re-analysis) (Fig. 1b, Dataset S1). Additionally,
the T
m
correlation of these peptides with their protein T
m
was similarly low ( R
2
= 0.18) to the
correlation between phosphopeptides and protein (Supplementary Fig. 1b). In the absence of a
biological explanation, this suggests that the independent labeling and mass spectrometric
analysis of peptide and phosphopeptide samples could have introduced substantial technical
error precluding the comparison, and perhaps that the reported hits arise from a lack of
stringency in the applied statistical analysis (Supplementary Discussion).
1
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
To minimize technical noise derived from sample preparation, peptide samples should be
labeled and mixed prior to phosphopeptide enrichment (Supplementary Discussion and
accompanying manuscript
3
). Because scaling-up isobaric chemical labeling increases reagent
costs substantially, we have developed an alternative approach to identify phosphosites that
alter thermal stability, that we call Dali (Fig. 1c). Dali applies the Proteome Integral Stability
Alteration (PISA) method
4
, a simplified version of thermal proteome profiling
5
, in which the
soluble protein from the different temperature points are combined to provide an estimation of
the area under the protein melting curve. To reliably compare phosphopeptides to proteins, we
normalize each measurement to a 30
o
C treated proteome reference that is labeled with heavy
lysine, obtaining a relative stability (R
s
) measurement for phosphopeptides and proteins. This
30
o
C reference is mixed in with the temperature gradient treated samples prior to protein
digestion, and it is present during phosphopeptide enrichment and mass spectrometry (MS)
measurement of peptides and phosphopeptides.
We applied Dali to the S. cerevisiae proteome and obtained reproducible R
s
measurements for proteins (average R
2
= 0.76) and phosphopeptides (average R
2
= 0.65)
(Supplementary Fig. 1a). In contrast to the Huang et al. dataset, we find that the stability of
phosphopeptides correlates well with the stability of their respective proteins (R
2
=0.79 for mean
R
s
comparisons) (Fig. 1d), suggesting that most phosphosites do not alter protein stability as
also observed by Potel et al.
3
. As expected, the stability of non-modified peptides present in the
phosphopeptide-enriched samples also correlated well with their proteins (R
2
= 0.90 for mean R
s
comparisons), indicating that R
s
measurements in the phosphopeptide samples and protein
samples can be reliably compared (Supplementary Fig. 1c). Finally, our analysis yielded 71
phosphopeptide isoforms out of 2,345 (3%) with significantly different thermal stability than the
unmodified protein (Fig. 1e, Dataset S2). We detected several non-modified peptides with
significant differences in stability, yet this set constituted a much smaller fraction than found in
Huang et al. (Dataset S3). Many of these peptides (7 out of 16) were cases where the protein is
known to be post-translationally processed via cleavage (e.g. RPS31
6
) or splicing (e.g. VMA1
7
),
resulting in proteins and/or proteoforms of different thermal stability as our method measured
(Supplementary Fig. 2).
Among phosphosites that decreased protein thermal stability, we identified four sites
located at protein interfaces (Ser56 on PUP2, Ser59 on ARO8, Ser79 on TPI1 and Ser201 on
GAPDH) (Fig. 2a, Supplementary Fig. 3) that may act by disrupting protein-protein interactions.
For example, PUP2 is the alpha 5 subunit of the 20S proteasome, and Ser56 is a known Cdc28
substrate
8
located at the protein interaction interface with PRE6, the 20S proteasome alpha 4
subunit (Fig. 2a). The stability measured for the phosphopeptide spanning Ser56 is significantly
lower than the stability of PUP2, which is similar to other proteins in the 20S proteasome,
suggesting Ser56 phosphorylation may dissociate PUP2 from the 20S proteasome.
We identified stabilizing phosphosites that may play a role in the protein translation
process. For example, we found that phosphorylation at Ser38 on ribosomal protein RPL12/uL11
significantly increased protein stability (Fig. 2b). This phosphosite is an evolutionarily-conserved
2
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Cdc28 substrate
8
that is regulated during the cell cycle
9
and has been reported to be depleted in
polysomes and influence mitotic translation
10
. Due to RPL12 location at the ribosome P-stalk
and the proximity of residue Ser38 to elongation factor 2, we hypothesize that this phosphosite
may modulate the interaction with EF-2 to aid ribosomal translocation during protein synthesis,
and the change in conformation and binding may stabilize RPL12. We also identified a stabilizing
phosphorylation on NEW1 at Thr1191 (delta R
s
= 1.23) (Fig. 2c). NEW1 is a translation factor
that binds to the ribosome at a position analogous to eEF3 and fine-tunes the efficiency of
translation termination
11
. The identified phosphosite fits the CK2 consensus motif, is located
within the acidic C-terminal sequence of NEW1, and is highly conserved. A T1191A mutant has
growth defects
12
suggesting that phosphorylation is important for NEW1 function.
We were able to measure many key glycolysis proteins identifying phosphosites that
may modulate enzyme kinetics. For example, we measured the stability for six phosphosites on
PGK1, of which only Thr331 showed significantly decreased stability (Fig. 2d). This observation
agrees with the predicted stability effects of phosphomimetic substitutions on PGK1
13
(Fig. 2d).
We identified a stabilizing phosphosite at Ser149 in the three GAPDH isozymes TDH1, TDH2
and TDH3 (Fig 2e). Ser149 is adjacent to catalytic Cys150 and to the binding sites of substrates
glyceraldehyde-3-phosphate (G3P) and inorganic phosphate. Interestingly, Ser149
phosphorylation would occupy the inorganic phosphate binding site (Fig 2e). Additionally, it has
been recently reported that a TDH3 S149A mutant exhibits a growth defect with doxorubicin
compared to wild-type and decreases TDH3 activity to a greater extent than a TDH3
knockout
14
. Our results raise the possibility that S149 phosphorylation may increase the stability
of apo-GAPDH, the GAPDH-G3P reaction intermediate and aid phosphate transfer by
enhancing product release.
In this communication, we have outlined a novel proteomic method that enables robust
thermal stability comparison between proteins and phosphorylated proteoforms. Our method
identified 3% phosphosites in the S. cerevisiae proteome that significantly changed protein
melting behavior, with several examples potentially altering protein conformation and
interactions. Additional experiments will be needed to precisely characterize the function of
these phosphosites. One limitation of this method is that the sensitivity to detect changes in
stability is lower for proteins with extreme (low or high) melting temperature, which can be
circumvented by performing the experiment using different temperature gradients. Our method
can be extended to other model organisms and cell culture systems, as well as to other
post-translational modifications, expanding the proteomic toolkit to functionally annotate dynamic
protein modifications at scale.
3
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
METHODS
Yeast strains
All yeast experiments were conducted on the Saccharomyces cerevisiae haploid strain BY4741
( MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 ), a direct descendant from FY2, which is itself a direct
descendant of S288C.
S. cerevisiae growth, stable isotope labeling, and cell harvest
Two overnight yeast cultures were grown at 30
o
C in synthetic complete media (SCM) containing
6.7g/L yeast nitrogen base, 2g/L of synthetic complete mix minus lysine, 2% glucose, and
supplemented with either regular lysine (light culture) or
2
H
4
-lysine (heavy culture) at 0.872 mM
final concentration. These cultures were used to seed three 50mL cultures of each light and
heavy at OD
600
0.15, which were grown at 30
o
C and 45mL were harvested at OD
600
~ 1 by
centrifugation at 7,000 x g for 10min. Yeast pellets were washed by resuspension in 1.5mL
ice-cold sterile water and centrifugation in 2mL screw cap tubes at 21,000 x g for 10min; and
then snap-frozen in liquid nitrogen and stored at -80
o
C.
Cell lysis and protein extract temperature treatment
Frozen yeast cell pellets were resuspended in 700μL of non-denaturing lysis buffer (50mM
HEPES pH 7.0, 75mM NaCl) containing 0.5X protease inhibitors (Pierce) and phosphatase
inhibitors (50mM β-glycerophosphate, 10mM sodium pyrophosphate, 50mM of NaF, 1mM
sodium orthovanadate) on ice. Cells were lysed by bead beating with 0.5mm zirconia/silica
beads for 4 cycles of 60sec of mechanical agitation followed by 90sec rest on ice. Lysates were
clarified by sequential centrifugation, first at 1,200 x g for 1min to remove the beads and then at
21,000 x g for 10min at 4
o
C to remove cell debris. To bring all protein extracts to the same
concentration, extract volumes were adjusted to 1 OD
600
unit from a 45mL culture in 1mL.
Each cell extract was aliquoted into 2 strips of 8 PCR tubes each (1x8 for the temperature
gradient and 1x8 for the 30
o
C) dispensing 50μL of protein extract per tube. All samples were
initially equilibrated to 30
o
C for 5 min. Temperature gradient samples were subjected to 45.6
o
C,
46.8
o
C, 48.3
o
C, 50
o
C, 52
o
C, 53.6
o
C, 54.9
o
C, and 57
o
C, one tube to each temperature, for 5min.
In parallel, controls were subjected to an additional 30
o
C temperature treatment for 5min. All
samples were cooled down to room temperature for 10min. For each replicate, temperature
gradient samples were all pooled into one tube and 30
o
C controls were pooled into a separate
tube prior to centrifugation at 21,000 x g for 30min at 4
o
C. The soluble protein fractions for the
temperature gradient and 30
o
C controls were combined 2:1, three replicates with the
temperature gradient labeled heavy and the 30
o
C controls labeled light, and three additional
replicates with the labels swapped. We generated additional controls where heavy and light 30
o
C
controls were combined to assess potential differences in protein expression due to the different
labeling. Protein concentration was measured by the BCA assay.
4
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Protein reduction, alkylation, LysC digestion, and desalting
Samples were diluted 2-fold with a buffer containing 8M urea, 50mM HEPES pH 8.9, 75mM
NaCl, 1mM sodium orthovanadate, 50mM β-glycerophosphate, 10mM sodium pyrophosphate,
50mM NaF. Protein samples were subjected to reduction with 7.5mM dithiothreitol (DTT) for
30min at 55
o
C and alkylation with iodoacetamide (22.5mM) for 30min at room temperature in the
dark with agitation. The alkylation reaction was quenched with an additional 7.5mM DTT at room
temperature for 30min with agitation. The pH was adjusted to 8.5 with 1M Tris pH 8.9. Lysyl
endopeptidase (LysC; Wako Chemicals) was added at a 1:100 enzyme to protein ratio and
protein samples were incubated overnight with agitation at room temperature. LysC digestion
was quenched by addition of trifluoroacetic acid (TFA) to a final concentration of 1% and pH ~2-3
and the digests were stored at -80
o
C.
Peptide samples were desalted by solid-phase extraction over 50mg Sep-Pak tC
18
cartridges
(Waters). Packing material was washed with 1mL methanol, 3 x 1mL 100% acetonitrile, 1mL
70% acetonitrile, 0.25% acetic acid, 1mL 40% acetonitrile, 0.5% acetic acid, and equilibrated
with 3 x 1mL 0.1% TFA. Peptides were then loaded by gravity twice, washed with 3 x 1mL 0.1%
TFA and 1mL 0.5% acetic acid. Peptides were eluted with 600μL of 40% acetonitrile, 0.5%
acetic acid and 400μL 70% acetonitrile, 0.25% acetic acid, and aliquoted as follows: 40μg for
high-pH reversed-phase fractionation, 200μg for Fe
3+
-IMAC phosphopeptide enrichment, and
10μg for preliminary LC-MS/MS analysis to assess sample quality. All samples were dried by
vacuum centrifugation and stored at -80
o
C.
High-pH reversed-phase fractionation
Peptides were fractionated by high-pH reversed-phase fractionation on a 200μL pipette tip
packed with 4 layers of SDB-XC material (Empore). The material was washed with 50μL
methanol, 50μL 80% acetonitrile, 20mM ammonium formate, and 3 X 50μL 20mM ammonium
formate. Peptides (40μg) were solubilized in 40μL of 5% acetonitrile, 20 mM ammonium
formate, loaded onto the SDB-XC tip, and the flow-through was collected in a mass
spectrometer vial (fraction 1). Peptide fractions 2-5 were obtained by step elution with 40μL of
20mM ammonium formate in 10 %, 15%, 20%, and 80% acetonitrile and collection in mass
spectrometry vials. Peptide fractions were dried by vacuum centrifugation, solubilized in 3%
acetonitrile, 4% formic acid, and ~1μg of each fraction was analyzed by LC-MS/MS.
Fe
3+
-NTA IMAC phosphopeptide enrichment
Phosphopeptide enrichment was conducted by immobilized iron cation affinity chromatography
in batch mode and automated in a 96-well format on a KingFisher magnetic particle processor
as we described
15
. For each sample, ~200μg peptides were solubilized in 70 μL 0.1% TFA, 80
% acetonitrile and incubated with 80 μL of a 5% slurry of magnetic Fe-NTA beads (Cube
Biotech) in the same solvent for 30min. Beads were washed three times with 150μL 0.1% TFA,
80% acetonitrile and phosphopeptides were eluted with 50μL 50% acetonitrile, 0.37M
ammonium hydroxide. Eluates were acidified with 30μL 10% formic acid, 75% acetonitrile and
filtered over two-layer C18 extraction disks (Empore) packed in 200μL pipette tip, which had
5
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
been previously conditioned with 50μL100% methanol, 50μL 100% acetonitrile and 50μL 70%
acetonitrile, 0.25% acetic acid. Filtered peptides were collected in a mass spectrometer vial and
the peptides in the extraction disk were further eluted with 50μL 70% acetonitrile, 0.25% acetic
acid and collected into the same mass spectrometry vial. Phosphopeptide-enriched samples
were dried by vacuum centrifugation, solubilized in 3% acetonitrile, 4% formic acid, and one third
of each sample was analyzed by LC-MS/MS.
Liquid chromatography coupled to tandem mass spectrometry
Peptide samples were analyzed by nLC-MS/MS on a nanoAcquity UPLC (Waters) coupled to
an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher, San Jose, USA).
Samples were loaded on a 100μm x 3-cm trap column packed with 3μm C18 beads (Dr.
Maisch), separated on a 100μm x 30-cm capillary analytical column, packed with 1.9μm C18
beads (Dr. Maisch) and set at 50
o
C, using a 90-min reversed-phase gradient of acetonitrile in
0.125% formic acid, and online analyzed by mass spectrometry using data-dependent
acquisition. Each cycle consisted of 3 sec where one full MS1 scan was acquired on the
orbitrap at 120,000 resolution from 300 to 1575 m/z using an AGC of 7e5 and maximum
injection time of 50ms followed by MS/MS dependent scans on most intense precursor m/z ions
(only considering z = 2 to 5) until exhausting the 3sec cycle time, using 1.6 m/z isolation
window, HCD fragmentation at 28 normalized collision energy, and acquired at 15,000 resolution
on the orbitrap with an AGC of 5e4 (peptide samples) and AGC of 1e5 (phosphopeptide
samples) with a maximum injection time of 22ms. Dynamic exclusion was enabled to exclude
fragmented precursors from repeated MS/MS selection for 30sec. To increase coverage,
phosphopeptide samples were injected twice, and the data from the two technical replicates
were combined.
Database searching, peptide quantification, phosphosite localization, and R
s
calculation
MS data files for proteome samples were analyzed with MaxQuant
16
(v.1.6.7.0) to obtain peptide
identifications and quantifications, using the following parameters: protein sequence database
S.cerevisiae downloaded from SGD in July 2014, LysC enzyme specificity (cleavage Ct to K),
maximum of 2 missed cleavages, mass tolerance of 20ppm for MS1 and 20ppm for MS2, fixed
modification of carbamidomethyl on cysteines, variable modifications of oxidation on
methionines and acetylation on protein N-termini. Lysine residues were only allowed to be all
light or all
2
H
4
-Lys within the same peptide. Phosphoproteome samples were processed in
MaxQuant similarly as above, with additional variable modification of phosphorylation on serine,
threonine, and tyrosine residues. All searches were combined for MaxQuant filtering set to 1%
FDR at the level of peptide spectral matches and protein.
Quantification values for heavy and light peptide features were extracted from the evidence.txt
file. Quantification values for features corresponding to the same peptide sequence (e.g. same
peptide identified at multiple charge states or fractions) were summed up. Phosphopeptide
quantification features were aggregated to the phosphopeptide isoform level by summing
features corresponding to the same peptide sequence (e.g. same peptide identified at multiple
charge states or replicate injections) as well as overlapping peptide sequences sharing the
6
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
same combination of modifications. For each phosphopeptide isoform we required the
maximum localization probability to be greater than 75% for at least one site.
R
s
values were calculated as log
2
ratios of the quantification value for the temperature gradient
treated divided by the respective quantification for the 30
o
C control. Peptide R
s
distributions were
median normalized to 0, and the same correction value derived for each replicate was applied to
normalize the corresponding phosphopeptide isoform R
s
distributions. Peptides and
phosphopeptide isoforms with the 5% highest R
s
standard deviation across replicates were
excluded from the analysis. Protein R
s
values were calculated as the median of peptide R
s
for
that protein, requiring a minimum of 2 peptides per protein, and each peptide observed in at least
two replicates.
To identify phosphopeptide isoforms that have different R
s
than their unmodified protein
counterpart, we performed a t-test comparing phosphopeptide isoform R
s
values (n=6)
compared to protein R
s
values (n=6) and assuming unequal variance. Phosphopeptide isoforms
and protein counterparts were required to be observed in at least two replicates. P-values were
corrected for multiple hypothesis testing using the Benjamini-Hochberg method
17
. All data
analysis was conducted using R.
Structure visualization and bioinformatics
Protein complex annotations were extracted from the CYC2008 resource
18
. Protein structure
coordinates were downloaded from PDB and visualized and manipulated with PyMOL
19
. For
PUP2 interface analysis, we extracted 20S proteasome protein structure from PDB 1RYP
20
.
Protein interface structures for ARO8, TPI1, and GAPDH were extracted from PDB (4JE5
21
,
1NEY
22
, 3PYM
23
respectively). To assess the stabilizing effect of S149 phosphorylation at the
catalytic site of GAPDH, we aligned crystal structures of GAPDH with bound G3P (1NQO
24
)
and inorganic phosphates (1GYP
25
) to a NAD-bound yeast GAPDH structure (3PYM
23
). Data on
sequence conservation, protein interfaces, and predicted stability effects of mutations (ΔΔG
pred
)
were obtained from the mutfunc resource
13
.
Reanalysis of the Huang et al. data
Supplementary data from Huang et al.
2
was used to calculate the correlation between T
m
for
phosphopeptides and proteins and learn about their statistical parameters. For data re-analysis,
all MS files from the study were downloaded from MassIVE data repository (dataset identifier:
MSV000083786), converted to open format mzXML files, and database searched with Comet
26
(v.2018.01.4) to obtain peptide and phosphopeptide identifications, with the exception of
“Bulk_6_2” which failed to convert. Database search parameters were: human protein sequence
database from UniProt (UP000005640), mass tolerance of 50 ppm for precursor m/z and 0.2 Da
for fragment ions, trypsin enzyme specificity (cleavage Ct to K, R, except for KP, RP),
maximum of 2 missed cleavages, fixed modification of carbamidomethyl on cysteines and
TMT10 (+229.1629) on lysines and peptide N-termini, and variable modifications of oxidation on
methionines and acetylation on protein N-termini. Phosphorylation samples included variable
modification of phosphorylation at serine, threonine, and tyrosine. Search results were filtered to
7
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
a PSM 1% FDR with Percolator
27
. Phosphosite localization was conducted with an in-house
C++ implementation of Ascore
28
and sites with Ascore > 13 were considered confidently
localized (P < 0.05). TMT10 reporter ion intensities were extracted from MS/MS scans using
in-house TMT quantification software.
We attempted to replicate the analysis by Huang et al. by following the method description
provided in their manuscript. Biological and technical replicates were treated equally. For each
replicate, TMT reporter ion intensities for all peptide spectral matches from the proteome files
were summed to the protein level, and TMT reporter ion intensities for phosphopeptide spectral
matches were summed to the phosphopeptide isoform level. In addition, we used the same
strategy to aggregate to peptide-level the TMT signals for PSMs mapping to the same
unmodified peptide observed in the phosphopeptide-enriched samples. We implemented the
TPP package in R to fit melting curves for proteins, phosphopeptide isoforms, and unmodified
peptides in the phosphorylation-enriched sample. To recapitulate the reported results we had to
conduct the fitting for all samples together (Supplementary Discussion). Melting curves were
filtered for fitting R
2
> 0.8. T-tests were conducted by comparing T
m
values for phosphopeptide
isoforms or unmodified peptides observed in the phosphopeptide-enriched samples to the
unmodified protein T
m
values, assuming equal variances and without multiple hypothesis
correction (as implemented by Huang et al.). Of note, our reanalysis revealed that one of the
phosphoproteome technical injections for biological replicate 5 was instead a repeated MS
analysis of biological replicate 4.
Data availability
The mass spectrometry proteomics data generated for this manuscript have been deposited to
the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier
PXD016750. Reviewer username is reviewer30568@ebi.ac.uk; password 7aQ1gAbU.
ACKNOWLEDGEMENTS
We thank members of the Villén lab for scientific discussions, in particular Bianca Ruiz, Mario
Leutert, and Alex Hogrebe. We thank Ariadna Llovet Soto and Jimmy Eng for software
developments on the data analysis pipeline. I.R.S. and K.N.H were supported by NIH training
grant T32HG000035 . A.S.V. was supported by NIH training grant T32LM012419 . Most of this
work was supported by NIH grant R35GM119536 to J.V. The Villén lab is additionally supported
by NIH grants R01AG056359, R01NS098329, and RM1 HG010461, Human Frontiers Science
Program grant RGP0034/2018, a research program grant from the W.M. Keck Foundation, and
the University of Washington Proteome Resource UWPR95794.
AUTHOR CONTRIBUTIONS
I.R.S., K.N.H., R.A.R.-M, and J.V. conceived the study and designed the experiments. I.R.S.
conducted the experiments with advice from K.N.H., R.A.R.-M. and J.V., and assistance from
8
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
A.A.B. I.R.S. analyzed the data with advice from R.A.R.-M. and A.S.V. J.V. supervised the
study. I.R.S. and J.V. wrote the paper and all authors edited it.
COMPETING FINANCIAL INTERESTS
The authors declare no competing interests.
9
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
FIGURE LEGENDS
Figure 1. Most phosphosites have little effect on protein stability. a, Scatter plot and
Pearson correlation between T
m
of phosphopeptide isoforms (n=10) and T
m
of the corresponding
protein (n=11). Values were obtained from the Huang et al. supplementary dataset, using mean
T
m
values. Significant phosphopeptide isoforms in blue. b, Volcano plots showing differences in
protein thermal stability (T
m
) between phosphopeptide isoforms (n=10, left panel) or unmodified
peptides observed in the phosphopeptide enriched sample (n=10, right panel) and their
corresponding protein (n=10) (t-test, significant hits at p-value <0.05 in blue), from Huang et al.
data reanalysis. c, Dali workflow depicting SILAC labeling of yeast, the gradient temperature
treatment of the protein extract, the inclusion of a 30
o
C control, the quantification of soluble
protein by mass spectrometry, and the calculation of relative stability (R
s
). d, Scatter plot and
Pearson correlation as in (b) using Dali’s R
s
data. e, Volcano plots as in (a) with x-axis showing
R
s
values. Here n=6, p-values were Benjamini-Hochberg corrected and significant hits at
q-value < 0.05. Significant hits in blue, significant phosphopeptide isoforms found in proteins with
known cleavage or splicing events are in orange.
Figure 2. Examples of phosphosites that alter protein thermal stability. a, Boxplot of R
s
values and distributions for the phosphopeptide containing PUP2 Ser56, PUP2 protein, and all
the proteins in the 20S proteasome. Shown at the right is the structure of the 20S proteasome
with PUP2 in blue, Ser56 in red, and PRE6 in grey. b, R
s
values for RPL12 and RPL12 S38
phosphopeptide isoform. c, R
s
values for NEW1 and phosphopeptide isoform containing NEW1
T1191. d, R
s
values for PGK1 and all measured PGK1 phosphopeptide isoforms, with
significantly destabilizing phosphosite S331 shown in red. ΔΔG
pred
for all glutamic acid
phosphomimetic substitutions were obtained from mutfunc, with ΔΔG
pred
> 2 considered likely
destabilizing. e, GAPDH S149 phosphopeptide is shared across all GAPDH paralogs (TDH1,
TDH2, and TDH3). Boxplot shows R
s
values and distributions for peptides unique to one isoform
(TDH1, TDH2, TDH3), peptides shared among all GAPDH isoforms (all), all peptides for TDH3,
and the S149 phosphopeptide isoform. Bottom panel shows localization of S149 on GAPDH
structure near the binding site of the enzyme substrate. All boxplots show results from 6
biological replicates.
10
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
SUPPLEMENTARY FIGURE LEGENDS
Supplementary Figure 1. Reproducibility and robustness of Dali compared to HTP. a,
Boxplot depicting pairwise Pearson correlations between biological and technical replicates for
the HTP (T
m
values) and Dali (R
s
values) approaches. b, Scatter plot and Pearson correlation
between the mean T
m
for unmodified peptides observed in the phosphopeptide enriched
samples (n=10) and the mean T
m
for their corresponding proteins (n=11). Results from the
Huang et al. data re-analysis conducted by us. c, Scatter plot and Pearson correlation as in (b)
with R
s
values obtained from the Dali method (n=6).
Supplementary Figure 2. Examples of significant hits on proteins that undergo
post-translational splicing or cleavage. a, R
s
values for observed VMA1 unmodified peptides
identified in phosphopeptide-enriched samples and proteome samples displayed across the
length of VMA1. Spliced products from amino acid 2-283 and 738-1031 are joined to generate
the V-type proton ATPase catalytic subunit A proteoform, extinguishing the 284-737 segment
7
.
Peptides derived from the proteome samples are colored in gray and significant unmodified
peptides found in the phosphopeptide-enriched sample are in red. b, Similar plot to (a) for
RPS31, which is cleaved to generate ubiquitin (1-76 amino acid segment) and 40S ribosomal
protein S31 (77-152 amino acid segment) proteins
6
.
Supplementary Figure 3. Examples of phosphosites that alter protein thermal stability
and are located at protein interfaces. R
s
boxplots for a, ARO8 S59, b, TPI1 S79, and c,
GAPDH S201 phosphopeptide isoforms and their protein counterparts. ARO8 S59, TPI1 S79,
and GAPDH S201 reside at dimerization interfaces as shown in the structures to the right (PDB
accession: 4JE5, 1NEY, and 3PYM respectively). Phosphomimetic mutations ARO8 S59E and
TPI1 S79E are predicted to disrupt protein interfaces (ΔΔG
pred
= 3.78 and ΔΔG
pred
=8.04
respectively). Additionally, TPI1 S79E mutation is predicted to alter protein conformational
stability (ΔΔG
pred
= 2.39). ΔΔG
pred
> 2 is predicted to be destabilizing
12
.
Supplementary Figure 4. Phosphosites significantly altering protein thermal stability
using two different statistical settings. Volcano plots showing ΔT
m
for mean phosphopeptide
isoform to mean protein counterpart in the x-axis, and the t-test probability in the y-axis. a,
Huang et al. implementation shows a p-value because multiple hypothesis correction was not
applied. Significant phosphopeptide isoforms (blue) are defined by p-value < 0.05 . b, Our
proposed analysis consolidates data from MS reanalysis prior to statistical testing, which is
performed assuming unequal variances between phosphopeptide isoform and proteins and
corrects p-values for multiple hypothesis testing. Significant phosphopeptide isoforms (blue) are
defined by q-value < 0.05.
11
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
REFERENCES
1. Hornbeck, P. V. et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic
Acids Res. 43 , D512–D520 (2015).
2. Huang, J. X. et al. High throughput discovery of functional protein modifications by Hotspot
Thermal Profiling. Nat. Methods 16 , 894–901 (2019).
3. Potel, C., Kurzawa, N., Beecher, I., Mateus, A. & Savitski, M. M. Impact of phosphorylation
on thermal stability of proteins. bioRxiv (2020).
4. Gaetani, M. et al. Proteome Integral Solubility Alteration: A High-Throughput Proteomics
Assay for Target Deconvolution. J. Proteome Res. 18 , 4027–4037 (2019).
5. Savitski, M. M. et al. Tracking cancer drugs in living cells by thermal profiling of the
proteome. Science 346 , (2014).
6. Finley, D., Bartel, B. & Varshavsky, A. The tails of ubiquitin precursors are ribosomal
proteins whose fusion to ubiquitin facilitates ribosome biogenesis. Nature 338 , 394–401
(1989).
7. Kane, P. M. et al. Protein splicing converts the yeast TFP1 gene product to the 69-kD
subunit of the vacuolar H(+)-adenosine triphosphatase. Science 250 , 651–657 (1990).
8. Holt, L. J. et al. Global Analysis of Cdk1 Substrate Phosphorylation Sites Provides Insights
into Evolution. Science 325 , 1682–1686 (2009).
9. Dephoure, N. et al. A quantitative atlas of mitotic phosphorylation. Proc. Natl. Acad. Sci.
105 , 10762–10767 (2008).
10. Imami, K. et al. Phosphorylation of the Ribosomal Protein RPL12/uL11 Affects Translation
during Mitosis. Mol. Cell 72 , 84-98.e9 (2018).
11. Kasari, V. et al. A role for the Saccharomyces cerevisiae ABCF protein New1 in translation
termination/recycling. Nucleic Acids Res. 47 , 8807–8820 (2019).
12. Viéitez, C. et al. Towards a systematic map of the functional role of protein phosphorylation.
bioRxiv 872770 (2019) doi:10.1101/872770.
13. Wagih, O. et al. A resource of variant effect predictions of single nucleotide variants in model
organisms. Mol. Syst. Biol. 14 , e8430 (2018).
14. Ochoa, D. et al. The functional landscape of the human phosphoproteome. Nat. Biotechnol.
(2019) doi:10.1038/s41587-019-0344-3.
15. Leutert, M., Rodriguez-Mias, R. A., Fukuda, N. K. & Villén, J. R2-P2 rapid-robotic
phosphoproteomics enables multidimensional cell signaling studies. Mol. Syst. Biol. 15 ,
e9021 (2019).
16. Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized
p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26 ,
1367–1372 (2008).
12
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
17. Benjamini, Y. & Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful
Approach to Multiple Testing. J. R. Stat. Soc. Ser. B M ethodol. 57 , 289–300 (1995).
18. Pu, S., Wong, J., Turner, B., Cho, E. & Wodak, S. J. Up-to-date catalogues of yeast protein
complexes. Nucleic Acids Res. 37 , 825–831 (2009).
19. The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrödinger, LLC.
20. Groll, M. et al. Structure of 20S proteasome from yeast at 2.4 A resolution. Nature 386 ,
463–471 (1997).
21. Bulfer, S. L., Brunzelle, J. S. & Trievel, R. C. Crystal structure of Saccharomyces cerevisiae
Aro8, a putative α-aminoadipate aminotransferase. Protein Sci. 22 , 1417–1424 (2013).
22. Jogl, G., Rozovsky, S., McDermott, A. E. & Tong, L. Optimal alignment for enzymatic proton
transfer: Structure of the Michaelis complex of triosephosphate isomerase at 1.2-Å
resolution. Proc. Natl. Acad. Sci. 100 , 50–55 (2003).
23. Garcia-Saez, I., Kozielski, F., Job, D. & Boscheron, C. Structure of GAPDH 3 from S.
cerevisiae at 2.0 Å resolution. Submitt. PDB Data Bank (2010).
24. Didierjean, C. et al. Crystal Structure of Two Ternary Complexes of Phosphorylating
Glyceraldehyde-3-phosphate Dehydrogenase from Bacillus stearothermophilus with NAD
and d-Glyceraldehyde 3-Phosphate. J. Biol. Chem. 278 , 12968–12976 (2003).
25. Kim, H., Feil, I. K., Verlinde, C. L. M. J., Petra, P. H. & Hol, W. G. J. Crystal Structure of
Glycosomal Glyceraldehyde-3-phosphate Dehydrogenase from Leishmania mexicana:
Implications for Structure-Based Drug Design and a New Position for the Inorganic
Phosphate Binding Site. Biochemistry 34 , 14975–14986 (1995).
26. Eng, J. K., Jahan, T. A. & Hoopmann, M. R. Comet: an open-source MS/MS sequence
database search tool. Proteomics 13 , 22–24 (2013).
27. Käll, L., Canterbury, J. D., Weston, J., Noble, W. S. & MacCoss, M. J. Semi-supervised
learning for peptide identification from shotgun proteomics datasets. Nat. Methods 4 ,
923–925 (2007).
28. Beausoleil, S. A., Villén, J., Gerber, S. A., Rush, J. & Gygi, S. P. A probability-based
approach for high-throughput protein phosphorylation analysis and site localization. Nat.
Biotechnol. 24 , 1285–1292 (2006).
13
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Figure 1
ab
de
c
45
50
55
60
65
45 50 55 60 65
Protein Tm (ºC)
Phosphoisoform Tm (ºC)
R2 = 0.18
−2
−1
0
1
2
−2 −1 012
Protein Rs
Phosphoisoform Rs
R2 = 0.79
cell lysis
light Lys
S. cerevisiae
cultures
heavy Lys
cell lysis
45.6ºC 57ºC
temperature
gradient
30ºC pool,
soluble
control
pool,
soluble 2X
1X
temperature
% soluble protein
Rs = log2
temperature
gradient
30ºC control
temperature treatment MS analysis of soluble protein
−10 0 10 −10 0 10
0
3
6
9
Phosphopeptide
isoforms
Unmodified
peptides
∆Tm (ºC) ∆Tm (ºC)
-log10(p-value)
−2 −1 012−2 −1 012
0
1
2
3
4
Phosphopeptide
isoforms
Unmodified
peptides
∆Rs∆Rs
-log10(q-value)
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Figure 2
a
d
b
e
c
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
20S
proteasome
PUP2 S56
PUP2
Rs
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
Protein S38
RPL12
Rs
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 NEW1
T1191Protein
Rs
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 GAPDH
TDH1 TDH2 TDH3 All TDH3 S149
shared
peptides
peptides unique
to one isoform
Rs
20S proteasome
PUP2 (α5)PRE6 (α4)
S56
NADH
C150 S149
Pi
Pi
GAPDH
H176
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 PGK1
S36 S203 T331 T392 S397 S413Protein
Rs
Phosphosites
−1
0
1
2
3
S36E
T203E
T331E
T392E
S397E
S413E
ΔΔG
Phosphomimetic mutant
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Supplementary Figure 1
a b c
0.00
0.25
0.50
0.75
1.00 phosphoisoforms
protein
Bio Bio Bio Bio Tech Tech
this study Huang et al.
−2
−1
0
1
2
−2 −1 0 1 2
R2 = 0.90
Protein R
s
40
50
60
70
40 50 60 70
R2 = 0.18
Protein T
m
(ºC)
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Supplementary Figure 2
a b
−0.5
0.0
0.5
1.0
1.5
0 300 600 900
Protein sequence position
VMA1
−1
0
1
50 100 1500
Protein sequence position
RPS31
ubiquitin
RPS31
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Supplementary Figure 3
a
b
c
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
TDH1 TDH2 TDH3 All TDH3 S201
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 TPI1
S79
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5 ARO8
S59
GAPDH
Protein
Protein
shared
peptides
peptides unique
to one isoform
TPI1
S79
ARO8
S59
TDH3
S201
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;
Supplementary Figure 4
a b
0.0
2.5
5.0
7.5
−10 −5 0 5 10
-log10(q-value)
∆T
m
(phosphoisoform-protein) (ºC)
0.0
2.5
5.0
7.5
−10 −5 0 5 10
-log10(p-value)
∆T
m
(phosphoisoform-protein) (ºC)
.CC-BY 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.14.904300doi: bioRxiv preprint first posted online Jan. 15, 2020;