High S equenc e Coverage of Proteins Isolated from
Liquid S eparations of Breast Canc er Cells Using
Capillary Elec trophoresis-T ime-of-Flight MS and
MALDI-T OF MS Mapping
K an Zhu,²J eongkwon K im,² ,³Chul Y oo,²Fred R. Miller,§and David M. Lubman*,²
Department of Chemistry, The University of Michigan, Ann Arbor, Michigan 48109-1055, and
Barbara Ann Karmanos Cancer Institute, Wayne State University, Detroit, Michigan 48201
A method has been developed for high sequence coverage
analysis of proteins isolated from breast cancer cell lines.
Intact proteins are isolated using multidimensional liquid-
phase separations that permit the collection of individual
protein fractions. Protein digests are then analyzed by
both matrix-assisted laser desorption/ionization time-of-
flight mass spectrometry (MALDI-TOF MS) peptide mass
fingerprinting and by capillary electrophoresis-electro-
spray ionization (CE-ESI)-TOF MS peptide mapping.
These methods can be readily interfaced to the relatively
clean proteins resulting from liquid-phase fractionation
of cell lysates with little sample preparation. Using com-
bined sequence information provided by both mapping
methods, 100% sequence coverage is often obtained for
smaller proteins, while for larger proteins up to 75 kDa,
over 90% coverage can be obtained. Furthermore, an
accurate intact protein MW value (within 150 ppm) can
be obtained from ESI-TOF MS. The intact MW together
with high coverage sequence information provides ac-
curate identification. More notably the high sequence
coverage of CE-ESI-TOF MS together with the MS/MS
information provided by the ion trap/reTOF MS elucidates
posttranslational modifications, sequence changes, trun-
cations, and isoforms that may otherwise go undetected
when standard MALDI-MS peptide fingerprinting is used.
This capability is critical in the analysis of human cancer
cells where large numbers of expressed proteins are
modified, and these modifications may play an important
role in the cancer process.
Proteomic techniques have become important in the study of
disease by providing an integrated view of disease at the protein
level. The identification of protein markers may be used for
diagnosis and prognosis of disease1-3as well as targets for
development of new drugs.4,5The development of procedures for
separating proteins from complex mixtures and for generating
structural information from the proteins of interest are two key
areas in proteome research. One such method for studying
complex protein mixtures involves using 2-D-PAGE to separate
large numbers of proteins from cell lysates where protein
structural information is obtained by analyzing enzymatic digests
of gel spots with matrix-assisted laser desorption/ionization time-
of-flight mass spectrometry (MALDI-TOF MS).6-9Proteins sepa-
rated by gels are not directly compatible with MALDI analysis. A
number of procedures including spot excision, destaining, enzy-
matic digestion, extraction of peptides into solution, and spotting
the sample are required prior to MALDI-TOF analysis.10A major
drawback of the method is that only a limited coverage of the
protein sequence is obtained from MALDI-TOF MS due to ion
suppression and varying ionization efficiencies for different pep-
tides.11The result is that in complex proteomes such a peptide
map may provide incorrect identifications, depending upon the
parameters with which the database is searched. Furthermore,
the limited coverage often does not identify the presence of
posttranslational modifications, which are critical to protein
function and dysfunction.12,13
The development of methods to analyze proteins with high
sequence coverage is essential in the field of proteomics. High
* Corresponding author: (tel) 734-764-1669, (fax) 734-615-8108, (e-mail)
²The University of Michigan.
³Present address: Environmental Molecular Sciences Laboratory, Pacific
Northwest National Laboratory, P.O. Box 999, Richland, WA 99352.
§Wayne State University.
(1) Lawrie, L. C.; Fothergill, J. E.; Murray, G. I. Lancet Oncol. 2001, 2, 270-
(2) Srinivas, P. R.; Verma, M.; Zhao, Y.; Srivastava, S. Clin. Chem. 2002, 48
(3) Petricoin, E. F.; Zoon, K. C.; Kohn, E. C.; Barrett, J. C.; Liotta, L. A. Nat.
Rev. Drug Discovery 2002, 1, 683-695.
(4) Figeys, D. Anal. Chem. 2002, 413A-419A.
(5) Vercoutter-Edouart, A. S.; Lemoine, J.; Le Bourhis, X.; Louis, H.; Boilly, B.;
Nurcombe, V.; Revillion, F.; Peyrat, J. P.; Hondermarck, H. Cancer Res.
2001, 61 (1): 76-80.
(6) Henzel, J. W.; Billeci, T. M.; Stults, J. T.; Wong, S. C.; Grimely, C.; Watanabe,
C. Proc. Natl. Acad. Sci. U.S.A. 1993, 90, 5011-5015.
(7) Yates, J. R. J. Mass Spectrom. 1998, 33, 1-9.
(8) Liang, X.; Bai, J.; Liu, Y. H.; Lubman, D. M. Anal. Chem. 1996, 68, 1012-
(9) Loo, R. R. O.; Stevenson, T. I.; Mitchell, C.; Loo, J. A.; Andrews, P. C. Anal.
Chem. 1996, 68, 1910-1917.
(10) Lahm, H. W.; Langen, H. Electrophoresis 2000, 21, 2105-2114.
(11) Krause, E.; Wenschuh, H.; Jungblut, P. R. Anal. Chem. 1999, 71, 4160-
(12) Minamoto, T.; Buschmann, T.; Habelhah, H.; Matusevich, E.; Tahara, H.;
Boerresen-Dale, A. L.; Harris, C.; Sidransky, D.; Ronai, Z. Oncogene2001,
(13) Wilkins, M. R.; Gasteiger, E.; Gooley, A. A.; Herbert, B. R.; Molloy, M. P.;
Binz, P. A.; Ou, Keli; Sanchez, J. C.; Bairoch, A.; Williams, K. L.;
Hochstrasser, D. F. J. Mol. Biol. 1999, 289, 645-657.
Anal. Chem. 2003, 75, 6209-6217
10.1021/ac0346454 CCC: $25.00
Published on Web 10/02/2003
© 2003 American Chemical Society
Analytical Chemistry, Vol. 75, No. 22, November 15, 2003
sequence coverage eliminates false identifications in database-
searching algorithms and provides a means for identifying the
presence of posttranslational modifications (PTMs). A number of
methods havebeendevelopedtoachievehigh sequencecoverage
and to eliminate the use of 2-D gel technology. In recent work by
Yates and co-workers, variant hemoglobins obtained from blood
were digested with three different enzymes and analysis was
performed on the combined enzymatic peptide mixtures with
microcapillary liquid chromatography (LC)-MS/MS.14Using the
combination of three enzymes, asequence coverage of >99%was
obtained. An improvement on the method is achieved by com-
bining three proteolytic peptide mixtures followed by MUDPIT
analysis.15However, lack of intact molecular weight information
prevents it from predicting PTMs and making decisions if the
modification is not detected. In other work, using complementary
results from ESI and MALDI, Prokai et al. obtained 95+%
sequence coverage for cytolysin proteins purified from sea
anemone Stichodactyla helianthus.16In all three cases, the use of
high sequence coverage revealed protein variants and modifica-
tions. Smith et al. also achieved 100% sequence coverage for
transform ion cyclotron resonance.17Another alternative method
used by Kelleher and co-workers has been the ESI-based ªtop-
downº approach, which has achieved 100%sequence coverage.18
However, this method has thus far been applied to relatively
simple organisms and for proteins under 40 kDa.
In this work, amethod is introduced toachieve high sequence
coverage of proteins isolated from real cellular samples. A liquid-
based 2-D fractionation of proteins from cell lysates produces
purified proteins in the liquid phase. The method has been
introduced in prior work to map cellular proteins as a means to
search for markers of disease.19-21Greater than 50%of human
proteins were found to be modified; thus, many proteins in the
humanlysates weredifficulttoidentify with confidencebasedupon
the tryptic peptide map alone.22A major advantage of the 2-D
methodis theproductionofsubstantial amounts ofhighly purified
proteins isolated in the liquid phase. As a result, proteins are
digestedanddirectly analyzedby acombinationofMS techniques
(capillary electrophoresis-electrospray ionization (CE-ESI)-MS
andMALDI-TOF MS) with minimal samplepreparation. CE-ESI-
MS can provide high sequence coverage of protein digests often
approaching total coveragefor proteins under 20kDa.23,24MALDI-
TOF MS often provides improved detection for peptides that are
not detected by ESI-MS. The result is that often >90%coverage
can be obtained even for large proteins using a combination of
Inthepresentwork, theuseof2-D liquidseparations combined
with CE-ESI-MS andMALDI-TOF MS for high sequencecoverage
of selected proteins isolated from breast cancer cell lines is
demonstrated. Over a MW range of 4000-70 000, >90%of the
sequence can be obtained from a single tryptic digest. The
presenceofproteins intheliquidphaseallows directdetermination
of an accurate MW value that is used to determine the presence
of PTMs. The MW together with the high sequence coverage
afforded by this method and the use ofMS/MS provides ameans
to determine the identification and sites of PTMs as well as the
presence of sequence deletions and additions and resulting
isoforms. Different isoforms of lamin A, a truncation of HSP60,
and acetylation of thymosin ?4 were distinguished using the
method. Theseprocedures providestructural informationessential
in studies of cancer progression and biomarker identification.
EX PERIMENTAL SECTION
MCF10Ca1DCL1 Cells and Lysis. MCF10Ca1d clone 1
(CA1d) is afully malignant humanbreast cancer line.25Cells were
grown in monolayer on plastic in DMEM/F12 medium supple-
mentedwith 5%horseserum, 10mM N-2-hydroxyethylpiperazine-
N′-2-ethanesulfonic acid (HEPES). Adherent cells were harvested
in log phase (∼75-80%confluence). The growth medium was
aspirated, andthecells weregently washedwith sterilePBS buffer,
then scraped with arubber policeman, and stored in -80°C. The
cell pellets were lysed by adding 1.5mL of lysis buffer containing
8M urea, 2M thiourea, 0.5%(w/v) n-octyl ?-D-galactopyranoside,
50mM dithiothreitol, 10mM phenylmethanesulfonyl fluoride, and
10% (v/v)glycerol, vortexed for 2 min, and then left at room
temperature for 1 h. After the sample was lysed, the resulting
mixturewas centrifugedat15000rpmfor 20min. Thesupernatant
was collected, and Bradford assays (Bio-Rad, Hercules, CA) were
performed to quantify the amount of protein in the lysate. The
supernatant was removed, diluted to 18 mL with the isoelectric
focusing (IEF) running buffer, and introduced into the mini-
Rotofor (Bio-Rad, Hercules, CA) separation chamber for fraction-
ation. All chemicals were obtained from Sigma Chemical Co. (St.
Louis, MO) unless specified otherwise.
Liquid-Phase Isoelectric Focusing. The Bio-Rad Mini-
Rotofor was used to separate the cell extract by IEF in the first
dimension. Cell extract was mixed with IEF running buffer
composed of 8 M urea, 2 M thiourea, and 2%pH 3-10 Biolyte
(Bio-Rad). The Rotofor cell was loaded with 18mL of the mixture
and the separation controlled at a constant 12 W for 3.5 h.
Separated pI fractions were harvested into20tubes and stored in
-80°C until further analysis. pH measurements aretakenfor each
fractionusing anOrionpH meter (model 250A, Allometrics, Baton
Rouge, LA) and Accumet combination electrode (Fischer, Pitts-
Nonporous (NPS) Reversed-Phase Liquid Chromatogra-
phy. Briefly, NPS-RP HPLC separation is performed at aflow rate
(14) Gatlin, C. L.; Eng, J. K.; Cross, S. T.; Detter, J. C., Yates, J. R. III. Anal.
Chem. 2000, 72, 757-763.
(15) MacCoss, M. J.; McDonald, W. H.; Saraf, A.; Sadygov, R.; Clark, J. M.; Tasto,
J. J.; Gould, K. L.; Wolters, D.; Washburn, M.; Weiss, A.; Clark, J. I.; Yates,
J. R., III. Proc. Natl. Acad. Sci. U.S.A. 2002, 99 (12), 7900-7905.
(16) Stevens, S. M., Jr.; Kem, W. R.; Prokai, L. Rapid Commun. Mass Spectrom.
2002, 16, 2094-2101.
(17) Bruce, J. E.; Anderson, G. A.; Wen, J.; Harkewicz, R.; Smith, R. D. Anal.
Chem. 1999, 71, 2595-2599.
(18) Forbes, A. J.; Mazur, M. T.; Patel, H. M.; Walsh, C. T.; Kelleher, N. L.
Proteomics 2001, 1, 927-933.
(19) Wall, D. B.; Kachman, M. T.; Gong, S.; Hinderer, R.; Parus, S.; Misek, D.
E.; Hanash, S. M.; Lubman, D. M. Anal. Chem. 2000, 72, 1099-1111.
(20) Kachman, M. T.; Wang, H. X.; Schwartz, D. R.; Cho, K. R.; Lubman, D. M.
Anal. Chem. 2002, 74 (8), 1779-1791.
(21) Chong, B. E.; Hamler, R. L.; Lubman, D. M.; Ethier, S. P.; Rosenspire, A. J.;
Miller, F. R. Anal. Chem. 2001, 73 (6), 1219-1227.
(22) Wall, D. B.; Kachman, M. T.; Gong, S. Y.; Parus, S. J.; Long, M. W.; Lubman,
D. M. Rapid Commun. Mass Spectrom. 2001, 15 (18), 1649-1661.
(23) Jin, X. Y.; Kim, J.; Parus, S.; Lubman, D. M.; Zand, R. Anal. Chem. 1999,
(24) Cao, P.; Moini, M. Rapid Commun. Mass Spectrom. 1998, 12, 864-870.
(25) Santner, S. J.; Dawson, P. J.; Tait, L.; Soule, H. D.; Eliason, J.; Mohamed, A.
N.; Wolman, S. R.; Heppner, G. H.; Miller, F. R. Breast Cancer Res. Treat.
2002, 65, 101-110.
Analytical Chemistry, Vol. 75, No. 22, November 15, 2003
measuretryptic digest peptides (Table2). TheMW valueoflamin
A and the sequence coverage indicate that there are no modifica-
tions present. In the case of the other isoforms, there is a small
difference between the measured and theoretical MW values
detected. The coverage of the sequence is such that no obvious
modifications have been found. The difference between the
measured and predicted MW values may be partially due to the
low signal of the intact proteins as measured by ESI-TOF MS.
The poor S/N results in a decreased accuracy of the MW
measurement, which may exceed the expected 150ppmand may
be on the order of 500 ppm or greater.
In this work, high sequence coverage of proteins has been
obtained by extraction of proteins from whole-cell lysates using
2-D liquid fractionation, which can be directly interfaced for
analysis of tryptic digests to CE-TOF MS and MALDI-TOF MS.
The 2-D liquid separation method has the advantage ofproducing
purified proteins in the liquid phase that can be analyzed. The
proteins in the liquid phase can be digested and analyzed directly
by CE-TOF MS, which provides high sequence coverage, espe-
cially for proteins under 20 kDa where the coverage may reach
100% . The combination of CE-TOF MS and MALDI-TOF MS can
oftenprovideover 90%sequencecoverageevenfor larger proteins
of over 70 kDa. The use of high sequence coverage is essential
for correct identification of proteins in human cells where
sequence homology between proteins may provide inaccurate
identifications using present databases. Inaddition, many proteins
in human cells are modified and high sequence coverage is
required to identify the presence and position of modifications.
The use ofaccurate intact protein MW determination by ESI-TOF
MS can be used to further identify the presence of modifications.
In this work, the method has been used to identify the presence
of lamin isoforms and the various modifications that may exist
among them. The method will be essential in cancer research
for identifying the presence of up- and downregulated proteins
that are differentially expressed in cells, the identity of those
proteins, and whether those proteins are modified. Even more
essential will be the ability to determine whether those modifica-
tions change as a function of cancer progression.
This work was supported in part by the National Science
Foundation under Grant DBI 9987220 (D.M.L.), the National
Institutes of Health under Grant R01 GM 49500 (D.M.L.), and
the National Cancer Institute under Grants R21CA83808(D.M.L.,
F.R.M.) and R01CA90503 (F.R.M., D.M.L.). Support was also
generously provided by Eprogen, Inc. The MALDI-TOF MS
instrument used in this work was funded by the National Science
Foundationunder GrantDBI 99874.Wethank Kimberly Schneider
for a critical reading of the manuscript.
Received for review June 13, 2003. Accepted August 25,
T able 3. Unique T ryptic Peptides of Lamin A, Lamin Adelta10, and Lamin C Detec ted by MALDI and CE-MS
proteinmass unique peptides
lamin A1988.0957 (528-545)
lamin Adelta10 TALINSTGEGSHCSSSGDPAEYNLRSR
lamin C SVTVVEDDEDEDGDDLLHHHHVSGSRR
Figure 8. MALDI-TOF spectrum of (A) lamin Adelta10, (B) lamin A, and (C) lamin C tryptic digest. Peaks circled are unique peaks for each
Analytical Chemistry, Vol. 75, No. 22, November 15, 2003