Article

Generation of minimal protein identifiers of proteins from two-dimensional gels and recombinant proteins

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We describe the technical feasibility and methodology to characterize a protein by a minimal set of structural information generated by matrix assisted laser desorption/ionization (MALDI)-mass spectrometry, termed a "minimal protein Identifier" (MPI). MPIs can be determined for proteins from two-dimensional gels and recombinant proteins and can be used to compare and identify proteins from these sources.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Proteins were separated by two-dimensional gel electrophoresis. 2D gels showed characteristic expression patterns, in which intensive protein spots were obtained in the 20-30 kDa region at basic (8)(9) pH and in the 40-90 kDa region at acidic (4-6) pH. The separated protein spots, mainly derived from cytotoxins, were digested by trypsin, and the peptides were analyzed by MALDI-TOF mass spectrometry and/or sequenced by nanoESI-MS/MS or MALDI-MS/MS. ...
... We have demonstrated cell-free protein expression in nanowell plates in chip format and have monitored the enzymatic activity of those in vitro expressed proteins [7]. We have previously demonstrated the use of proteins in proteomics [8] and in large-scale selection systems [9]. Our second focus is to apply this technology to study interactions, such as peptide-protein interactions, and to elucidate the specificity and cross-reactivity of binders, such as monoclonal antibodies, and to profile the antibody repertoire in body fluids, such as human serum. ...
Article
External quality assessment schemes are evaluated by method-dependent consensus values or by method independent target values, determined with reference measurement procedures. The reference measurement procedures yield values having a high trueness and precision with a defined uncertainty of measurement. The great advantage of the reference measurement procedure is that the comparability of results in the different medical laboratories is promoted and improved intending in the same reference interval for an analyte in all laboratories. At the same time the manufacturers are motivated to calibrate their analytical systems in such a way, that the best comparability of values in the external quality assessment schemes is given. The European regulation in that field is the ''In-vitro-diagnostic medical device directive''. This directive gives an additional role to the external quality assessment schemes that is the vigilance of the market. The vigilance procedure of the diagnostic market intends to regis-trate and inform the manufacturers and the competent authorities of any incidence in the diagnostic system. To perform this vigilance procedure in an appropriate way a European standard has been developed (EN 14136). The interaction between EQAS-organizers the manufacturers and the routine laboratories will improve the quality of measurements in the medical laboratories in favour of the patients. One typical reference method measurement procedure will be presented to determine the concentration of digoxin and digitoxin in EQA-samples. The reference measurement procedures have already proved their significance within epidemiological studies.
... Proteins were separated by two-dimensional gel electrophoresis. 2D gels showed characteristic expression patterns, in which intensive protein spots were obtained in the 20-30 kDa region at basic (8)(9) pH and in the 40-90 kDa region at acidic (4-6) pH. The separated protein spots, mainly derived from cytotoxins, were digested by trypsin, and the peptides were analyzed by MALDI-TOF mass spectrometry and/or sequenced by nanoESI-MS/MS or MALDI-MS/MS. ...
... We have demonstrated cell-free protein expression in nanowell plates in chip format and have monitored the enzymatic activity of those in vitro expressed proteins [7]. We have previously demonstrated the use of proteins in proteomics [8] and in large-scale selection systems [9]. Our second focus is to apply this technology to study interactions, such as peptide-protein interactions, and to elucidate the specificity and cross-reactivity of binders, such as monoclonal antibodies, and to profile the antibody repertoire in body fluids, such as human serum. ...
Article
Aim: Development of an analytical method for the comparative analysis of serum samples in the benefit of biomarker discovery for cervical cancer diagnosis. Introduction: Cervical carcinoma is the second most frequent carcinoma in women worldwide, while in the developing countries cervical carcinoma is the most frequent carcinoma in women. In an approach to perform comparative analyses of samples from a serum bank from patients in a longitudinal and cross-sectional manner, we have developed a methodology for the comparative analysis of depleted, trypsin digested serum samples by LC-MS followed by data pre-processing and multivariate statistics. Results: Optimization of the analytical method from sample preparation (clotting time, various depletion methods of the most abundant proteins) to the final LC-MS analysis were performed to lower the within sample variability. Further improvement of the reproducibility of the overall procedure was achieved by the use of horse heart Cytochrom C as internal standard added to the sample prior to sample preparation. The obtained LC-MS data were pre-processed prior to statistical analysis by alignment of retention time and the normalization of intensity of the chromatograms by using specific internal standard peaks prior to selecting the most ‘‘information rich’’ m/z traces. The selected chromatographic traces containing a significantly decreased level of spikes and noise, were subjected to unsupervised multivariate statistics (principal component analysis). In an effort to validate the methodology, we spiked various amounts of horse heart Cytochrom C into the original serum and found that samples containing the internal standard were discriminated from the non-spiked samples down to a level of 1 pmol in 20 ll serum. Perspectives: In order to further improve the sensitivity of the overall method, we performed comparative studies using an on-line nanoLC-MALDI spotter set-up combined with MALDI-TOF/(TOF)-MS*. Using the described methodology, we are presently performing a comparative, longitudinal study with samples from early and late stage cervical cancer patients prior to and after treatment. Furthermore, we are comparing samples from patients with a positive or negative prognosis in order to define new discriminatory biomarkers or biomarker patterns.
... Hence, the FDR is more error-prone in case of mixed host-pathogen samples where two organisms are combined. One of the most suitable methods to bypass this limitation is the so-called spectra-to-spectra search [26]. Here the number of database entries is significantly reduced. ...
Article
Full-text available
Streptococcus pneumoniae (pneumococci) is a leading cause of severe bacterial meningitis in many countries worldwide. To characterize the repertoire of fitness and virulence factors predominantly expressed during meningitis we performed niche-specific analysis of the in vivo proteome in a mouse meningitis model, in which bacteria are directly inoculated into the cerebrospinal fluid (CSF) cisterna magna. We generated a comprehensive mass spectrometry (MS) spectra library enabling bacterial proteome analysis even in the presence of eukaryotic proteins. We recovered 200,000 pneumococci from CSF obtained from meningitis mice and by MS we identified 685 pneumococci proteins in samples from in vitro filter controls and 249 in CSF isolates. Strikingly, the regulatory two-component system ComDE and substrate-binding protein AliB of the oligopeptide transporter system were exclusively detected in pneumococci recovered from the CSF. In the mouse meningitis model, AliB-, ComDE-, or AliB-ComDE-deficiency resulted in attenuated meningeal inflammation and disease severity when compared to wild-type pneumococci indicating the crucial role of ComDE and AliB in pneumococcal meningitis. In conclusion, we show here mechanisms of pneumococcal adaptation to a defined host compartment by a proteome-based approach. Further, this study provides the basis of a promising strategy for the identification of protein antigens critical for invasive disease caused by pneumococci and other meningeal pathogens.
... • Fixierlösung: 50% Ethanol, 10% Essigsäure Im folgenden wurde die Verteilung von GroEL im Gel noch weiter analysiert, indem die Peaklisten nach "GroEL-typischen" Massen durchsucht wurden (s. auch [86] Die mittels NC-Partikel affinitätsgereinigten IgG wurden an Protein-G Sepharose gebunden und ...
Thesis
Das magenbesiedelnde Bakterium Helicobacter pylori gehört zu den am weitesten verbreiteten Infektionserregern. Obwohl die Infektion meist lebenslang symptomlos verläuft, kann H. pylori bei einigen Menschen schwere Erkrankungen bis hin zum Magenkarzinom verursachen. Ziele dieser Arbeit waren Magenkarzinom-assoziierte Antigene für einen diagnostischen Test zu finden und Methoden zur Untersuchung von Spotkompositionen mittels MALDI-TOF/TOF Massenspektrometrie zu entwickeln. Im ersten Teil der Promotion wurden die Antigenerkennungsmuster von 30 Magenadenokarzinom- mit 30 Ulkus duodeni-Patienten mithilfe hochauflösender zweidimensionaler Immunoblots von H. pylori Lysat verglichen. Diese fokussierte Gegenüberstellung eignet sich gut für diese Fragestellung, da beide Erkrankungen von diesem Bakterium verursacht werden, aber nur sehr selten gemeinsam auftreten. Durch univariate statistische Analysen wurden 14 Magenkarzinom korrelierte Spots gefunden (p
... A large number of cDNA clones have to be expressed simultaneously with the appropriate vector system [1,128]. Gene expression data from DNA arrays and/or protein expression profiling data from 2D electrophoresis and MS can be correlated with data from protein chips [488,489]. DNA sequence information and protein expression levels can also be linked. ...
Article
Technological advances in miniaturization have found a niche in biology and signal the beginning of a new revolution. Most of the attention and advances have been made with DNA chips yet a lot of progress is being made in the use of other biomolecules and cells. A variety of reviews have covered only different aspects and technologies but leading to the shared terminology of "biochips." This review provides a basic introduction and an in-depth survey of the different technologies and applications involving the use of non-DNA molecules such as proteins and cells. The review focuses on microarrays and microfluidics, but also describes some cellular systems (studies involving patterning and sensor chips) and nanotechnology. The principles of each technology including parameters involved in biochip design and operation are outlined. A discussion of the different biological and biomedical applications illustrates the significance of biochips in biotechnology.
Article
Completely revised and updated, this text provides an easy-to-read guide to the concept of mass spectrometry and demonstrates its potential and limitations. Written by internationally recognised experts and utilising "real life" examples of analyses and applications, the book presents real cases of qualitative and quantitative applications of mass spectrometry. Unlike other mass spectrometry texts, this comprehensive reference provides systematic descriptions of the various types of mass analysers and ionisation, along with corresponding strategies for interpretation of data. The book concludes with a comprehensive 3000 references. This multi-disciplined text covers the fundamentals as well as recent advance in this topic, providing need-to-know information for researchers in many disciplines including pharmaceutical, environmental and biomedical analysis who are utilizing mass spectrometry.
Article
Secreted proteins of bacteria are preferentially capable of interacting with host cells and are therefore of special biological and medical interest. Narrow pH range 2-DE and MALDI-TOFTOF-MS combine high-resolution protein separation with highly sensitive identification of proteins. Secreted proteins of Mycobacterium tuberculosis were separated at the protein species level, distinguishing different protein species of one protein. We focused on the pI range 4.0-4.7 and the Mr range 6-20kDa of the 2-DE pattern. Out of 128 analyzed spots, 121 were identified resulting in 33 different proteins with 277 different protein species, accumulating in a mean of 8.4 protein species per protein. Overrepresentation was found for the protein classes "virulence, detoxification, adaption", "information pathways", "cell wall and cell processes", and "intermediary metabolism and respiration". Thus far, 15 protein species of the ESX-1 family are characterized with 100% sequence coverage. More automated 2-DE procedures and more sensitive identification techniques are required for complete characterization of all of the protein species even in highly enriched samples, such as culture filtrates. Only then the functional level of proteomics will be achieved and potential biomarkers can be postulated at the molecular level. Proteomics is dominated by bottom-up approaches largely ignoring protein speciation. A prerequisite to reach the protein species level is to obtain 100% sequence coverage, which is a major challenge in proteomics. Here we show complete sequence information with a 2-DE-MS approach for 15 protein species. Acetylation of the N-terminus of ESAT-6 inhibits interaction with CFP-10, with direct consequences for pathogen-host interaction. This article is part of a Special Issue entitled: (Trends in Microbial Proteomics).
Article
Full-text available
Large-scale and high throughput approaches increasingly play an essential role in the study of biological systems, which are per se highly complex. Therefore, they need to be examined by these extensive methods to receive information about the large genomic and proteomic networks. In plant biology, this purpose has a strong support through the accessability of the complete genome sequence of the model plant Arabidopsis thaliana. This brief review intends to focus on the basics and the state-of-the-art of these high-throughput technologies and their application to plant proteomics. It describes protein microarrays, the use of antibodies, 2-DE and MS methods and the yeast two hybrid system, which are emerging as the major technologies for plant proteomics.
Article
Peptide mass fingerprinting (PMF) is a powerful tool for identification of proteins separated by two-dimensional electrophoresis (2-DE). With the increase in sensitivity of peptide mass determination it becomes obvious that even spots looking well separated on a 2-DE gel may consist of several proteins. As a result the number of mass peaks in PMFs increased dramatically leaving many unassigned after a first database search. A number of these are caused by experiment-specific contaminants or by neighbor spots, as well as by additional proteins or post-translational modifications. To understand the complete protein composition of a spot we suggest an iterative procedure based on large numbers of PMFs, exemplified by PMFs of 480 Helicobacter pylori protein spots. Three key iterations were applied: (1) Elimination of contaminant mass peaks determined by MS-Screener (a software developed for this purpose) followed by reanalysis; (2) neighbor spot mass peak determination by cluster analysis, elimination from the peak list and repeated search; (3) re-evaluation of contaminant peaks. The quality of the identification was improved and spots previously unidentified were assigned to proteins. Eight additional spots were identified with this procedure, increasing the total number of identified spots to 455.
Article
Full-text available
There is burgeoning interest in protein microarrays, but a source of thousands of nonredundant, purified proteins was not previously available. Here we show a glass chip containing 2413 nonredundant purified human fusion proteins on a polymer surface, where densities up to 1600 proteins/cm(2) on a microscope slide can be realized. In addition, the polymer coating of the glass slide enables screening of protein interactions under nondenaturing conditions. Such screenings require only 200-microl sample volumes, illustrating their potential for high-throughput applications. Here we demonstrate two applications: the characterization of antibody binding, specificity, and cross-reactivity; and profiling the antibody repertoire in body fluids, such as serum from patients with autoimmune diseases. For the first application, we have incubated these protein chips with anti-RGSHis(6), anti-GAPDH, and anti-HSP90beta antibodies. In an initial proof of principle study for the second application, we have screened serum from alopecia and arthritis patients. With analysis of large sample numbers, identification of disease-associated proteins to generate novel diagnostic markers may be possible.
Article
Protein identification by matrix-assisted laser desorption/ionization mass-spectrometry peptide mass fingerprinting (MALDI-MS PMF) represents a cornerstone of proteomics. However, it often fails to identify low-molecular-mass proteins, protein fragments, and protein mixtures reliably. To overcome these limitations, PMF can be complemented by tandem mass spectrometry and other search strategies for unambiguous protein identification. The present study explores the advantages of using a MALDI-MS-based approach, designated minimal protein identifier (MPI) approach, for protein identification. This is illustrated for culture supernatant (CSN) proteins of Mycobacterium tuberculosis H37Rv after separation by two-dimensional gel electrophoresis (2-DE). The MPI approach takes into consideration that proteins yield characteristic peptides upon proteolytic cleavage. In this study, peptide mixtures derived from tryptic protein cleavage were analyzed by MALDI-MS and the resulting spectra were compared with template spectra of previously identified counterparts. The MPI approach allowed protein identification by few protein-specific signature peptide masses and revealed truncated variants of mycobacterial elongation factor EF-Tu, previously not identified by PMF. Furthermore, the MPI approach can be employed to track proteins in 2-DE gels, as demonstrated for the 14 kDa antigen, the 10 kDa chaperone, and the conserved hypothetical protein Rv0569 of M. tuberculosis H37Rv. Furthermore, it is shown that the power of the MPI approach strongly depends on distinct factors, most notably on the complexity of the proteome analyzed and accuracy of the mass spectrometer used for peptide mass determination.
Article
The mouse is the premier genetic model organism for the study of disease and development. We describe the establishment of a mouse T helper cell type 1 (T(H)1) protein expression library that provides direct access to thousands of recombinant mouse proteins, in particular those associated with immune responses. The advantage of a system based on the combination of large cDNA expression libraries with microarray technology is the direct connection of the DNA sequence information from a particular clone to its recombinant, expressed protein. We have generated a mouse T(H)1 expression cDNA library and used protein arrays of this library to characterize the specificity and cross-reactivity of antibodies. Additionally, we have profiled the autoantibody repertoire in serum of a mouse model for systemic lupus erythematosus on these protein arrays and validated the putative autoantigens on highly sensitive protein microarrays.
Article
Peptide mass fingerprinting by MALDI-MS and sequencing by tandem mass spectrometry have evolved into the major methods for identification of proteins following separation by two-dimensional gel electrophoresis, SDS-PAGE or liquid chromatography. One main technological goal of proteome analyses beside high sensitivity and automation was the comprehensive analysis of proteins. Therefore, the protein species level with the essential information on co- and post-translational modifications must be achieved. The power of peptide mass fingerprinting for protein identification was described here, as exemplified by the identification of protein species with high molecular masses (spectrin alpha and beta), low molecular masses (elongation factor EF-TU fragments), splice variants (alpha A crystallin), aggregates with disulfide bridges (alkylhydroperoxide reductase), and phosphorylated proteins (heat shock protein 27). Helpful tools for these analyses were the use of the minimal protein identifier concept and the software program MS-Screener to remove mass peaks assignable to contaminants and neighbor spots.
Article
Dilated cardiomyopathy (DCM) is a myocardial disease characterized by progressive depression of myocardial contractile function and ventricular dilatation. Thirty percent of DCM patients belong to the inherited genetic form; the rest may be idiopathic, viral, autoimmune, or immune-mediated associated with a viral infection. Disturbances in humoral and cellular immunity have been described in cases of myocarditis and DCM. A number of autoantibodies against cardiac cell proteins have been identified in DCM. In this study, we have profiled the autoantibody repertoire of plasma from DCM patients against a human protein array consisting of 37,200 redundant, recombinant human proteins and performed qualitative and quantitative validation of these putative autoantigens on protein microarrays to identify novel putative DCM specific autoantigens. In addition to analyzing the whole IgG autoantibody repertoire, we have also analyzed the IgG3 antibody repertoire in the plasma samples to study the characteristics of IgG3 subclass antibodies. By combining screening of a protein expression library with protein microarray technology, we have detected 26 proteins identified by the IgG antibody repertoire and 6 proteins bound by the IgG3 subclass. Several of these autoantibodies found in plasma of DCM patients, such as the autoantibody against the Kv channel-interacting protein, are associated with heart failure.
Article
Full-text available
A rapid method for the identification of known proteins separated by two-dimensional gel electrophoresis is described in which molecular masses of peptide fragments are used to search a protein sequence database. The peptides are generated by in situ reduction, alkylation, and tryptic digestion of proteins electroblotted from two-dimensional gels. Masses are determined at the subpicomole level by matrix-assisted laser desorption/ionization mass spectrometry of the unfractionated digest. A computer program has been developed that searches the protein sequence database for multiple peptides of individual proteins that match the measured masses. To ensure that the most recent database updates are included, a theoretical digest of the entire database is generated each time the program is executed. This method facilitates simultaneous processing of a large number of two-dimensional gel spots. The method was applied to a two-dimensional gel of a crude Escherichia coli extract that was electroblotted onto poly(vinylidene difluoride) membrane. Ten randomly chosen spots were analyzed. With as few as three peptide masses, each protein was uniquely identified from over 91,000 protein sequences. All identifications were verified by concurrent N-terminal sequencing of identical spots from a second blot. One of the spots contained an N-terminally blocked protein that required enzymatic cleavage, peptide separation, and Edman degradation for confirmation of its identity.
Article
Full-text available
Molecular analysis of complex biological structures and processes increasingly requires sensitive methods for protein sequencing. Electrospray mass spectrometry has been applied to the high-sensitivity sequencing of short peptides, but technical difficulties have prevented similar success with gel-isolated proteins. Here we report a simple and robust technique for the sequencing of proteins isolated by polyacrylamide gel electrophoresis, using nano-electrospray tandem mass spectrometry. As little as 5 ng protein starting material on Coomassie- or silver-stained gels can be sequenced. Multiple-sequence stretches of up to 16 amino acids are obtained, which identify the protein unambiguously if already present in databases or provide information to clone the corresponding gene. We have applied this method to the sequencing and cloning of a protein which inhibits the proliferation of capillary endothelial cells in vitro and thus may have potential antiangiogenic effects on solid tumours.
Article
Full-text available
Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery.
Article
Full-text available
We have developed a technique to establish catalogues of protein products of arrayed cDNA clones identified by DNA hybridisation or sequencing. A human fetal brain cDNA library was directionally cloned in a bacterial vector that allows IPTG-inducible expression of His6-tagged fusion proteins. Using robot technology, the library was arrayed in microtitre plates and gridded onto high-density in situ filters. A monoclonal antibody recognising the N-terminal RGSH6 sequence of expressed proteins (RGS·His antibody, Qiagen) detected 20% of the library as putative expression clones. Two example genes, GAPDH and HSP90α, were identified on high-density filters using DNA probes and antibodies against their proteins.
Article
Full-text available
Many new gene products are being discovered by large-scale genomics and proteomics strategies, the challenge is now to develop high throughput approaches to systematically analyse these proteins and to assign a biological function to them. Having access to these gene products as recombinantly expressed proteins, would allow them to be robotically arrayed to generate protein chips. Other applications include using these proteins for the generation of specific antibodies, which can also be arrayed to produce antibody chips. The availability of such protein and antibody arrays would facilitate the simultaneous analysis of thousands of interactions within a single experiment. This chapter will focus on current strategies used to generate protein and antibody arrays and their current applications in biological research, medicine and diagnostics. The shortcomings of these approaches, the developments required, as well as the potential applications of protein and antibody arrays will be discussed.
Article
Full-text available
We developed a high-throughput technique for the generation of cDNA libraries in the yeast Saccharomyces cerevisiae which enables the selection of cloned cDNA inserts containing open reading frames (ORFs). For direct screening of random-primed cDNA libraries, we have constructed a yeast shuttle/expression vector, the so-called ORF vector pYEXTSH3, which allows the enriched growth of protein expression clones. The selection system is based on the HIS3 marker gene fused to the C terminus of the cDNA insert. The cDNAs cloned in-frame result in histidine prototrophic yeast cells growing on minimal medium, whereas clones bearing the vector without insert or out-of-frame inserts should not grow on this medium. A randomly primed cDNA library from human fetal brain tissue was cloned in this novel vector, and using robot technology the selected clones were arrayed in microtiter plates and were analyzed by sequencing and for protein expression. In the constructed cDNA expression library, about 60% of clones bear an insert in the correct reading frame. In comparison to unselected libraries it was possible to increase the clones with inserts in the correct reading frame more than fourfold, from 14% to 60%. With the expression system described here, we could avoid time-consuming and costly techniques for identification of clones expressing protein by using antibody screening on high-density filters and subsequently rearraying the selected clones in a new "daughter" library. The advantage of this ORF vector is that, in a one-step screening procedure, it allows the generation of expression libraries enriched for clones with correct reading frames as sources of recombinant proteins.
Article
The rapid evolution of proteomics has continued during the past year, with a series of innovations in the core technologies of two-dimensional electrophoresis and mass spectrometry, and a diversity of productive research programmes. Well-annotated proteomics databases are now emerging in a number of fields to provide a platform for systematic research, with particularly promising progress in clinical applications such as cardiology and oncology. Large-scale quantitative research, comparable in power and sensitivity to that achieved for gene expression, is thus becoming a reality at the protein level.
Article
A cDNA library was generated from rat brain tissues and organized into 1536-well plates, using a fluorescence activated cell sorter (FACS), acting as a single cell deposition system. The organized library containing 10,000 clones, with 60% full-length cDNA inserts, allowed the generation of multiple identical membrane replicas. Each replica was hybridized with a complex probe obtained from a particular brain tissue or a given cultured cell. The signal intensity for each of the clones present on the membrane, quantified with a standard image-analysis software, is proportional both to the abundance of the corresponding mRNA in the probe and to the amount of plasmid template on the membrane. The latter value was thus used to normalize the signals produced with complex probes, to optimize the comparison of mRNA expression levels for the different systems under study. The construction of high-quality cDNA libraries, the generation of identical membrane replicas and comparable probes, and the utilization of an image-analysis software package, coupled with the normalization of the spot intensity by assaying plasmid quantity, significantly improves the differential screening approach. Altogether, these technical improvements open the possibility to compare a great number of different probes and, in consequence, to accumulate biological information for each clone present in an organized cDNA library. The functional information obtained should complement data from DNA sequencing projects.
Article
The two-dimensional electrophoresis (2-DE) technique developed by Klose in 1975 (Humangenetik 1975, 26, 211-234), independently of the technique developed by O'Farrell (J. Biol. Chem. 1975, 250, 4007-4021), has been revised in our laboratory and an updated protocol is presented. This protocol is the result of our experience in using this method since its introduction. Many modifications and suggestions found in the literature were also tested and then integrated into our original method if advantageous. Gel and buffer composition, size of gels, use of stacking gels or not, necessity of isoelectric focusing (IEF) gel incubation, freezing of IEF gels or immediate use, carrier ampholytes versus Immobilines, regulation of electric current, conditions for staining and drying the gels - these and other problems were the subject of our concern. Among the technical details and special equipment which constitute our 2-DE method presented here, a few features are of particular significance: (i) sample loading onto the acid side of the IEF gel with the result that both acidic and basic proteins are well resolved in the same gel; (ii) use of large (46 x 30 cm) gels to achieve high resolution, but without the need of unusually large, flat gel equipment; (iii) preparation of ready-made gel solutions which can be stored frozen, a prerequisite, among others, for high reproducibility. Using the 2-DE method described we demonstrate that protein patterns revealing more than 10 000 polypeptide spots can be obtained from mouse tissues. This is by far the highest resolution so far reported in the literature for 2-DE of complex protein mixtures. The 2-DE patterns were of high quality with regard to spot shape and background. The reproducibility of the protein patterns is demonstrated and shown to be thoroughly satisfactory. An example is given to show how effectively 2-DE of high resolution and reproducibility can be used to study the genetic variability of proteins in an interspecific mouse backcross (Mus musculus x Mus spretus) established by the European Backcross Collaborative Group for mapping the mouse genome. We outline our opinion that the structural analysis of the human genome, currently pursued most intensively on a worldwide scale, should be accompanied by a functional analysis of the genome that starts from the proteins of the organism.
Article
We demonstrate a new approach to the identification of mass spectrometrically fragmented peptides. A fragmentation spectrum usually contains a short, easily identifiable series of sequence ions, which yields a partial sequence. This partial sequence divides the peptide into three parts-regions 1, 2, and 3-characterized by the added mass m1 of region 1, the partial sequence of region 2, and the added mass m3 of region 3. We call the construct, m1 partial sequence m3, a "peptide sequence tag" and show that it is a highly specific identifier of the peptide. An algorithm developed here that uses the sequence tag to find the peptide in a sequence database is up to 1 million-fold more discriminating than the partial sequence information alone. Peptides can be identified even in the presence of an unknown posttranslational modification or an amino acid substitution between an entry in the sequence database and the measured peptide. These concepts are demonstrated with model and practical examples of electrospray mass spectrometry/mass spectrometry of tryptic peptides. Just two to three amino acid residues derived by fragmentation are enough to identify these peptides. In peptide mapping applications, even less information is necessary.
Article
A computer searching algorithm has been used to identify protein sequences in the Protein Information Resource (PIR) database with peptide mass information (mass map) obtained from proteolytic digests of proteins analyzed by microcapillary high-performance liquid chromatography electrospray ionization mass spectrometry. A theoretical analysis of the cytochrome c family demonstrates the ability to identify protein sequences in the PIR database with a high degree of accuracy using a set of six predicted tryptic peptide masses. This method was also applied to experimentally determined peptide masses for a small GTP-binding protein, a protein from pig uterus, the human sex steroid binding protein, and a thermostable DNA polymerase. The results demonstrate that a set of observed masses which is less than 50% of the total number of predicted masses can be used to identify a protein sequence in the database. For the analysis presented in this paper, a mass matching tolerance of 1 amu is used. Under these conditions, mass maps created by fast atom bombardment mass spectrometry and matrix-assisted laser desorption time-of-flight would also be applicable. In cases where multiple matches are observed or verification of the protein identification is needed, tandem mass spectrometry sequencing can be used to establish sequence similarity.
Article
During the last decade new ionization techniques have made it possible to measure the molecular weight of many intact proteins by mass spectrometry, and they have made it much easier to obtain a mass spectrometric peptide map of a protein. At the same time advances in protein and DNA sequencing technology are resulting in an exponential increase in the number of sequences deposited in databases. Here we investigate the possibility to use mass spectrometric data to identify proteins in databases. Searching a database by total molecular weight is found to be an easy and sometimes sufficient approach. For more specificity and for error tolerance in both the mass spectrometric data and the database information we search by partial mass spectrometric peptide map of the protein. In general, just four to six proteolytic peptides measured with a mass accuracy between 0.1 and 0.01% allow a useful search of databases such as the Protein Identification Resource (PIR). As the size of DNA and protein sequence databases grows, protein identification by partial mass spectrometric peptide maps should become increasingly powerful and may become a general method to identify and characterize proteins.
Article
We have developed an algorithm for identifying proteins at the sub-microgram level without sequence determination by chemical degradation. The protein, usually isolated by one- or two-dimensional gel electrophoresis, is digested by enzymatic or chemical means and the masses of the resulting peptides are determined by mass spectrometry. The resulting mass profile, i.e., the list of the molecular masses of peptides produced by the digestion, serves as a fingerprint which uniquely defines a particular protein. This fingerprint may be used to search the database of known sequences to find proteins with a similar profile. If the protein is not yet sequenced the profile can serve as a unique marker. This provides a rapid and sensitive link between genomic sequences and 2D gel electrophoresis mapping of cellular proteins.
Article
The correlation of uninterpreted tandem mass spectra of modified and unmodified peptides, produced under low-energy (10-50 eV) collision conditions, with nucleotide sequences is demonstrated. In this method nucleotide databases are translated in six reading frames, and the resulting amino acid sequences are searched "on the fly" to identify and fit linear sequences to the fragmentation patterns observed in the tandem mass spectra of peptides. A cross-correlation function is then used to provide a measurement of similarity between the mass-to-charge ratios for the fragment ions predicted by amino acid sequences translated from the nucleotide database and the fragment ions observed in the tandem mass spectrum. In general, a difference greater than 0.1 between the normalized cross-correlation functions for the first- and second-ranked search results indicates a successful match between sequence and spectrum. Measurements of the deviation from maximum similarity employing the spectral reconstruction method are made. The search method employing nucleotide databases is also demonstrated on the spectra of phosphorylated peptides. Specific sites of modification are identified even though no specific information relevant to sites of modification is contained in the character-based sequence information of nucleotide databases.
Article
Proteins from silver-stained gels can be digested enzymatically and the resulting peptide analyzed and sequenced by mass spectrometry. Standard proteins yield the same peptide maps when extracted from Coomassie- and silver-stained gels, as judged by electrospray and MALDI mass spectrometry. The low nanogram range can be reached by the protocols described here, and the method is robust. A silver-stained one-dimensional gel of a fraction from yeast proteins was analyzed by nano-electrospray tandem mass spectrometry. In the sequencing, more than 1000 amino acids were covered, resulting in no evidence of chemical modifications due to the silver staining procedure. Silver staining allows a substantial shortening of sample preparation time and may, therefore, be preferable over Coomassie staining. This work removes a major obstacle to the low-level sequence analysis of proteins separated on polyacrylamide gels.
Article
Advances in microarray technology enable massive parallel mining of biological data, with biological chips providing hybridization-based expression monitoring, polymorphism detection and genotyping on a genomic scale. Microarrays containing sequences representative of all human genes may soon permit the expression analysis of the entire human genome in a single reaction. These 'genome chips' will provide unprecedented access to key areas of human health, including disease prognosis and diagnosis, drug discovery, toxicology, aging, and mental illness. Microarray technology is rapidly becoming a central platform for functional genomics.
Article
We report the development of a method to compare collision-induced dissociation (CID) spectra of peptides. This method employs a cross-correlation analysis of a CID spectrum to a reference spectrum and normalizes the cross-correlation score to the autocorrelation of the CID spectra. The query spectrum is compared by using both mass information and fragmentation patterns. Fragmentation patterns are compared to each other using a correlation function. To evaluate the specificity of the approach, a set of 2180 tandem mass spectra obtained from both triple-quadrupole tandem mass spectrometers (TSQ) and quadrupole ion trap mass spectrometers (LCQ) was created. Comparisons are performed between tandem mass spectra obtained on the same instrument type as well as between different instrument types. Accurate and reliable comparisons are demonstrated in both types of analyses. The scores obtained in the cross-comparison of TSQ and LCQ tandem mass spectra of the same peptide are found to be slightly lower than comparisons performed with spectra obtained on the same instrument type. The method appears insensitive to variations in day-to-day performance of the instrument, minor variations in fragment ion abundance, and instrumental differences inherent in the same instrument model. The use of this method of comparison is demonstrated for library searching and subtractive analysis of tandem mass spectra obtained during LC/MS/MS experiments.
Article
Genome sequencing provides a wealth of information on predicted gene products (mostly proteins), but the majority of these have no known function. Two-dimensional gel electrophoresis and mass spectrometry have, coupled with searches in protein and EST databases, transformed the protein-identification process. The proteome is the expressed protein complement of a genome and proteomics is functional genomics at the protein level. Proteomics can be divided into expression proteomics, the study of global changes in protein expression, and cell-map proteomics, the systematic study of protein-protein interactions through the isolation of protein complexes.
Article
We have constructed a human fetal brain cDNA library in an Escherichia coli expression vector for high-throughput screening of recombinant human proteins. Using robot technology, the library was arrayed in microtiter plates and gridded onto high-density filter membranes. Putative expression clones were detected on the filters using an antibody against the N-terminal sequence RGS-His(6) of fusion proteins. Positive clones were rearrayed into a new sublibrary, and 96 randomly chosen clones were analyzed. Expression products were analyzed by SDS-PAGE, affinity purification, matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry, and the determined protein masses were compared to masses predicted from DNA sequencing data. It was found that 66% of these clones contained inserts in a correct reading frame. Sixty-four percent of the correct reading frame clones comprised the complete coding sequence of a human protein. High-throughput microtiter plate methods were developed for protein expression, extraction, purification, and mass spectrometric analyses. An enzyme assay for glyceraldehyde-3-phosphate dehydrogenase activity in native extracts was adapted to the microtiter plate format. Our data indicate that high-throughput screening of an arrayed protein expression library is an economical way of generating large numbers of clones producing recombinant human proteins for structural and functional analyses.
Article
A new strategy for identifying proteins in sequence data-bases by MALDI-MS peptide mapping is reported. The strategy corrects for systematic deviations of determined peptide molecular masses using information contained in the opened database and thereby renders unnecessary internal spectrum calibration. As a result, data acquisition is simplified and less error prone. Performance of the new strategy is demonstrated by identification of a set of recombinant, human cDNA expression products as well as native proteins isolated from crude mouse brain extracts by 2-D electrophoresis. Using one set of calibration constants for the mass spectrometric analyses, 20 proteins were identified without applying any molecular weight restrictions, which was not possible without data correction. A sequence database search program has been written that performs all necessary calculations automatically, access to which will be provided to the scientific community in the Internet.
Article
The interest in proteomics has recently increased dramatically and proteomic methods are now applied to many problems in cell biology. The method of choice in proteomics for identifying and characterizing proteins is mass spectrometry combined with database searching. Software tools have been improved to increase the sensitivity of protein identification and methods for evaluating the search results have been incorporated
Article
We have constructed a novel Pichia pastoris/Escherichia coli dual expression vector for the production of recombinant proteins in both host systems. In this vector, an E. coli T7 promoter region, including the ribosome binding site from the phage T7 major capsid protein for efficient translation is placed downstream from the yeast alcohol oxidase promoter (AOX). For detection and purification of the target protein, the vector contains an amino-terminal oligohistidine domain (His6) followed by the hemaglutinine epitope (HA) adjacent to the cloning sites. A P. pastoris autonomous replicating sequence (PARS) was integrated enabling simple propagation and recovery of plasmids from yeast and bacteria (1). In the present study, the expression of human proteins in P. pastoris and E. coli was compared using this single expression vector. For this purpose we have subcloned a cDNA expression library deriving from human fetal brain (2) into our dual expression T7 vector and investigated 96 randomly picked clones. After sequencing, 29 clones in the correct reading frame have been identified, their plasmids isolated and shuttled from yeast to bacteria. All proteins were expressed soluble in P. pastoris, whereas in E. coli only 31% could be purified under native conditions. Our data indicates that this dual expression vector allows the economic expression and purification of proteins in different hosts without subcloning.
Article
An emerging field for the analysis of biological systems is the study of the complete protein complement of the genome, the 'proteome'. There are several complementary tools available for proteome analysis including 2D protein electrophoresis and mass spectrometry. Emerging technologies for proteome analysis include spotted-array-based methods and microfluidic devices. Taken together, these technologies provide a wealth of information that is useful in discovery-based science. However, there are some key limitations of these approaches and new technology is required to be able to fully integrate proteomic information with information obtained about DNA sequence, mRNA profiles and metabolite concentrations into effective models of biological systems.
Article
The large-gel two-dimensional electrophoresis (2-DE) technique, developed by Klose and co-workers over the past 25 years, provides the resolving power necessary to separate crude proteome extracts of higher eukaryotes. Matrix assisted laser desorption/ionization-time of flight-mass spectrometry (MALDI-TOF-MS) provides the sample throughput necessary to identify thousands of different protein species in an adequate time period. Spot excision, in situ proteolysis, and extraction of the cleavage products from the gel matrix, peptide purification and concentration as well as the mass spectrometric sample preparation are the crucial steps that interface the two analytical techniques. Today, these routines and not the mass spectrometric instrumentation determine how many protein digests can be analyzed per day per instrument. The present paper focuses on this analytical interface and reports on an integrated protocol and technology developed in our laboratory. Automated identification of proteins in sequence databases by mass spectrometric peptide mapping requires a powerful search engine that makes full use of the information contained in the experimental data, and scores the search results accordingly. This challenge is heading a second part of the paper.
Article
Developments in 'soft' ionisation techniques have revolutionized mass-spectro-metric approaches for the analysis of protein structure. For more than a decade, such techniques have been used, in conjuction with digestion b specific proteases, to produce accurate peptide molecular weight 'fingerprints' of proteins. These fingerprints have commonly been used to screen known proteins, in order to detect errors of translation, to characterize post-translational modifications and to assign diulphide bonds. However, the extent to which peptide-mass information can be used alone to identify unknown sample proteins, independent of other analytical methods such as protein sequence analysis, has remained largely unexplored. We report here on the development of the molecular weight search (MOWSE) peptide-mass database at the SERC Daresbury Laboratory. Practical experience has shown that sample proteins can be uniquely identified from a few as three or four experimentally determined peptide masses when these are screened against a fragment database that is derived from over 50 000 proteins. Experimental errors of a few Daltons are tolerated by the scoring algorithms, thus permitting the use of inexpensive time-of-flight mass spectrometers. As with other types of physical data, such as amino-acid composition or linear sequence, peptide masses provide a set of determinants that are sufficiently discriminating to identify or match unknown sample proteins. Peptide-mass fingerprints can prove as discriminating as linear peptide sequences, but can be obtained in a fraction of the time using less protein. In many cases, this allows for a rapid identification of a sample protein before committing it to protein sequence analysis. Fragment masses also provide information, at the protein level, that is complementary to the information provided by large-scale DNA sequencing or mapping projects.
  • V Egelhofer
  • K Büssow
  • C Luebbert
  • H Lehrach
  • E Nordhoff
Egelhofer, V., Büssow, K., Luebbert, C., Lehrach, H., Nordhoff, E., Anal. Chem. 2000, 72, 2741–2750.
  • M Mann
  • M Wilm
Mann, M., Wilm, M., Anal. Chem. 1994, 66, 4390–4399.
  • K Büssow
  • E Nordhoff
  • C Lübbert
  • H Lehrach
  • G Walter
Büssow, K., Nordhoff, E., Lübbert, C., Lehrach, H., Walter, G., Genomics 2000, 65, 1–8.
  • Iii Yates
  • J R Speicher
  • S Griffin
  • P Hunkapiller
Yates III, J. R., Speicher, S., Griffin, P., Hunkapiller, T., Anal. Biochem. 1993, 214, 397–408.
  • J Gobom
  • M Schuerenberg
  • M Mueller
  • D Theiss
  • H Lehrach
  • E Nordhoff
Gobom, J., Schuerenberg, M., Mueller, M., Theiss, D., Lehrach, H., Nordhoff, E., Anal. Chem. 2000, ASAP Article, Web Release Date: December 28.
  • J Klose
  • U Kobalz
Klose, J., Kobalz, U., Electrophoresis 1995, 16, 1034–1059.
  • D Feny
Feny, D., Curr. Opin. Biotechnol. 2000, 11, 391–395.
  • K Büssow
  • D J Cahill
  • W Nietfeld
  • D Bancroft
  • E Scherzinger
  • H Lehrach
  • G Walter
Büssow, K., Cahill, D. J., Nietfeld, W., Bancroft, D., Scherzinger, E., Lehrach, H., Walter, G., Nucleic Acids Res. 1998, 26, 5007–5008.
  • M Mann
  • P Hojrup
  • P Roepstorff
Mann, M., Hojrup, P., Roepstorff, P., Biol. Mass Spectrom. 1993, 22, 338–345.
  • A Lueking
  • C Holz
  • C Gotthold
  • H Lehrach
  • D J Cahill
Lueking, A., Holz, C., Gotthold, C., Lehrach, H., Cahill, D. J., Prot. Expr. Purif. 2000, 20, 372–378.
  • A Shevchenko
  • M Wilm
  • O Vorm
  • M Mann
Shevchenko, A., Wilm, M., Vorm, O., Mann, M., Anal. Chem. 1996, 68, 850–858.
  • E Nordhoff
  • V Egelhofer
  • P Giavalisco
  • H Eickhoff
  • M Horn
  • T Przewieslik
  • D Theiss
  • U Schneider
  • H Lehrach
  • J Gobom
Nordhoff, E., Egelhofer, V., Giavalisco, P., Eickhoff, H., Horn, M., Przewieslik, T., Theiss, D., Schneider, U., Lehrach, H., Gobom, J., Electrophoresis 2001, 22, 2844–2855.
  • M Wilm
  • A Shevchenko
  • T Houthaeve
  • S Breit
  • L Schweigerer
  • T Fotsis
  • M Mann
Wilm, M., Shevchenko, A., Houthaeve, T., Breit, S., Schweigerer, L., Fotsis, T., Mann, M., Nature 1996, 379, 466–469.