|
Biotecnologia Aplicada
Elfos Scientiae
ISSN: 0684-4551
Vol. 17, Num. 1, 2000, pp. 46-47
|
ba00011
Biotecnologia Aplicada 2000; Vol. 17 No. 1, pp. 46-47
New Approaches to Quantitative Proteome
Analysis
Ruedi Aebersold, Beate Rist, Steven P Gygi
Department of Molecular Biotechnology, University of Washington,
Seattle, WA 98195, USA.
Code Number: BA00011
With the completion of a rapidly increasing number of
complete genomic sequences much attention is currently focused on
the questions if and how the information contained in sequence
databases can be interpreted in terms of the structure, function
and control of biological systems. Quantitative proteome analysis,
the global analysis of protein expression, has been proposed as a
method to study genes at a steady state and after
perturbation-induced changes. Here we discuss the justification for
gene expression analysis at the protein level, highlight the
limitations in the current standard proteome technology, and
introduce a new experimental approach to quantitative proteome
analysis.
The poor correlation between mRNA and protein levels in cells
provides justification for quantitative proteome analysis. With
recent technical advances including the development of differential
display-PCR [1], cDNA microarray and DNA chip technology
[2, 3], and serial gene analysis (SAGE) [4, 5], it is now
feasible to establish global and quantitative mRNA expression maps
of cells and tissues in species for which the sequence of all the
genes is known. The discoveries of post-transcriptional mechanisms
which control translation rate [6] and protein and mRNA half-lives
[7] led us to predict that quantitative transcript expression
measurements are insufficient for predicting the quantity of
protein expression. To test this hypothesis we determined the
correlation between the mRNA and protein levels for a group of
genes expressed in exponentially growing cells of the yeast
Saccharomyces cerevisiae. Protein expression levels were
quantitated by metabolic labeling of the yeast proteins to a steady
state, followed by 2D-gel electrophoresis and liquid scintillation
counting of the selected, separated protein species. Separated
proteins were identified by tryptic digestion of spots with
subsequent analysis by microcapillary high performance liquid
chromatography-tandem mass spectrometry (mLC-MS/MS) and sequence
database searching [8-10]. The corresponding mRNA transcript levels
were calculated from serial analysis of gene expression (SAGE)
frequency tables [5].
The correlation between mRNA and protein levels was calculated
for a data set consisting of more than 100 mRNA and protein
products of selected genes. For the entire set of genes, there was
a general trend of increased mRNA levels resulting in increased
protein levels. The Pearson product moment correlation coefficient
for the whole data set was 0.935. This number is highly biased by a
relatively small number of genes with very large protein and
message levels. A more representative subset of the was the group
of genes for which message level was measured below 10 copies/cell.
This subset included 70% of the data used in the study. The Pearson
product moment correlation coefficient for this data set was 0.356.
This weak correlation is further evident by the observation that
levels of protein expression coded for by mRNA with comparable
abundance varied by as much as 30 fold and that the mRNA
levels coding for protein with comparable expression levels varied
by as much as 20 fold. This study, for the first time, correlated
the mRNA transcript and protein expression levels of a relatively
large number of genes expressed in cells representing the same
state. lt is apparent that the observed correlation is not
sufficiently high to allow for protein levels to be predicted by
mRNA levels. We therefore conclude that quantitative proteome
analysis is an essential component of any comprehensive analysis of
biological systems.
Current proteome technology is biased towards the analysis of
high abundance proteins: the current standard approach to
quantitative proteome analysis is based on the separation of
proteins by 2D-gel electrophoresis (2DE) and the subsequent
identification of individually separated and detected protein spots
by mass spectrometry or tandem mass spectrometry followed by
sequence database searching [9-11]. The method is sequential, labor
intensive and difficult to automate. lt does, however, provide
precise quantitation and is well suited to reveal relative changes
in protein expression, clusters of concurrently regulated proteins
and additional features which affect the electrophoretic mobility
of proteins, including post-translational protein processing and
modifications. As a true proteome technology, the 2DE/MS/MS method
would be expected to display every protein in a protein mixture. To
assess to what extent the 2DE protein pattern obtained from a total
yeast lysate represented the proteome of this microorganism, we
related protein expression levels from protein detected by silver
staining to the predicted expression levels of all the open reading
frames (ORF) in the yeast.
Prediction of the level of protein expression was based on the
codon bias of the respective genes. The codon bias indicates the
propensity for a gene to utilize the same codon to encode an amino
acid even though other codons would insert the identical amino acid
into the growing polyeptide chain. Its value varies between 960.3
and 1.0, and it has further been found empirically that highly
expressed proteins have large codon bias values (>0.2) and
proteins expressed at low levels have low codon bias values
(>0.1) [12]. Comparison of the codon bias distributions for all
the yeast ORFs with the distribution of all the proteins analyzed
by 2DE, silver staining and tandem mass spectrometry indicated that
the population of proteins analyzed by the standard 2DE/MS/MS
proteome analysis technique was highly biased towards the most
highly expressed proteins. No proteins with codon bias values
<0.2 were detected, whereas the majority of ORF92s predicted
from the yeast genome sequence have codon bias values <0.2. We
therefore conclude that the current proteome technology, used
without sample pre-enrichment is not a true proteome technology and
that the construction of complete proteome maps will be very
challenging, even for relatively simple, unicellular organisms.
A novel method for quantitative proteome analysis
To address the limitations inherent to the 2DE/MS/MS method to
proteome analysis, we have developed a new experimental approach.
lt is intended to retain relative quantitative information while
still rapidly and conclusively identifying even the minor
components of a mixture. This method is based on a class of new
chemical reagents termed isotope coded affinity tags (ICAT) and
MS/MS. ICAT reagents consist of three functional units, namely a
chemical reactivity directed towards a functional group in proteins
(e.g. SH, NH2, COOH), a linker group synthesized in isotopically
heavy and light forms, respectively, and an affinity tag (typically
a biotin group) [13].
The ICAT strategy consists of the following steps. Proteins in
protein mixtures 1 and 2 are treated after reduction with a
sulfhydryl-specific ICAT reagent. The reagents exist in two forms:
isotopically light (d0) and isotopically heavy (d8). The heavy and
light forms are used to derivatize the proteins in samples 1 and 2,
respectively. After treatment with the ICAT reagents the samples
are mixed, At this point, any optional fractionation technique can
be performed to enrich for low abundance proteins or to reduce the
complexity of the mixture, while the relative quantities are
maintained. The combined protein sample is then proteolyzed and the
ICAT-tagged peptides are selectively enriched by avidin-biotin
affinity chromatography. These peptides are separated and analyzed
by microcapillary HPLC-ESI-MS/MS. The relative ion intensities of
the two differentially isotopically tagged forms of a specific
peptide indicate their relative abundance. Such pairs of tagged
peptides are easily detected because they essentially co-elute from
the column and because of the eight mass units difference encoded
in the ICAT tag, which is detected in the mass spectrometer. Every
other scan is devoted to fragmenting and then recording sequence
information about an eluting peptide (MS/MS spectrum). The protein
from which this peptide originated is then identified by searching
a sequence database with the recorded MS/MS spectrum. The procedure
thus provides the relative quantitation and identification of the
components of protein mixtures in a single analysis.
In this manuscript, we argue that in the emerging post-genomic
era technologies that can quantitatively, globally, and
automatically measure gene expression at the protein level are
essential for the comprehensive analysis of biological processes
and systems. We indicate the limitations of the current standard
method for large scale protein analysis with respect to the
analysis of low abundance proteins and propose a new approach to
quantitative proteome analysis. We anticipate that the new ICAT
strategy will provide broadly applicable means for the quantitative
cataloging and comparison of expressed proteins in a variety of
normal, developmental, and disease states.
Acknowledgments
This work was supported in part by the NSF Science and
Technology Center for Molecular Biotechnology, NIH grant T32HG00035
and a grant from the Merck Genome Research Institute.
References
Paper selected from Biotecnología Habana99
Congress. November 28December 3, 1999.
1. Liang P, Pardee AB. Science 1992; 257:967.
2. Shalon D, Smith SJ, Brown PO. Genome Research 1996;6:639.
3. Lashkari DA, DeRisi JL, McCusker JH, Namath AF, Gentile C,
Hwang SY, et al. PNAS USA 1997;94:13057.
4. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Science
1995;270:484.
5. Velculescu VE, Zhang L, Zhou W, Vogelstein J, Basral MA,
Bassett DE, et al. Cell 1997;88:243.
6. Harford JB, Morris DR. Post-transcriptional gene regulation.
Wiley-Liss, lnc., New York, 1997.
7. Varshavsky A. PNAS USA 1996;93: 12142.
8. Eng J, McCormack AL, Yates JR. J Am Soc Mass Spectrom
1994;5:976.
9. Gygi SP, Rochon Y, Franza BR, Aebersold R. Molecular and
Cellular Biology 1999; 19:1720.
10. Gygi SP, Han DKM, Gingras AC, Sonenberg N, Aebersold R.
Electrophoresis 1999;20:310.
11. Patterson SD, Aebersold R. Electrophoresis
1995;16:1791814.
12. Bennetzen JL, Hall BD. J Biological Chem 1982;257:3026.
13. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R.
Nature Biotechnology In press. (1999).
Copyright 2000 Elfos Scientiae
|