Defining,Comparing, and Improving iTRAQ Quantification in Mass Spectrometry Proteomics Data |
| |
Authors: | Lina Hultin-Rosenberg Jenny Forshed Rui M. M. Branca Janne Lehti? Henrik J. Johansson |
| |
Affiliation: | From the ‡Cancer Proteomics Mass Spectrometry, Department of Oncology-Pathology, Science for Life Laboratory and Karolinska Institutet, Box 1031, 171 21 Solna, Sweden |
| |
Abstract: | The purpose of this study was to generate a basis for the decision of what protein quantities are reliable and find a way for accurate and precise protein quantification. To investigate this we have used thousands of peptide measurements to estimate variance and bias for quantification by iTRAQ (isobaric tags for relative and absolute quantification) mass spectrometry in complex human samples. A549 cell lysate was mixed in the proportions 2:2:1:1:2:2:1:1, fractionated by high resolution isoelectric focusing and liquid chromatography and analyzed by three mass spectrometry platforms; LTQ Orbitrap Velos, 4800 MALDI-TOF/TOF and 6530 Q-TOF. We have investigated how variance and bias in the iTRAQ reporter ions data are affected by common experimental variables such as sample amount, sample fractionation, fragmentation energy, and instrument platform. Based on this, we have suggested a concept for experimental design and a methodology for protein quantification. By using duplicate samples in each run, each experiment is validated based on its internal experimental variation. The duplicates are used for calculating peptide weights, unique to the experiment, which is used in the protein quantification. By weighting the peptides depending on reporter ion intensity, we can decrease the relative error in quantification at the protein level and assign a total weight to each protein that reflects the protein quantitation confidence. We also demonstrate the usability of this methodology in a cancer cell line experiment as well as in a clinical data set of lung cancer tissue samples. In conclusion, we have in this study developed a methodology for improved protein quantification in shotgun proteomics and introduced a way to assess quantification for proteins with few peptides. The experimental design and developed algorithms decreased the relative protein quantification error in the analysis of complex biological samples.Recent developments in methods and instruments for mass spectrometry enable quantitative proteomics analysis of complex samples with good coverage (1–4). Several techniques for quantification by mass spectrometry exist, both using isotopic labeling and label free methods (5, 6). Quantification by isotopic labeling can be done on precursor ion level or by quantifying isobaric label fragments in fragment spectra. Isotope-coded affinity tag (7), isobaric tags for relative and absolute quantification (iTRAQ)1 (8), and stable isotope labeling by amino acids in cell culture (SILAC) (9) are among the most commonly used labeling methods based on stable isotopes. iTRAQ allows for simultaneous relative quantification of up to eight samples within a single run. Quantification by mass spectrometry is however a challenge, and several factors contribute to the uncertainty in the quantitative estimate; differences in labeling efficiency, protein digestion, precursor mixing, ion suppression, peak detection, data preprocessing, and data analysis (10). The quality of quantitation methods can be measured in terms of precision and accuracy. Precision is affected by random errors, that is, random fluctuations around the true value (variance). Lack of accuracy is caused by systematic errors, that is, differences between true and observed values (bias).Several studies have shown that iTRAQ labeling is associated with bias; fold changes are compressed toward one (11–14). It has been suggested that this underestimation of fold change is caused by co-eluting peptides with similar m/z values that are isolated together, creating mixed iTRAQ intensities in complex samples (14). Concerning precision, iTRAQ data has been reported to exhibit variance heterogeneity. The coefficient of variance (CV) of the signal depends on the intensity, with larger CV for low intensity peaks (11, 12, 15, 16). Measurements of iTRAQ intensities for quantification are made in the MS/MS spectra of the peptides, and thereafter combined to calculate a summarized relative protein quantity. There are several different approaches for combining the iTRAQ peptide data to compute a reliable protein ratio. Methods to improve the protein quantification by addressing the variance heterogeneity have been based on excluding low intensity peptide data (17, 18), weighting the peptide data according to intensity (18–21) or stabilizing the variance (12).Quantitative studies of complex human samples are subject to even more challenges related to large biological variation, large and unknown complexity of the human proteome and a large concentration range of proteins. This in turn results in many peptides and a large variety of peptides that can cause interference and related problems in the mass spectrometry analysis. In, for example, biomarker discovery research the goal is to measure quantitative changes or differences in protein levels between two or more clinical conditions. It is therefore crucial to achieve as accurate and precise quantitative information from the data as possible as well as to correctly estimate the limitations of the quantification. Setting adequate standards for quantitative proteomics analysis is hence essential for being able to detect relevant changes in protein abundance, select important proteins, and further use those proteins to interpret the biological and clinical meaning (10, 22). Selecting a protein as significant and taking it to further validation in other clinical material using complementary techniques is time consuming and costly (23). For successful use of iTRAQ labeling in biomarker discovery, and to avoid false discoveries, it is hence essential to assess the accuracy and precision of the methodology.A common approach to study variance and bias in mass spectrometry based protein quantification is to spike a set of standard proteins into a sample and then measure the CV and bias of the intensities of those peptides. Spike-in of proteins has the benefit of looking at a small controlled set of peptides and how they behave in the studied system. This strategy has been used in several of the previously mentioned papers that address iTRAQ quantification (11–14). However, the number of data points studied may be unlikely to represent the complexity of a real biological sample, which often contains thousands of proteins (24). In the current study, all peptides detected in a complex human cell line sample (A549) are used to get an estimate of the quantitative accuracy and precision. This experimental setup is hence more similar to a real biomarker discovery study with high complex human proteome samples. The quality of the protein quantifications is compared among several different mass spectrometers in this work; also the influence of different loaded peptide amounts and the use of different methods for sample separation are examined. Factors such as variance and bias of peptide quantification by iTRAQ are systematically evaluated in those high complex samples. Further, methods for improving the protein quantification are investigated; by filtering on the peptide level to remove low quality intensities and by weighting the peptide values to account for the higher risk of errors at low intensities (20).We have described the factors contributing to bias and variance in protein quantification by iTRAQ labeling. This has generated guidelines for how to estimate the accuracy of protein quantities, which will be an essential tool in both biomarker discovery and studies of biological systems. Based on the results, we suggest an experimental design where each labeling set (e.g., iTRAQ) includes duplicate samples, and we describe how these duplicates are used for calculating peptide weights that can be used in addressing the accuracy of protein quantities. This novel approach is shown to improve protein quantification by iTRAQ in six data sets of A431 cell line samples treated with drug and a clinical data set of lung cancer tissue samples. |
| |
Keywords: | |
本文献已被 ScienceDirect 等数据库收录! |
|