首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Differences in sample collection, biomolecule extraction, and instrument variability introduce bias to data generated by liquid chromatography coupled with mass spectrometry (LC-MS). Normalization is used to address these issues. In this paper, we introduce a new normalization method using the Gaussian process regression model (GPRM) that utilizes information from individual scans within an extracted ion chromatogram (EIC) of a peak. The proposed method is particularly applicable for normalization based on analysis order of LC-MS runs. Our method uses measurement variabilities estimated through LC-MS data acquired from quality control samples to correct for bias caused by instrument drift. Maximum likelihood approach is used to find the optimal parameters for the fitted GPRM. We review several normalization methods and compare their performance with GPRM.

Results

To evaluate the performance of different normalization methods, we consider LC-MS data from a study where metabolomic approach is utilized to discover biomarkers for liver cancer. The LC-MS data were acquired by analysis of sera from liver cancer patients and cirrhotic controls. In addition, LC-MS runs from a quality control (QC) sample are included to assess the run to run variability and to evaluate the ability of various normalization method in reducing this undesired variability. Also, ANOVA models are applied to the normalized LC-MS data to identify ions with intensity measurements that are significantly different between cases and controls.

Conclusions

One of the challenges in using label-free LC-MS for quantitation of biomolecules is systematic bias in measurements. Several normalization methods have been introduced to overcome this issue, but there is no universally applicable approach at the present time. Each data set should be carefully examined to determine the most appropriate normalization method. We review here several existing methods and introduce the GPRM for normalization of LC-MS data. Through our in-house data set, we show that the GPRM outperforms other normalization methods considered here, in terms of decreasing the variability of ion intensities among quality control runs.
  相似文献   

2.
Normalization is an important step in the analysis of quantitative proteomics data. If this step is ignored, systematic biases can lead to incorrect assumptions about regulation. Most statistical procedures for normalizing proteomics data have been borrowed from genomics where their development has focused on the removal of so-called ‘batch effects.’ In general, a typical normalization step in proteomics works under the assumption that most peptides/proteins do not change; scaling is then used to give a median log-ratio of 0. The focus of this work was to identify other factors, derived from knowledge of the variables in proteomics, which might be used to improve normalization. Here we have examined the multi-laboratory data sets from Phase I of the NCI''s CPTAC program. Surprisingly, the most important bias variables affecting peptide intensities within labs were retention time and charge state. The magnitude of these observations was exaggerated in samples of unequal concentrations or “spike-in” levels, presumably because the average precursor charge for peptides with higher charge state potentials is lower at higher relative sample concentrations. These effects are consistent with reduced protonation during electrospray and demonstrate that the physical properties of the peptides themselves can serve as good reporters of systematic biases. Between labs, retention time, precursor m/z, and peptide length were most commonly the top-ranked bias variables, over the standardly used average intensity (A). A larger set of variables was then used to develop a stepwise normalization procedure. This statistical model was found to perform as well or better on the CPTAC mock biomarker data than other commonly used methods. Furthermore, the method described here does not require a priori knowledge of the systematic biases in a given data set. These improvements can be attributed to the inclusion of variables other than average intensity during normalization.The number of laboratories using MS as a quantitative tool for protein profiling continues to grow, propelling the field forward past simple qualitative measurements (i.e. cataloging), with the aim of establishing itself as a robust method for detecting proteomic differences. By analogy, semiquantitative proteomic profiling by MS can be compared with measurement of relative gene expression by genomics technologies such as microarrays or, newer, RNAseq measurements. While proteomics is disadvantaged by the lack of a molecular amplification system for proteins, successful reports from discovery experiments are numerous in the literature and are increasing with advances in instrument resolution and sensitivity.In general, methods for performing relative quantitation can be broadly divided into two categories: those employing labels (e.g. iTRAQ, TMT, and SILAC (1)) and so-called “label-free” techniques. Labeling methods involve adding some form of isobaric or isotopic label(s) to the proteins or peptides prior to liquid chromatography-tandem MS (LC-MS/MS) analysis. Chemical labels are typically applied during sample processing, and isotopic labels are commonly added during cell culture (i.e. metabolic labeling). One advantage of label-based methods is that the two (or more) differently-labeled samples can be mixed and run in single LC-MS analyses. This is in contrast to label-free methods which require the samples to be run independently and the data aligned post-acquisition.Many labs employ label-free methods because they are applicable to a wider range of samples and require fewer sample processing steps. Moreover, data from qualitative experiments can sometimes be re-analyzed using label-free software tools to provide semiquantitative data. Advances in these software tools have been extensively reviewed (2). While analysis of label-based data primarily uses full MS scan (MS1)1 or tandem MS scan (MS2) ion current measurements, analysis of label-free data can employ simple counts of confidently identified tandem mass spectra (3). So-called spectral counting makes the assumption that the number of times a peptide is identified is proportional to its concentration. These values are sometimes summed across all peptides for a given protein and scaled by protein length. Relative abundance can then be calculated for any peptide or protein of interest. While this approach may be easy to perform, its usefulness is particularly limited in smaller data sets and/or when counts are low.This report focuses only on the use of ion current measurements in label-free data sets, specifically those calculated from extracted MS1 ion chromatograms (XICs). In general terms, raw intensity values (i.e. ion counts in arbitrary units) cannot be used for quantitation in the absence of cognate internal standards because individual ion intensities depend on a response factor, related to the chemical properties of the molecule. Intensities are instead almost always reserved for relative determinations. Furthermore, retention times are sometimes used to align the chromatograms between runs to ensure higher confidence prior to calculating relative intensities. This step is crucial for methods without corresponding identity information, particularly for experiments performed on low-resolution instruments. To support a label-free workflow, peptide identifications are commonly made from tandem mass spectra (MS/MS) acquired along with direct electrospray signal (MS1). Or, in alternative workflows seeking deeper coverage, interesting MS1 components can be targeted for identification by MS/MS in follow-up runs (4).“Rolling up” the peptide ion information to the peptide and protein level is also done in different ways in different labs. In most cases, “peptide intensity” or “peptide abundance” is the summed or averaged value of the identified peptide ions. How the peptide information is transferred to the protein level differs between methods but typically involves summing one or more peptide intensities, following parsimony analysis. One such solution is the “Top 3” method developed by Silva and co-workers (5).Because peptides in label-free methods lack labeled analogs and require separate runs, they are more susceptible to analytical noise and systematic variations. Sources of these obscuring variations can come from many sources, including sample preparation, operator error, chromatography, electrospray, and even from the data analysis itself. While analytical noise (e.g. chemical interference) is difficult to selectively reject, systematic biases can often be removed by statistical preprocessing. The goal of these procedures is to normalize the data prior to calculations of relative abundance. Failure to resolve these issues is the common origin of batch effects, previously described for genomics data, which can severely limit meaningful interpretation of experimental data (6, 7).These effects have also been recently explored in proteomics data (8). Methods used to normalize proteomics data have been largely borrowed from the microarray community, or are based on a simple mean/median intensity ratio correction. Methods applied on microarray and/or gene chip and used on proteomics data include scaling, linear regression, nonlinear regression, and quantile normalizations (9). Moreover, work has also been done to improve normalization by subselecting a peptide basis (10). Other work suggests that linear regression, followed by run order analysis, works better than other methods tested (11). Key to this last method is the incorporation of a variable other than intensity during normalization. It is also important to note that little work has been done towards identifying the underlying sources of these variations in proteomics data. Although cause-and-effect is often difficult to determine, understanding these relationships will undoubtedly help remove and avoid the major underlying sources of systematic variations.In this report, we have attempted to combine our efforts focused on understanding variability with the work initiated by others for normalizing ion current-based label-free proteomics data. We have identified several major variables commonly affecting peptide ion intensities both within and between labs. As test data, we used a subset of raw data acquired during Phase I of the National Cancer Institute''s (NCI) Clinical Proteomics Technology Assessment for Cancer (CPTAC) program. With these data, we were able to develop a statistical model to rank bias variables and normalize the intensities using stepwise, semiparametric regression. The data analysis methods have been implemented within the National Institute of Standards and Technology (NIST) MS quality control (MSQC) pipeline. Finally, we have developed R code for removing systematic biases and have tested it using a reference standard spiked into a complex biological matrix (i.e. yeast cell lysate).  相似文献   

3.
In this review we examine techniques, software, and statistical analyses used in label-free quantitative proteomics studies for area under the curve and spectral counting approaches. Recent advances in the field are discussed in an order that reflects a logical workflow design. Examples of studies that follow this design are presented to highlight the requirement for statistical assessment and further experiments to validate results from label-free quantitation. Limitations of label-free approaches are considered, label-free approaches are compared with labelling techniques, and forward-looking applications for label-free quantitative data are presented. We conclude that label-free quantitative proteomics is a reliable, versatile, and cost-effective alternative to labelled quantitation.  相似文献   

4.
Matros A  Kaspar S  Witzel K  Mock HP 《Phytochemistry》2011,72(10):963-974
Recent innovations in liquid chromatography-mass spectrometry (LC-MS)-based methods have facilitated quantitative and functional proteomic analyses of large numbers of proteins derived from complex samples without any need for protein or peptide labelling. Regardless of its great potential, the application of these proteomics techniques to plant science started only recently. Here we present an overview of label-free quantitative proteomics features and their employment for analysing plants. Recent methods used for quantitative protein analyses by MS techniques are summarized and major challenges associated with label-free LC-MS-based approaches, including sample preparation, peptide separation, quantification and kinetic studies, are discussed. Database search algorithms and specific aspects regarding protein identification of non-sequenced organisms are also addressed. So far, label-free LC-MS in plant science has been used to establish cellular or subcellular proteome maps, characterize plant-pathogen interactions or stress defence reactions, and for profiling protein patterns during developmental processes. Improvements in both, analytical platforms (separation technology and bioinformatics/statistical analysis) and high throughput nucleotide sequencing technologies will enhance the power of this method.  相似文献   

5.
High resolution proteomics approaches have been successfully utilized for the comprehensive characterization of the cell proteome. However, in the case of quantitative proteomics an open question still remains, which quantification strategy is best suited for identification of biologically relevant changes, especially in clinical specimens. In this study, a thorough comparison of a label-free approach (intensity-based) and 8-plex iTRAQ was conducted as applied to the analysis of tumor tissue samples from non-muscle invasive and muscle-invasive bladder cancer. For the latter, two acquisition strategies were tested including analysis of unfractionated and fractioned iTRAQ-labeled peptides. To reduce variability, aliquots of the same protein extract were used as starting material, whereas to obtain representative results per method further sample processing and MS analysis were conducted according to routinely applied protocols. Considering only multiple-peptide identifications, LC-MS/MS analysis resulted in the identification of 910, 1092 and 332 proteins by label-free, fractionated and unfractionated iTRAQ, respectively. The label-free strategy provided higher protein sequence coverage compared to both iTRAQ experiments. Even though pre-fraction of the iTRAQ labeled peptides allowed for a higher number of identifications, this was not accompanied by a respective increase in the number of differentially expressed changes detected. Validity of the proteomics output related to protein identification and differential expression was determined by comparison to existing data in the field (Protein Atlas and published data on the disease). All methods predicted changes which to a large extent agreed with published data, with label-free providing a higher number of significant changes than iTRAQ. Conclusively, both label-free and iTRAQ (when combined to peptide fractionation) provide high proteome coverage and apparently valid predictions in terms of differential expression, nevertheless label-free provides higher sequence coverage and ultimately detects a higher number of differentially expressed proteins. The risk for receiving false associations still exists, particularly when analyzing highly heterogeneous biological samples, raising the need for the analysis of higher sample numbers and/or application of adjustment for multiple testing.  相似文献   

6.
Mass spectrometry-based proteomics greatly benefited from recent improvements in instrument performance and the development of bioinformatics solutions facilitating the high-throughput quantification of proteins in complex biological samples. In addition to quantification approaches using stable isotope labeling, label-free quantification has emerged as the method of choice for many laboratories. Over the last years, data-independent acquisition approaches have gained increasing popularity. The integration of ion mobility separation into commercial instruments enabled researchers to achieve deep proteome coverage from limiting sample amounts. Additionally, ion mobility provides a new dimension of separation for the quantitative assessment of complex proteomes, facilitating precise label-free quantification even of highly complex samples. The present work provides a thorough overview of the combination of ion mobility and data-independent acquisition-based label-free quantification LC-MS and its applications in biomedical research.  相似文献   

7.
Liquid chromatography (LC) coupled to electrospray mass spectrometry (MS) is well established in high-throughput proteomics. The technology enables rapid identification of large numbers of proteins in a relatively short time. Comparative quantification of identified proteins from different samples is often regarded as the next step in proteomics experiments enabling the comparison of protein expression in different proteomes. Differential labeling of samples using stable isotope incorporation or conjugation is commonly used to compare protein levels between samples but these procedures are difficult to carry out in the laboratory and for large numbers of samples. Recently, comparative quantification of label-free LC(n)-MS proteomics data has emerged as an alternative approach. In this review, we discuss different computational approaches for extracting comparative quantitative information from label-free LC(n)-MS proteomics data. The procedure for computationally recovering the quantitative information is described. Furthermore, statistical tests used to evaluate the relevance of results will also be discussed.  相似文献   

8.
Normalization removes or minimizes the biases of systematic variation that exists in experimental data sets. This study presents a systematic variation normalization (SVN) procedure for removing systematic variation in two channel microarray gene expression data. Based on an analysis of how systematic variation contributes to variability in microarray data sets, our normalization procedure includes background subtraction determined from the distribution of pixel intensity values from each data acquisition channel and log conversion, linear or non-linear regression, restoration or transformation, and multiarray normalization. In the case when a non-linear regression is required, an empirical polynomial approximation approach is used. Either the high terminated points or their averaged values in the distributions of the pixel intensity values observed in control channels may be used for rescaling multiarray datasets. These pre-processing steps remove systematic variation in the data attributable to variability in microarray slides, assay-batches, the array process, or experimenters. Biologically meaningful comparisons of gene expression patterns between control and test channels or among multiple arrays are therefore unbiased using normalized but not unnormalized datasets.  相似文献   

9.
Novak JP  Sladek R  Hudson TJ 《Genomics》2002,79(1):104-113
Large-scale gene expression measurement techniques provide a unique opportunity to gain insight into biological processes under normal and pathological conditions. To interpret the changes in expression profiles for thousands of genes, we face the nontrivial problem of understanding the significance of these changes. In practice, the sources of background variability in expression data can be divided into three categories: technical, physiological, and sampling. To assess the relative importance of these sources of background variation, we generated replicate gene expression profiles on high-density Affymetrix GeneChip oligonucleotide arrays, using either identical RNA samples or RNA samples obtained under similar biological states. We derived a novel measure of dispersion in two-way comparisons, using a linear characteristic function. When comparing expression profiles from replicate tests using the same RNA sample (a test for technical variability), we observed a level of dispersion similar to the pattern obtained with RNA samples from replicate cultures of the same cell line (a test for physiological variability). On the other hand, a higher level of dispersion was observed when tissue samples of different animals were compared (an example of sampling variability). This implies that, in experiments in which samples from different subjects are used, the variation induced by the stimulus may be masked by non-stimuli-related differences in the subjects' biological state. These analyses underscore the need for replica experiments to reliably interpret large-scale expression data sets, even with simple microarray experiments.  相似文献   

10.
Label-free detection methods for protein microarrays   总被引:1,自引:0,他引:1  
Yu X  Xu D  Cheng Q 《Proteomics》2006,6(20):5493-5503
With the growth of the "-omics" such as functional genomics and proteomics, one of the foremost challenges in biotechnologies has become the development of novel methods to monitor biological process and acquire the information of biomolecular interactions in a systematic manner. To fully understand the roles of newly discovered genes or proteins, it is necessary to elucidate the functions of these molecules in their interaction network. Microarray technology is becoming the method of choice for such a task. Although protein microarray can provide a high throughput analytical platform for protein profiling and protein-protein interaction, most of the current reports are limited to labeled detection using fluorescence or radioisotope techniques. These limitations deflate the potential of the method and prevent the technology from being adapted in a broader range of proteomics applications. In recent years, label-free analytical approaches have gone through intensified development and have been coupled successfully with protein microarray. In many examples of label-free study, the microarray has not only offered the high throughput detection in real time, but also provided kinetics information as well as in situ identification. This article reviews the most significant label-free detection methods for microarray technology, including surface plasmon resonance imaging, atomic force microscope, electrochemical impedance spectroscopy and MS and their applications in proteomics research.  相似文献   

11.
Various types of unwanted and uncontrollable signal variations in MS‐based metabolomics and proteomics datasets severely disturb the accuracies of metabolite and protein profiling. Therefore, pooled quality control (QC) samples are often employed in quality management processes, which are indispensable to the success of metabolomics and proteomics experiments, especially in high‐throughput cases and long‐term projects. However, data consistency and QC sample stability are still difficult to guarantee because of the experimental operation complexity and differences between experimenters. To make things worse, numerous proteomics projects do not take QC samples into consideration at the beginning of experimental design. Herein, a powerful and interactive web‐based software, named pseudoQC, is presented to simulate QC sample data for actual metabolomics and proteomics datasets using four different machine learning‐based regression methods. The simulated data are used for correction and normalization of the two published datasets, and the obtained results suggest that nonlinear regression methods perform better than linear ones. Additionally, the above software is available as a web‐based graphical user interface and can be utilized by scientists without a bioinformatics background. pseudoQC is open‐source software and freely available at https://www.omicsolution.org/wukong/pseudoQC/ .  相似文献   

12.
Shotgun proteomics has become the standard proteomics technique for the large-scale measurement of protein abundances in biological samples. Despite quantitative proteomics has been usually performed using label-based approaches, label-free quantitation offers advantages related to the avoidance of labeling steps, no limitation in the number of samples to be compared, and the gain in protein detection sensitivity. However, since samples are analyzed separately, experimental design becomes critical. The exploration of spectral counting quantitation based on LC-MS presented here gathers experimental evidence of the influence of batch effects on comparative proteomics. The batch effects shown with spiking experiments clearly interfere with the biological signal. In order to minimize the interferences from batch effects, a statistical correction is proposed and implemented. Our results show that batch effects can be attenuated statistically when proper experimental design is used. Furthermore, the batch effect correction implemented leads to a substantial increase in the sensitivity of statistical tests. Finally, the applicability of our batch effects correction is shown on two different biomarker discovery projects involving cancer secretomes. We think that our findings will allow designing and executing better comparative proteomics projects and will help to avoid reaching false conclusions in the field of proteomics biomarker discovery.  相似文献   

13.
A systematic evaluation of the value and potential of terminal-restriction fragment length polymorphism (T-RFLP) analysis for the study of microbial community structure has been undertaken. The reproducibility and robustness of the method has been assessed using environmental DNA samples isolated directly from PCB-polluted or pristine soil, and subsequent polymerase chain reaction (PCR) amplification of total community 16S rDNA. An initial investigation to assess the variability both within and between different polyacrylamide gel electrophoresis (PAGE) runs showed that almost identical community profiles were consistently produced from the same sample. Similarly, very little variability was observed as a result of variation between replicate restriction digestions, PCR amplifications or between replicate DNA isolations. Decreasing concentrations of template DNA produced a decline in both the complexity and the intensity of fragments present in the community profile, with no additional fragments detected in the higher dilutions that were not already present when more original template DNA was used. Reducing the number of cycles of PCR produced similar results. The greatest variation between profiles generated from the same DNA sample was produced using different Taq DNA polymerases, while lower levels of variability were found between PCR products that had been produced using different annealing temperatures. Incomplete digestion by the restriction enzyme may, as a result of the generation of partially digested fragments, lead to an overestimation of the overall diversity within a community. The results obtained indicate that, once standardized, T-RFLP analysis is a highly reproducible and robust technique that yields high-quality fingerprints consisting of fragments of precise sizes, which, in principle, could be phylogenetically assigned, once an appropriate database is constructed.  相似文献   

14.

Background  

Mass spectrometry is a key technique in proteomics and can be used to analyze complex samples quickly. One key problem with the mass spectrometric analysis of peptides and proteins, however, is the fact that absolute quantification is severely hampered by the unclear relationship between the observed peak intensity and the peptide concentration in the sample. While there are numerous approaches to circumvent this problem experimentally (e.g. labeling techniques), reliable prediction of the peak intensities from peptide sequences could provide a peptide-specific correction factor. Thus, it would be a valuable tool towards label-free absolute quantification.  相似文献   

15.
Mass spectrometry-driven proteomics is increasingly relying on quantitative analyses for biological discoveries. As a result, different methods and algorithms have been developed to perform relative or absolute quantification based on mass spectrometry data. One of the most popular quantification methods are the so-called label-free approaches, which require no special sample processing, and can even be applied retroactively to existing data sets. Of these label-free methods, the MS/MS-based approaches are most often applied, mainly because of their inherent simplicity as compared to MS-based methods. The main application of these approaches is the determination of relative protein amounts between different samples, expressed as protein ratios. However, as we demonstrate here, there are some issues with the reproducibility across replicates of these protein ratio sets obtained from the various MS/MS-based label-free methods, indicating that the existing methods are not optimally robust. We therefore present two new methods (called RIBAR and xRIBAR) that use the available MS/MS data more effectively, achieving increased robustness. Both the accuracy and the precision of our novel methods are analyzed and compared to the existing methods to illustrate the increased robustness of our new methods over existing ones.  相似文献   

16.
Within the past decade numerous methods for quantitative proteome analysis have been developed of which all exhibit particular advantages and disadvantages. Here, we present the results of a study aiming for a comprehensive comparison of ion-intensity based label-free proteomics and two label-based approaches using isobaric tags incorporated at the peptide and protein levels, respectively. As model system for our quantitative analysis we used the three hepatoma cell lines HepG2, Hep3B and SK-Hep-1. Four biological replicates of each cell line were quantitatively analyzed using an RPLC–MS/MS setup. Each quantification experiment was performed twice to determine technical variances of the different quantification techniques. We were able to show that the label-free approach by far outperforms both TMT methods regarding proteome coverage, as up to threefold more proteins were reproducibly identified in replicate measurements. Furthermore, we could demonstrate that all three methods show comparable reproducibility concerning protein quantification, but slightly differ in terms of accuracy. Here, label-free was found to be less accurate than both TMT approaches. It was also observed that the introduction of TMT labels at the protein level reduces the effect of underestimation of protein ratios, which is commonly monitored in case of TMT peptide labeling. Previously reported differences in protein expression between the particular cell lines were furthermore reproduced, which confirms the applicability of each investigated quantification method to study proteomic differences in such biological systems. This article is part of a Special Issue entitled: Biomarkers: A Proteomic Challenge.  相似文献   

17.
Nowadays, proteomic studies no longer focus only on identifying as many proteins as possible in a given sample, but aiming for an accurate quantification of them. Especially in clinical proteomics, the investigation of variable protein expression profiles can yield useful information on pathological pathways or biomarkers and drug targets related to a particular disease. Over the time, many quantitative proteomic approaches have been established allowing researchers in the field of proteomics to refer to a comprehensive toolbox of different methodologies. In this review we will give an overview of different methods of quantitative proteomics with focus on label-free proteomics and its use in clinical proteomics.  相似文献   

18.
Comprehensive comparisons of quantitative proteomics techniques are rare in the literature, yet they are crucially important for optimal selection of approaches and methodologies that are ideal for a given proteomics initiative. In this study, two LC-based quantitative proteomics approaches--iTRAQ and label-free--were implemented using the LTQ-Orbitrap Velos platform. For this comparison, the model used was the total protein content from two Chlamydomonas reinhardtii strains in the context of alternative biofuels production. The strain comparison includes sta6 (a starch-less mutant of cw15) that produces twice as many lipid bodies (LB) containing triacylglycerols (TAGs) as its parental strain cw15 (a cell wall-deficient C. reinhardtii strain) under nitrogen starvation. Internal standard addition was used to rigorously assess the quantitation accuracy and precision of each method. Results from iTRAQ-4plex labeling using HCD (higher energy collision-induced dissociation) fragmentation were compared to those obtained using a label-free approach based on the peak area of intact peptides and collision-induced dissociation. The accuracy and precision, number of identified/quantified proteins and statistically significant protein differences detected, as well as efficiency of these two quantitative proteomics methods were evaluated and compared. Four technical and three biological replicates of each strain were performed to assess both the technical and biological variation of both approaches. A total of 896 and 639 proteins were identified with high confidence, and 329 and 124 proteins were quantified significantly with label-free and iTRAQ, respectively, using biological replicates. The results showed that both iTRAQ labeling and label-free methods provide high quality quantitative and qualitative data using nano-LC coupled with the LTQ-Orbitrap Velos mass spectrometer, but the selection of the optimal approach is dependent on experimental design and the biological question to be addressed. The functional categorization of the differential proteins between cw15 and sta6 reveals already known but also new mechanisms likely responsible for the production of lipids in sta6 and sets the baseline for future studies aimed at engineering these strains for high oil production.  相似文献   

19.
Jens Allmer 《Amino acids》2010,38(4):1075-1087
Determining the differential expression of proteins under different conditions is of major importance in proteomics. Since mass spectrometry-based proteomics is often used to quantify proteins, several labelling strategies have been developed. While these are generally more precise than label-free quantitation approaches, they imply specifically designed experiments which also require knowledge about peptides that are expected to be measured and need to be modified. We recently designed the 2DB database which aids storage, analysis, and publication of data from mass spectrometric experiments to identify proteins. This database can aid identifying peptides which can be used for quantitation. Here an extension to the database application, named MSMAG, is presented which allows for more detailed analysis of the distribution of peptides and their associated proteins over the fractions of an experiment. Furthermore, given several biological samples in the database, label-free quantitation can be performed. Thus, interesting proteins, which may warrant further investigation, can be identified en passant while performing high-throughput proteomics studies.  相似文献   

20.
Advancements in mass spectrometry‐based proteomics have enabled experiments encompassing hundreds of samples. While these large sample sets deliver much‐needed statistical power, handling them introduces technical variability known as batch effects. Here, we present a step‐by‐step protocol for the assessment, normalization, and batch correction of proteomic data. We review established methodologies from related fields and describe solutions specific to proteomic challenges, such as ion intensity drift and missing values in quantitative feature matrices. Finally, we compile a set of techniques that enable control of batch effect adjustment quality. We provide an R package, "proBatch", containing functions required for each step of the protocol. We demonstrate the utility of this methodology on five proteomic datasets each encompassing hundreds of samples and consisting of multiple experimental designs. In conclusion, we provide guidelines and tools to make the extraction of true biological signal from large proteomic studies more robust and transparent, ultimately facilitating reliable and reproducible research in clinical proteomics and systems biology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号