期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Discussion on common data analysis strategies used in MS-based proteomics

Matthiesen R Azevedo L Amorim A Carvalho AS 《Proteomics》2011,11(4):604-619

Current proteomics technology is limited in resolving the proteome complexity of biological systems. The main issue at stake is to increase throughput and spectra quality so that spatiotemporal dimensions, population parameters and the complexity of protein modifications on a quantitative scale can be considered. MS-based proteomics and protein arrays are the main players in large-scale proteome analysis and an integration of these two methodologies is powerful but presently not sufficient for detailed quantitative and spatiotemporal proteome characterization. Improvements of instrumentation for MS-based proteomics have been achieved recently resulting in data sets of approximately one million spectra which is a large step in the right direction. The corresponding raw data range from 50 to 100?Gb and are frequently made available. Multidimensional LC-MS data sets have been demonstrated to identify and quantitate 2000-8000 proteins from whole cell extracts. The analysis of the resulting data sets requires several steps from raw data processing, to database-dependent search, statistical evaluation of the search result, quantitative algorithms and statistical analysis of quantitative data. A large number of software tools have been proposed for the above-mentioned tasks. However, it is not the aim of this review to cover all software tools, but rather discuss common data analysis strategies used by various algorithms for each of the above-mentioned steps in a non-redundant approach and to argue that there are still some areas which need improvements. 相似文献

2.

Protein extraction from plant tissues for 2DE and its application in proteomic analysis

Xiaolin Wu Fangping Gong Wei Wang 《Proteomics》2014,14(6):645-658

Plant tissues contain large amounts of secondary compounds that significantly interfere with protein extraction and 2DE analysis. Thus, sample preparation is a crucial step prior to 2DE in plant proteomics. This tutorial highlights the guidelines that need to be followed to perform an adequate total protein extraction before 2DE in plant proteomics. We briefly describe the history, development, and feature of major sample preparation methods for the 2DE analysis of plant tissues, that is, trichloroacetic acid/acetone precipitation and phenol extraction. We introduce the interfering compounds in plant tissues and the general guidelines for tissue disruption, protein precipitation and resolubilization. We describe in details the advantages, limitations, and application of the trichloroacetic acid/acetone precipitation and phenol extraction methods to enable the readers to select the appropriate method for a specific species, tissue, or cell type. The current applications of the sample preparation methods in plant proteomics in the literature are analyzed. A comparative proteomic analysis between male and female plants of Pistacia chinensis is used as an example to represent the sample preparation methodology in 2DE‐based proteomics. Finally, the current limitations and future development of these sample preparation methods are discussed. This Tutorial is part of the International Proteomics Tutorial Programme (IPTP17). 相似文献

3.

GProX, a user-friendly platform for bioinformatics analysis and visualization of quantitative proteomics data

Rigbolt KT Vanselow JT Blagoev B 《Molecular & cellular proteomics : MCP》2011,10(8):O110.007450

Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net. 相似文献

4.

A face in the crowd: recognizing peptides through database search

Eng JK Searle BC Clauser KR Tabb DL 《Molecular & cellular proteomics : MCP》2011,10(11):R111.009522

Peptide identification via tandem mass spectrometry sequence database searching is a key method in the array of tools available to the proteomics researcher. The ability to rapidly and sensitively acquire tandem mass spectrometry data and perform peptide and protein identifications has become a commonly used proteomics analysis technique because of advances in both instrumentation and software. Although many different tandem mass spectrometry database search tools are currently available from both academic and commercial sources, these algorithms share similar core elements while maintaining distinctive features. This review revisits the mechanism of sequence database searching and discusses how various parameter settings impact the underlying search. 相似文献

5.

EBprot: Statistical analysis of labeling‐based quantitative proteomics data

下载免费PDF全文

Hiromi W. L. Koh Hannah L. F. Swa Damian Fermin Siok Ghee Ler Jayantha Gunaratne Hyungwon Choi 《Proteomics》2015,15(15):2580-2591

Labeling‐based proteomics is a powerful method for detection of differentially expressed proteins (DEPs). The current data analysis platform typically relies on protein‐level ratios, which is obtained by summarizing peptide‐level ratios for each protein. In shotgun proteomics, however, some proteins are quantified with more peptides than others, and this reproducibility information is not incorporated into the differential expression (DE) analysis. Here, we propose a novel probabilistic framework EBprot that directly models the peptide‐protein hierarchy and rewards the proteins with reproducible evidence of DE over multiple peptides. To evaluate its performance with known DE states, we conducted a simulation study to show that the peptide‐level analysis of EBprot provides better receiver‐operating characteristic and more accurate estimation of the false discovery rates than the methods based on protein‐level ratios. We also demonstrate superior classification performance of peptide‐level EBprot analysis in a spike‐in dataset. To illustrate the wide applicability of EBprot in different experimental designs, we applied EBprot to a dataset for lung cancer subtype analysis with biological replicates and another dataset for time course phosphoproteome analysis of EGF‐stimulated HeLa cells with multiplexed labeling. Through these examples, we show that the peptide‐level analysis of EBprot is a robust alternative to the existing statistical methods for the DE analysis of labeling‐based quantitative datasets. The software suite is freely available on the Sourceforge website http://ebprot.sourceforge.net/ . All MS data have been deposited in the ProteomeXchange with identifier PXD001426 ( http://proteomecentral.proteomexchange.org/dataset/PXD001426/ ). 相似文献

6.

Fishing for biomarkers: analyzing mass spectrometry data with the new ClinProTools software

Ketterlinus R Hsieh SY Teng SH Lee H Pusch W 《BioTechniques》2005,(Z1):37-40

Recently, applications of mass spectrometry in the field of clinical proteomics have gained tremendous visibility in the scientific and clinical community. One major objective is the search for potential biomarkers in complex body fluids like serum, plasma, urine, saliva, or cerebral spinal fluid. For this purpose, efficient visualization of large data sets derived from patient cohorts is crucial to provide clinical experts an interactive impression of the data quality. Additionally, it is necessary to apply statistical analysis and pattern matching algorithms to attain validated signal patterns that may allow for later applications in sample classification. We introduce the new ClinProTools bioinformatics software, which performs all major steps of profiling, screening, and monitoring applications in clinical proteomics. ClinProTools is the data interpretation software of the mass spectrometry-based ClinProt solutions for biomarker analysis. ClinProTools performs data pretreatment, visualization, statistics, pattern determination, pattern evaluation, and classification of spectra. This article will focus on ClinProTool's powerful and intuitive visualization options for clinical proteomics applications. 相似文献

7.

Bottom up proteomics data analysis strategies to explore protein modifications and genomic variants

下载免费PDF全文

Ana Sofia Carvalho Deborah Penque Rune Matthiesen 《Proteomics》2015,15(11):1789-1792

The quest to understand biological systems requires further attention of the scientific community to the challenges faced in proteomics. In fact the complexity of the proteome reaches uncountable orders of magnitude. This means that significant technical and data‐analytic innovations will be needed for the full understanding of biology. Current state of art MS is probably our best choice for studying protein complexity and exploring new ways to use MS and MS derived data should be given higher priority. We present here a brief overview of visualization and statistical analysis strategies for quantitative peptide values on an individual protein basis. These analysis strategies can help pinpoint protein modifications, splice, and genomic variants of biological relevance. We demonstrate the application of these data analysis strategies using a bottom‐up proteomics dataset obtained in a drug profiling experiment. Furthermore, we have also observed that the presented methods are useful for studying peptide distributions from clinical samples from a large number of individuals. We expect that the presented data analysis strategy will be useful in the future to define functional protein variants in biological model systems and disease studies. Therefore robust software implementing these strategies is urgently needed. 相似文献

8.

基于质谱的定量蛋白质组学策略和方法研究进展

下载免费PDF全文

常乘朱云平《中国科学:生命科学》2015,45(5):425-438

定量蛋白质组学已经成为组学领域研究的热点之一.相关实验技术和计算方法的不断创新极大地促进了定量蛋白质组学的飞速发展.常用的定量蛋白质组学策略按照是否需要稳定同位素标记可以分为无标定量和有标定量两大类.每类策略又产生了众多定量方法和工具,它们一方面推动了定量蛋白质组学的深入发展;另一方面,也在实验策略与技术的发展过程中不断更新.因此对这些定量实验策略和方法进行系统总结和归纳将有助于定量蛋白质组学的研究.本文主要从方法学角度全面归纳了目前定量蛋白质组学研究的相关策略和算法,详述了无标定量和有标定量的具体算法流程并比较了各自特点,还对以研究蛋白质绝对丰度为目标的绝对定量算法进行了总结,列举了常用的定量软件和工具,最后概述了定量结果的质量控制方法,对定量蛋白质组学方法发展的前景进行了展望. 相似文献

9.

EBP, a program for protein identification using multiple tandem mass spectrometry datasets

Price TS Lucitt MB Wu W Austin DJ Pizarro A Yocum AK Blair IA FitzGerald GA Grosser T 《Molecular & cellular proteomics : MCP》2007,6(3):527-536

MS/MS combined with database search methods can identify the proteins present in complex mixtures. High throughput methods that infer probable peptide sequences from enzymatically digested protein samples create a challenge in how best to aggregate the evidence for candidate proteins. Typically the results of multiple technical and/or biological replicate experiments must be combined to maximize sensitivity. We present a statistical method for estimating probabilities of protein expression that integrates peptide sequence identifications from multiple search algorithms and replicate experimental runs. The method was applied to create a repository of 797 non-homologous zebrafish (Danio rerio) proteins, at an empirically validated false identification rate under 1%, as a resource for the development of targeted quantitative proteomics assays. We have implemented this statistical method as an analytic module that can be integrated with an existing suite of open-source proteomics software. 相似文献

10.

Investigating sample pooling strategies for DIGE experiments to address biological variability

Natasha A. Karp Kathryn S. Lilley Dr. 《Proteomics》2009,9(2):388-397

If biological questions are to be answered using quantitative proteomics, it is essential to design experiments which have sufficient power to be able to detect changes in expression. Sample subpooling is a strategy that can be used to reduce the variance but still allow studies to encompass biological variation. Underlying sample pooling strategies is the biological averaging assumption that the measurements taken on the pool are equal to the average of the measurements taken on the individuals. This study finds no evidence of a systematic bias triggered by sample pooling for DIGE and that pooling can be useful in reducing biological variation. For the first time in quantitative proteomics, the two sources of variance were decoupled and it was found that technical variance predominates for mouse brain, while biological variance predominates for human brain. A power analysis found that as the number of individuals pooled increased, then the number of replicates needed declined but the number of biological samples increased. Repeat measures of biological samples decreased the numbers of samples required but increased the number of gels needed. An example cost benefit analysis demonstrates how researchers can optimise their experiments while taking into account the available resources. 相似文献

11.

基于质谱的蛋白质N末端乙酰化程度定量方法的研究

王洁张旭敏《基因组学与应用生物学》2019,(1):135-142

随着质谱技术及各种定量方法的不断完善和发展,定量蛋白质组学的方法不断地被应用到各类生物学研究中。蛋白质组学定性定量数据的处理主要通过一些多功能的商业化或者开源软件来进行,如常用的数据分析软件Proteome Discoverer和Maxquant。但是在通过化学标记对蛋白质N末端乙酰化程度进行定量这一方面,Proteome Discoverer和Maxquant在一定程度上存在准确性不高和完整度不够的问题。于是本研究针对自己的实验特点,通过Java算法编写了相应的定量程序Acequant来完成N末端乙酰化程度的相对定量。本研究将该程序在已有相关报道的He La cell上进行了验证,Acequant共定量到1 587个蛋白质N末端,而Proteome Discoverer和Maxquant分别只定量到42个和306个N末端。同时,手动验证原始图谱也证实了Acequant定量的准确性更好。于是,本研究将此方法进一步应用到秀丽隐杆线虫N末端乙酰化的研究中,并初步发现了线虫整体的N末端乙酰化状态,为进一步的N末端研究提供了支持。相似文献

12.

Relative and absolute quantitative shotgun proteomics: targeting low-abundance proteins in Arabidopsis thaliana

Wienkoop S Weckwerth W 《Journal of experimental botany》2006,57(7):1529-1535

相似文献

13.

A suite of algorithms for the comprehensive analysis of complex protein mixtures using high-resolution LC-MS

Bellew M Coram M Fitzgibbon M Igra M Randolph T Wang P May D Eng J Fang R Lin C Chen J Goodlett D Whiteaker J Paulovich A McIntosh M 《Bioinformatics (Oxford, England)》2006,22(15):1902-1909

相似文献

14.

Representation of selected‐reaction monitoring data in the mzQuantML data standard

下载免费PDF全文

Da Qi Craig Lawless Johan Teleman Fredrik Levander Stephen W. Holman Simon Hubbard Andrew R. Jones 《Proteomics》2015,15(15):2592-2596

The mzQuantML data standard was designed to capture the output of quantitative software in proteomics, to support submissions to public repositories, development of visualization software and pipeline/modular approaches. The standard is designed around a common core that can be extended to support particular types of technique through the release of semantic rules that are checked by validation software. The first release of mzQuantML supported four quantitative proteomics techniques via four sets of semantic rules: (i) intensity‐based (MS¹) label free, (ii) MS¹ label‐based (such as SILAC or N¹⁵), (iii) MS² tag‐based (iTRAQ or tandem mass tags), and (iv) spectral counting. We present an update to mzQuantML for supporting SRM techniques. The update includes representing the quantitative measurements, and associated meta‐data, for SRM transitions, the mechanism for inferring peptide‐level or protein‐level quantitative values, and support for both label‐based or label‐free SRM protocols, through the creation of semantic rules and controlled vocabulary terms. We have updated the specification document for mzQuantML (version 1.0.1) and the mzQuantML validator to ensure that consistent files are produced by different exporters. We also report the capabilities for production of mzQuantML files from popular SRM software packages, such as Skyline and Anubis. 相似文献

15.

IsobariQ: software for isobaric quantitative proteomics using IPTL, iTRAQ, and TMT

Arntzen MØ Koehler CJ Barsnes H Berven FS Treumann A Thiede B 《Journal of proteome research》2011,10(2):913-920

Isobaric peptide labeling plays an important role in relative quantitative comparisons of proteomes. Isobaric labeling techniques utilize MS/MS spectra for relative quantification, which can be either based on the relative intensities of reporter ions in the low mass region (iTRAQ and TMT) or on the relative intensities of quantification signatures throughout the spectrum due to isobaric peptide termini labeling (IPTL). Due to the increased quantitative information found in MS/MS fragment spectra generated by the recently developed IPTL approach, new software was required to extract the quantitative information. IsobariQ was specifically developed for this purpose; however, support for the reporter ion techniques iTRAQ and TMT is also included. In addition, to address recently emphasized issues about heterogeneity of variance in proteomics data sets, IsobariQ employs the statistical software package R and variance stabilizing normalization (VSN) algorithms available therein. Finally, the functionality of IsobariQ is validated with data sets of experiments using 6-plex TMT and IPTL. Notably, protein substrates resulting from cleavage by proteases can be identified as shown for caspase targets in apoptosis. 相似文献

16.

Gaining confidence in high-throughput protein interaction networks 总被引：1，自引：0，他引：1

Bader JS Chaudhuri A Rothberg JM Chant J 《Nature biotechnology》2004,22(1):78-85

相似文献

17.

Evaluation of two-dimensional difference gel electrophoresis for protein profiling. Soluble proteins of the marine bacterium Pirellula sp. strain 1

Gade D Thiermann J Markowsky D Rabus R 《Journal of molecular microbiology and biotechnology》2003,5(4):240-251

Two-dimensional gel electrophoresis (2DE) is a central tool of proteome research, since it allows separation of complex protein mixtures at highest resolution. Quantification of gene expression at the protein level requires sensitive visualization of protein spots over a wide linear range. Two-dimensional difference gel electrophoresis (2D DIGE) is a new fluorescent technique for protein labeling in 2DE gels. Proteins are labeled prior to electrophoresis with fluorescent CyDyes trade mark and differently labeled samples are then co-separated on the same 2DE gel. We evaluated 2D DIGE for detection and quantification of proteins specific for glucose or N-acetylglucosamine metabolism in the marine bacterium Pirellula sp. strain 1. The experiment was based on 10 parallel 2DE gels. Detection and comparison of the protein spots were performed with the DeCyder trade mark software that uses an internal standard to quantify differences in protein abundance with high statistical confidence; 24 proteins differing in abundance by a factor of at least 1.5 (t test value <10(-9)) were identified. For comparison, another experiment was carried out with four SYPRO-Ruby-stained 2DE gels for each of the two growth conditions; image analysis was done with the ImageMaster trade mark 2D Elite software. Sensitivity of the CyDye fluors was evaluated by comparing Cy2, Cy3, Cy5, SYPRO Ruby, silver, and colloidal Coomassie staining. Three replicate gels, each loaded with 50 microg of protein, were run for each stain and the gels were analyzed with the ImageMaster software. Labeling with CyDyes allowed detection of almost as many protein spots as staining with silver or SYPRO Ruby. 相似文献

18.

Detecting and accounting for multiple sources of positional variance in peak list registration analysis and spin system grouping

Andrey Smelter Eric C. Rouchka Hunter N. B. Moseley

《Journal of biomolecular NMR》

Peak lists derived from nuclear magnetic resonance (NMR) spectra are commonly used as input data for a variety of computer assisted and automated analyses. These include automated protein resonance assignment and protein structure calculation software tools. Prior to these analyses, peak lists must be aligned to each other and sets of related peaks must be grouped based on common chemical shift dimensions. Even when programs can perform peak grouping, they require the user to provide uniform match tolerances or use default values. However, peak grouping is further complicated by multiple sources of variance in peak position limiting the effectiveness of grouping methods that utilize uniform match tolerances. In addition, no method currently exists for deriving peak positional variances from single peak lists for grouping peaks into spin systems, i.e. spin system grouping within a single peak list. Therefore, we developed a complementary pair of peak list registration analysis and spin system grouping algorithms designed to overcome these limitations. We have implemented these algorithms into an approach that can identify multiple dimension-specific positional variances that exist in a single peak list and group peaks from a single peak list into spin systems. The resulting software tools generate a variety of useful statistics on both a single peak list and pairwise peak list alignment, especially for quality assessment of peak list datasets. We used a range of low and high quality experimental solution NMR and solid-state NMR peak lists to assess performance of our registration analysis and grouping algorithms. Analyses show that an algorithm using a single iteration and uniform match tolerances approach is only able to recover from 50 to 80% of the spin systems due to the presence of multiple sources of variance. Our algorithm recovers additional spin systems by reevaluating match tolerances in multiple iterations. To facilitate evaluation of the algorithms, we developed a peak list simulator within our nmrstarlib package that generates user-defined assigned peak lists from a given BMRB entry or database of entries. In addition, over 100,000 simulated peak lists with one or two sources of variance were generated to evaluate the performance and robustness of these new registration analysis and peak grouping algorithms. 相似文献

19.

Comparison of statistical approaches for the analysis of proteome expression data of differentiating neural stem cells 总被引：1，自引：0，他引：1

Maurer MH Feldmann RE Brömme JO Kalenka A 《Journal of proteome research》2005,4(1):96-100

Comparative proteomic studies often use statistical tests included in the software for the analysis of digitized images of two-dimensional electrophoresis gels. As these programs include only limited capabilities for statistical analysis, many studies do not further describe their statistical approach. To find potential differences produced by different data processing, we compared the results of (1) Student's t-test using a spreadsheet program, (2) the intrinsic algorithms implemented in the Phoretix 2D gel analysis software, and (3) the SAM algorithm originally developed for microarray analysis. We applied the algorithms to proteome data of undifferentiated neural stem cells versus in vitro differentiated neural stem cells. We found (1) 367 spots differentially expressed using Student's t-test, (2) 203 spots using the algorithms in Phoretix 2D, and (3) 119 spots using the algorithms in SAM, respectively, with an overlap of 42 spots detected by all three algorithms. Applying different statistical approaches on the same dataset resulted in divergent set of protein spots labeled as statistically "significant". Currently, there is no agreement on statistical data processing of 2DE datasets, but the statistical tests applied in 2DE studies should be documented. Tools for the statistical analysis of proteome data should be implemented and documented in the existing 2DE software. 相似文献

20.

PROTEOMER: A workflow‐optimized laboratory information management system for 2‐D electrophoresis‐centered proteomics

Grit Nebrich Marion Herrmann Daniela Hartl Madeleine Diedrich Thomas Kreitler Christoph Wierling Joachim Klose Patrick Giavalisco Claus Zabel Dr. Lei Mao 《Proteomics》2009,9(7):1795-1808

In recent years proteomics became increasingly important to functional genomics. Although a large amount of data is generated by high throughput large‐scale techniques, a connection of these mostly heterogeneous data from different analytical platforms and of different experiments is limited. Data mining procedures and algorithms are often insufficient to extract meaningful results from large datasets and therefore limit the exploitation of the generated biological information. In our proteomic core facility, which almost exclusively focuses on 2‐DE/MS‐based proteomics, we developed a proteomic database custom tailored to our needs aiming at connecting MS protein identification information to 2‐DE derived protein expression profiles. The tools developed should not only enable an automatic evaluation of single experiments, but also link multiple 2‐DE experiments with MS‐data on different levels and thereby helping to create a comprehensive network of our proteomics data. Therefore the key feature of our “PROTEOMER” database is its high cross‐referencing capacity, enabling integration of a wide range of experimental data. To illustrate the workflow and utility of the system, two practical examples are provided to demonstrate that proper data cross‐referencing can transform information into biological knowledge. 相似文献