首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Liquid chromatography-mass spectrometry (LC-MS)-based proteomics is becoming an increasingly important tool in characterizing the abundance of proteins in biological samples of various types and across conditions. Effects of disease or drug treatments on protein abundance are of particular interest for the characterization of biological processes and the identification of biomarkers. Although state-of-the-art instrumentation is available to make high-quality measurements and commercially available software is available to process the data, the complexity of the technology and data presents challenges for bioinformaticians and statisticians. Here, we describe a pipeline for the analysis of quantitative LC-MS data. Key components of this pipeline include experimental design (sample pooling, blocking, and randomization) as well as deconvolution and alignment of mass chromatograms to generate a matrix of molecular abundance profiles. An important challenge in LC-MS-based quantitation is to be able to accurately identify and assign abundance measurements to members of protein families. To address this issue, we implement a novel statistical method for inferring the relative abundance of related members of protein families from tryptic peptide intensities. This pipeline has been used to analyze quantitative LC-MS data from multiple biomarker discovery projects. We illustrate our pipeline here with examples from two of these studies, and show that the pipeline constitutes a complete workable framework for LC-MS-based differential quantitation. Supplementary material is available at http://iec01.mie.utoronto.ca/~thodoros/Bukhman/.  相似文献   

2.
Recent advances in proteomics technologies provide tremendous opportunities for biomarker-related clinical applications; however, the distinctive characteristics of human biofluids such as the high dynamic range in protein abundances and extreme complexity of the proteomes present tremendous challenges. In this review we summarize recent advances in LC-MS-based proteomics profiling and its applications in clinical proteomics as well as discuss the major challenges associated with implementing these technologies for more effective candidate biomarker discovery. Developments in immunoaffinity depletion and various fractionation approaches in combination with substantial improvements in LC-MS platforms have enabled the plasma proteome to be profiled with considerably greater dynamic range of coverage, allowing many proteins at low ng/ml levels to be confidently identified. Despite these significant advances and efforts, major challenges associated with the dynamic range of measurements and extent of proteome coverage, confidence of peptide/protein identifications, quantitation accuracy, analysis throughput, and the robustness of present instrumentation must be addressed before a proteomics profiling platform suitable for efficient clinical applications can be routinely implemented.  相似文献   

3.

Background

Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS) based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics.

Results

We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling.

Conclusion

The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools. Corra enables appropriate statistical analyses, with controlled false-discovery rates, ultimately to inform subsequent targeted identification of differentially abundant peptides by MS/MS. For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field.  相似文献   

4.
MS-BID (MS Biomarker Discovery Platform) is an integrative computational pipeline for biomarker discovery using LC-MS-based comparative proteomic analysis. This platform consists of several computational tools for: (i) detecting peptides in the collected patterns; (ii) matching detected peptides across a number of LC-MS datasets and (iii) selecting discriminatory peptides between classes of samples. AVAILABILITY: MS-BID source codes, binaries and documentations are freely available under LGPL from http://tools.proteomecenter.org/msBID.php.  相似文献   

5.
Despite their potential to impact diagnosis and treatment of cancer, few protein biomarkers are in clinical use. Biomarker discovery is plagued with difficulties ranging from technological (inability to globally interrogate proteomes) to biological (genetic and environmental differences among patients and their tumors). We urgently need paradigms for biomarker discovery. To minimize biological variation and facilitate testing of proteomic approaches, we employed a mouse model of breast cancer. Specifically, we performed LC-MS/MS of tumor and normal mammary tissue from a conditional HER2/Neu-driven mouse model of breast cancer, identifying 6758 peptides representing >700 proteins. We developed a novel statistical approach (SASPECT) for prioritizing proteins differentially represented in LC-MS/MS datasets and identified proteins over- or under-represented in tumors. Using a combination of antibody-based approaches and multiple reaction monitoring-mass spectrometry (MRM-MS), we confirmed the overproduction of multiple proteins at the tissue level, identified fibulin-2 as a plasma biomarker, and extensively characterized osteopontin as a plasma biomarker capable of early disease detection in the mouse. Our results show that a staged pipeline employing shotgun-based comparative proteomics for biomarker discovery and multiple reaction monitoring for confirmation of biomarker candidates is capable of finding novel tissue and plasma biomarkers in a mouse model of breast cancer. Furthermore, the approach can be extended to find biomarkers relevant to human disease.  相似文献   

6.
7.
Metabolomics aims at identification and quantitation of small molecules involved in metabolic reactions. LC-MS has enjoyed a growing popularity as the platform for metabolomic studies due to its high throughput, soft ionization, and good coverage of metabolites. The success of a LC-MS-based metabolomic study often depends on multiple experimental, analytical, and computational steps. This review presents a workflow of a typical LC-MS-based metabolomic analysis for identification and quantitation of metabolites indicative of biological/environmental perturbations. Challenges and current solutions in each step of the workflow are reviewed. The review intends to help investigators understand the challenges in metabolomic studies and to determine appropriate experimental, analytical, and computational methods to address these challenges.  相似文献   

8.
Matros A  Kaspar S  Witzel K  Mock HP 《Phytochemistry》2011,72(10):963-974
Recent innovations in liquid chromatography-mass spectrometry (LC-MS)-based methods have facilitated quantitative and functional proteomic analyses of large numbers of proteins derived from complex samples without any need for protein or peptide labelling. Regardless of its great potential, the application of these proteomics techniques to plant science started only recently. Here we present an overview of label-free quantitative proteomics features and their employment for analysing plants. Recent methods used for quantitative protein analyses by MS techniques are summarized and major challenges associated with label-free LC-MS-based approaches, including sample preparation, peptide separation, quantification and kinetic studies, are discussed. Database search algorithms and specific aspects regarding protein identification of non-sequenced organisms are also addressed. So far, label-free LC-MS in plant science has been used to establish cellular or subcellular proteome maps, characterize plant-pathogen interactions or stress defence reactions, and for profiling protein patterns during developmental processes. Improvements in both, analytical platforms (separation technology and bioinformatics/statistical analysis) and high throughput nucleotide sequencing technologies will enhance the power of this method.  相似文献   

9.

Introduction

A proof-of-concept demonstration of the use of label-free quantitative glycoproteomics for biomarker discovery workflow is presented in this paper, using a mouse model for skin cancer as an example.

Materials and Methods

Blood plasma was collected from ten control mice and ten mice having a mutation in the p19ARF gene, conferring them high propensity to develop skin cancer after carcinogen exposure. We enriched for N-glycosylated plasma proteins, ultimately generating deglycosylated forms of the tryptic peptides for liquid chromatography mass spectrometry (LC-MS) analyses. LC-MS runs for each sample were then performed with a view to identifying proteins that were differentially abundant between the two mouse populations. We then used a recently developed computational framework, Corra, to perform peak picking and alignment, and to compute the statistical significance of any observed changes in individual peptide abundances. Once determined, the most discriminating peptide features were then fragmented and identified by tandem mass spectrometry with the use of inclusion lists.

Results and Discussions

We assessed the identified proteins to see if there were sets of proteins indicative of specific biological processes that correlate with the presence of disease, and specifically cancer, according to their functional annotations. As expected for such sick animals, many of the proteins identified were related to host immune response. However, a significant number of proteins are also directly associated with processes linked to cancer development, including proteins related to the cell cycle, localization, transport, and cell death. Additional analysis of the same samples in profiling mode, and in triplicate, confirmed that replicate MS analysis of the same plasma sample generated less variation than that observed between plasma samples from different individuals, demonstrating that the reproducibility of the LC-MS platform was sufficient for this application.

Conclusion

These results thus show that an LC-MS-based workflow can be a useful tool for the generation of candidate proteins of interest as part of a disease biomarker discovery effort.  相似文献   

10.
There is an increasing interest in the quantitative proteomic measurement of the protein contents of substantially similar biological samples, e.g. for the analysis of cellular response to perturbations over time or for the discovery of protein biomarkers from clinical samples. Technical limitations of current proteomic platforms such as limited reproducibility and low throughput make this a challenging task. A new LC-MS-based platform is able to generate complex peptide patterns from the analysis of proteolyzed protein samples at high throughput and represents a promising approach for quantitative proteomics. A crucial component of the LC-MS approach is the accurate evaluation of the abundance of detected peptides over many samples and the identification of peptide features that can stratify samples with respect to their genetic, physiological, or environmental origins. We present here a new software suite, SpecArray, that generates a peptide versus sample array from a set of LC-MS data. A peptide array stores the relative abundance of thousands of peptide features in many samples and is in a format identical to that of a gene expression microarray. A peptide array can be subjected to an unsupervised clustering analysis to stratify samples or to a discriminant analysis to identify discriminatory peptide features. We applied the SpecArray to analyze two sets of LC-MS data: one was from four repeat LC-MS analyses of the same glycopeptide sample, and another was from LC-MS analysis of serum samples of five male and five female mice. We demonstrate through these two study cases that the SpecArray software suite can serve as an effective software platform in the LC-MS approach for quantitative proteomics.  相似文献   

11.
The combined method of LC-MS/MS is increasingly being used to explore differences in the proteomic composition of complex biological systems. The reliability and utility of such comparative protein expression profiling studies is critically dependent on an accurate and rigorous assessment of quantitative changes in the relative abundance of the myriad of proteins typically present in a biological sample such as blood or tissue. In this review, we provide an overview of key statistical and computational issues relevant to bottom-up shotgun global proteomic analysis, with an emphasis on methods that can be applied to improve the dependability of biological inferences drawn from large proteomic datasets. Focusing on a start-to-finish approach, we address the following topics: 1) low-level data processing steps, such as formation of a data matrix, filtering, and baseline subtraction to minimize noise, 2) mid-level processing steps, such as data normalization, alignment in time, peak detection, peak quantification, peak matching, and error models, to facilitate profile comparisons; and, 3) high-level processing steps such as sample classification and biomarker discovery, and related topics such as significance testing, multiple testing, and choice of feature space. We report on approaches that have recently been developed for these steps, discussing their merits and limitations, and propose areas deserving of further research.  相似文献   

12.

Background

Recent advances in liquid chromatography-mass spectrometry (LC-MS) technology have led to more effective approaches for measuring changes in peptide/protein abundances in biological samples. Label-free LC-MS methods have been used for extraction of quantitative information and for detection of differentially abundant peptides/proteins. However, difference detection by analysis of data derived from label-free LC-MS methods requires various preprocessing steps including filtering, baseline correction, peak detection, alignment, and normalization. Although several specialized tools have been developed to analyze LC-MS data, determining the most appropriate computational pipeline remains challenging partly due to lack of established gold standards.

Results

The work in this paper is an initial study to develop a simple model with "presence" or "absence" condition using spike-in experiments and to be able to identify these "true differences" using available software tools. In addition to the preprocessing pipelines, choosing appropriate statistical tests and determining critical values are important. We observe that individual statistical tests could lead to different results due to different assumptions and employed metrics. It is therefore preferable to incorporate several statistical tests for either exploration or confirmation purpose.

Conclusions

The LC-MS data from our spike-in experiment can be used for developing and optimizing LC-MS data preprocessing algorithms and to evaluate workflows implemented in existing software tools. Our current work is a stepping stone towards optimizing LC-MS data acquisition and testing the accuracy and validity of computational tools for difference detection in future studies that will be focused on spiking peptides of diverse physicochemical properties in different concentrations to better represent biomarker discovery of differentially abundant peptides/proteins.  相似文献   

13.
Membrane proteins play a crucial role in various cellular processes and are essential components of cell membranes. Computational methods have emerged as a powerful tool for studying membrane proteins due to their complex structures and properties that make them difficult to analyze experimentally. Traditional features for protein sequence analysis based on amino acid types, composition, and pair composition have limitations in capturing higher-order sequence patterns. Recently, multiple sequence alignment (MSA) and pre-trained language models (PLMs) have been used to generate features from protein sequences. However, the significant computational resources required for MSA-based features generation can be a major bottleneck for many applications. Several methods and tools have been developed to accelerate the generation of MSAs and reduce their computational cost, including heuristics and approximate algorithms. Additionally, the use of PLMs such as BERT has shown great potential in generating informative embeddings for protein sequence analysis. In this review, we provide an overview of traditional and more recent methods for generating features from protein sequences, with a particular focus on MSAs and PLMs. We highlight the advantages and limitations of these approaches and discuss the methods and tools developed to address the computational challenges associated with features generation. Overall, the advancements in computational methods and tools provide a promising avenue for gaining deeper insights into the function and properties of membrane proteins, which can have significant implications in drug discovery and personalized medicine.  相似文献   

14.
The excitement associated with clinical applications of proteomics was initially focused on its potential to serve as a vehicle for both biomarker discovery and drug discovery and routine clinical sample analysis. Some approaches were thought to be able to "identify" mass spectral characteristics that distinguished between control and disease samples, and thereafter it was believed that the same tool could be employed to screen samples in a high-throughput clinical setting. However, this has been difficult to achieve, and the early promise is yet to be fully realized. While we see an important place for mass spectrometry in drug and biomarker discovery, we believe that alternative strategies will prove more fruitful for routine analysis. Here we discuss the power and versatility of 2D gels and mass spectrometry in the discovery phase of biomarker work but argue that it is better to rely on immunochemical methods for high-throughput validation and routine assay applications.  相似文献   

15.
Reactive carbonyl species (RCS) generated by lipid peroxidation, leading to protein carbonylation, are involved in several human diseases. Protein carbonylation constitutes one of the best characterised biomarker of oxidative damage to proteins. Albumin and actin have been identified, through different proteomic approaches, as the main protein targets for RCS in plasma and tissues, respectively. By a combined LC-MS/MS and computational approach, we have demonstrated their high reactivity towards alpha,beta-unsaturated aldehydes, and established the stoichiometry of reaction with HNE and acrolein, as well as the amino acid residues more susceptible to carbonyl attack. A new mass spectrometric approach, based on LC-MS/MS analysis of tag HNE/ACR-modified peptides of carbonylated albumin and actin is proposed, and the advantages over the conventional methods for RCS and RCS-adducted protein analyses discussed.  相似文献   

16.
Discovery of urinary biomarkers   总被引:4,自引:0,他引:4  
A myriad of proteins and peptides can be identified in normal human urine. These are derived from a variety of sources including glomerular filtration of blood plasma, cell sloughing, apoptosis, proteolytic cleavage of cell surface glycosylphosphatidylinositol-linked proteins, and secretion of exosomes by epithelial cells. Mass spectrometry-based approaches to urinary protein and peptide profiling can, in principle, reveal changes in excretion rates of specific proteins/peptides that can have predictive value in the clinical arena, e.g. in the early diagnosis of disease, in classification of disease with regard to likely therapeutic responses, in assessment of prognosis, and in monitoring response to therapy. These approaches have potential value, not only in diseases of the kidney and urinary tract but also in systemic diseases that are associated with circulating small protein and peptide markers that can pass the glomerular filter. Most large scale biomarker discovery studies reported thus far have used one of two approaches to identify proteins and peptides whose excretion in urine changes in specific disease states: 1) two-dimensional electrophoresis with mass spectrometric and/or immunochemical identification of proteins and 2) top-down mass spectrometric methods (SELDI-TOF-MS and capillary electrophoresis-MS). These studies have been chiefly in the areas of nephrology, urology, and oncology. We review these applications, focusing on two areas of progress, viz. in bladder cancer and in acute rejection of renal transplants. Progress has been limited so far. However, with the advent of powerful LC-MS/MS methods along with methods for quantifying LC-MS/MS output, there is hope for an accelerated discovery and validation of disease biomarkers in urine.  相似文献   

17.
18.
Nanoparticle biological activity, biocompatibility and fate can be directly affected by layers of readily adsorbed host proteins in biofluids. Here, we report a study on the interactions between human blood plasma proteins and nanoparticles with a controlled systematic variation of properties using (18)O-labeling and LC-MS-based quantitative proteomics. We developed a novel protocol to both simplify isolation of nanoparticle bound proteins and improve reproducibility. LC-MS analysis identified and quantified 88 human plasma proteins associated with polystyrene nanoparticles consisting of three different surface chemistries and two sizes, as well as, for four different exposure times (for a total of 24 different samples). Quantitative comparison of relative protein abundances was achieved by spiking an (18)O-labeled "universal" reference into each individually processed unlabeled sample as an internal standard, enabling simultaneous application of both label-free and isotopic labeling quantification across the entire sample set. Clustering analysis of the quantitative proteomics data resulted in distinctive patterns that classified the nanoparticles based on their surface properties and size. In addition, temporal data indicated that the formation of the stable protein corona was at equilibrium within 5 min. The comprehensive quantitative proteomics results obtained in this study provide rich data for computational modeling and have potential implications towards predicting nanoparticle biocompatibility.  相似文献   

19.
Discovering and detecting transposable elements in genome sequences   总被引:2,自引:0,他引:2  
The contribution of transposable elements (TEs) to genome structure and evolution as well as their impact on genome sequencing, assembly, annotation and alignment has generated increasing interest in developing new methods for their computational analysis. Here we review the diversity of innovative approaches to identify and annotate TEs in the post-genomic era, covering both the discovery of new TE families and the detection of individual TE copies in genome sequences. These approaches span a broad spectrum in computational biology including de novo, homology-based, structure-based and comparative genomic methods. We conclude that the integration and visualization of multiple approaches and the development of new conceptual representations for TE annotation will further advance the computational analysis of this dynamic component of the genome.  相似文献   

20.
We have developed an integrated suite of algorithms, statistical methods, and computer applications to support large-scale LC-MS-based gel-free shotgun profiling of complex protein mixtures using basic experimental procedures. The programs automatically detect and quantify large numbers of peptide peaks in feature-rich ion mass chromatograms, compensate for spurious fluctuations in peptide signal intensities and retention times, and reliably match related peaks across many different datasets. Application of this toolkit markedly facilitates pattern recognition and biomarker discovery in global comparative proteomic studies, simplifying mechanistic investigation of physiological responses and the detection of proteomic signatures of disease.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号