首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
Contemporary protein microarrays such as the ProtoArray® are used for autoimmune antibody screening studies to discover biomarker panels. For ProtoArray data analysis, the software Prospector and a default workflow are suggested by the manufacturer. While analyzing a large data set of a discovery study for diagnostic biomarkers of the Parkinson's disease (ParkCHIP), we have revealed the need for distinct improvements of the suggested workflow concerning raw data acquisition, normalization and preselection method availability, batch effects, feature selection, and feature validation. In this work, appropriate improvements of the default workflow are proposed. It is shown that completely automatic data acquisition as a batch, a re‐implementation of Prospector's pre‐selection method, multivariate or hybrid feature selection, and validation of the selected protein panel using an independent test set define in combination an improved workflow for large studies.  相似文献   

2.
Ovarian cancer recurs at the rate of 75% within a few months or several years later after therapy. Early recurrence, though responding better to treatment, is difficult to detect. Surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) mass spectrometry has showed the potential to accurately identify disease biomarkers to help early diagnosis. A major challenge in the interpretation of SELDI-TOF data is the high dimensionality of the feature space. To tackle this problem, we have developed a multi-step data processing method composed of t-test, binning and backward feature selection. A new algorithm, support vector machine-Markov blanket/recursive feature elimination (SVM-MB/RFE) is presented for the backward feature selection. This method is an integration of minimum weight feature elimination by SVM-RFE and information theory based redundant/irrelevant feature removal by Markov Blanket. Subsequently, SVM was used for classification. We conducted the biomarker selection algorithm on 113 serum samples to identify early relapse from ovarian cancer patients after primary therapy. To validate the performance of the proposed algorithm, experiments were carried out in comparison with several other feature selection and classification algorithms.  相似文献   

3.
In this paper, we compare the performance of six different feature selection methods for LC-MS-based proteomics and metabolomics biomarker discovery—t test, the Mann–Whitney–Wilcoxon test (mww test), nearest shrunken centroid (NSC), linear support vector machine–recursive features elimination (SVM-RFE), principal component discriminant analysis (PCDA), and partial least squares discriminant analysis (PLSDA)—using human urine and porcine cerebrospinal fluid samples that were spiked with a range of peptides at different concentration levels. The ideal feature selection method should select the complete list of discriminating features that are related to the spiked peptides without selecting unrelated features. Whereas many studies have to rely on classification error to judge the reliability of the selected biomarker candidates, we assessed the accuracy of selection directly from the list of spiked peptides. The feature selection methods were applied to data sets with different sample sizes and extents of sample class separation determined by the concentration level of spiked compounds. For each feature selection method and data set, the performance for selecting a set of features related to spiked compounds was assessed using the harmonic mean of the recall and the precision (f-score) and the geometric mean of the recall and the true negative rate (g-score). We conclude that the univariate t test and the mww test with multiple testing corrections are not applicable to data sets with small sample sizes (n = 6), but their performance improves markedly with increasing sample size up to a point (n > 12) at which they outperform the other methods. PCDA and PLSDA select small feature sets with high precision but miss many true positive features related to the spiked peptides. NSC strikes a reasonable compromise between recall and precision for all data sets independent of spiking level and number of samples. Linear SVM-RFE performs poorly for selecting features related to the spiked compounds, even though the classification error is relatively low.Biomarkers play an important role in advancing medical research through the early diagnosis of disease and prognosis of treatment interventions (1, 2). Biomarkers may be proteins, peptides, or metabolites, as well as mRNAs or other kinds of nucleic acids (e.g. microRNAs) whose levels change in relation to the stage of a given disease and which may be used to accurately assign the disease stage of a patient. The accurate selection of biomarker candidates is crucial, because it determines the outcome of further validation studies and the ultimate success of efforts to develop diagnostic and prognostic assays with high specificity and sensitivity. The success of biomarker discovery depends on several factors: consistent and reproducible phenotyping of the individuals from whom biological samples are obtained; the quality of the analytical methodology, which in turn determines the quality of the collected data; the accuracy of the computational methods used to extract quantitative and molecular identity information to define the biomarker candidates from raw analytical data; and finally the performance of the applied statistical methods in the selection of a limited list of compounds with the potential to discriminate between predefined classes of samples. De novo biomarker research consists of a biomarker discovery part and a biomarker validation part (3). Biomarker discovery uses analytical techniques that try to measure as many compounds as possible in a relatively low number of samples. The goal of subsequent data preprocessing and statistical analysis is to select a limited number of candidates, which are subsequently subjected to targeted analyses in large number of samples for validation.Advanced technology, such as high-performance liquid chromatography–mass spectrometry (LC-MS),1 is increasingly applied in biomarker discovery research. Such analyses detect tens of thousands of compounds, as well as background-related signals, in a single biological sample, generating enormous amounts of multivariate data. Data preprocessing workflows reduce data complexity considerably by trying to extract only the information related to compounds resulting in a quantitative feature matrix, in which rows and columns correspond to samples and extracted features, respectively, or vice versa. Features may also be related to data preprocessing artifacts, and the ratio of such erroneous features to compound-related features depends on the performance of the data preprocessing workflow (4). Preprocessed LC-MS data sets contain a large number of features relative to the sample size. These features are characterized by their m/z value and retention time, and in the ideal case they can be combined and linked to compound identities such as metabolites, peptides, and proteins. In LC-MS-based proteomics and metabolomics studies, sample analysis is so time consuming that it is practically impossible to increase the number of samples to a level that balances the number of features in a data set. Therefore, the success of biomarker discovery depends on powerful feature selection methods that can deal with a low sample size and a high number of features. Because of the unfavorable statistical situation and the risk of overfitting the data, it is ultimately pivotal to validate the selected biomarker candidates in a larger set of independent samples, preferably in a double-blinded fashion, using targeted analytical methods (1).Biomarker selection is often based on classification methods that are preceded by feature selection methods (filters) or which have built-in feature selection modules (wrappers and embedded methods) that can be used to select a list of compounds/peaks/features that provide the best classification performance for predefined sample groups (e.g. healthy versus diseased) (5). Classification methods are able to classify an unknown sample into a predefined sample class. Univariate feature selection methods such as filters (t test or Wilcoxon–Mann–Whitney tests) cannot be used for sample classification. Other classification methods such as the nearest shrunken centroid method have intrinsic feature selection ability, whereas other classification methods such as principal component discriminant analysis (PCDA) and partial least squares regression coupled with discriminant analysis (PLSDA) should be augmented with a feature selection method. There are classifiers having no feature selection option that perform the classification using all variables, such as support vector machines that use non-linear kernels (6). Classification methods without the ability to select features cannot be used for biomarker discovery, because these methods aim to classify samples into predefined classes but cannot identify the limited number of variables (features or compounds) that form the basis of the classification (6, 7). Different statistical methods with feature selection have been developed according to the complexity of the analyzed data, and these have been extensively reviewed (5, 6, 8, 9). Ways of optimizing such methods to improve sensitivity and specificity are a major topic in current biomarker discovery research and in the many “omics-related” research areas (6, 10, 11). Comparisons of classification methods with respect to their classification and learning performance have been initiated. Van der Walt et al. (12) focused on finding the most accurate classifiers for simulated data sets with sample sizes ranging from 20 to 100. Rubingh et al. (13) compared the influence of sample size in an LC-MS metabolomics data set on the performance of three different statistical validation tools: cross validation, jack-knifing model parameters, and a permutation test. That study concluded that for small sample sets, the outcome of these validation methods is influenced strongly by individual samples and therefore cannot be trusted, and the validation tool cannot be used to indicate problems due to sample size or the representativeness of sampling. This implies that reducing the dimensionality of the feature space is critical when approaching a classification problem in which the number of features exceeds the number of samples by a large margin. Dimensionality reduction retains a smaller set of features to bring the feature space in line with the sample size and thus allow the application of classification methods that perform with acceptable accuracy only when the sample size and the feature size are similar.In this study we compared different classification methods focusing on feature selection in two types of spiked LC-MS data sets that mimic the situation of a biomarker discovery study. Our results provide guidelines for researchers who will engage in biomarker discovery or other differential profiling “omics” studies with respect to sample size and selecting the most appropriate feature selection method for a given data set. We evaluated the following approaches: univariate t test and Mann–Whitney–Wilcoxon test (mww test) with multiple testing correction (14), nearest shrunken centroid (NSC) (15, 16), support vector machine–recursive features elimination (SVM-RFE) (17), PLSDA (18), and PCDA (19). PCDA and PLSDA were combined with the rank-product as a feature selection criterion (20). These methods were evaluated with data sets having three characteristics: different biological background, varying sample size, and varying within- and between-class variability of the added compounds. Data were acquired via LC-MS from human urine and porcine cerebrospinal fluid (CSF) samples that were spiked with a set of known peptides (true positives) at different concentration levels. These samples were then combined in two classes containing peptides spiked at low and high concentration levels. The performance of the classification methods with feature selection was measured based on their ability to select features that were related to the spiked peptides. Because true positives were known in our data set, we compared performance based on the f-score (the harmonic mean of precision and recall) and the g-score (the geometric mean of accuracy).  相似文献   

4.
Immler D  Greven S  Reinemer P 《Proteomics》2006,6(10):2947-2958
Authentic biomarkers, distilling the essence of a complex, functionally significant process in a mammalian system into a precise, physicochemical measurement have been implicated as a tool of increasing importance for drug discovery and development. However, even in spite of recent technological advances, validating a new biomarker candidate, where generation of suitable antibodies is required, is still a long-lasting task. Methods to accelerate initial validation by MS approaches have been suggested, but all methods described so far are associated with serious drawbacks, finally leading to non-generic methods of detection and quantification. Moreover, when complex body fluids are used as samples, efficient debulking strategies are crucial to open a window of analytical sensitivity in the ng/mL range, where many diagnostically relevant analytes are present. Here we report the proof-of-principle of a multi-dimensional strategy for accelerated initial validation of biomarker candidates by MS, which promises to be generally applicable, sensitive and quantitative. The method presented employs a combination of electrophoretic and chromatographic steps on the peptide level, followed by MS quantification using isotopically labeled synthetic peptides as internal standards. Our proposed workflow includes up to four dimensions, finally resulting in a desired LOD sufficient to detect and quantify diagnostically relevant analytes from complex samples. Although the current state of the method only represents a starting point for further validation and development, it reveals great potential in biomarker validation.  相似文献   

5.
BackgroundMass spectrometry (MS) is becoming the gold standard for biomarker discovery. Several MS-based bioinformatics methods have been proposed for this application, but the divergence of the findings by different research groups on the same MS data suggests that the definition of a reliable method has not been achieved yet. In this work, we propose an integrated software platform, MASCAP, intended for comparative biomarker detection from MALDI-TOF MS data.ResultsMASCAP integrates denoising and feature extraction algorithms, which have already shown to provide consistent peaks across mass spectra; furthermore, it relies on statistical analysis and graphical tools to compare the results between groups. The effectiveness in mass spectrum processing is demonstrated using MALDI-TOF data, as well as SELDI-TOF data. The usefulness in detecting potential protein biomarkers is shown comparing MALDI-TOF mass spectra collected from serum and plasma samples belonging to the same clinical population.ConclusionsThe analysis approach implemented in MASCAP may simplify biomarker detection, by assisting the recognition of proteomic expression signatures of the disease. A MATLAB implementation of the software and the data used for its validation are available at http://www.unich.it/proteomica/bioinf.  相似文献   

6.

Background

Recent advances in liquid chromatography-mass spectrometry (LC-MS) technology have led to more effective approaches for measuring changes in peptide/protein abundances in biological samples. Label-free LC-MS methods have been used for extraction of quantitative information and for detection of differentially abundant peptides/proteins. However, difference detection by analysis of data derived from label-free LC-MS methods requires various preprocessing steps including filtering, baseline correction, peak detection, alignment, and normalization. Although several specialized tools have been developed to analyze LC-MS data, determining the most appropriate computational pipeline remains challenging partly due to lack of established gold standards.

Results

The work in this paper is an initial study to develop a simple model with "presence" or "absence" condition using spike-in experiments and to be able to identify these "true differences" using available software tools. In addition to the preprocessing pipelines, choosing appropriate statistical tests and determining critical values are important. We observe that individual statistical tests could lead to different results due to different assumptions and employed metrics. It is therefore preferable to incorporate several statistical tests for either exploration or confirmation purpose.

Conclusions

The LC-MS data from our spike-in experiment can be used for developing and optimizing LC-MS data preprocessing algorithms and to evaluate workflows implemented in existing software tools. Our current work is a stepping stone towards optimizing LC-MS data acquisition and testing the accuracy and validity of computational tools for difference detection in future studies that will be focused on spiking peptides of diverse physicochemical properties in different concentrations to better represent biomarker discovery of differentially abundant peptides/proteins.  相似文献   

7.
The paper presents two analyzes of the MALDI-TOF mass spectrometry dataset. Both analyzes use the support vector machine as a tool to build a prediction model. The first analysis which is our contribution to the competition uses the given spectra data without further processing. In the second analysis, we employed an additional preprocessing step consisting of peak detection, peak alignment and feature selection based on statistical tests. The experimental results suggest that the preprocessing step with feature selection improves prediction accuracy.  相似文献   

8.
Peak detection is one of the most important steps in mass spectrometry (MS) analysis. However, the detection result is greatly affected by severe spectrum variations. Unfortunately, most current peak detection methods are neither flexible enough to revise false detection results nor robust enough to resist spectrum variations. To improve flexibility, we introduce peak tree to represent the peak information in MS spectra. Each tree node is a peak judgment on a range of scales, and each tree decomposition, as a set of nodes, is a candidate peak detection result. To improve robustness, we combine peak detection and common peak alignment into a closed-loop framework, which finds the optimal decomposition via both peak intensity and common peak information. The common peak information is derived and loopily refined from the density clustering of the latest peak detection result. Finally, we present an improved ant colony optimization biomarker selection method to build a whole MS analysis system. Experiment shows that our peak detection method can better resist spectrum variations and provide higher sensitivity and lower false detection rates than conventional methods. The benefits from our peak-tree-based system for MS disease analysis are also proved on real SELDI data.  相似文献   

9.
Shaoxiong Chen 《Proteomics》2015,15(13):2358-2368
Chondrosarcoma is the third most common primary bone cancer, requiring surgical resection. However, differentiation of low‐grade chondrosarcoma (grade 1) from enchondroma that is benign and only requires regular follow‐up is one of the most frequent diagnostic dilemmas facing orthopedic oncologists in clinical management. Although multiple techniques are applied to make the distinction, immunohistochemistry is an important ancillary technique, especially when a histopathological stain of specimen must be obtained in order to guarantee an accurate confirmation. Currently, no adequate immunohistochemical diagnostic protein biomarkers are available to distinguish low‐grade chondrosarcoma from enchondroma. To discover novel protein biomarker candidates, an LC‐MS/MS approach was applied to directly compare formalin‐fixed, paraffin‐embedded low‐grade chondrosarcoma with enchondroma tissue samples. The proteomics analysis revealed 17 protein biomarker candidates. A principle was developed to prioritize the candidates using category and ranking. An algorithm, prioritization index of biomarker candidates for immunohistochemistry on tissue specimens, was developed to rank the candidates inside each category. Using the proteomics data and bioinformatics results, the prioritization index of biomarker candidates for immunohistochemistry on tissue revealed periostin as a top candidate. Immunohistochemical staining of periostin in 23 low‐grade chondrosarcoma and 31 enchondroma tissue specimens disclosed 87% specificity and 70% sensitivity.  相似文献   

10.
MOTIVATION: Novel methods, both molecular and statistical, are urgently needed to take advantage of recent advances in biotechnology and the human genome project for disease diagnosis and prognosis. Mass spectrometry (MS) holds great promise for biomarker identification and genome-wide protein profiling. It has been demonstrated in the literature that biomarkers can be identified to distinguish normal individuals from cancer patients using MS data. Such progress is especially exciting for the detection of early-stage ovarian cancer patients. Although various statistical methods have been utilized to identify biomarkers from MS data, there has been no systematic comparison among these approaches in their relative ability to analyze MS data. RESULTS: We compare the performance of several classes of statistical methods for the classification of cancer based on MS spectra. These methods include: linear discriminant analysis, quadratic discriminant analysis, k-nearest neighbor classifier, bagging and boosting classification trees, support vector machine, and random forest (RF). The methods are applied to ovarian cancer and control serum samples from the National Ovarian Cancer Early Detection Program clinic at Northwestern University Hospital. We found that RF outperforms other methods in the analysis of MS data.  相似文献   

11.
Lee HJ  Na K  Kwon MS  Park T  Kim KS  Kim H  Paik YK 《Proteomics》2011,11(10):1976-1984
Disease biomarkers are predicted to be in low abundance; thus, the most crucial step of biomarker discovery is the efficient fractionation of clinical samples into protein sets that define disease stages and/or predict disease development. For this purpose, we developed a new platform that uses peptide-based size exclusion chromatography (pep-SEC) to quantify disease biomarker candidates. This new platform has many advantages over previously described biomarker profiling platforms, including short run time, high resolution, and good reproducibility, which make it suitable for large-scale analysis. We combined this platform with isotope labeling and label-free methods to identify and quantitate differentially expressed proteins in hepatocellular carcinoma (HCC) tissues. When we combined pep-SEC with a gas phase fractionation method, which broadens precursor ion selection, the protein coverage was significantly increased, which is critical for the global profiling of HCC specimens. Furthermore, pep-SEC-LC-MS/MS analysis enhanced the detection of low-abundance proteins (e.g. insulin receptor substrate 2 and carboxylesterase 1) and glycopeptides in HCC plasma. Thus, our pep-SEC platform is an efficient and versatile pre-fractionation system for the large-scale profiling and quantitation of candidate biomarkers in complex disease proteomes.  相似文献   

12.
Plasma biomarkers of exposure to environmental contaminants play an important role in early detection of disease. The emerging field of proteomics presents an attractive opportunity for candidate biomarker discovery, as it simultaneously measures and analyzes a large number of proteins. This article presents a case study for measuring arsenic concentrations in a population residing in an As-endemic region of Bangladesh using plasma protein expressions measured by SELDI-TOF mass spectrometry. We analyze the data using a unified statistical method based on functional learning to preprocess mass spectra and extract mass spectrometry (MS) features and to associate the selected MS features with arsenic exposure measurements. The task is challenging due to several factors, the high dimensionality of mass spectrometry data, complicated error structures, and a multiple comparison problem. We use nonparametric functional regression techniques for MS modeling, peak detection based on the significant zero-downcrossing method, and peak alignment using a warping algorithm. Our results show significant associations of arsenic exposure to either under- or overexpressions of 20 proteins.  相似文献   

13.
14.
Although dysfunctional protein homeostasis (proteostasis) is a key factor in many age‐related diseases, the untargeted identification of structurally modified proteins remains challenging. Peptide location fingerprinting is a proteomic analysis technique capable of identifying structural modification‐associated differences in mass spectrometry (MS) data sets of complex biological samples. A new webtool (Manchester Peptide Location Fingerprinter), applied to photoaged and intrinsically aged skin proteomes, can relatively quantify peptides and map statistically significant differences to regions within protein structures. New photoageing biomarker candidates were identified in multiple pathways including extracellular matrix organisation (collagens and proteoglycans), protein synthesis and folding (ribosomal proteins and TRiC complex subunits), cornification (keratins) and hemidesmosome assembly (plectin and integrin α6β4). Crucially, peptide location fingerprinting uniquely identified 120 protein biomarker candidates in the dermis and 71 in the epidermis which were modified as a consequence of photoageing but did not differ significantly in relative abundance (measured by MS1 ion intensity). By applying peptide location fingerprinting to published MS data sets, (identifying biomarker candidates including collagen V and versican in ageing tendon) we demonstrate the potential of the MPLF webtool for biomarker discovery.  相似文献   

15.
Shotgun proteome analysis platforms based on multidimensional liquid chromatography-tandem mass spectrometry (LC-MS/MS) provide a powerful means to discover biomarker candidates in tissue specimens. Analysis platforms must balance sensitivity for peptide detection, reproducibility of detected peptide inventories and analytical throughput for protein amounts commonly present in tissue biospecimens (< 100 microg), such that platform stability is sufficient to detect modest changes in complex proteomes. We compared shotgun proteomics platforms by analyzing tryptic digests of whole cell and tissue proteomes using strong cation exchange (SCX) and isoelectric focusing (IEF) separations of peptides prior to LC-MS/MS analysis on a LTQ-Orbitrap hybrid instrument. IEF separations provided superior reproducibility and resolution for peptide fractionation from samples corresponding to both large (100 microg) and small (10 microg) protein inputs. SCX generated more peptide and protein identifications than did IEF with small (10 microg) samples, whereas the two platforms yielded similar numbers of identifications with large (100 microg) samples. In nine replicate analyses of tryptic peptides from 50 microg colon adenocarcinoma protein, overlap in protein detection by the two platforms was 77% of all proteins detected by both methods combined. IEF more quickly approached maximal detection, with 90% of IEF-detectable medium abundance proteins (those detected with a total of 3-4 peptides) detected within three replicate analyses. In contrast, the SCX platform required six replicates to detect 90% of SCX-detectable medium abundance proteins. High reproducibility and efficient resolution of IEF peptide separations make the IEF platform superior to the SCX platform for biomarker discovery via shotgun proteomic analyses of tissue specimens.  相似文献   

16.
Conventional biomarker discovery focuses mostly on the identification of single markers and thus often has limited success in disease diagnosis and prognosis. This study proposes a method to identify an optimized protein biomarker panel based on MS studies for predicting the risk of major adverse cardiac events (MACE) in patients. Since the simplicity and concision requirement for the development of immunoassays can only tolerate the complexity of the prediction model with a very few selected discriminative biomarkers, established optimization methods, such as conventional genetic algorithm (GA), thus fails in the high‐dimensional space. In this paper, we present a novel variant of GA that embeds the recursive local floating enhancement technique to discover a panel of protein biomarkers with far better prognostic value for prediction of MACE than existing methods, including the one approved recently by FDA (Food and Drug Administration). The new pragmatic method applies the constraints of MACE relevance and biomarker redundancy to shrink the local searching space in order to avoid heavy computation penalty resulted from the local floating optimization. The proposed method is compared with standard GA and other variable selection approaches based on the MACE prediction experiments. Two powerful classification techniques, partial least squares logistic regression (PLS‐LR) and support vector machine classifier (SVMC), are deployed as the MACE predictors owing to their ability in dealing with small scale and binary response data. New preprocessing algorithms, such as low‐level signal processing, duplicated spectra elimination, and outliner patient's samples removal, are also included in the proposed method. The experimental results show that an optimized panel of seven selected biomarkers can provide more than 77.1% MACE prediction accuracy using SVMC. The experimental results empirically demonstrate that the new GA algorithm with local floating enhancement (GA‐LFE) can achieve the better MACE prediction performance comparing with the existing techniques. The method has been applied to SELDI/MALDI MS datasets to discover an optimized panel of protein biomarkers to distinguish disease from control.  相似文献   

17.
姜忠俊  李小波 《微生物学报》2022,62(8):2954-2968
宏基因组学技术可以直接从环境中提取微生物的全部遗传物质,而不需要像传统方法一样在培养基上纯培养。这种技术的出现为科学家对微生物群落的结构和功能的认识提供了重要的方法,同时对疾病的诊治、环境的治理以及生命的认识具有重大的意义。从环境中提取出微生物全部遗传物质,对其进行测序从而得到它们的reads片段,通过reads组装工具可以进一步组装成重叠群片段。对重叠群片段进行分箱,可以从宏基因组样本中重建出更多完整的基因。分箱效果的好坏直接影响到后续的生物分析,因此如何将这些含有不同微生物基因混合的重叠群序列进行有效的分箱成为了宏基因组学研究的热点和难点。机器学习方法被广泛应用于宏基因组重叠群分箱,通常分为有监督重叠群分类方法和无监督重叠群聚类方法。该综述针对宏基因组重叠群分箱方法进行了较为全面的阐述,深入剖析了重叠群分类方法与聚类方法,发现其存在分类准确率较低、分箱时间较长、难以从复杂数据集中重建更多微生物基因等问题,并对未来重叠群分箱方法的研究和发展进行了展望。作者建议可以使用半监督学习、集成学习以及深度学习方法,并采用更有效的数据特征表示等途径来提高分箱效果。  相似文献   

18.
Verification of candidate biomarker proteins in blood is typically done using multiple reaction monitoring (MRM) of peptides by LC-MS/MS on triple quadrupole MS systems. MRM assay development for each protein requires significant time and cost, much of which is likely to be of little value if the candidate biomarker is below the detection limit in blood or a false positive in the original discovery data. Here we present a new technology, accurate inclusion mass screening (AIMS), designed to provide a bridge from unbiased discovery to MS-based targeted assay development. Masses on the software inclusion list are monitored in each scan on the Orbitrap MS system, and MS/MS spectra for sequence confirmation are acquired only when a peptide from the list is detected with both the correct accurate mass and charge state. The AIMS experiment confirms that a given peptide (and thus the protein from which it is derived) is present in the plasma. Throughput of the method is sufficient to qualify up to a hundred proteins/week. The sensitivity of AIMS is similar to MRM on a triple quadrupole MS system using optimized sample preparation methods (low tens of ng/ml in plasma), and MS/MS data from the AIMS experiments on the Orbitrap can be directly used to configure MRM assays. The method was shown to be at least 4-fold more efficient at detecting peptides of interest than undirected LC-MS/MS experiments using the same instrumentation, and relative quantitation information can be obtained by AIMS in case versus control experiments. Detection by AIMS ensures that a quantitative MRM-based assay can be configured for that protein. The method has the potential to qualify large number of biomarker candidates based on their detection in plasma prior to committing to the time- and resource-intensive steps of establishing a quantitative assay.  相似文献   

19.
We describe an integrated suite of algorithms and software for general accurate mass and time (AMT) tagging data analysis of mass spectrometry data. The AMT approach combines identifications from liquid chromatography (LC) tandem mass spectrometry (MS/MS) data with peptide accurate mass and retention time locations from high-resolution LC-MS data. Our workflow includes the traditional AMT approach, in which MS/MS identifications are located in external databases, as well as methods based on more recent hybrid instruments such as the LTQ-FT or Orbitrap, where MS/MS identifications are embedded with the MS data. We demonstrate our AMT workflow's utility for general data synthesis by combining data from two dissimilar biospecimens. Specifically, we demonstrate its use relevant to serum biomarker discovery by identifying which peptides sequenced by MS/MS analysis of tumor tissue may also be present in the plasma of tumor-bearing and control mice. The analysis workflow, referred to as msInspect/AMT, extends and combines existing open-source platforms for LC-MS/MS (CPAS) and LC-MS (msInspect) data analysis and is available in an unrestricted open-source distribution.  相似文献   

20.
The mass spectrometry-based peptidomics approaches have proven its usefulness in several areas such as the discovery of physiologically active peptides or biomarker candidates derived from various biological fluids including blood and cerebrospinal fluid. However, to identify biomarkers that are reproducible and clinically applicable, development of a novel technology, which enables rapid, sensitive, and quantitative analysis using hundreds of clinical specimens, has been eagerly awaited. Here we report an integrative peptidomic approach for identification of lung cancer-specific serum peptide biomarkers. It is based on the one-step effective enrichment of peptidome fractions (molecular weight of 1,000-5,000) with size exclusion chromatography in combination with the precise label-free quantification analysis of nano-LC/MS/MS data set using Expressionist proteome server platform. We applied this method to 92 serum samples well-managed with our SOP (standard operating procedure) (30 healthy controls and 62 lung adenocarcinoma patients), and quantitatively assessed the detected 3,537 peptide signals. Among them, 118 peptides showed significantly altered serum levels between the control and lung cancer groups (p<0.01 and fold change >5.0). Subsequently we identified peptide sequences by MS/MS analysis and further assessed the reproducibility of Expressionist-based quantification results and their diagnostic powers by MRM-based relative-quantification analysis for 96 independently prepared serum samples and found that APOA4 273-283, FIBA 5-16, and LBN 306-313 should be clinically useful biomarkers for both early detection and tumor staging of lung cancer. Our peptidome profiling technology can provide simple, high-throughput, and reliable quantification of a large number of clinical samples, which is applicable for diverse peptidome-targeting biomarker discoveries using any types of biological specimens.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号