首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Proteomic biomarker discovery has led to the identification of numerous potential candidates for disease diagnosis, prognosis, and prediction of response to therapy. However, very few of these identified candidate biomarkers reach clinical validation and go on to be routinely used in clinical practice. One particular issue with biomarker discovery is the identification of significantly changing proteins in the initial discovery experiment that do not validate when subsequently tested on separate patient sample cohorts. Here, we seek to highlight some of the statistical challenges surrounding the analysis of LC‐MS proteomic data for biomarker candidate discovery. We show that common statistical algorithms run on data with low sample sizes can overfit and yield misleading misclassification rates and AUC values. A common solution to this problem is to prefilter variables (via, e.g. ANOVA and or use of correction methods such as Bonferonni or false discovery rate) to give a smaller dataset and reduce the size of the apparent statistical challenge. However, we show that this exacerbates the problem yielding even higher performance metrics while reducing the predictive accuracy of the biomarker panel. To illustrate some of these limitations, we have run simulation analyses with known biomarkers. For our chosen algorithm (random forests), we show that the above problems are substantially reduced if a sufficient number of samples are analyzed and the data are not prefiltered. Our view is that LC‐MS proteomic biomarker discovery data should be analyzed without prefiltering and that increasing the sample size in biomarker discovery experiments should be a very high priority.  相似文献   

2.
Xiao H  Wong DT 《Bioinformation》2010,5(7):294-296
Human saliva is a biological fluid with enormous diagnostic potential. Because saliva can be non-invasively collected, it provides an attractive alternative for blood, serum or plasma. It has been postulated that the blood concentrations of many components are reflected in saliva. Saliva harbors a wide array of proteins, which can be informative for the detection of diseases. Profiling the proteins in saliva over the course of disease progression could reveal potential biomarkers indicative of different stages of diseases, which may be useful in medical diagnostics. With advanced instrumentation and developed refined analytical techniques, proteomics is widely envisioned as a useful and powerful approach for salivary proteomic biomarker discovery. As proteomic technologies continue to mature, salivary proteomics have great potential for biomarker research and clinical applications. The progress and current status of salivary proteomics and its application in the biomarker discovery of oral and systematic diseases will be reviewed. The scientific and clinical challenges underlying this approach will also be discussed.  相似文献   

3.
Human saliva is a biological fluid with enormous diagnostic potential. Because saliva can be non-invasively collected, it provides an attractive alternative for blood, serum or plasma. It has been postulated that the blood concentrations of many components are reflected in saliva. Saliva harbors a wide array of proteins, which can be informative for the detection of diseases. Profiling the proteins in saliva over the course of disease progression could reveal potential biomarkers indicative of different stages of diseases, which may be useful in medical diagnostics. With advanced instrumentation and developed refined analytical techniques, proteomics is widely envisioned as a useful and powerful approach for salivary proteomic biomarker discovery. As proteomic technologies continue to mature, salivary proteomics have great potential for biomarker research and clinical applications. The progress and current status of salivary proteomics and its application in the biomarker discovery of oral and systematic diseases will be reviewed. The scientific and clinical challenges underlying this approach will also be discussed.  相似文献   

4.

Background  

The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard supervised classification algorithms can be employed, the "curse of dimensionality" needs to be solved. Due to the sheer amount of information contained within the mass spectra, most standard machine learning techniques cannot be directly applied. Instead, feature selection techniques are used to first reduce the dimensionality of the input space and thus enable the subsequent use of classification algorithms. This paper examines feature selection techniques for proteomic mass spectrometry.  相似文献   

5.

Background

As a promising way to transform medicine, mass spectrometry based proteomics technologies have seen a great progress in identifying disease biomarkers for clinical diagnosis and prognosis. However, there is a lack of effective feature selection methods that are able to capture essential data behaviors to achieve clinical level disease diagnosis. Moreover, it faces a challenge from data reproducibility, which means that no two independent studies have been found to produce same proteomic patterns. Such reproducibility issue causes the identified biomarker patterns to lose repeatability and prevents it from real clinical usage.

Methods

In this work, we propose a novel machine-learning algorithm: derivative component analysis (DCA) for high-dimensional mass spectral proteomic profiles. As an implicit feature selection algorithm, derivative component analysis examines input proteomics data in a multi-resolution approach by seeking its derivatives to capture latent data characteristics and conduct de-noising. We further demonstrate DCA's advantages in disease diagnosis by viewing input proteomics data as a profile biomarker via integrating it with support vector machines to tackle the reproducibility issue, besides comparing it with state-of-the-art peers.

Results

Our results show that high-dimensional proteomics data are actually linearly separable under proposed derivative component analysis (DCA). As a novel multi-resolution feature selection algorithm, DCA not only overcomes the weakness of the traditional methods in subtle data behavior discovery, but also suggests an effective resolution to overcoming proteomics data's reproducibility problem and provides new techniques and insights in translational bioinformatics and machine learning. The DCA-based profile biomarker diagnosis makes clinical level diagnostic performances reproducible across different proteomic data, which is more robust and systematic than the existing biomarker discovery based diagnosis.

Conclusions

Our findings demonstrate the feasibility and power of the proposed DCA-based profile biomarker diagnosis in achieving high sensitivity and conquering the data reproducibility issue in serum proteomics. Furthermore, our proposed derivative component analysis suggests the subtle data characteristics gleaning and de-noising are essential in separating true signals from red herrings for high-dimensional proteomic profiles, which can be more important than the conventional feature selection or dimension reduction. In particular, our profile biomarker diagnosis can be generalized to other omics data for derivative component analysis (DCA)'s nature of generic data analysis.
  相似文献   

6.
Protein glycosylation, as an important post-translational modification, is implicated in a number of ailments. Applying proteomic approaches, including mass spectrometry (MS) analyses that have played a significant role in biomarker detection and early diagnosis of diseases, to the study of glycoproteins or glycopeptides will facilitate a deeper understanding of many physiological functions and biological pathways involved in cancer, inflammatory and degenerative diseases. The abundance of glycopeptides and their ionization potential are relatively lower compared to those of non-glycopeptides; therefore, sample enrichment is necessary for glycopeptides prior to MS analysis. The application of nanotechnology in the past decade has been rapidly penetrating into many diverse scientific research disciplines. Particularly in what we now refer to as the “glycoproteomics area”, nanotechnologies have enabled enhanced sensitivity and specificity of glycopeptide detection in complex biological fluids, which are critical for disease diagnosis and monitoring. In this review, we highlight some recent studies that combine the capabilities of specific nanotechnologies with the comprehensive features of glycoproteomics. In particular, we focus on the ways in which nanotechnology has facilitated the detection of glycopeptides in complex biological samples and enhanced their characterization by MS, in terms of intensity and resolution. These studies reveal an increasingly important role for nanotechnology in helping to overcome certain technical challenges in biomarker discovery, in general, and glycoproteomics research, in particular.  相似文献   

7.
During the last two decades, biomarker research has benefited from the introduction of new proteomic analytical techniques. In this article, we review the application of surface enhanced laser desorption/ionization time-of-flight (SELDI-TOF) mass spectroscopy in urologic cancer research. After reviewing the literature from MEDLINE on proteomics and urologic oncology, we found that SELDI-TOF is an emerging proteomic technology in biomarker discovery that allows for rapid and sensitive analysis of complex protein mixtures. SELDI-TOF is a novel proteomic technology that has the potential to contribute further to the understanding and clinical exploitation of new, clinically relevant biomarkers.  相似文献   

8.
Annotated formalin-fixed, paraffin-embedded (FFPE) tissue archives constitute a valuable resource for retrospective biomarker discovery. However, proteomic exploration of archival tissue is impeded by extensive formalin-induced covalent cross-linking. Robust methodology enabling proteomic profiling of archival resources is urgently needed. Recent work is beginning to support the feasibility of biomarker discovery in archival tissues, but further developments in extraction methods which are compatible with quantitative approaches are urgently needed. We report a cost-effective extraction methodology permitting quantitative proteomic analyses of small amounts of FFPE tissue for biomarker investigation. This surfactant/heat-based approach results in effective and reproducible protein extraction in FFPE tissue blocks. In combination with a liquid chromatography-mass spectrometry-based label-free quantitative proteomics methodology, the protocol enables the robust representative and quantitative analyses of the archival proteome. Preliminary validation studies in renal cancer tissues have identified typically 250-300 proteins per 500 ng of tissue with 1D LC-MS/MS with comparable extraction in FFPE and fresh frozen tissue blocks and preservation of tumor/normal differential expression patterns (205 proteins, r = 0.682; p < 10(-15)). The initial methodology presented here provides a quantitative approach for assessing the potential suitability of the vast FFPE tissue archives as an alternate resource for biomarker discovery and will allow exploration of methods to increase depth of coverage and investigate the impact of preanalytical factors.  相似文献   

9.
An ability to predict the likelihood of cellular response towards particular chemotherapeutic agents based upon protein expression patterns could facilitate the identification of biological molecules with previously undefined roles in the process of chemoresistance/chemosensitivity, and if robust enough these patterns might also be exploited towards the development of novel predictive assays. To ascertain whether proteomic based molecular profiling in conjunction with artificial neural network (ANN) algorithms could be applied towards the specific recognition of phenotypic patterns between either control or drug treated and chemosensitive or chemoresistant cellular populations, a combined approach involving MALDI-TOF matrix-assisted laser desorption/ionization-time of flight mass spectrometry, Ciphergen protein chip technology and ANN algorithms have been applied to specifically identify proteomic 'fingerprints' indicative of treatment regimen for chemosensitive (MCF-7, T47D) and chemoresistant (MCF-7/ADR) breast cancer cell lines following exposure to Doxorubicin or Paclitaxel. The results indicate that proteomic patterns can be identified by ANN algorithms to correctly assign 'class' for treatment regimen (e.g. control/drug treated or chemosensitive/chemoresistant) with a high degree of accuracy using boot-strap statistical validation techniques and that biomarker ion patterns indicative of response/non-response phenotypes are associated with MCF-7 and MCF-7/ADR cells exposed to Doxorubicin. We have also examined the predictive capability of this approach towards MCF-7 and T47D cells to ascertain whether prediction could be made based upon treatment regimen irrespective of cell lineage. Models were identified that could correctly assign class (control or Paclitaxel treatment) for 35/38 samples of an independent dataset. A similar level of predictive capability was also found (> 92%; n = 28) when proteomic patterns derived from the drug resistant cell line MCF-7/ADR were compared against those derived from MCF-7 and T47D as a model system of drug resistant and drug sensitive phenotypes. This approach might offer a potential methodology for predicting the biological behaviour of cancer cells towards particular chemotherapeutics and through protein isolation and sequence identification could result in the identification of biological molecules associated with chemosensitive/chemoresistance tumour phenotypes.  相似文献   

10.

Background  

New, more sensitive and specific biomarkers are needed to support other means of clinical diagnosis of neurodegenerative disorders. Proteomics technology is widely used in discovering new biomarkers. There are several difficulties with in-depth analysis of human plasma/serum, including that there is no one proteomic platform that can offer complete identification of differences in proteomic profiles. Another set of problems is associated with heterogeneity of human samples in addition intrinsic variability associated with every step of proteomic investigation. Validation is the very last step of proteomic investigation and it is very often difficult to validate potential biomarker with desired sensitivity and specificity. Even though it may be possible to validate a differentially expressed protein, it may not necessarily prove to be a valid diagnostic biomarker.  相似文献   

11.
12.
13.
随着质谱技术的进步以及生物信息学与统计学算法的发展,以疾病研究为主要目的之一的人类蛋白质组计划正快速推进。蛋白质生物标志物在疾病早期诊断和临床治疗等方面有着非常重要的意义,其发现策略和方法的研究已成为一个重要的热点领域。特征选择与机器学习对于解决蛋白质组数据"高维度"及"稀疏性"问题有较好的效果,因而逐渐被广泛地应用于发现蛋白质生物标志物的研究中。文中主要阐述蛋白质生物标志物的发现策略以及其中特征选择与机器学习方法的原理、应用实例和适用范围,并讨论深度学习方法在本领域的应用前景及局限性,以期为相关研究提供参考。  相似文献   

14.
Proteomic technologies have experienced major improvements in recent years. Such advances have facilitated the discovery of potential tumor markers with improved sensitivities and specificities for the diagnosis, prognosis and treatment monitoring of cancer patients. This review will focus on four state-of-the-art proteomic technologies, namely 2D difference gel electrophoresis, MALDI imaging mass spectrometry, electron transfer dissociation mass spectrometry and reverse-phase protein array. The major advancements these techniques have brought about and examples of their applications in cancer biomarker discovery will be presented in this review, so that readers can appreciate the immense progress in proteomic technologies from 1997 to 2008. Finally, a summary will be presented that discusses current hurdles faced by proteomic researchers, such as the wide dynamic range of protein abundance, standardization of protocols and validation of cancer biomarkers, and a 5-year view of potential solutions to such problems will be provided.  相似文献   

15.
An enormous amount of research effort has been devoted to biomarker discovery and validation. With the completion of the human genome, proteomics is now playing an increasing role in this search for new and better biomarkers. Here, what leads to successful biomarker development is reviewed and how these features may be applied in the context of proteomic biomarker research is considered. The “fit‐for‐purpose” approach to biomarker development suggests that untargeted proteomic approaches may be better suited for early stages of biomarker discovery, while targeted approaches are preferred for validation and implementation. A systematic screening of published biomarker articles using MS‐based proteomics reveals that while both targeted and untargeted technologies are used in proteomic biomarker development, most researchers do not combine these approaches. i) The reasons for this discrepancy, (ii) how proteomic technologies can overcome technical challenges that seem to limit their translation into the clinic, and (iii) how MS can improve, complement, or replace existing clinically important assays in the future are discussed.  相似文献   

16.
17.
Here, we report on our proteomic studies in the field of cardiovascular medicine. Our research has been focused on understanding the role of proteins in cardiovascular disease with a particular focus on epigenetic regulation and biomarker discovery, with the objective of better understanding cardiovascular pathophysiology to lead to the development of new and better diagnostic and therapeutic methods. We have used mass spectrometry for over 5 years as a viable method to investigate protein-protein interactions and post-translational modifications in cellular proteins as well as a method to investigate the role of extra-cellular proteins. Use of mass spectrometry not only as a research tool but also as a potential diagnostic tool is a topic of interest. In addition to these functional proteomics studies, structural proteomic studies are also done with expectations to allow for pinpoint drug design and therapeutic intervention. Collectively, our proteomics studies are focused on understanding the functional role and potential therapeutically exploitable property of proteins in cardiovascular disease from both intra-cellular and extra-cellular aspects with both functional as well as structural proteomics approaches to allow for comprehensive analysis.  相似文献   

18.

Background  

Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e.g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application.  相似文献   

19.
Candidate proteomic biomarker discovery from human plasma holds both incredible clinical potential as well as significant challenges. The dynamic range of proteins within plasma is known to exceed 10(10), and many potential biomarkers are likely present at lower protein abundances. At present, proteomic based MS analyses provide a dynamic range typically not exceeding approximately 10(3) in a single spectrum, and approximately 10(4)-10(6) when combined with on-line separations (e.g., reversed-phase gradient liquid chromatography), and thus are generally insufficient for low level biomarker detection directly from human plasma. This limitation is providing an impetus for the development of experimental methodologies and strategies to increase the possible number of detections within this biofluid. Discussed is the diversity of available approaches currently used by our laboratory and others to utilize human plasma as a viable medium for biomarker discovery. Various separation, depletion, enrichment, and quantitative efforts as well as recent improvements in MS capabilities have resulted in measurable improvements in the detection and identification of lower abundance proteins (by approximately 10-10(2)). Despite these improvements, further advances are needed to provide a basis for discovery of candidate biomarkers at very low levels. Continued development of depletion and enrichment techniques, coupled with improved pre-MS separations (both at the protein and peptide level) holds promise in extending the dynamic range of proteomic analysis.  相似文献   

20.
Complicating proteomic analysis of whole tissues is the obvious problem of cell heterogeneity in tissues, which often results in misleading or confusing molecular findings. Thus, the coupling of tissue microdissection for tumor cell enrichment with capillary isotachophoresis-based selective analyte concentration not only serves as a synergistic strategy to characterize low abundance proteins, but it can also be employed to conduct comparative proteomic studies of human astrocytomas. A set of fresh frozen brain biopsies were selectively microdissected to provide an enriched, high quality, and reproducible sample of tumor cells. Despite sharing many common proteins, there are significant differences in the protein expression level among different grades of astrocytomas. A large number of proteins, such as plasma membrane proteins EGFR and Erbb2, are up-regulated in glioblastoma. Besides facilitating the prioritization of follow-on biomarker selection and validation, comparative proteomics involving measurements in changes of pathways are expected to reveal the molecular relationships among different pathological grades of gliomas and potential molecular mechanisms that drive gliomagenesis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号