首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

In spite of the recognized diagnostic potential of biomarkers, the quest for squelching noise and wringing in information from a given set of biomarkers continues. Here, we suggest a statistical algorithm that – assuming each molecular biomarker to be a diagnostic test – enriches the diagnostic performance of an optimized set of independent biomarkers employing established statistical techniques. We validated the proposed algorithm using several simulation datasets in addition to four publicly available real datasets that compared i) subjects having cancer with those without; ii) subjects with two different cancers; iii) subjects with two different types of one cancer; and iv) subjects with same cancer resulting in differential time to metastasis.  相似文献   

2.
Computationally identifying effective biomarkers for cancers from gene expression profiles is an important and challenging task. The challenge lies in the complicated pathogenesis of cancers that often involve the dysfunction of many genes and regulatory interactions. Thus, sophisticated classification model is in pressing need. In this study, we proposed an efficient approach, called ellipsoidFN (ellipsoid Feature Net), to model the disease complexity by ellipsoids and seek a set of heterogeneous biomarkers. Our approach achieves a non-linear classification scheme for the mixed samples by the ellipsoid concept, and at the same time uses a linear programming framework to efficiently select biomarkers from high-dimensional space. ellipsoidFN reduces the redundancy and improves the complementariness between the identified biomarkers, thus significantly enhancing the distinctiveness between cancers and normal samples, and even between cancer types. Numerical evaluation on real prostate cancer, breast cancer and leukemia gene expression datasets suggested that ellipsoidFN outperforms the state-of-the-art biomarker identification methods, and it can serve as a useful tool for cancer biomarker identification in the future. The Matlab code of ellipsoidFN is freely available from http://doc.aporc.org/wiki/EllipsoidFN.  相似文献   

3.
Biomarkers are widely used in clinical diagnosis, prognosis and therapy monitoring. Here, we developed a protocol for the efficient and selective enrichment of small and low concentrated biomarkers from human serum, involving a 95% effective depletion of high‐abundant serum proteins by partial denaturation and enrichment of low‐abundant biomarkers by size exclusion chromatography. The recovery of low‐abundance biomarkers was above 97%. Using this protocol, we quantified the tumour markers DcR3 and growth/differentiation factor (GDF)15 from 100 μl human serum by isotope dilution mass spectrometry, using 15N metabolically labelled and concatamerized fingerprint peptides for the both proteins. Analysis of three different fingerprint peptides for each protein by liquid chromatography electrospray ionization mass spectrometry resulted in comparable concentrations in three healthy human serum samples (DcR3: 27.23 ± 2.49 fmol/ml; GDF15: 98.11 ± 0.49 fmol/ml). In contrast, serum levels were significantly elevated in tumour patients for DcR3 (116.94 ± 57.37 fmol/ml) and GDF15 (164.44 ± 79.31 fmol/ml). Obtained data were in good agreement with ELISA and qPCR measurements, as well as with literature data. In summary, our protocol allows the reliable quantification of biomarkers, shows a higher resolution at low biomarker concentrations than antibody‐based strategies, and offers the possibility of multiplexing. Our proof‐of‐principle studies in patient sera encourage the future analysis of the prognostic value of DcR3 and GDF15 for colon cancer patients in larger patient cohorts.  相似文献   

4.
Biological treatment of many cancers currently targets membrane bound receptors located on a cell surface. To identify novel membrane proteins associated with migration and metastasis of breast cancer cells, a more migrating subpopulation of MDA‐MB‐231 breast cancer cell line is selected and characterized. A high‐resolution quantitative mass spectrometry with SILAC labeling is applied to analyze their surfaceome and it is compared with that of parental MDA‐MB‐231 cells. Among 824 identified proteins (FDR < 0.01), 128 differentially abundant cell surface proteins with at least one transmembrane domain are found. Of these, i) desmocollin‐1 (DSC1) is validated as a protein connected with lymph node status of luminal A breast cancer, tumor grade, and Her‐2 status by immunohistochemistry in the set of 96 primary breast tumors, and ii) catechol‐O‐methyltransferase is successfully verified as a protein associated with lymph node metastasis of triple negative breast cancer as well as with tumor grade by targeted data extraction from the SWATH‐MS data of the same set of tissues. The findings indicate importance of both proteins for breast cancer development and metastasis and highlight the potential of biomarker validation strategy via targeted data extraction from SWATH‐MS datasets.  相似文献   

5.
Biomarkers for the lung cancer diagnosis and their advances in proteomics   总被引:1,自引:0,他引:1  
Sung HJ  Cho JY 《BMB reports》2008,41(9):615-625
Over a last decade, intense interest has been focused on biomarker discovery and their clinical uses. This interest is accelerated by the completion of human genome project and the progress of techniques in proteomics. Especially, cancer biomarker discovery is eminent in this field due to its anticipated critical role in early diagnosis, therapy guidance, and prognosis monitoring of cancers. Among cancers, lung cancer, one of the top three major cancers, is the one showing the highest mortality because of failure in early diagnosis. Numerous potential DNA biomarkers such as hypermethylations of the promoters and mutations in K-ras, p53, and protein biomarkers; carcinoembryonic antigen (CEA), CYFRA21-1, plasma kallikrein B1 (KLKB1), Neuron-specific enolase, etc. have been discovered as lung cancer biomarkers. Despite extensive studies thus far, few are turned out to be useful in clinic. Even those used in clinic do not show enough sensitivity, specificity and reproducibility for general use. This review describes what the cancer biomarkers are for, various types of lung cancer biomarkers discovered at present and predicted future advance in lung cancer biomarker discovery with proteomics technology.  相似文献   

6.
Class and biomarker discovery continue to be among the preeminent goals in gene microarray studies of cancer. We have developed a new data mining technique, which we call Binary State Pattern Clustering (BSPC) that is specifically adapted for these purposes, with cancer and other categorical datasets. BSPC is capable of uncovering statistically significant sample subclasses and associated marker genes in a completely unsupervised manner. This is accomplished through the application of a digital paradigm, where the expression level of each potential marker gene is treated as being representative of its discrete functional state. Multiple genes that divide samples into states along the same boundaries form a kind of gene-cluster that has an associated sample-cluster. BSPC is an extremely fast deterministic algorithm that scales well to large datasets. Here we describe results of its application to three publicly available oligonucleotide microarray datasets. Using an alpha-level of 0.05, clusters reproducing many of the known sample classifications were identified along with associated biomarkers. In addition, a number of simulations were conducted using shuffled versions of each of the original datasets, noise-added datasets, as well as completely artificial datasets. The robustness of BSPC was compared to that of three other publicly available clustering methods: ISIS, CTWC and SAMBA. The simulations demonstrate BSPC's substantially greater noise tolerance and confirm the accuracy of our calculations of statistical significance.  相似文献   

7.
Despite their potential to impact diagnosis and treatment of cancer, few protein biomarkers are in clinical use. Biomarker discovery is plagued with difficulties ranging from technological (inability to globally interrogate proteomes) to biological (genetic and environmental differences among patients and their tumors). We urgently need paradigms for biomarker discovery. To minimize biological variation and facilitate testing of proteomic approaches, we employed a mouse model of breast cancer. Specifically, we performed LC-MS/MS of tumor and normal mammary tissue from a conditional HER2/Neu-driven mouse model of breast cancer, identifying 6758 peptides representing >700 proteins. We developed a novel statistical approach (SASPECT) for prioritizing proteins differentially represented in LC-MS/MS datasets and identified proteins over- or under-represented in tumors. Using a combination of antibody-based approaches and multiple reaction monitoring-mass spectrometry (MRM-MS), we confirmed the overproduction of multiple proteins at the tissue level, identified fibulin-2 as a plasma biomarker, and extensively characterized osteopontin as a plasma biomarker capable of early disease detection in the mouse. Our results show that a staged pipeline employing shotgun-based comparative proteomics for biomarker discovery and multiple reaction monitoring for confirmation of biomarker candidates is capable of finding novel tissue and plasma biomarkers in a mouse model of breast cancer. Furthermore, the approach can be extended to find biomarkers relevant to human disease.  相似文献   

8.
Clinical management of prostate cancer remains a significant challenge due to the lack of available tests for guiding treatment decisions. The blood prostate‐specific antigen test has facilitated early detection and intervention of prostate cancer. However, blood prostate‐specific antigen levels are less effective in distinguishing aggressive from indolent prostate cancers and other benign prostatic diseases. Thus, the development of novel approaches specific for prostate cancer that can differentiate aggressive from indolent disease remains an urgent medical need. In the current study, we evaluated urine specimens from prostate cancer patients using LC‐MS/MS, with the aim of identifying effective urinary prostate cancer biomarkers. Glycoproteins from urine samples of prostate cancer patients with different Gleason scores were characterized via solid phase extraction of N‐linked glycosite‐containing peptides and LC‐MS/MS. A total of 2923 unique glycosite‐containing peptides were identified. Glycoproteomic comparison on urine and tissues from aggressive and non‐aggressive prostate cancers as well as sera from prostate cancer patients revealed that the majority of AG prostate cancer associated glycoproteins were more readily detected in patient's urine than serum samples. Our data collectively indicate that urine provides a potential source for biomarker testing in patients with AG prostate cancer.  相似文献   

9.

Background  

The four heterogeneous childhood cancers, neuroblastoma, non-Hodgkin lymphoma, rhabdomyosarcoma, and Ewing sarcoma present a similar histology of small round blue cell tumor (SRBCT) and thus often leads to misdiagnosis. Identification of biomarkers for distinguishing these cancers is a well studied problem. Existing methods typically evaluate each gene separately and do not take into account the nonlinear interaction between genes and the tools that are used to design the diagnostic prediction system. Consequently, more genes are usually identified as necessary for prediction. We propose a general scheme for finding a small set of biomarkers to design a diagnostic system for accurate classification of the cancer subgroups. We use multilayer networks with online gene selection ability and relational fuzzy clustering to identify a small set of biomarkers for accurate classification of the training and blind test cases of a well studied data set.  相似文献   

10.

Background

Many mathematical and statistical models and algorithms have been proposed to do biomarker identification in recent years. However, the biomarkers inferred from different datasets suffer a lack of reproducibilities due to the heterogeneity of the data generated from different platforms or laboratories. This motivates us to develop robust biomarker identification methods by integrating multiple datasets.

Methods

In this paper, we developed an integrative method for classification based on logistic regression. Different constant terms are set in the logistic regression model to measure the heterogeneity of the samples. By minimizing the differences of the constant terms within the same dataset, both the homogeneity within the same dataset and the heterogeneity in multiple datasets can be kept. The model is formulated as an optimization problem with a network penalty measuring the differences of the constant terms. The L1 penalty, elastic penalty and network related penalties are added to the objective function for the biomarker discovery purpose. Algorithms based on proximal Newton method are proposed to solve the optimization problem.

Results

We first applied the proposed method to the simulated datasets. Both the AUC of the prediction and the biomarker identification accuracy are improved. We then applied the method to two breast cancer gene expression datasets. By integrating both datasets, the prediction AUC is improved over directly merging the datasets and MetaLasso. And it’s comparable to the best AUC when doing biomarker identification in an individual dataset. The identified biomarkers using network related penalty for variables were further analyzed. Meaningful subnetworks enriched by breast cancer were identified.

Conclusion

A network-based integrative logistic regression model is proposed in the paper. It improves both the prediction and biomarker identification accuracy.
  相似文献   

11.
Wubin Ding 《Epigenetics》2019,14(1):67-80
DNA methylation status is closely associated with diverse diseases, and is generally more stable than gene expression, thus abnormal DNA methylation could be important biomarkers for tumor diagnosis, treatment and prognosis. However, the signatures regarding DNA methylation changes for pan-cancer diagnosis and prognosis are less explored. Here we systematically analyzed the genome-wide DNA methylation patterns in diverse TCGA cancers with machine learning. We identified seven CpG sites that could effectively discriminate tumor samples from adjacent normal tissue samples for 12 main cancers of TCGA (1216 samples, AUC > 0.99). Those seven potential diagnostic biomarkers were further validated in the other 9 different TCGA cancers and 4 independent datasets (AUC > 0.92). Three out of the seven CpG sites were correlated with cell division, DNA replication and cell cycle. We also identified 12 CpG sites that can effectively distinguish 26 different cancers (7605 samples), and the result was repeatable in independent datasets as well as two disparate tumors with metastases (micro-average AUC > 0.89). Furthermore, a series of potential signatures that could significantly predict the prognosis of tumor patients for 7 different cancer were identified via survival analysis (p-value < 1e-4). Collectively, DNA methylation patterns vary greatly between tumor and adjacent normal tissues, as well as among different types of cancers. Our identified signatures may aid the decision of clinical diagnosis and prognosis for pan-cancer and the potential cancer-specific biomarkers could be used to predict the primary site of metastatic breast and prostate cancers.  相似文献   

12.
The development of molecular diagnostic tools to achieve individualized medicine requires identifying predictive biomarkers associated with subgroups of individuals who might receive beneficial or harmful effects from different available treatments. However, due to the large number of candidate biomarkers in the large‐scale genetic and molecular studies, and complex relationships among clinical outcome, biomarkers, and treatments, the ordinary statistical tests for the interactions between treatments and covariates have difficulties from their limited statistical powers. In this paper, we propose an efficient method for detecting predictive biomarkers. We employ weighted loss functions of Chen et al. to directly estimate individual treatment scores and propose synthetic posterior inference for effect sizes of biomarkers. We develop an empirical Bayes approach, namely, we estimate unknown hyperparameters in the prior distribution based on data. We then provide efficient screening methods for the candidate biomarkers via optimal discovery procedure with adequate control of false discovery rate. The proposed method is demonstrated in simulation studies and an application to a breast cancer clinical study in which the proposed method was shown to detect the much larger numbers of significant biomarkers than existing standard methods.  相似文献   

13.
Research has shown that microRNAs are promising biomarkers that can be used to promote a more accurate diagnosis of cancer. In this study, we developed an integrated multi-step selection process to analyze available high-throughput datasets to obtain information on microRNAs as cancer biomarkers. Applying this approach to the microRNA expression profiles of prostate cancer and the datasets in The Cancer Genome Atlas Data Portal, we identified miRNA-182, miRNA-200c and miRNA-221 as possible biomarkers for prostate cancer. The associations between the expressions of these three microRNAs with clinical parameters as well as their diagnostic capability were studied. Several online databases were used to predict the target genes of these three microRNAs, and the results were confirmed by significant statistical correlations. Comparing with the other 18 types of cancers listed in The Cancer Genome Atlas Data Portal, we found that the combination of both miRNA-182 and miRNA-200c being up-regulated and miRNA-221 being down-regulated only happens in prostate cancer. This provides a unique biological characteristic for prostate cancer that can potentially be used for diagnosis based on tissue testing. In addition, our study also revealed that these three microRNAs are associated with the pathological status of prostate cancer.  相似文献   

14.
15.
Introduction: Cancer represents one of the major causes of human deaths. Identification of proteins as biomarkers for early detection of cancer and therapeutic targets for cancer treatment are important issues in precision medicine. Secretome of cancer cells represents the collection of proteins secreted or shed from cancer cells. Proteomic profiling of the cancer cell secretome has been proven to be a convenient and efficient way to discover cancer biomarker and/or therapeutic targets.

Areas covered: There have been numerous reviews describing the history and application of secretome analysis in cancer biomarker/therapeutic target research. The present review focuses on the technological advancement for profiling low-molecular-mass proteins in secretome, the latest information regarding the new candidate biomarkers and molecular mechanisms discovered on the basis of cancer cell secretome analysis, as well as the previously discovered candidate biomarkers that enter into clinical trials.

Expert commentary: Current technologies for protein sample preparation/separation and MS-based protein identification have allowed in-depth analysis of cancer cell secretome. Future efforts should focus on the comprehensiveness of cancer cell secretome, meta-analysis of different secretome datasets and integrated analysis via combining other omics datasets, as well as the incorporation of MS-based biomarker verification pipeline into both preclinical studies and clinical trials.  相似文献   


16.
Smoking is the leading cause of lung cancer development and several genes have been identified as potential biomarker for lungs cancer. Contributing to the present scientific knowledge of biomarkers for lung cancer two different data sets, i.e. GDS3257 and GDS3054 were downloaded from NCBI׳s GEO database and normalized by RMA and GRMA packages (Bioconductor). Diffrentially expressed genes were extracted by using and were R (3.1.2); DAVID online tool was used for gene annotation and GENE MANIA tool was used for construction of gene regulatory network. Nine smoking independent gene were found whereas average expressions of those genes were almost similar in both the datasets. Five genes among them were found to be associated with cancer subtypes. Thirty smoking specific genes were identified; among those genes eight were associated with cancer sub types. GPR110, IL1RN and HSP90AA1 were found directly associated with lung cancer. SEMA6A differentially expresses in only non-smoking lung cancer samples. FLG is differentially expressed smoking specific gene and is related to onset of various cancer subtypes. Functional annotation and network analysis revealed that FLG participates in various epidermal tissue developmental processes and is co-expressed with other genes. Lung tissues are epidermal tissues and thus it suggests that alteration in FLG may cause lung cancer. We conclude that smoking alters expression of several genes and associated biological pathways during development of lung cancers.  相似文献   

17.
18.
Tsai YS  Aguan K  Pal NR  Chung IF 《PloS one》2011,6(9):e24259
Informative genes from microarray data can be used to construct prediction model and investigate biological mechanisms. Differentially expressed genes, the main targets of most gene selection methods, can be classified as single- and multiple-class specific signature genes. Here, we present a novel gene selection algorithm based on a Group Marker Index (GMI), which is intuitive, of low-computational complexity, and efficient in identification of both types of genes. Most gene selection methods identify only single-class specific signature genes and cannot identify multiple-class specific signature genes easily. Our algorithm can detect de novo certain conditions of multiple-class specificity of a gene and makes use of a novel non-parametric indicator to assess the discrimination ability between classes. Our method is effective even when the sample size is small as well as when the class sizes are significantly different. To compare the effectiveness and robustness we formulate an intuitive template-based method and use four well-known datasets. We demonstrate that our algorithm outperforms the template-based method in difficult cases with unbalanced distribution. Moreover, the multiple-class specific genes are good biomarkers and play important roles in biological pathways. Our literature survey supports that the proposed method identifies unique multiple-class specific marker genes (not reported earlier to be related to cancer) in the Central Nervous System data. It also discovers unique biomarkers indicating the intrinsic difference between subtypes of lung cancer. We also associate the pathway information with the multiple-class specific signature genes and cross-reference to published studies. We find that the identified genes participate in the pathways directly involved in cancer development in leukemia data. Our method gives a promising way to find genes that can involve in pathways of multiple diseases and hence opens up the possibility of using an existing drug on other diseases as well as designing a single drug for multiple diseases.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号