首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
High-throughput molecular-profiling technologies provide rapid, efficient and systematic approaches to search for biomarkers. Supervised learning algorithms are naturally suited to analyse a large amount of data generated using these technologies in biomarker discovery efforts. The study demonstrates with two examples a data-driven analysis approach to analysis of large complicated datasets collected in high-throughput technologies in the context of biomarker discovery. The approach consists of two analytic steps: an initial unsupervised analysis to obtain accurate knowledge about sample clustering, followed by a second supervised analysis to identify a small set of putative biomarkers for further experimental characterization. By comparing the most widely applied clustering algorithms using a leukaemia DNA microarray dataset, it was established that principal component analysis-assisted projections of samples from a high-dimensional molecular feature space into a few low dimensional subspaces provides a more effective and accurate way to explore visually and identify data structures that confirm intended experimental effects based on expected group membership. A supervised analysis method, shrunken centroid algorithm, was chosen to take knowledge of sample clustering gained or confirmed by the first step of the analysis to identify a small set of molecules as candidate biomarkers for further experimentation. The approach was applied to two molecular-profiling studies. In the first study, PCA-assisted analysis of DNA microarray data revealed that discrete data structures exist in rat liver gene expression and correlated with blood clinical chemistry and liver pathological damage in response to a chemical toxicant diethylhexylphthalate, a peroxisome-proliferator-activator receptor agonist. Sixteen genes were then identified by shrunken centroid algorithm as the best candidate biomarkers for liver damage. Functional annotations of these genes revealed roles in acute phase response, lipid and fatty acid metabolism and they are functionally relevant to the observed toxicities. In the second study, 26 urine ions identified from a GC/MS spectrum, two of which were glucose fragment ions included as positive controls, showed robust changes with the development of diabetes in Zucker diabetic fatty rats. Further experiments are needed to define their chemical identities and establish functional relevancy to disease development.  相似文献   

2.
High-throughput molecular-profiling technologies provide rapid, efficient and systematic approaches to search for biomarkers. Supervised learning algorithms are naturally suited to analyse a large amount of data generated using these technologies in biomarker discovery efforts. The study demonstrates with two examples a data-driven analysis approach to analysis of large complicated datasets collected in high-throughput technologies in the context of biomarker discovery. The approach consists of two analytic steps: an initial unsupervised analysis to obtain accurate knowledge about sample clustering, followed by a second supervised analysis to identify a small set of putative biomarkers for further experimental characterization. By comparing the most widely applied clustering algorithms using a leukaemia DNA microarray dataset, it was established that principal component analysis-assisted projections of samples from a high-dimensional molecular feature space into a few low dimensional subspaces provides a more effective and accurate way to explore visually and identify data structures that confirm intended experimental effects based on expected group membership. A supervised analysis method, shrunken centroid algorithm, was chosen to take knowledge of sample clustering gained or confirmed by the first step of the analysis to identify a small set of molecules as candidate biomarkers for further experimentation. The approach was applied to two molecular-profiling studies. In the first study, PCA-assisted analysis of DNA microarray data revealed that discrete data structures exist in rat liver gene expression and correlated with blood clinical chemistry and liver pathological damage in response to a chemical toxicant diethylhexylphthalate, a peroxisome-proliferator-activator receptor agonist. Sixteen genes were then identified by shrunken centroid algorithm as the best candidate biomarkers for liver damage. Functional annotations of these genes revealed roles in acute phase response, lipid and fatty acid metabolism and they are functionally relevant to the observed toxicities. In the second study, 26 urine ions identified from a GC/MS spectrum, two of which were glucose fragment ions included as positive controls, showed robust changes with the development of diabetes in Zucker diabetic fatty rats. Further experiments are needed to define their chemical identities and establish functional relevancy to disease development.  相似文献   

3.
The application of mass spectrometry to identify disease biomarkers in clinical fluids like serum using high throughput protein expression profiling continues to evolve as technology development, clinical study design, and bioinformatics improve. Previous protein expression profiling studies have offered needed insight into issues of technical reproducibility, instrument calibration, sample preparation, study design, and supervised bioinformatic data analysis. In this overview, new strategies to increase the utility of protein expression profiling for clinical biomarker assay development are discussed with an emphasis on utilizing differential lectin-based glycoprotein capture and targeted immunoassays. The carbohydrate binding specificities of different lectins offer a biological affinity approach that complements existing mass spectrometer capabilities and retains automated throughput options. Specific examples using serum samples from prostate cancer and hepatocellular carcinoma subjects are provided along with suggested experimental strategies for integration of lectin-based methods into clinical fluid expression profiling strategies. Our example workflow incorporates the necessity of early validation in biomarker discovery using an immunoaffinity-based targeted analytical approach that integrates well with upstream discovery technologies.  相似文献   

4.
The search and validation of novel disease biomarkers requires the complementary power of professional study planning and execution, modern profiling technologies and related bioinformatics tools for data analysis and interpretation. Biomarkers have considerable impact on the care of patients and are urgently needed for advancing diagnostics, prognostics and treatment of disease. This survey article highlights emerging bioinformatics methods for biomarker discovery in clinical metabolomics, focusing on the problem of data preprocessing and consolidation, the data-driven search, verification, prioritization and biological interpretation of putative metabolic candidate biomarkers in disease. In particular, data mining tools suitable for the application to omic data gathered from most frequently-used type of experimental designs, such as case-control or longitudinal biomarker cohort studies, are reviewed and case examples of selected discovery steps are delineated in more detail. This review demonstrates that clinical bioinformatics has evolved into an essential element of biomarker discovery, translating new innovations and successes in profiling technologies and bioinformatics to clinical application.  相似文献   

5.
The mass spectrometry (MS) technology in clinical proteomics is very promising for discovery of new biomarkers for diseases management. To overcome the obstacles of data noises in MS analysis, we proposed a new approach of knowledge-integrated biomarker discovery using data from Major Adverse Cardiac Events (MACE) patients. We first built up a cardiovascular-related network based on protein information coming from protein annotations in Uniprot, protein-protein interaction (PPI), and signal transduction database. Distinct from the previous machine learning methods in MS data processing, we then used statistical methods to discover biomarkers in cardiovascular-related network. Through the tradeoff between known protein information and data noises in mass spectrometry data, we finally could firmly identify those high-confident biomarkers. Most importantly, aided by protein-protein interaction network, that is, cardiovascular-related network, we proposed a new type of biomarkers, that is, network biomarkers, composed of a set of proteins and the interactions among them. The candidate network biomarkers can classify the two groups of patients more accurately than current single ones without consideration of biological molecular interaction.  相似文献   

6.
Results obtained from expression profilings of renal cell carcinoma using different “ome”‐based approaches and comprehensive data analysis demonstrated that proteome‐based technologies and cDNA microarray analyses complement each other during the discovery phase for disease‐related candidate biomarkers. The integration of the respective data revealed the uniqueness and complementarities of the different technologies. While comparative cDNA microarray analyses though restricted to up‐regulated targets largely revealed genes involved in controlling gene/protein expression (19%) and signal transduction processes (13%), proteomics/PROTEOMEX‐defined candidate biomarkers include enzymes of the cellular metabolism (36%), transport proteins (12%), and cell motility/structural molecules (10%). Candidate biomarkers defined by proteomics and PROTEOMEX are frequently shared, whereas the sharing rate between cDNA microarray and proteome‐based profilings is limited. Putative candidate biomarkers provide insights into their cellular (dys)function and their diagnostic/prognostic value but still warrant further validation in larger patient numbers. Based on the fact that merely three candidate biomarkers were shared by all applied technologies, namely annexin A4, tubulin α‐1A chain, and ubiquitin carboxyl‐terminal hydrolase L1, the analysis at a single hierarchical level of biological regulation seems to provide only limited results thus emphasizing the importance and benefit of performing rather combinatorial screenings which can complement the standard clinical predictors.  相似文献   

7.
MOTIVATION: Our purpose is to develop a statistical modeling approach for cancer biomarker discovery and provide new insights into early cancer detection. We propose the concept of dependence network, apply it for identifying cancer biomarkers, and study the difference between the protein or gene samples from cancer and non-cancer subjects based on mass-spectrometry (MS) and microarray data. RESULTS: Three MS and two gene microarray datasets are studied. Clear differences are observed in the dependence networks for cancer and non-cancer samples. Protein/gene features are examined three at one time through an exhaustive search. Dependence networks are constructed by binding triples identified by the eigenvalue pattern of the dependence model, and are further compared to identify cancer biomarkers. Such dependence-network-based biomarkers show much greater consistency under 10-fold cross-validation than the classification-performance-based biomarkers. Furthermore, the biological relevance of the dependence-network-based biomarkers using microarray data is discussed. The proposed scheme is shown promising for cancer diagnosis and prediction. AVAILABILITY: See supplements: http://dsplab.eng.umd.edu/~genomics/dependencenetwork/  相似文献   

8.
DNA microarrays may be used to identify potential molecular targets for drug discovery. Yet, DNA microarray experiments provide massive amounts of data. To limit the choice of potential molecular targets, it may be desirable to eliminate genes coincidentally up-regulated in tissues implicated in absorption, distribution, metabolism, and excretion (ADME) pharmacokinetics. DNA microarray experiments were performed to demonstrate a gene-exclusion approach using as an example RNA samples of neural origin, i.e., a human neuroblastoma cell line (SK-N-SH) and brain tissue, as the intended hypothetical site(s) of drug action. Biomarkers were identified using PharmArray DNA microarrays. The lists of neuroblastoma and neural biomarkers were constrained by limiting selection to the subset of genes that were not highly expressed in three transformed cell lines from liver, colon, and kidney (HepG2, Caco-2, and 786-O, respectively) that are routinely used as representatives of the ADME system during in vitro pharmacology and toxicology experiments. Principal component analysis methods with likelihood ratio-related bioinformatic tools were utilized to identify robust potential biomarker genes for the three ADME-related cell lines, neuroblastoma, and normal brain. Biomarkers of each sample were identified and selected genes were validated by qRT-PCR. Hundreds of biomarkers of the three ADME-related cell types, representing hepatocytes, kidney epithelium, and gastrointestinal tract, may now be used as a valuable database to restrict selection of biomarkers as potential molecular targets from the intended samples (e.g., neuroblastoma in this work). In addition to biomarker discovery per se, this demonstration suggests that our model method may be viable to help restrict gene lists during selection of potential molecular targets for subsequent drug discovery.  相似文献   

9.
Early detection and diagnosis of cancer can allow timely medical intervention, which greatly improves chances of survival and enhances quality of life. Biomarkers play an important role in assisting clinicians and health care providers in cancer diagnosis and treatment follow‐up. In spite of years of research and the discovery of thousands of candidate cancer biomarkers, only a few have transitioned to routine usage in the clinic. This review highlights advances in proteomics technologies that have enabled high rates of discovery of candidate cancer biomarkers and evaluates integration with other omics technologies to improve their progress through to validation and clinical translation. Furthermore, it gauges the role of metabolomics technology in cancer biomarker research and assesses it as a complementary tool in aiding cancer biomarker discovery and validation.  相似文献   

10.
Non-alcoholic steatohepatitis (NASH) is a severe form of non-alcoholic fatty liver disease (NAFLD). The molecular pathological mechanism of NASH is poorly understood. Recently, high throughput data such as microarray data together with bioinformatics methods have become a powerful way to identify biomarkers and to investigate pathogenesis of diseases. Taking advantage of well characterized microarray datasets of NASH livers, we performed a systematic analysis of potential biomarkers and possible pathological mechanism of NASH from a bioinformatics perspective.CodeLink Human Whole Genome Bioarrays were analyzed to find differentially expressed genes (DEGs) between controls and NASH patients. Four methods were used to identify DEGs and the intersection of DEGs identified by these methods was subsequently used for both biomarker prediction and molecular pathological mechanism analysis. For biomarker prediction, rank aggregation was used to rank DEGs identified by all these methods according to their significance of different expression. Alcohol dehydrogenase 4 (ADH4) exhibited the highest rank suggesting the most significant differential expression between normal and disease condition. Together with the previous report demonstrating the association between ADH4 and the pathogenesis of NASH, our data suggest that ADH4 could be a potential biomarker for NASH. For molecular pathological mechanism analysis, two clusters of highly correlated annotation terms and genes in these terms were identified based on the intersection of DEGs. Then, pathways enriched with these genes were identified to construct the network. Using this network, both for the first time, amino acid catabolism is implicated to play a pivotal role and urea cycle is implicated to be involved in the development of NASH.The results of our study identified potential biomarkers and suggested possible molecular pathological mechanism of NASH. These findings provide a comprehensive and systematic understanding of the pathogenesis of NASH and may facilitate the diagnosis, prevention and treatment of NASH.  相似文献   

11.
High-quality biomarkers for disease progression, drug efficacy and toxicity liability are essential for improving the efficiency of drug discovery and development. The identification of drug-activity biomarkers is often limited by access to and the quantity of target tissue. Peripheral blood has increasingly become an attractive alternative to tissue samples from organs as source for biomarker discovery, especially during early clinical studies. However, given the heterogeneous blood cell population, possible artifacts from ex vivo activations, and technical difficulties associated with overall performance of the assay, it is challenging to profile peripheral blood cells directly for biomarker discovery. In the present study, Applied BioSystems’ blood collection system was evaluated for its ability to isolate RNA suitable for use on the Affymetrix microarray platform. Blood was collected in a TEMPUS tube and RNA extracted using an ABI-6100 semi-automated workstation. Using human and rat whole blood samples, it was demonstrated that the RNA isolated using this approach was stable, of high quality and was suitable for Affymetrix microarray applications. The microarray data were statistically analysed and compared with other blood protocols. Minimal haemoglobin interference with RNA labelling efficiency and chip hybridization was found using the TEMPUS tube and extraction method. The RNA quality, stability and ease of handling requirement make the TEMPUS tube protocol an attractive approach for expression profiling of whole blood to support target and biomarker discovery.  相似文献   

12.
13.
We developed a pipeline to integrate the proteomic technologies used from the discovery to the verification stages of plasma biomarker identification and applied it to identify early biomarkers of cardiac injury from the blood of patients undergoing a therapeutic, planned myocardial infarction (PMI) for treatment of hypertrophic cardiomyopathy. Sampling of blood directly from patient hearts before, during and after controlled myocardial injury ensured enrichment for candidate biomarkers and allowed patients to serve as their own biological controls. LC-MS/MS analyses detected 121 highly differentially expressed proteins, including previously credentialed markers of cardiovascular disease and >100 novel candidate biomarkers for myocardial infarction (MI). Accurate inclusion mass screening (AIMS) qualified a subset of the candidates based on highly specific, targeted detection in peripheral plasma, including some markers unlikely to have been identified without this step. Analyses of peripheral plasma from controls and patients with PMI or spontaneous MI by quantitative multiple reaction monitoring mass spectrometry or immunoassays suggest that the candidate biomarkers may be specific to MI. This study demonstrates that modern proteomic technologies, when coherently integrated, can yield novel cardiovascular biomarkers meriting further evaluation in large, heterogeneous cohorts.  相似文献   

14.
High-quality biomarkers for disease progression, drug efficacy and toxicity liability are essential for improving the efficiency of drug discovery and development. The identification of drug-activity biomarkers is often limited by access to and the quantity of target tissue. Peripheral blood has increasingly become an attractive alternative to tissue samples from organs as source for biomarker discovery, especially during early clinical studies. However, given the heterogeneous blood cell population, possible artifacts from ex vivo activations, and technical difficulties associated with overall performance of the assay, it is challenging to profile peripheral blood cells directly for biomarker discovery. In the present study, Applied BioSystems' blood collection system was evaluated for its ability to isolate RNA suitable for use on the Affymetrix microarray platform. Blood was collected in a TEMPUS tube and RNA extracted using an ABI-6100 semi-automated workstation. Using human and rat whole blood samples, it was demonstrated that the RNA isolated using this approach was stable, of high quality and was suitable for Affymetrix microarray applications. The microarray data were statistically analysed and compared with other blood protocols. Minimal haemoglobin interference with RNA labelling efficiency and chip hybridization was found using the TEMPUS tube and extraction method. The RNA quality, stability and ease of handling requirement make the TEMPUS tube protocol an attractive approach for expression profiling of whole blood to support target and biomarker discovery.  相似文献   

15.
Recent advancements in proteomics technology have stimulated the widespread research and development in the area of biomarker discovery using mass spectrometry (MS). The final goal of biomarker discovery and development is to establish clinically useful and reliable diagnostic methods for various diseases. Specific alterations in the nature and composition of glycans attached to proteins are seen during the development and progression of a number of diseases and disorders. Therefore, development of glyco-biomarkers, which detect disease-specific glycoproteins and changes in glycoforms, is gaining much attention. The combined use of multiple technologies, not solely MS, is the key to the discovery of clinically significant and reliable biomarkers. We have employed the combination of quantitative real-time polymerase chain reaction (PCR), lectin microarray, liquid chromatography/mass spectrometry-based technique with isotope-coded glycosylation site-specific tagging (IGOT-LC/MS), and bioinformatics to successfully develop a novel diagnostic kit for the quantitative evaluation of liver fibrosis. Efforts to develop highly effective glyco-biomarkers for other diseases are also currently underway.  相似文献   

16.
Proteomic profiling of pancreatic cancer for biomarker discovery   总被引:15,自引:0,他引:15  
Pancreatic cancer is a uniformly lethal disease that is difficult to diagnose at early stage and even more difficult to cure. In recent years, there has been a substantial interest in applying proteomics technologies to identify protein biomarkers for early detection of cancer. Quantitative proteomic profiling of body fluids, tissues, or other biological samples to identify differentially expressed proteins represents a very promising approach for improving the outcome of this disease. Proteins associated with pancreatic cancer identified through proteomic profiling technologies could be useful as biomarkers for the early diagnosis, therapeutic targets, and disease response markers. In this article, we discuss recent progress and challenges for applying quantitative proteomics technologies for biomarker discovery in pancreatic cancer.  相似文献   

17.
Prediction of the diagnostic category of a tissue sample from its gene-expression profile and selection of relevant genes for class prediction have important applications in cancer research. We have developed the uncorrelated shrunken centroid (USC) and error-weighted, uncorrelated shrunken centroid (EWUSC) algorithms that are applicable to microarray data with any number of classes. We show that removing highly correlated genes typically improves classification results using a small set of genes.  相似文献   

18.
Multivariate biomarkers that can predict the effectiveness of targeted therapy in individual patients are highly desired. Previous biomarker discovery studies have largely focused on the identification of single biomarker signatures, aimed at maximizing prediction accuracy. Here, we present a different approach that identifies multiple biomarkers by simultaneously optimizing their predictive power, number of features, and proximity to the drug target in a protein-protein interaction network. To this end, we incorporated NSGA-II, a fast and elitist multi-objective optimization algorithm that is based on the principle of Pareto optimality, into the biomarker discovery workflow. The method was applied to quantitative phosphoproteome data of 19 non-small cell lung cancer (NSCLC) cell lines from a previous biomarker study. The algorithm successfully identified a total of 77 candidate biomarker signatures predicting response to treatment with dasatinib. Through filtering and similarity clustering, this set was trimmed to four final biomarker signatures, which then were validated on an independent set of breast cancer cell lines. All four candidates reached the same good prediction accuracy (83%) as the originally published biomarker. Although the newly discovered signatures were diverse in their composition and in their size, the central protein of the originally published signature — integrin β4 (ITGB4) — was also present in all four Pareto signatures, confirming its pivotal role in predicting dasatinib response in NSCLC cell lines. In summary, the method presented here allows for a robust and simultaneous identification of multiple multivariate biomarkers that are optimized for prediction performance, size, and relevance.  相似文献   

19.
High-throughput technologies can now identify hundreds of candidate protein biomarkers for any disease with relative ease. However, because there are no assays for the majority of proteins and de novo immunoassay development is prohibitively expensive, few candidate biomarkers are tested in clinical studies. We tested whether the analytical performance of a biomarker identification pipeline based on targeted mass spectrometry would be sufficient for data-dependent prioritization of candidate biomarkers, de novo development of assays and multiplexed biomarker verification. We used a data-dependent triage process to prioritize a subset of putative plasma biomarkers from >1,000 candidates previously identified using a mouse model of breast cancer. Eighty-eight novel quantitative assays based on selected reaction monitoring mass spectrometry were developed, multiplexed and evaluated in 80 plasma samples. Thirty-six proteins were verified as being elevated in the plasma of tumor-bearing animals. The analytical performance of this pipeline suggests that it should support the use of an analogous approach with human samples.  相似文献   

20.
Associating changes in protein levels with the onset of cancer has been widely investigated to identify clinically relevant diagnostic biomarkers. In the present study, we analyzed sera from 205 patients recruited in the United States and Egypt for biomarker discovery using label‐free proteomic analysis by LC‐MS/MS. We performed untargeted proteomic analysis of sera to identify candidate proteins with statistically significant differences between hepatocellular carcinoma (HCC) and patients with liver cirrhosis. We further evaluated the significance of 101 proteins in sera from the same 205 patients through targeted quantitation by MRM on a triple quadrupole mass spectrometer. This led to the identification of 21 candidate protein biomarkers that were significantly altered in both the United States and Egyptian cohorts. Among the 21 candidates, ten were previously reported as HCC‐associated proteins (eight exhibiting consistent trends with our observation), whereas 11 are new candidates discovered by this study. Pathway analysis based on the significant proteins reveals upregulation of the complement and coagulation cascades pathway and downregulation of the antigen processing and presentation pathway in HCC cases versus patients with liver cirrhosis. The results of this study demonstrate the power of combining untargeted and targeted quantitation methods for a comprehensive serum proteomic analysis, to evaluate changes in protein levels and discover novel diagnostic biomarkers. All MS data have been deposited in the ProteomeXchange with identifier PXD001171 ( http://proteomecentral.proteomexchange.org/dataset/PXD001171 ).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号