期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Identification of hub genes with diagnostic values in pancreatic cancer by bioinformatics analyses and supervised learning methods

Chunyang Li Xiaoxi Zeng Haopeng Yu Yonghong Gu Wei Zhang 《World journal of surgical oncology》2018,16(1):223

Background

Pancreatic cancer is one of the most lethal tumors with poor prognosis, and lacks of effective biomarkers in diagnosis and treatment. The aim of this investigation was to identify hub genes in pancreatic cancer, which would serve as potential biomarkers for cancer diagnosis and therapy in the future.

Methods

Combination of two expression profiles of GSE16515 and GSE22780 from Gene Expression Omnibus (GEO) database was served as training set. Differentially expressed genes (DEGs) with top 25% variance followed by protein-protein interaction (PPI) network were performed to find candidate genes. Then, hub genes were further screened by survival and cox analyses in The Cancer Genome Atlas (TCGA) database. Finally, hub genes were validated in GSE15471 dataset from GEO by supervised learning methods k-nearest neighbor (kNN) and random forest algorithms.

Results

After quality control and batch effect elimination of training set, 181 DEGs bearing top 25% variance were identified as candidate genes. Then, two hub genes, MMP7 and ITGA2, correlating with diagnosis and prognosis of pancreatic cancer were screened as hub genes according to above-mentioned bioinformatics methods. Finally, hub genes were demonstrated to successfully differ tumor samples from normal tissues with predictive accuracies reached to 93.59 and 81.31% by using kNN and random forest algorithms, respectively.

Conclusions

All the hub genes were associated with the regulation of tumor microenvironment, which implicated in tumor proliferation, progression, migration, and metastasis. Our results provide a novel prospect for diagnosis and treatment of pancreatic cancer, which may have a further application in clinical.

相似文献

2.

Prediction of side chain orientations in proteins by statistical machine learning methods

Yan A Kloczkowski A Hofmann H Jernigan RL 《Journal of biomolecular structure & dynamics》2007,25(3):275-288

We develop ways to predict the side chain orientations of residues within a protein structure by using several different statistical machine learning methods. Here side chain orientation of a given residue i is measured by an angle Omega(i) between the vector pointing from the center of the protein structure to the C(i)(alpha) atom and the vector pointing from the C(i)(alpha) atom to the center of its side chain atoms. To predict the Omega(i) angles, we construct statistical models by using several different methods such as general linear regression, a regression tree and bagging, a neural network, and a support vector machine. The root mean square errors for the different models range only from 36.67 to 37.60 degrees and the correlation coefficients are all between 30% and 34%. The performances of different models in the test set are, thus, quite similar, and show the relative predictive power of these models to be significant in comparison with random side chain orientations. 相似文献

3.

Towards reconstruction of gene networks from expression data by supervised learning 总被引：2，自引：0，他引：2

Soinov LA Krestyaninova MA Brazma A 《Genome biology》2003,4(1):R6

相似文献

4.

Functional impact of missense variants in BRCA1 predicted by supervised learning

下载免费PDF全文

Karchin R Monteiro AN Tavtigian SV Carvalho MA Sali A 《PLoS computational biology》2007,3(2):e26

Many individuals tested for inherited cancer susceptibility at the BRCA1 gene locus are discovered to have variants of unknown clinical significance (UCVs). Most UCVs cause a single amino acid residue (missense) change in the BRCA1 protein. They can be biochemically assayed, but such evaluations are time-consuming and labor-intensive. Computational methods that classify and suggest explanations for UCV impact on protein function can complement functional tests. Here we describe a supervised learning approach to classification of BRCA1 UCVs. Using a novel combination of 16 predictive features, the algorithms were applied to retrospectively classify the impact of 36 BRCA1 C-terminal (BRCT) domain UCVs biochemically assayed to measure transactivation function and to blindly classify 54 documented UCVs. Majority vote of three supervised learning algorithms is in agreement with the assay for more than 94% of the UCVs. Two UCVs found deleterious by both the assay and the classifiers reveal a previously uncharacterized putative binding site. Clinicians may soon be able to use computational classifiers such as those described here to better inform patients. These classifiers can be adapted to other cancer susceptibility genes and systematically applied to prioritize the growing number of potential causative loci and variants found by large-scale disease association studies. 相似文献

5.

Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods

Stefano Lise Cedric Archambeau Massimiliano Pontil David T Jones 《BMC bioinformatics》2009,10(1):365

Background

Alanine scanning mutagenesis is a powerful experimental methodology for investigating the structural and energetic characteristics of protein complexes. Individual amino-acids are systematically mutated to alanine and changes in free energy of binding (ΔΔG) measured. Several experiments have shown that protein-protein interactions are critically dependent on just a few residues ("hot spots") at the interface. Hot spots make a dominant contribution to the free energy of binding and if mutated they can disrupt the interaction. As mutagenesis studies require significant experimental efforts, there is a need for accurate and reliable computational methods. Such methods would also add to our understanding of the determinants of affinity and specificity in protein-protein recognition. 相似文献

6.

Improved prediction of malaria degradomes by supervised learning with SVM and profile kernel

Rui Kuang Jianying Gu Hong Cai Yufeng Wang 《Genetica》2009,137(2):243-243

相似文献

7.

Improved prediction of malaria degradomes by supervised learning with SVM and profile kernel

Rui Kuang Jianying Gu Hong Cai Yufeng Wang 《Genetica》2009,136(1):189-209

The spread of drug resistance through malaria parasite populations calls for the development of new therapeutic strategies. However, the seemingly promising genomics-driven target identification paradigm is hampered by the weak annotation coverage. To identify potentially important yet uncharacterized proteins, we apply support vector machines using profile kernels, a supervised discriminative machine learning technique for remote homology detection, as a complement to the traditional alignment based algorithms. In this study, we focus on the prediction of proteases, which have long been considered attractive drug targets because of their indispensable roles in parasite development and infection. Our analysis demonstrates that an abundant and complex repertoire is conserved in five Plasmodium parasite species. Several putative proteases may be important components in networks that mediate cellular processes, including hemoglobin digestion, invasion, trafficking, cell cycle fate, and signal transduction. This catalog of proteases provides a short list of targets for functional characterization and rational inhibitor design. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. Rui Kuang and Jianying Gu have contributed equally to this work. An erratum to this article can be found at 相似文献

8.

The rise and fall of supervised machine learning techniques

Jensen LJ Bateman A 《Bioinformatics (Oxford, England)》2011,27(24):3331-3332

相似文献

9.

A highly efficient system to produce infectious human papillomavirus: Elucidation of natural virus-host interactions

《Cell cycle (Georgetown, Tex.)》2013,12(9):1319-1323

A simple, efficient system has been developed to produce high titers of infectious human papillomavirus type 18 (HPV-18) in organotypic raft cultures of primary human keratinocytes (PHKs). Molecular characterization elucidated key early and late events in the reproductive program. The system obviates the need for immortalized cells and allows the analyses of mutant HPV genomes not previously possible. An E6 deletion mutant incapable of causing p53 degradation is defective in viral DNA amplification and capsid protein production. The high levels of p53 protein which accumulated in numerous cells did not lead to apoptosis over a prolonged duration. Time course and metabolic labeling experiments revealed novel interactions with the host. Notably, post-mitotic, differentiated cells are induced by HPV E7 expression to reenter S phase, whereupon host chromosomes replicate, but HPV DNA does not amplify until the cells have progressed to and are arrested in G2 phase. Here, we present data that strongly suggest that the abundant cytoplasmic viral E1^E4 protein is not responsible for this G2 arrest, as described in the literature upon ectopic expression in cell lines. We provide additional insights into the viral life cycle and contrast them to conclusions derived from experiments in cell lines. 相似文献

10.

Model selection methodology in supervised learning with evolutionary computation 总被引：4，自引：0，他引：4

Rowland JJ 《Bio Systems》2003,72(1-2):187-196

The expressive power, powerful search capability, and the explicit nature of the resulting models make evolutionary methods very attractive for supervised learning applications in bioinformatics. However, their characteristics also make them highly susceptible to overtraining or to discovering chance relationships in the data. Identification of appropriate criteria for terminating evolution and for selecting an appropriately validated model is vital. Some approaches that are commonly applied to other modelling methods are not necessarily applicable in a straightforward manner to evolutionary methods. An approach to model selection is presented that is not unduly computationally intensive. To illustrate the issues and the technique two bioinformatic datasets are used, one relating to metabolite determination and the other to disease prediction from gene expression data. 相似文献

11.

Prediction of protein-protein interactions by docking methods

Smith GR Sternberg MJ 《Current opinion in structural biology》2002,12(1):28-35

Recently, developments have been made in predicting the structure of docked complexes when the coordinates of the components are known. The process generally consists of a stage during which the components are combined rigidly and then a refinement stage. Several rapid new algorithms have been introduced in the rigid docking problem and promising refinement techniques have been developed, based on modified molecular mechanics force fields and empirical measures of desolvation, combined with minimisations that switch on the short-range interactions gradually. There has also been progress in developing a benchmark set of targets for docking and a blind trial, similar to the trials of protein structure prediction, has taken place. 相似文献

12.

Activity-dependent regulation of receptive field properties of cat area 17 by supervised Hebbian learning.

Y Frégnac D E Shulz 《Journal of neurobiology》1999,41(1):69-82

Most algorithms currently used to model synaptic plasticity in self-organizing cortical networks suppose that the change in synaptic efficacy is governed by the same structuring factor, i.e., the temporal correlation of activity between pre- and postsynaptic neurons. Functional predictions generated by such algorithms have been tested electrophysiologically in the visual cortex of anesthetized and paralyzed cats. Supervised learning procedures were applied at the cellular level to change receptive field (RF) properties during the time of recording of an individual functionally identified cell. The protocols were devised as cellular analogs of the plasticity of RF properties, which is normally expressed during a critical period of postnatal development. We summarize here evidence demonstrating that changes in covariance between afferent input and postsynaptic response imposed during extracellular and intracellular conditioning can acutely induce selective long-lasting up- and down-regulations of visual responses. The functional properties that could be modified in 40% of cells submitted to differential pairing protocols include ocular dominance, orientation selectivity and orientation preference, interocular orientation disparity, and the relative dominance of ON and OFF responses. Since changes in RF properties can be induced in the adult as well, our findings also suggest that similar activity-dependent processes may occur during development and during active phases of learning under the supervision of behavioral attention or contextual signals. Such potential for plasticity in primary visual cortical neurons suggests the existence of a hidden connectivity expressing a wider functional competence than the one revealed at the spiking level. In particular, in the spatial domain the sensory synaptic integration field is larger than the classical discharge field. It can be shaped by supervised learning and its subthreshold extent can be unmasked by the pharmacological blockade of intracortical inhibition. 相似文献

13.

Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. 总被引：84，自引：0，他引：84

Margaret A Shipp Ken N Ross Pablo Tamayo Andrew P Weng Jeffery L Kutok Ricardo C T Aguiar Michelle Gaasenbeek Michael Angelo Michael Reich Geraldine S Pinkus Tane S Ray Margaret A Koval Kim W Last Andrew Norton T Andrew Lister Jill Mesirov Donna S Neuberg Eric S Lander Jon C Aster Todd R Golub 《Nature medicine》2002,8(1):68-74

Diffuse large B-cell lymphoma (DLBCL), the most common lymphoid malignancy in adults, is curable in less than 50% of patients. Prognostic models based on pre-treatment characteristics, such as the International Prognostic Index (IPI), are currently used to predict outcome in DLBCL. However, clinical outcome models identify neither the molecular basis of clinical heterogeneity, nor specific therapeutic targets. We analyzed the expression of 6,817 genes in diagnostic tumor specimens from DLBCL patients who received cyclophosphamide, adriamycin, vincristine and prednisone (CHOP)-based chemotherapy, and applied a supervised learning prediction method to identify cured versus fatal or refractory disease. The algorithm classified two categories of patients with very different five-year overall survival rates (70% versus 12%). The model also effectively delineated patients within specific IPI risk categories who were likely to be cured or to die of their disease. Genes implicated in DLBCL outcome included some that regulate responses to B-cell-receptor signaling, critical serine/threonine phosphorylation pathways and apoptosis. Our data indicate that supervised learning classification techniques can predict outcome in DLBCL and identify rational targets for intervention. 相似文献

14.

Cerebellar supervised learning revisited: biophysical modeling and degrees-of-freedom control 总被引：1，自引：0，他引：1

Kawato M Kuroda S Schweighofer N 《Current opinion in neurobiology》2011,21(5):791-800

The biophysical models of spike-timing-dependent plasticity have explored dynamics with molecular basis for such computational concepts as coincidence detection, synaptic eligibility trace, and Hebbian learning. They overall support different learning algorithms in different brain areas, especially supervised learning in the cerebellum. Because a single spine is physically very small, chemical reactions at it are essentially stochastic, and thus sensitivity-longevity dilemma exists in the synaptic memory. Here, the cascade of excitable and bistable dynamics is proposed to overcome this difficulty. All kinds of learning algorithms in different brain regions confront with difficult generalization problems. For resolution of this issue, the control of the degrees-of-freedom can be realized by changing synchronicity of neural firing. Especially, for cerebellar supervised learning, the triangle closed-loop circuit consisting of Purkinje cells, the inferior olive nucleus, and the cerebellar nucleus is proposed as a circuit to optimally control synchronous firing and degrees-of-freedom in learning. 相似文献

15.

Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes

Ujjwal Maulik Anirban Mukhopadhyay Sanghamitra Bandyopadhyay 《BMC bioinformatics》2009,10(1):27

相似文献

16.

Information-theoretic identification of predictive SNPs and supervised visualization of genome-wide association studies 总被引：2，自引：0，他引：2

Bhasi K Zhang L Brazeau D Zhang A Ramanathan M 《Nucleic acids research》2006,34(14):e101

The size, dimensionality and the limited range of the data values makes visualization of single nucleotide polymorphism (SNP) datasets challenging. The purpose of this study is to evaluate the usefulness of 3D VizStruct, a novel multi-dimensional data visualization technique for SNP datasets capable of identifying informative SNPs in genome-wide association studies. VizStruct is an interactive visualization technique that reduces multi-dimensional data to three dimensions using a combination of the discrete Fourier transform and the Kullback–Leibler divergence. The performance of 3D VizStruct was challenged with several diverse, biologically relevant published datasets including the human lipoprotein lipase (LPL) gene locus, the human Y-chromosome in several populations and a multi-locus genotype dataset of coral samples from four populations. In every case, the SNPs and or polymorphic markers identified by the 3D VizStruct mapping were predictive of the underlying biology. 相似文献

17.

Transcription-based prediction of response to IFNbeta using supervised computational methods

下载免费PDF全文

Baranzini SE Mousavi P Rio J Caillier SJ Stillman A Villoslada P Wyatt MM Comabella M Greller LD Somogyi R Montalban X Oksenberg JR 《PLoS biology》2005,3(1):e2

相似文献

18.

Comparison of supervised clustering methods to discriminate genotoxic from non-genotoxic carcinogens by gene expression profiling

van Delft JH van Agen E van Breda SG Herwijnen MH Staal YC Kleinjans JC 《Mutation research》2005,575(1-2):17-33

相似文献

19.

TargetSpy: a supervised machine learning approach for microRNA target prediction

Martin Sturm Michael Hackenberg David Langenberger Dmitrij Frishman 《BMC bioinformatics》2010,11(1):292

Background

Virtually all currently available microRNA target site prediction algorithms require the presence of a (conserved) seed match to the 5' end of the microRNA. Recently however, it has been shown that this requirement might be too stringent, leading to a substantial number of missed target sites. 相似文献

20.

Prediction of calcium-binding sites by combining loop-modeling with machine learning

Tianyun Liu Russ B Altman 《BMC structural biology》2009,9(1):72-17

Background

Protein ligand-binding sites in the apo state exhibit structural flexibility. This flexibility often frustrates methods for structure-based recognition of these sites because it leads to the absence of electron density for these critical regions, particularly when they are in surface loops. Methods for recognizing functional sites in these missing loops would be useful for recovering additional functional information. 相似文献