首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hu  Jialu  Gao  Yiqun  Li  Jing  Zheng  Yan  Wang  Jingru  Shang  Xuequn 《BMC bioinformatics》2019,20(18):1-12
Background

It’s a very urgent task to identify cancer genes that enables us to understand the mechanisms of biochemical processes at a biomolecular level and facilitates the development of bioinformatics. Although a large number of methods have been proposed to identify cancer genes at recent times, the biological data utilized by most of these methods is still quite less, which reflects an insufficient consideration of the relationship between genes and diseases from a variety of factors.

Results

In this paper, we propose a two-rounds random walk algorithm to identify cancer genes based on multiple biological data (TRWR-MB), including protein-protein interaction (PPI) network, pathway network, microRNA similarity network, lncRNA similarity network, cancer similarity network and protein complexes. In the first-round random walk, all cancer nodes, cancer-related genes, cancer-related microRNAs and cancer-related lncRNAs, being associated with all the cancer, are used as seed nodes, and then a random walker walks on a quadruple layer heterogeneous network constructed by multiple biological data. The first-round random walk aims to select the top score k of potential cancer genes. Then in the second-round random walk, genes, microRNAs and lncRNAs, being associated with a certain special cancer in corresponding cancer class, are regarded as seed nodes, and then the walker walks on a new quadruple layer heterogeneous network constructed by lncRNAs, microRNAs, cancer and selected potential cancer genes. After the above walks finish, we combine the results of two-rounds RWR as ranking score for experimental analysis. As a result, a higher value of area under the receiver operating characteristic curve (AUC) is obtained. Besides, cases studies for identifying new cancer genes are performed in corresponding section.

Conclusion

In summary, TRWR-MB integrates multiple biological data to identify cancer genes by analyzing the relationship between genes and cancer from a variety of biological molecular perspective.

  相似文献   

2.
microRNA靶基因预测算法研究概况及发展趋势   总被引:6,自引:2,他引:6  
microRNA(miRNA)是一类约22个核苷酸(nt)长的非编码小分子RNA,广泛存在于动植物细胞中,通过和靶基因的不精确互补配对而裂解mRNA或抑制翻译的起始。准确地预测miRNA靶基因和正确地认识miRNA及其靶基因的作用机理已成为当前研究的热点。作者试图对目前常用的10余个高等生物miRNA靶基因预测软件的实现原理、适用对象及各算法的创新之处等加以综述,以便为进行靶基因预测算法设计人员提供参考,对生物学实验验证提供更好的理论指导。  相似文献   

3.
MicroRNAs are one class of small single-stranded RNA of about 22 nt serving as important negative gene regulators. In animals, miRNAs mainly repress protein translation by binding itself to the 3′ UTR regions of mRNAs with imperfect complementary pairing. Although bioinformatics investigations have resulted in a number of target prediction tools, all of these have a common shortcoming—a high false positive rate. Therefore, it is important to further filter the predicted targets. In this paper, based on miRNA:target duplex, we construct a second-order Hidden Markov Model, implement Baum-Welch training algorithm and apply this model to further process predicted targets. The model trains the classifier by 244 positive and 49 negative miRNA:target interaction pairs and achieves a sensitivity of 72.54%, specificity of 55.10% and accuracy of 69.62% by 10-fold cross-validation experiments. In order to further verify the applicability of the algorithm, previously collected datasets, including 195 positive and 38 negative, are chosen to test it, with consistent results. We believe that our method will provide some guidance for experimental biologists, especially in choosing miRNA targets for validation.  相似文献   

4.

Background  

Several studies have demonstrated that synthetic lethal genetic interactions between gene mutations provide an indication of functional redundancy between molecular complexes and pathways. These observations help explain the finding that organisms are able to tolerate single gene deletions for a large majority of genes. For example, system-wide gene knockout/knockdown studies in S. cerevisiae and C. elegans revealed non-viable phenotypes for a mere 18% and 10% of the genome, respectively. It has been postulated that the low percentage of essential genes reflects the extensive amount of genetic buffering that occurs within genomes. Consistent with this hypothesis, systematic double-knockout screens in S. cerevisiae and C. elegans show that, on average, 0.5% of tested gene pairs are synthetic sick or synthetic lethal. While knowledge of synthetic lethal interactions provides valuable insight into molecular functionality, testing all combinations of gene pairs represents a daunting task for molecular biologists, as the combinatorial nature of these relationships imposes a large experimental burden. Still, the task of mapping pairwise interactions between genes is essential to discovering functional relationships between molecular complexes and pathways, as they form the basis of genetic robustness. Towards the goal of alleviating the experimental workload, computational techniques that accurately predict genetic interactions can potentially aid in targeting the most likely candidate interactions. Building on previous studies that analyzed properties of network topology to predict genetic interactions, we apply random walks on biological networks to accurately predict pairwise genetic interactions. Furthermore, we incorporate all published non-interactions into our algorithm for measuring the topological relatedness between two genes. We apply our method to S. cerevisiae and C. elegans datasets and, using a decision tree classifier, integrate diverse biological networks and show that our method outperforms established methods.  相似文献   

5.
MOTIVATION: We are motivated by the fast-growing number of protein structures in the Protein Data Bank with necessary information for prediction of protein-protein interaction sites to develop methods for identification of residues participating in protein-protein interactions. We would like to compare conditional random fields (CRFs)-based method with conventional classification-based methods that omit the relation between two labels of neighboring residues to show the advantages of CRFs-based method in predicting protein-protein interaction sites. RESULTS: The prediction of protein-protein interaction sites is solved as a sequential labeling problem by applying CRFs with features including protein sequence profile and residue accessible surface area. The CRFs-based method can achieve a comparable performance with state-of-the-art methods, when 1276 nonredundant hetero-complex protein chains are used as training and test set. Experimental result shows that CRFs-based method is a powerful and robust protein-protein interaction site prediction method and can be used to guide biologists to make specific experiments on proteins. AVAILABILITY: http://www.insun.hit.edu.cn/~mhli/site_CRFs/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

6.
An algorithm has been developed to improve the success rate in the prediction of the secondary structure of proteins by taking into account the predicted class of the proteins. This method has been called the 'double prediction method' and consists of a first prediction of the secondary structure from a new algorithm which uses parameters of the type described by Chou and Fasman, and the prediction of the class of the proteins from their amino acid composition. These two independent predictions allow one to optimize the parameters calculated over the secondary structure database to provide the final prediction of secondary structure. This method has been tested on 59 proteins in the database (i.e. 10,322 residues) and yields 72% success in class prediction, 61.3% of residues correctly predicted for three states (helix, sheet and coil) and a good agreement between observed and predicted contents in secondary structure.  相似文献   

7.
In this paper, we present a method based on local density and random walks (LDRW) for core-attachment complexes detection in protein-protein interaction (PPI) networks whether they are weighted or not. Our LDRW method consists of two stages. Firstly, it finds all the protein-complex cores based on local density of subnetwork. Then it uses random walks with restarts for finding the attachment proteins of each detected core to form complexes. We evaluate the effectiveness of our method using two different yeast PPI networks and validate the biological significance of the predicted protein complexes using known complexes in the Munich Information Center for Protein Sequence (MIPS) and Gene Ontology (GO) databases. We also perform a comprehensive comparison between our method and other existing methods. The results show that our method can find more protein complexes with high biological significance and obtains a significant improvement. Furthermore, our method is able to identify biologically significant overlapped protein complexes.  相似文献   

8.
张超  张晖  李冀新  高红 《生物信息学》2006,4(3):128-131
遗传算法源于自然界的进化规律,是一种自适应启发式概率性迭代式全局搜索算法。本文主要介绍了GA的基本原理,算法及优点;总结GA在蛋白质结构预测中建立模型和执行策略,以及多种算法相互结合预测蛋白质结构的研究进展。  相似文献   

9.
We used fluorescence recovery after photobleaching (FRAP) and single particle tracking (SPT) techniques to compare diffusion of class I major histocompatibility complex molecules (MHC) on normal and alpha-spectrin-deficient murine erythroleukemia (MEL) cells. Because the cytoskeleton mesh acts as a barrier to lateral mobility of membrane proteins, we expected that diffusion of membrane proteins in alpha-spectrin-deficient MEL cells would differ greatly from that in normal MEL cells. In the event, diffusion coefficients derived from either FRAP or SPT analysis were similar for alpha-spectrin-deficient and normal MEL cells, differing by a factor of approximately 2, on three different timescales: tens of seconds, 1-10 s, and 100 ms. SPT analysis showed that the diffusion of most class I MHC molecules was confined on both cell types. On the normal MEL cells, the mean diagonal length of the confined area was 330 nm with a mean residency time of 40s. On the alpha-spectrin-deficient MEL cells, the mean diagonal length was 650 nm with a mean residency time of 45s. Thus there are fewer barriers to lateral diffusion on cytoskeleton mutant MEL cells than on normal MEL cells, but this difference does not strongly affect lateral diffusion on the scales measured here.  相似文献   

10.
11.
Summary When a very large number of phytosociological types have to be compared, a reduction of the number of relevés is desirable. In this paper a method of relevé selection from given phytosociological tables is suggested. The method is based on a sum of squares criterion. The advantage, in comparison with other selection procedures, is that this method provides a means on the basis of which the efficiency of a relevé selection can be objectively measured.Contribution from the Working Group for Data-Processing in Phytosociology, International Society for Vegetation Science.The work was completed at the Department of Plant Sciences of the University of Western Ontario, London, Canada. We wish to thank Prof. L. Orlóci for the hospitality and the helpful discussions. The work was supported by Italian C.N.R., within the project Promozione qualità dell'ambiente subproject Metodologie matematiche e basi di dati.  相似文献   

12.

Background

Large amounts of data are being generated by high-throughput genome sequencing methods. But the rate of the experimental functional characterization falls far behind. To fill the gap between the number of sequences and their annotations, fast and accurate automated annotation methods are required. Many methods, such as GOblet, GOFigure, and Gotcha, are designed based on the BLAST search. Unfortunately, the sequence coverage of these methods is low as they cannot detect the remote homologues. Adding to this, the lack of annotation specificity advocates the need to improve automated protein function prediction.

Results

We designed a novel automated protein functional assignment method based on the neural response algorithm, which simulates the neuronal behavior of the visual cortex in the human brain. Firstly, we predict the most similar target protein for a given query protein and thereby assign its GO term to the query sequence. When assessed on test set, our method ranked the actual leaf GO term among the top 5 probable GO terms with accuracy of 86.93%.

Conclusions

The proposed algorithm is the first instance of neural response algorithm being used in the biological domain. The use of HMM profiles along with the secondary structure information to define the neural response gives our method an edge over other available methods on annotation accuracy. Results of the 5-fold cross validation and the comparison with PFP and FFPred servers indicate the prominent performance by our method. The program, the dataset, and help files are available at http://www.jjwanglab.org/NRProF/.
  相似文献   

13.
To develop accurate prognostic models is one of the biggest challenges in “omics”-based cancer research. Here, we propose a novel computational method for identifying dysregulated gene subnetworks as biomarkers to predict cancer recurrence. Applying our method to the DNA methylome of endometrial cancer patients, we identified a subnetwork consisting of differentially methylated (DM) genes, and non-differentially methylated genes, termed Epigenetic Connectors (EC), that are topologically important for connecting the DM genes in a protein-protein interaction network. The ECs are statistically significantly enriched in well-known tumorgenesis and metastasis pathways, and include known epigenetic regulators. Importantly, combining the DMs and ECs as features using a novel random walk procedure, we constructed a support vector machine classifier that significantly improved the prediction accuracy of cancer recurrence and outperformed several alternative methods, demonstrating the effectiveness of our network-based approach.  相似文献   

14.
15.
微RNA(microRNA,miRNA)是多种生物学过程的有效调节子,并表现为基因的定量调节。新出现的证据表明miRNA与天然免疫反应的调节有关。这种调节作用有助于维持宿主免疫反应和保护感染组织间的平衡。深入理解miRNA对天然免疫反应的调节有助于鉴定免疫调节的新靶标和建立基于miRNA的有效疗法。本综述重点总结miRNA在调节免疫细胞发育、Toll样受体和炎症细胞因子信号中的作用。  相似文献   

16.
Sampling rate effects on measurements of correlated and biased random walks   总被引:2,自引:0,他引:2  
When observing the two-dimensional movement of animals or microorganisms, it is usually necessary to impose a fixed sampling rate, so that observations are made at certain fixed intervals of time and the trajectory is split into a set of discrete steps. A sampling rate that is too small will result in information about the original path and correlation being lost. If random walk models are to be used to predict movement patterns or to estimate parameters to be used in continuum models, then it is essential to be able to quantify and understand the effect of the sampling rate imposed by the observer on real trajectories. We use a velocity jump process with a realistic reorientation model to simulate correlated and biased random walks and investigate the effect of sampling rate on the observed angular deviation, apparent speed and mean turning angle. We discuss a method of estimating the values of the reorientation parameters used in the original random walk from the rediscretized data that assumes a linear relation between sampling time step and the parameter values.  相似文献   

17.
18.
19.

Background

Polygenic diseases are usually caused by the dysfunction of multiple genes. Unravelling such disease genes is crucial to fully understand the genetic landscape of diseases on molecular level. With the advent of ‘omic’ data era, network-based methods have prominently boosted disease gene discovery. However, how to make better use of different types of data for the prediction of disease genes remains a challenge.

Results

In this study, we improved the performance of disease gene prediction by integrating the similarity of disease phenotype, biological function and network topology. First, for each phenotype, a phenotype-specific network was specially constructed by mapping phenotype similarity information of given phenotype onto the protein-protein interaction (PPI) network. Then, we developed a gene gravity-like algorithm, to score candidate genes based on not only topological similarity but also functional similarity. We tested the proposed network and algorithm by conducting leave-one-out and leave-10%-out cross validation and compared them with state-of-art algorithms. The results showed a preference to phenotype-specific network as well as gene gravity-like algorithm. At last, we tested the predicting capacity of proposed algorithms by test gene set derived from the DisGeNET database. Also, potential disease genes of three polygenic diseases, obesity, prostate cancer and lung cancer, were predicted by proposed methods. We found that the predicted disease genes are highly consistent with literature and database evidence.

Conclusions

The good performance of phenotype-specific networks indicates that phenotype similarity information has positive effect on the prediction of disease genes. The proposed gene gravity-like algorithm outperforms the algorithm of Random Walk with Restart (RWR), implicating its predicting capacity by combing topological similarity with functional similarity. Our work will give an insight to the discovery of disease genes by fusing multiple similarities of genes and diseases.
  相似文献   

20.
OrienTM is a computer software that utilizes an initial definition of transmembrane segments to predict the topology of transmembrane proteins from their sequence. It uses position-specific statistical information for amino acid residues which belong to putative non-transmembrane segments derived from statistical analysis of non-transmembrane regions of membrane proteins stored in the SwissProt database. Its accuracy compares well with that of other popular existing methods. A web-based version of OrienTM is publicly available at the address http://biophysics.biol.uoa.gr/OrienTM.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号