首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 218 毫秒
1.
文本挖掘技术在整合蛋白与疾病关系资源中的应用   总被引:3,自引:0,他引:3  
为了整合文献中大量的人类蛋白质与疾病相互关系的信息,通过文本挖掘和通路分析的方法从PubMed中的摘要提取出对应关系后,利用KEGG中的通路信息构建出人类蛋白质和疾病相互的一个网络效应,并构建了查询数据库,用户可以根据蛋白质名称、疾病名称、通路名称来进行多方面的查询。  相似文献   

2.
基于功能基因组信息、网络拓扑结构信息整合分析方法,利用基因表达谱数据和蛋白质互作数据挖掘动脉粥样硬化(AS)风险疾病基因,为从基因组层面研究动脉粥样硬化提供了新的视角.经过差异表达分析,支持向量机(SVM)的机器学习方法双重筛选,可以鉴别出可信度水平较高的风险疾病基因,对于研究动脉粥样硬化疾病基因在网络中的拓扑性质,建立基因与疾病发生发展过程的联系,提供了新的思路.得到了巨噬细胞样本中59个风险疾病基因,泡沫细胞中61个风险疾病基因.这些风险基因与已知疾病基因共享大部分动脉粥样硬化病变相关生物学过程及信号通路.并应用到对其他复杂疾病致病机理的研究中.  相似文献   

3.
细胞中的生理活动主要是通过蛋白质 - 蛋白质之间的相互作用来调控完成 . 详尽细致的蛋白质 - 蛋白质相互作用网络的解析对于理解细胞中复杂的调控、代谢和信号通路有重要的意义 . 近年来,关于新的蛋白质 - 蛋白质相互作用预测领域进展快速,这里,利用贝叶斯算法结合关联的 GO (Gene Ontology) ,来预测蛋白质的相互作用 . 利用非冗余的蛋白质相互作用数据来观察 GO 对的特性,得到 GO 关联的概率 . 通过阳性的和阴性的标准对照数据证实这个新方法可以很好地区别这两类不同的数据,显示出较好的灵敏度和非常低的假阳性预测率 . 通过与已知的高通量的实验数据比较,这个方法具有灵敏度高、速度快的优点 . 而且,运用这个新方法可以提供一些新的关于细胞内蛋白质之间相互作用的信息,为进一步的实验提供理论依据 .  相似文献   

4.
目的:基于整合网络和联合策略预测心肌梗死的新致病基因.方法:从系统生物学的角度,提出基于蛋白质亚细胞定位信息,构建区域化的蛋白质互作的整合网络;通过疾病风险基因与已知致病基因的功能一致性程度和互作相关性的强度联合筛选的新策略,预测心肌梗死的新致病基因.结果:预测出10个心肌梗死的新致病基因(CCL19、CCL25、COMP、CCL11、CCL7、F2、KLKB1、HTR6、ADRB1、BDKRB2),其中8个基因(CCL 19、CCL25、CCL11、CCL7、F2、KLKB1、ADRB1、BDKRB2)经文献证实与心肌梗死的发生发展有着密切的联系;另外2个基因(COMP、HTR6)尚需实验验证.结论:基于整合网络和联合策略预测出10个心肌梗死的新致病基因,此方法为探索复疾病的致病基因提供了新的思路,有助于阐明复杂疾病的致病机理.  相似文献   

5.
在生命体内,基因以及其它分子间相互作用形成复杂调控网络,生命过程都是以调控网络的形式存在,如从代谢通路网络到转录调控网络,从信号转导网络到蛋白质相互作用网络等等。因此,网络现象是生命现象的复杂本质和主要特征。本文系统地介绍了基于表达谱数据构建基因调控网络的布尔网络模型,线性模型,微分方程模型和贝叶斯网络模型,并对各种网络构建模型进行了深入的分析和总结。同时,文章从基因组序列信息、蛋白质相互作用信息和生物医学文献信息等方面讨论了基因调控网络方面构建的研究,这对从系统生物学水平揭示生命复杂机制具有重要的参考价值。  相似文献   

6.
Cytoscape是一个广泛应用于分子相互作用网络可视化的软件.发展了一个基于java的Cytoscape插件PNmerger.对于一个蛋白质相互作用网络,PNmerger能够使用KEGG数据库中的通路信息自动注释网络中的蛋白质.并通过网络和通路的比较发现网络中已知的通路元件,预测可能的通路元件及通路交联元件.该软件可以可视化网络中存在的通路模块,并将连接不同通路间的潜在交联元件显示出来.PNmerger软件能够有效地帮助实验人员发现网络中重要的功能线索,帮助实验人员进行实验设计.用户可以通过网站http://www.hupo.org.cn/PNmerger下载PNmerger插件.  相似文献   

7.
结构域是进化上的保守序列单元,是蛋白质的结构和功能的标准组件.典型的两个蛋白质间的相互作用涉及特殊结构域间的结合,而且识别相互作用结构域对于在结构域水平上彻底理解蛋白质的功能与进化、构建蛋白质相互作用网络、分析生物学通路等十分重要.目前,依赖于对实验数据的进一步挖掘和对各种不同输入数据的计算预测,已识别出了一些相互作用/功能连锁结构域对,并由此构建了内容丰富、日益更新的结构域相互作用数据库.综述了产生结构域相互作用的8种计算预测方法.介绍了5个结构域相互作用公共数据库3DID、iPfam、InterDom、DIMA和DOMINE的有关信息和最新动态.实例概述了结构域相互作用在蛋白质相互作用计算预测、可信度评估,蛋白质结构域注释,以及在生物学通路分析中的应用.  相似文献   

8.
蛋白质功能注释是后基因组时代研究的核心内容之一,基于蛋白质相互作用网络的蛋白质功能预测方法越来越受到研究者们的关注.提出了一种基于贝叶斯网络和蛋白质相互作用可信度的蛋白质功能预测方法.该方法在功能预测过程中为待注释的蛋白质建立贝叶斯网络预测模型,并充分考虑了蛋白质相互作用的可信度问题.在构建的芽殖酵母数据集上的三重交叉验证测试表明,在功能预测过程中考虑蛋白质可信度能够有效地提高功能预测的性能.与现有一些算法相比,该方法能够给出令人满意的预测效果.  相似文献   

9.
刘阳  王丽茹  张岩 《生物信息学》2021,19(4):240-248
为了通过分析DNA甲基化谱识别出与预后相关的结肠腺癌亚型。从TCGA数据库获取了结肠腺癌患者的甲基化数据,通过差异甲基化分析和构建COX比例风险回归模型筛得与预后显著相关的CpG位点,并通过一致性聚类识别出7个亚型。生存分析和临床特征检验显示7个亚型间预后差异显著且亚型特征可由多种临床特征反映。此外,用7个亚型间识别出的差异甲基化位点构建的基于SMO(序列最小最优化)的预测模型在各亚型上都有较高的AUC值,并用检验集进行了验证。综上,本研究利用生物信息学算法识别了7个预后差异的结肠腺癌亚型并挖掘了它们的特异性甲基化标记。该研究结果或可使得结肠腺癌预后被更精准地评估,为早期诊断及治疗方案提供新思路。  相似文献   

10.
目的:基于基因拷贝数变异(CNV)区域网络识别神经胶质瘤的重要功能区域。方法:运用独特的计算样本的共相关性值的方法,使CNV数据与基因数据产生联系;基于蛋白质互作关系,在CNV区域与基因之间搭建桥梁,构建CNV区域网络;分析网络拓扑性质,识别出神经胶质瘤的重要功能CNV区域。结果:本文共识别出了11个与神经胶质瘤相关的候选重要功能CNV区域,通过功能注释和通路分析,确认了识别到的区域与神经胶质瘤有重要联系。结论:通过基因与表型之间的联系,利用已知表型基因在同源、功能、互作、结构域上的特征将CNV区域与基因联系起来,通过基因的功能可以了解到CNV区域的功能,对于疾病的预测和诊断有重要的意义。  相似文献   

11.
12.
ABSTRACT: BACKGROUND: In this study we explored preeclampsia through a bioinformatics approach. We create a comprehensive genes/proteins dataset by the analysis of both public proteomic data and text mining of public scientific literature. From this dataset the associated protein-protein interaction network has been obtained. Several indexes of centrality have been explored for hubs detection as well as the enrichment statistical analysis of metabolic pathway and disease. RESULTS: We confirmed the well known relationship between preeclampsia and cardiovascular diseases but also identified statistically significant relationships with respect to cancer and aging. Moreover, significant metabolic pathways such as apoptosis, cancer and cytokine-cytokine receptor interaction have also been identified by enrichment analysis. We obtained FLT1, VEGFA, FN1, F2 and PGF genes with the highest scores by hubs analysis; however, we also found other genes as PDIA3, LYN, SH2B2 and NDRG1 with high scores. CONCLUSIONS: The applied methodology not only led to the identification of well known genes related to preeclampsia but also to propose new candidates poorly explored or completely unknown in the pathogenesis of preeclampsia, which eventually need to be validated experimentally. Moreover, new possible connections were detected between preeclampsia and other diseases that could open new areas of research. More must be done in this area to resolve the identification of unknown interactions of proteins/genes and also for a better integration of metabolic pathways and diseases.  相似文献   

13.
14.
We present a novel dataset assessing the specificity of protein-protein interactions between 69 transmitter and receiver domains from two-component system (TCS)-signalling pathways. TCS require a conserved protein-protein interaction between partner transmitter and receiver domains for signal transduction. The complex prokaryote Myxococcus xanthus possesses an unusually large number of TCS genes, many of which have no obvious interaction partners. Interactions between TCS domains of M. xanthus were assessed using a yeast two-hybrid assay, in which domains were expressed as both bait and prey translational fusions. LacZ production was monitored as an indicator of protein-protein interaction, and the strength of interactions classified as weak, medium or strong. Two-hundred and fifty-five transmitter-receiver domain interactions were observed (46 strong), allowing identification of potential signalling partners for individual M. xanthus TCS proteins. In addition, the dataset provides interesting 'meta' information. For instance, many strong interactions were identified between different transmitter domain pairs (34) and receiver domain pairs (23), suggesting a surprisingly large degree of heterodimerisation of these domains. Proteins in our dataset that exhibited similar 'profiles' of interactions, often shared a similar biological function, suggesting that interaction profiles can provide information on biological function, even considering sets of homologous domains.  相似文献   

15.
Li C  Li Y  Xu J  Lv J  Ma Y  Shao T  Gong B  Tan R  Xiao Y  Li X 《Gene》2011,489(2):119-129
Detection of the synergetic effects between variants, such as single-nucleotide polymorphisms (SNPs), is crucial for understanding the genetic characters of complex diseases. Here, we proposed a two-step approach to detect differentially inherited SNP modules (synergetic SNP units) from a SNP network. First, SNP-SNP interactions are identified based on prior biological knowledge, such as their adjacency on the chromosome or degree of relatedness between the functional relationships of their genes. These interactions form SNP networks. Second, disease-risk SNP modules (or sub-networks) are prioritised by their differentially inherited properties in IBD (Identity by Descent) profiles of affected and unaffected sibpairs. The search process is driven by the disease information and follows the structure of a SNP network. Simulation studies have indicated that this approach achieves high accuracy and a low false-positive rate in the identification of known disease-susceptible SNPs. Applying this method to an alcoholism dataset, we found that flexible patterns of susceptible SNP combinations do play a role in complex diseases, and some known genes were detected through these risk SNP modules. One example is GRM7, a known alcoholism gene successfully detected by a SNP module comprised of two SNPs, but neither of the two SNPs was significantly associated with the disease in single-locus analysis. These identified genes are also enriched in some pathways associated with alcoholism, including the calcium signalling pathway, axon guidance and neuroactive ligand-receptor interaction. The integration of network biology and genetic analysis provides putative functional bridges between genetic variants and candidate genes or pathways, thereby providing new insight into the aetiology of complex diseases.  相似文献   

16.
Groups of distinct but related diseases often share common symptoms, which suggest likely overlaps in underlying pathogenic mechanisms. Identifying the shared pathways and common factors among those disorders can be expected to deepen our understanding for them and help designing new treatment strategies effected on those diseases. Neurodegeneration diseases, including Alzheimer''s disease (AD), Parkinson''s disease (PD) and Huntington''s disease (HD), were taken as a case study in this research. Reported susceptibility genes for AD, PD and HD were collected and human protein-protein interaction network (hPPIN) was used to identify biological pathways related to neurodegeneration. 81 KEGG pathways were found to be correlated with neurodegenerative disorders. 36 out of the 81 are human disease pathways, and the remaining ones are involved in miscellaneous human functional pathways. Cancers and infectious diseases are two major subclasses within the disease group. Apoptosis is one of the most significant functional pathways. Most of those pathways found here are actually consistent with prior knowledge of neurodegenerative diseases except two cell communication pathways: adherens and tight junctions. Gene expression analysis showed a high probability that the two pathways were related to neurodegenerative diseases. A combination of common susceptibility genes and hPPIN is an effective method to study shared pathways involved in a group of closely related disorders. Common modules, which might play a bridging role in linking neurodegenerative disorders and the enriched pathways, were identified by clustering analysis. The identified shared pathways and common modules can be expected to yield clues for effective target discovery efforts on neurodegeneration.  相似文献   

17.
为寻找与结直肠癌发展和预后相关的潜在关键基因及信号通路.从美国国立信息中心NCBI的GEO数据库获得结直肠癌基因表达数据集GSE106582,通过PCA对样本进行分组,利用GEO2R进行综合分析,筛选结直肠癌与癌旁对照组的差异表达基因;通过DAVID在线工具对差异表达基因进行GO本体分析和KEGG通路富集分析,初步分析...  相似文献   

18.
Li BQ  Huang T  Liu L  Cai YD  Chou KC 《PloS one》2012,7(4):e33393
One of the most important and challenging problems in biomedicine and genomics is how to identify the disease genes. In this study, we developed a computational method to identify colorectal cancer-related genes based on (i) the gene expression profiles, and (ii) the shortest path analysis of functional protein association networks. The former has been used to select differentially expressed genes as disease genes for quite a long time, while the latter has been widely used to study the mechanism of diseases. With the existing protein-protein interaction data from STRING (Search Tool for the Retrieval of Interacting Genes), a weighted functional protein association network was constructed. By means of the mRMR (Maximum Relevance Minimum Redundancy) approach, six genes were identified that can distinguish the colorectal tumors and normal adjacent colonic tissues from their gene expression profiles. Meanwhile, according to the shortest path approach, we further found an additional 35 genes, of which some have been reported to be relevant to colorectal cancer and some are very likely to be relevant to it. Interestingly, the genes we identified from both the gene expression profiles and the functional protein association network have more cancer genes than the genes identified from the gene expression profiles alone. Besides, these genes also had greater functional similarity with the reported colorectal cancer genes than the genes identified from the gene expression profiles alone. All these indicate that our method as presented in this paper is quite promising. The method may become a useful tool, or at least plays a complementary role to the existing method, for identifying colorectal cancer genes. It has not escaped our notice that the method can be applied to identify the genes of other diseases as well.  相似文献   

19.

Background

One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individual genes' discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance across data sets. Nonetheless, the variables in multivariable classifiers should synergistically interact to produce more effective classifiers than individual biomarkers.

Results

We developed an integrated approach, namely network-constrained support vector machine (netSVM), for cancer biomarker identification with an improved prediction performance. The netSVM approach is specifically designed for network biomarker identification by integrating gene expression data and protein-protein interaction data. We first evaluated the effectiveness of netSVM using simulation studies, demonstrating its improved performance over state-of-the-art network-based methods and gene-based methods for network biomarker identification. We then applied the netSVM approach to two breast cancer data sets to identify prognostic signatures for prediction of breast cancer metastasis. The experimental results show that: (1) network biomarkers identified by netSVM are highly enriched in biological pathways associated with cancer progression; (2) prediction performance is much improved when tested across different data sets. Specifically, many genes related to apoptosis, cell cycle, and cell proliferation, which are hallmark signatures of breast cancer metastasis, were identified by the netSVM approach. More importantly, several novel hub genes, biologically important with many interactions in PPI network but often showing little change in expression as compared with their downstream genes, were also identified as network biomarkers; the genes were enriched in signaling pathways such as TGF-beta signaling pathway, MAPK signaling pathway, and JAK-STAT signaling pathway. These signaling pathways may provide new insight to the underlying mechanism of breast cancer metastasis.

Conclusions

We have developed a network-based approach for cancer biomarker identification, netSVM, resulting in an improved prediction performance with network biomarkers. We have applied the netSVM approach to breast cancer gene expression data to predict metastasis in patients. Network biomarkers identified by netSVM reveal potential signaling pathways associated with breast cancer metastasis, and help improve the prediction performance across independent data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号