首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
癌症的发生发展与机体内基因的改变有密切联系,在临床上表现为症状或检测指标的异常.通过挖掘分析临床表现与基因改变之间的关系,可为癌症早期诊断和精准治疗提供临床决策支持.从文献数据出发,利用结论性数据挖掘基因与临床表现的关系具有重要意义.本文提出一种基于医学主题词(Medical Subject Headings,Me SH)的生物医学实体关系挖掘方法.该方法利用PubMed中提供的文献信息,借用向量空间模型思想,使用MeSH主题词矢量表达待研究实体,引入文献相互引用因素对结果进行修正,将关系挖掘转化为矢量间的数学运算,实现定量分析.本文将该方法应用于结直肠癌临床表现和基因关系的研究中,得到与结直肠癌相关的203个基因和对应的临床-基因462个关系.通过结合使用基因功能和通路分析工具g:Profiler和KEGG等,对结果进行分析验证.结果表明,基于MeSH主题词的文献挖掘方法,避免传统"共现"方法对发现潜在关系的限制和复杂语义分析带来的大量计算,为生物实体之间潜在关系的挖掘提供一种新的思路和方法.  相似文献   

2.
目的 预测与筛选结直肠癌组织特异性基因,作为靶向治疗的候选靶点.方法 利用自主开发的Python语言程序分析人类正常组织与结直肠癌组织mRNA表达的组织特异性,结合人类胚胎干细胞富集基因集以及文献挖掘结果,筛选可能的与结直肠癌发生或发展相关的基因作为候选靶点,并对其进行通路分析及基因富集分析.结果 获得了结直肠癌组织特异的且与肿瘤生物学通路密切相关的4个基因,作为进一步研究的候选靶点.结论 应用生物信息学方法从芯片数据进行挖掘,可以为结直肠癌的靶向治疗提供候选靶点,并为后续的药物设计奠定基础.  相似文献   

3.
王爽  魏云巍 《微生物学通报》2021,48(9):3065-3070
结直肠癌是西方国家最常见的癌症之一,多数患者伴有肠道菌群的改变。有研究表明,肠道菌群通过microRNA调节宿主基因的表达,而宿主的microRNA同样调节菌群的生长和基因表达。因此,本文概述了肠道菌群与宿主microRNA相互作用的具体机制,以及这种交互作用在结直肠癌的发生、发展、治疗阶段的研究进展。为进一步深入研究肠道菌群与结直肠癌的关系提供理论基础。  相似文献   

4.
结直肠癌是一种涉及遗传、环境和生活方式等多风险因素的疾病。越来越多的研究表明肠道菌群在结直肠癌的发生发展中起重要作用,菌群与消化道之间的共生作用对维持肠道内环境的稳定也十分重要。菌群在炎症、药物代谢,甚至癌症的发展中扮演着许多角色,然而由感染、饮食或生活方式等变化引起的菌群组成的改变却可影响这种共生关系。同样,菌群组成的变化使部分菌种在肠内引发炎症反应甚至致癌,从而对结肠直肠癌的发生发展产生实质性的影响,综述将总结目前肠道菌群与结直肠癌之间潜在的联系,重点关注细菌在肠道中所参与的各种反应,从而为治疗结直肠癌提供更多的研究思路。  相似文献   

5.
结直肠癌是消化道频发的重大恶性肿瘤,其发病率和致死率逐年攀升,严重威胁着人类生命安全。虽然多种化疗药物已广泛应用于临床,然而其潜在的毒副作用和耐药性致使患者依从性差,进而导致化疗以失败告终。基于此,亟待挖掘高效低毒的抗结直肠癌药物以应对现实的临床治疗窘境。中药单体成分作为中药发挥药效的主要效应物质,在临床抗结直肠癌方面优势凸显。与合成类化学制剂相比,其来源丰富,安全性高,在结直肠癌防治方面呈现出较大潜力。因此,本文从化学活性物质角度出发,系统总结了中药单体活性成分的抗结直肠癌作用和主要分子机制,并结合当前研究现状对中药治疗结直肠癌的相关研究进行了初步探讨,以期为中药单体成分抗结直肠癌研究及临床应用提供理论依据和参考价值。  相似文献   

6.
为寻找与结直肠癌发展和预后相关的潜在关键基因及信号通路。从美国国立信息中心NCBI的GEO数据库获得结直肠癌基因表达数据集GSE106582,通过PCA对样本进行分组,利用GEO2R进行综合分析,筛选结直肠癌与癌旁对照组的差异表达基因;通过DAVID在线工具对差异表达基因进行GO本体分析和KEGG通路富集分析,初步分析差异表达基因的生物学作用;基于STRING数据库对差异表达基因进行蛋白质相互作用网络分析,利用Cytoscape软件进行可视化并筛选关键基因;用生存分析和ROC曲线诊断对关键基因进行鉴定并通过数据集GSE21510进行验证。共鉴定出199个差异表达基因,其中53个为上调基因,146个为下调基因;上调的差异表达基因主要富集在与胶原蛋白分解代谢过程、细胞外基质分解、细胞外基质受体相互作用和PI3K/AKT信号通路等生物学过程;下调的差异表达基因主要富集在碳酸氢盐运输、一碳代谢过程、矿物质吸收、药物代谢-细胞色素P450和氮代谢通路等生物学过程;MCODE分析、生存分析和ROC诊断共发现3个基因分别为BGN、COL1A2和TIMP1可能与结直肠癌的发生发展有关,它们在肿瘤组织中的异常高表达与患者较差的生存期呈正相关,GSE21510的验证结果与GSE106582的分析结果相同。本研究采用生物信息学方法对CRC基因芯片数据进行挖掘,从基因水平探讨CRC潜在的发病机制、肿瘤标志物的及患者预后分子的筛选,以及可能的药物治疗靶点提供了一定的参考价值和理论基础。  相似文献   

7.
microRNAs(miRNAs)在参与癌症发生、发展过程中起着十分重要的作用.目前,miR-92b在结直肠癌中的作用及相关机制还未见报道.本研究探讨了miR-92b在结直肠癌发生发展中的功能及潜在机制.采用RT-qPCR方法发现,miR-92b在人结直肠癌临床样本中与癌旁组织相比显著高表达.通过结肠癌细胞株SW620稳转细胞及裸鼠皮下成瘤模型,发现过表达miR-92b可以显著促进细胞增殖及体内肿瘤生长.同时还发现miR-92b可以分泌形式存在于胞外及外周血中,提示miR-92b是一个具有分泌特性的microRNA. 在分子机理方面,c-MYC可通过调节miR-92b的启动子活性从而促进后者转录,并且c-MYC在结直肠癌组织样本中也存在高表达.进一步,通过在线预测、报告质粒活性检测及蛋白质印迹技术证实FBXW7是一个新的miR-92b靶基因.由于FBXW7已报道为c-MYC泛素降解过程中的关键泛素化连接酶之一,本研究结果提示结直肠癌中c-MYC、miR-92b及FBXW7三者间可能存在分子调节环路.综上所述,本研究为miR-92b在结直肠癌中的功能及机制提供了新的视角,并为miR-92b在结直肠癌早期诊断中的应用提供了新的参考.  相似文献   

8.
目的人体微生态是世界主要国家争先布局的前沿领域。本研究采用文献计量学方法,从宏观角度量化分析人体微生态领域的国际研究态势和热点主题。方法选用PubMed/Medline数据库,采用主题词检索法构建含33 024条信息的"人体微生态"国际论文题录信息数据库,筛选2008年以来(近10年)31 060条文献信息用以主题分析。通过对题录信息的计量分析得到该领域的主要研究特点;将高频主题词(MeSH词)按照菌群、人体部位、疾病和人群分为4类,采用词频标准化和线性回归的方法分析热点主题与趋势;采用共词法对研究国别间、菌群—疾病间、疾病—人群间主题词关联度进行量化,并采用网络图法分析主题词间关系,以识别热点领域。结果人体微生态领域研究论文共33 024篇,其中94.1%的研究集中在近10年(2008年以来)。美国学者发表论文数量最多,占27.1%(8 414篇),我国学者论文数位居第二位(3 182篇,占10.2%);我国与美国合作发表论文432篇,居各国间合作论文数量之首。在热点主题方面,乳酸菌、双歧杆菌、古生菌、艰难梭菌、真菌等重点菌群主题词平均标化频次分别为5.1%、4.7%、1.6%、1.5%和1.8%;肠道、阴道、口腔、皮肤等人体主要部位主题词平均标化频次为12.5%、1.8%、1.7%和1.6%;相关疾病方面,炎性肠病、肥胖、克罗恩病、免疫系统、结直肠癌、糖尿病等主题词的平均标化频次为3.5%、3.5%、1.6%、1.1%、0.9%和0.8%。重点主题词关联度方面,艰难梭菌与梭菌感染和痢疾,梭菌感染、结直肠癌、2型糖尿病与中老年人群,溃疡性结肠炎、克罗恩病与中青年人群,肥胖、免疫系统疾病与妊娠期女性、新生儿等有较为密切的关联。结论本研究基于文献计量分析的量化依据,综合分析了人体微生态领域的重点菌群、主要疾病及重点人群方面的热点主题及其变化趋势,可为今后研究提供参考。  相似文献   

9.
结直肠癌是常见的恶性肿瘤之一,其发病率居全球恶性肿瘤发病率的第三位,死亡率呈逐年上升趋势。中国已成为全球结直肠癌每年新发病例数和死亡病例数最多的国家。对结直肠癌基因突变状态的识别以及对结直肠癌发生发展过程进行精确分类,可实现对患者进行个性化精准治疗的目的,而精准治疗的实现有赖于基因测序技术。目前,二代测序技术(Next generation sequencing,NGS)结合基因捕获技术,集中对研究者感兴趣的候选基因或外显子进行平行测序,极大拓展了对肿瘤特征基因的认识,为发展新的治疗手段和治疗策略奠定了基础。整合癌症基因组数据库IntOgen已明确72个结直肠癌驱动突变基因,包括“TP53”、“KRAS”、“PIK3CA”等;癌基因数据库Cancer Gene Census目前收录的结直肠癌突变基因有59个,包括原癌基因“BRAF”、抑癌基因“SMAD4”等;在线人类孟德尔遗传OMIM数据库已收录55个与结直肠癌相关的体细胞突变基因,包括“SRC”、“APC”等。本文通过26篇国内外文献,对结直肠癌基因突变检测的共识基因进行综述,并总结了与结直肠癌患者临床诊断、分型、预后、治疗等临床病理特征相关的突变基因标志物。  相似文献   

10.
基于功能一致性利用蛋白质互作网络挖掘潜在的疾病致病基因,对于了解疾病致病机理和改进临床治疗至关重要.基于基因功能一致性和其在蛋白质互作网络中的拓扑属性将基因与疾病之间建立关联,对疾病风险位点内的基因进行了致病风险预测,并通过GO及KEGG功能富集分析方法进一步筛选,预测出新的致病基因.预测出了51个新的冠心病致病基因,分析发现大部分基因参与了冠心病的致病过程.为疾病基因的挖掘提出一个新的思路,从而有助于复杂疾病致病机理的研究.  相似文献   

11.
Gene function annotation remains a key challenge in modern biology. This is especially true for high-throughput techniques such as gene expression experiments. Vital information about genes is available electronically from biomedical literature in the form of full texts and abstracts. In addition, various publicly available databases (such as GenBank, Gene Ontology and Entrez) provide access to gene-related information at different levels of biological organization, granularity and data format. This information is being used to assess and interpret the results from high-throughput experiments. To improve keyword extraction for annotational clustering and other types of analyses, we have developed a novel text mining approach, which is based on keywords identified at the level of gene annotation sentences (in particular sentences characterizing biological function) instead of entire abstracts. Further, to improve the expressiveness and usefulness of gene annotation terms, we investigated the combination of sentence-level keywords with terms from the Medical Subject Headings (MeSH) and Gene Ontology (GO) resources. We find that sentence-level keywords combined with MeSH terms outperforms the typical 'baseline' set-up (term frequencies at the level of abstracts) by a significant margin, whereas the addition of GO terms improves matters only marginally. We validated our approach on the basis of a manually annotated corpus of 200 abstracts generated on the basis of 2 cancer categories and 10 genes per category. We applied the method in the context of three sets of differentially expressed genes obtained from pediatric brain tumor samples. This analysis suggests novel interpretations of discovered gene expression patterns.  相似文献   

12.
MicroRNAs are short non-coding RNAs that can regulate gene expression during various crucial cell processes such as differentiation, proliferation and apoptosis. Changes in expression profiles of miRNA play an important role in the development of many cancers, including CRC. Therefore, the identification of cancer related miRNAs and their target genes are important for cancer biology research. In this paper, we applied TSK-type recurrent neural fuzzy network (TRNFN) to infer miRNA–mRNA association network from paired miRNA, mRNA expression profiles of CRC patients. We demonstrated that the method we proposed achieved good performance in recovering known experimentally verified miRNA–mRNA associations. Moreover, our approach proved successful in identifying 17 validated cancer miRNAs which are directly involved in the CRC related pathways. Targeting such miRNAs may help not only to prevent the recurrence of disease but also to control the growth of advanced metastatic tumors. Our regulatory modules provide valuable insights into the pathogenesis of cancer.  相似文献   

13.
In colorectal cancer (CRC), chromosomal instability (CIN) is typically studied using comparative-genomic hybridization (CGH) arrays. We studied paired (tumor and surrounding healthy) fresh frozen tissue from 86 CRC patients using Illumina's Infinium-based SNP array. This method allowed us to study CIN in CRC, with simultaneous analysis of copy number (CN) and B-allele frequency (BAF)--a representation of allelic composition. These data helped us to detect mono-allelic and bi-allelic amplifications/deletion, copy neutral loss of heterozygosity, and levels of mosaicism for mixed cell populations, some of which can not be assessed with other methods that do not measure BAF. We identified associations between CN abnormalities and different CRC phenotypes (histological diagnosis, location, tumor grade, stage, MSI and presence of lymph node metastasis). We showed commonalities between regions of CN change observed in CRC and the regions reported in previous studies of other solid cancers (e.g. amplifications of 20q, 13q, 8q, 5p and deletions of 18q, 17p and 8p). From Therapeutic Target Database, we identified relevant drugs, targeted to the genes located in these regions with CN changes, approved or in trials for other cancers and common diseases. These drugs may be considered for future therapeutic trials in CRC, based on personalized cytogenetic diagnosis. We also found many regions, harboring genes, which are not currently targeted by any relevant drugs that may be considered for future drug discovery studies. Our study shows the application of high density SNP arrays for cytogenetic study in CRC and its potential utility for personalized treatment.  相似文献   

14.
Literature search is a process in which external developers provide alternative representations for efficient data mining of biomedical literature such as ranking search results, displaying summarized knowledge of semantics and clustering results into topics. In clustering search results, prominent vocabularies, such as GO (Gene Ontology), MeSH(Medical Subject Headings) and frequent terms extracted from retrieved PubMed abstracts have been used as topics for grouping. In this study, we have proposed FNeTD (Frequent Nearer Terms of the Domain) method for PubMed abstracts clustering. This is achieved through a two-step process viz; i) identifying frequent words or phrases in the abstracts through the frequent multi-word extraction algorithm and ii) identifying nearer terms of the domain from the extracted frequent phrases using the nearest neighbors search. The efficiency of the clustering of PubMed abstracts using nearer terms of the domain was measured using F-score. The present study suggests that nearer terms of the domain can be used for clustering the search results.  相似文献   

15.
16.
The cell division control protein (Cdc2) kinase is a catalytic subunit of a protein kinase complex, called the M phase promoting factor, which induces entry into mitosis and is universal among eukaryotes. This protein is believed to play a major role in cell division and control. The lives of biological cells are controlled by proteins interacting in metabolic and signaling pathways, in complexes that replicate genes and regulate gene activity, and in the assembly of the cytoskeletal infrastructure. Our knowledge of protein–protein (P–P) interactions has been accumulated from biochemical and genetic experiments, including the widely used yeast two-hybrid test. In this paper we examine if P–P interactions in regenerating tissues and cells of the anuran Xenopus laevis can be discovered from biomedical literature using computational and literature mining techniques. Using literature mining techniques, we have identified a set of implicitly interacting proteins in regenerating tissues and cells of Xenopus laevis that may interact with Cdc2 to control cell division. Genome sequence based bioinformatics tools were then applied to validate a set of proteins that appear to interact with the Cdc2 protein. Pathway analysis of these proteins suggests that Myc proteins function as the regulator of M phase initiation by controlling expression of the Akt1 molecule that ultimately inhibits the Cdc2-cyclin B complex in cells. P–P interactions that are implicitly appearing in literature can be effectively discovered using literature mining techniques. By applying evolutionary principles on the P–P interacting pairs, it is possible to quantitatively analyze the significance of the associations with biological relevance. The developed BioMap system allows discovering implicit P–P interactions from large quantity of biomedical literature data. The unique similarities and differences observed within the interacting proteins can lead to the development of the new hypotheses that can be used to design further laboratory experiments.  相似文献   

17.
MOTIVATION: The recent explosion of interest in mining the biomedical literature for associations between defined entities such as genes, diseases and drugs has made apparent the need for robust methods of identifying occurrences of these entities in biomedical text. Such concept-based indexing is strongly dependent on the availability of a comprehensive ontology or lexicon of biomedical terms. However, such ontologies are very difficult and expensive to construct, and often require extensive manual curation to render them suitable for use by automatic indexing programs. Furthermore, the use of statistically salient noun phrases as surrogates for curated terminology is not without difficulties, due to the lack of high-quality part-of-speech taggers specific to medical nomenclature. RESULTS: We describe a method of improving the quality of automatically extracted noun phrases by employing prior knowledge during the HMM training procedure for the tagger. This enhancement, when combined with appropriate training data, can greatly improve the quality and relevance of the extracted phrases, thereby enabling greater accuracy in downstream literature mining tasks.  相似文献   

18.
目的:近年来,随着生物医学领域文献数量的急骤增长,大量隐含的规律和新知被掩埋在浩如烟海的文献之中,而将文本挖掘技术应用于生物医学领域则可以对海量生物医学文献数据进行整合、分析,从而获得有价值的信息,提高人们对生物医学现象的认识。本文就我国近十年来文本挖掘技术在生物医学领域的应用现状进行文献计量学分析,旨在为我国科研工作者对该领域的进一步研究提供参考。方法:对国内正式发表的生物医学领域文本挖掘相关文献进行检索和筛选,分别从年度变化、地区分布、研究机构、期刊来源、研究领域等方面进行分析。结果:国内生物医学文本挖掘文献总量呈上升趋势,主要集中在挖掘算法的研究和文本挖掘技术在中医药及系统生物学领域的应用方面;北京、上海、广东等地的研究处于领先地位。结论:相比其他较为成熟的研究课题来说,目前文本挖掘技术在生物医学中的应用在国内还属于一个比较新的研究领域,但国内对该领域的认识正不断提高、研究正不断深入,初步形成了一批在该领域的核心研究地区、核心研究机构和核心研究领域,而对其进一步的研究,必将为生物医学领域的发展注入新的活力。  相似文献   

19.
MOTIVATION: We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual automatic text categorization systems, which rely on data-intensive models extracted from large sets of training data, our categorizer is largely data-independent. METHODS: In order to evaluate the robustness of our approach we test the system on two different biomedical terminologies: the Medical Subject Headings (MeSH) and the Gene Ontology (GO). Our lightweight categorizer, based on two ranking modules, combines a pattern matcher and a vector space retrieval engine, and uses both stems and linguistically-motivated indexing units. RESULTS AND CONCLUSION: Results show the effectiveness of phrase indexing for both GO and MeSH categorization, but we observe the categorization power of the tool depends on the controlled vocabulary: precision at high ranks ranges from above 90% for MeSH to <20% for GO, establishing a new baseline for categorizers based on retrieval methods.  相似文献   

20.
Text mining can support the interpretation of the enormous quantity of textual data produced in biomedical field. Recent developments in biomedical text mining include advances in the reliability of the recognition of named entities (NEs) such as specific genes and proteins, as well as movement toward richer representations of the associations of NEs. We argue that this shift in representation should be accompanied by the adoption of a more detailed model of the relations holding between NEs and other relevant domain terms. As a step toward this goal, we study NE-term relations with the aim of defining a detailed, broadly applicable set of relation types based on accepted domain standard concepts for use in corpus annotation and domain information extraction approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号