首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The cell division control protein (Cdc2) kinase is a catalytic subunit of a protein kinase complex, called the M phase promoting factor, which induces entry into mitosis and is universal among eukaryotes. This protein is believed to play a major role in cell division and control. The lives of biological cells are controlled by proteins interacting in metabolic and signaling pathways, in complexes that replicate genes and regulate gene activity, and in the assembly of the cytoskeletal infrastructure. Our knowledge of protein–protein (P–P) interactions has been accumulated from biochemical and genetic experiments, including the widely used yeast two-hybrid test. In this paper we examine if P–P interactions in regenerating tissues and cells of the anuran Xenopus laevis can be discovered from biomedical literature using computational and literature mining techniques. Using literature mining techniques, we have identified a set of implicitly interacting proteins in regenerating tissues and cells of Xenopus laevis that may interact with Cdc2 to control cell division. Genome sequence based bioinformatics tools were then applied to validate a set of proteins that appear to interact with the Cdc2 protein. Pathway analysis of these proteins suggests that Myc proteins function as the regulator of M phase initiation by controlling expression of the Akt1 molecule that ultimately inhibits the Cdc2-cyclin B complex in cells. P–P interactions that are implicitly appearing in literature can be effectively discovered using literature mining techniques. By applying evolutionary principles on the P–P interacting pairs, it is possible to quantitatively analyze the significance of the associations with biological relevance. The developed BioMap system allows discovering implicit P–P interactions from large quantity of biomedical literature data. The unique similarities and differences observed within the interacting proteins can lead to the development of the new hypotheses that can be used to design further laboratory experiments.  相似文献   

2.
A huge amount of important biomedical information is hidden in the bulk of research articles in biomedical fields. At the same time, the publication of databases of biological information and of experimental datasets generated by high-throughput methods is in great expansion, and a wealth of annotated gene databases, chemical, genomic (including microarray datasets), clinical and other types of data repositories are now available on the Web. Thus a current challenge of bioinformatics is to develop targeted methods and tools that integrate scientific literature, biological databases and experimental data for reducing the time of database curation and for accessing evidence, either in the literature or in the datasets, useful for the analysis at hand. Under this scenario, this article reviews the knowledge discovery systems that fuse information from the literature, gathered by text mining, with microarray data for enriching the lists of down and upregulated genes with elements for biological understanding and for generating and validating new biological hypothesis. Finally, an easy to use and freely accessible tool, GeneWizard, that exploits text mining and microarray data fusion for supporting researchers in discovering gene-disease relationships is described.  相似文献   

3.
癌症的发生发展与机体内基因的改变有密切联系,在临床上表现为症状或检测指标的异常.通过挖掘分析临床表现与基因改变之间的关系,可为癌症早期诊断和精准治疗提供临床决策支持.从文献数据出发,利用结论性数据挖掘基因与临床表现的关系具有重要意义.本文提出一种基于医学主题词(Medical Subject Headings,MeSH)的生物医学实体关系挖掘方法.该方法利用PubMed中提供的文献信息,借用向量空间模型思想,使用MeSH主题词矢量表达待研究实体,引入文献相互引用因素对结果进行修正,将关系挖掘转化为矢量间的数学运算,实现定量分析.本文将该方法应用于结直肠癌临床表现和基因关系的研究中,得到与结直肠癌相关的203个基因和对应的临床-基因462个关系.通过结合使用基因功能和通路分析工具g:Profiler和KEGG等,对结果进行分析验证.结果表明,基于MeSH主题词的文献挖掘方法,避免传统“共现”方法对发现潜在关系的限制和复杂语义分析带来的大量计算,为生物实体之间潜在关系的挖掘提供一种新的思路和方法.  相似文献   

4.
目的:近年来,随着生物医学领域文献数量的急骤增长,大量隐含的规律和新知被掩埋在浩如烟海的文献之中,而将文本挖掘技术应用于生物医学领域则可以对海量生物医学文献数据进行整合、分析,从而获得有价值的信息,提高人们对生物医学现象的认识。本文就我国近十年来文本挖掘技术在生物医学领域的应用现状进行文献计量学分析,旨在为我国科研工作者对该领域的进一步研究提供参考。方法:对国内正式发表的生物医学领域文本挖掘相关文献进行检索和筛选,分别从年度变化、地区分布、研究机构、期刊来源、研究领域等方面进行分析。结果:国内生物医学文本挖掘文献总量呈上升趋势,主要集中在挖掘算法的研究和文本挖掘技术在中医药及系统生物学领域的应用方面;北京、上海、广东等地的研究处于领先地位。结论:相比其他较为成熟的研究课题来说,目前文本挖掘技术在生物医学中的应用在国内还属于一个比较新的研究领域,但国内对该领域的认识正不断提高、研究正不断深入,初步形成了一批在该领域的核心研究地区、核心研究机构和核心研究领域,而对其进一步的研究,必将为生物医学领域的发展注入新的活力。  相似文献   

5.

Background  

Manual curation of experimental data from the biomedical literature is an expensive and time-consuming endeavor. Nevertheless, most biological knowledge bases still rely heavily on manual curation for data extraction and entry. Text mining software that can semi- or fully automate information retrieval from the literature would thus provide a significant boost to manual curation efforts.  相似文献   

6.
The immense growth of MEDLINE coupled with the realization that a vast amount of biomedical knowledge is recorded in free-text format, has led to the appearance of a large number of literature mining techniques aiming to extract biomedical terms and their inter-relations from the scientific literature. Ontologies have been extensively utilized in the biomedical domain either as controlled vocabularies or to provide the framework for mapping relations between concepts in biology and medicine. Literature-based approaches and ontologies have been used in the past for the purpose of hypothesis generation in connection with drug discovery. Here, we review the application of literature mining and ontology modeling and traversal to the area of drug repurposing (DR). In recent years, DR has emerged as a noteworthy alternative to the traditional drug development process, in response to the decreased productivity of the biopharmaceutical industry. Thus, systematic approaches to DR have been developed, involving a variety of in silico, genomic and high-throughput screening technologies. Attempts to integrate literature mining with other types of data arising from the use of these technologies as well as visualization tools assisting in the discovery of novel associations between existing drugs and new indications will also be presented.  相似文献   

7.
In this paper, we present a novel approach Bio-IEDM (biomedical information extraction and data mining) to integrate text mining and predictive modeling to analyze biomolecular network from biomedical literature databases. Our method consists of two phases. In phase 1, we discuss a semisupervised efficient learning approach to automatically extract biological relationships such as protein-protein interaction, protein-gene interaction from the biomedical literature databases to construct the biomolecular network. Our method automatically learns the patterns based on a few user seed tuples and then extracts new tuples from the biomedical literature based on the discovered patterns. The derived biomolecular network forms a large scale-free network graph. In phase 2, we present a novel clustering algorithm to analyze the biomolecular network graph to identify biologically meaningful subnetworks (communities). The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local density of the vertex and its neighborhood functions that can be used to find more meaningful clusters with different density level. The experimental results indicate our approach is very effective in extracting biological knowledge from a huge collection of biomedical literature. The integration of data mining and information extraction provides a promising direction for analyzing the biomolecular network  相似文献   

8.
A survey of current work in biomedical text mining   总被引:3,自引:0,他引:3  
The volume of published biomedical research, and therefore the underlying biomedical knowledge base, is expanding at an increasing rate. Among the tools that can aid researchers in coping with this information overload are text mining and knowledge extraction. Significant progress has been made in applying text mining to named entity recognition, text classification, terminology extraction, relationship extraction and hypothesis generation. Several research groups are constructing integrated flexible text-mining systems intended for multiple uses. The major challenge of biomedical text mining over the next 5-10 years is to make these systems useful to biomedical researchers. This will require enhanced access to full text, better understanding of the feature space of biomedical literature, better methods for measuring the usefulness of systems to users, and continued cooperation with the biomedical research community to ensure that their needs are addressed.  相似文献   

9.
MOTIVATION: As biomedical researchers are amassing a plethora of information in a variety of forms resulting from the advancements in biomedical research, there is a critical need for innovative information management and knowledge discovery tools to sift through these vast volumes of heterogeneous data and analysis tools. In this paper we present a general model for an information management system that is adaptable and scalable, followed by a detailed design and implementation of one component of the model. The prototype, called BioSifter, was applied to problems in the bioinformatics area. RESULTS: BioSifter was tested using 500 documents obtained from PubMed database on two biological problems related to genetic polymorphism and extracorporal shockwave lithotripsy. The results indicate that BioSifter is a powerful tool for biological researchers to automatically retrieve relevant text documents from biological literature based on their interest profile. The results also indicate that the first stage of information management process, i.e. data to information transformation, significantly reduces the size of the information space. The filtered data obtained through BioSifter is relevant as well as much smaller in dimension compared to all the retrieved data. This would in turn significantly reduce the complexity associated with the next level transformation, i.e. information to knowledge.  相似文献   

10.
With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models.  相似文献   

11.
Computational techniques have been adopted in medi-cal and biological systems for a long time. There is no doubt that the development and application of computational methods will render great help in better understanding biomedical and biological functions. Large amounts of datasets have been produced by biomedical and biological experiments and simulations. In order for researchers to gain knowledge from origi- nal data, nontrivial transformation is necessary, which is regarded as a critical link in the chain of knowledge acquisition, sharing, and reuse. Challenges that have been encountered include: how to efficiently and effectively represent human knowledge in formal computing models, how to take advantage of semantic text mining techniques rather than traditional syntactic text mining, and how to handle security issues during the knowledge sharing and reuse. This paper summarizes the state-of-the-art in these research directions. We aim to provide readers with an introduction of major computing themes to be applied to the medical and biological research.  相似文献   

12.
Quorum sensing plays a pivotal role in Pseudomonas aeruginosa’s virulence. This paper reviews experimental results on antimicrobial strategies based on quorum sensing inhibition and discusses current targets in the regulatory network that determines P. aeruginosa biofilm formation and virulence. A bioinformatics framework combining literature mining with information from biomedical ontologies and curated databases was used to create a knowledge network of potential anti-quorum sensing agents for P. aeruginosa. A total of 110 scientific articles, corresponding to 1,004 annotations, were so far included in the network and are analysed in this work. Information on the most studied agents, QS targets and methods is detailed. This knowledge network offers a unique view of existing strategies for quorum sensing inhibition and their main regulatory targets and may be used to readily access otherwise scattered information and to help generate new testable hypotheses. This knowledge network is publicly available at http://pcquorum.org/.  相似文献   

13.
Bioinformatics     
Bioinformatics is an interdisciplinary field that blends computer science and biostatistics with biological and biomedical sciences such as biochemistry, cell biology, developmental biology, genetics, genomics, and physiology. An important goal of bioinformatics is to facilitate the management, analysis, and interpretation of data from biological experiments and observational studies. The goal of this review is to introduce some of the important concepts in bioinformatics that must be considered when planning and executing a modern biological research study. We review database resources as well as data mining software tools.  相似文献   

14.
The past decade has seen a tremendous growth in the amount of experimental and computational biomedical data, specifically in the areas of genomics and proteomics. This growth is accompanied by an accelerated increase in the number of biomedical publications discussing the findings. In the last few years, there has been a lot of interest within the scientific community in literature-mining tools to help sort through this abundance of literature and find the nuggets of information most relevant and useful for specific analysis tasks. This paper provides a road map to the various literature-mining methods, both in general and within bioinformatics. It surveys the disciplines involved in unstructured-text analysis, categorizes current work in biomedical literature mining with respect to these disciplines, and provides examples of text analysis methods applied towards meeting some of the current challenges in bioinformatics.  相似文献   

15.
Text mining can support the interpretation of the enormous quantity of textual data produced in biomedical field. Recent developments in biomedical text mining include advances in the reliability of the recognition of named entities (NEs) such as specific genes and proteins, as well as movement toward richer representations of the associations of NEs. We argue that this shift in representation should be accompanied by the adoption of a more detailed model of the relations holding between NEs and other relevant domain terms. As a step toward this goal, we study NE-term relations with the aim of defining a detailed, broadly applicable set of relation types based on accepted domain standard concepts for use in corpus annotation and domain information extraction approaches.  相似文献   

16.
Molecular perturbations provide a powerful toolset for biomedical researchers to scrutinize the contributions of individual molecules in biological systems. Perturbations qualify the context of experimental results and, despite their diversity, share properties in different dimensions in ways that can be formalized. We propose a formal framework to describe and classify perturbations that allows accumulation of knowledge in order to inform the process of biomedical scientific experimentation and target analysis. We apply this framework to develop a novel algorithm for automatic detection and characterization of perturbations in text and show its relevance in the study of gene–phenotype associations and protein–protein interactions in diabetes and cancer. Analyzing perturbations introduces a novel view of the multivariate landscape of biological systems.  相似文献   

17.
18.
MOTIVATION: Protein annotation is a task that describes protein X in terms of topic Y. Usually, this is constructed using information from the biomedical literature. Until now, most of literature-based protein annotation work has been done manually by human annotators. However, as the number of biomedical papers grows ever more rapidly, manual annotation becomes more difficult, and there is increasing need to automate the process. Recently, information extraction (IE) has been used to address this problem. Typically, IE requires pre-defined relations and hand-crafted IE rules or annotated corpora, and these requirements are difficult to satisfy in real-world scenarios such as in the biomedical domain. In this article, we describe an IE system that requires only sentences labelled according to their relevance or not to a given topic by domain experts. RESULTS: We applied our system to meet the annotation needs of a well-known protein family database; the results show that our IE system can annotate proteins with a set of extracted relations by learning relations and IE rules for disease, function and structure from only relevant and irrelevant sentences.  相似文献   

19.
Although various ontologies and knowledge sources have been developed in recent years to facilitate biomedical research, it is difficult to assimilate information from multiple knowledge sources. To enable researchers to easily gain understanding of a biomedical concept, a biomedical Semantic Web that seamlessly integrates knowledge from biomedical ontologies, publications and patents would be very helpful. In this paper, current research efforts in representing biomedical knowledge in Semantic Web languages are surveyed. Techniques are presented for information retrieval and knowledge discovery from the Semantic Web that extend traditional keyword search and database querying techniques. Finally, some of the challenges that have to be addressed to make the vision of a biomedical Semantic Web a reality are discussed.  相似文献   

20.
随着深度测序和基因芯片技术的不断发展,基因组、转录组、表达谱数据大量积累。目前,至少有10多个昆虫的基因组已被测序,30多个昆虫的转录组数据被报道。显然,传统的生物统计学方法无法处理如此海量的生物数据。量变引发质变,生物数据的大量积累催生了一门新兴学科,生物信息学。生物信息学融合了统计学、信息科学和生物学等各学科的理论和研究内容,在医学、基础生物学、农业科学以及昆虫学等方面获得了广泛的应用。生物信息学的目标是存储数据、管理数据和数据挖掘。因此,建立维护生物学数据库、设计开发基于模式识别、机器学习、数据挖掘等方法的生物软件,以及运用上述工具进行深度的数据挖掘,是生物信息学的重要研究内容。本文首先简要介绍了生物信息学的历史、研究现状及其在昆虫学科中的应用,然后综述了昆虫基因组学和转录组学的研究进展,最后对生物信息学在昆虫学研究中的应用前景进行了展望。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号