首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs.  相似文献   

2.
We report the production and availability of over 7000 fully sequence verified plasmid ORF clones representing over 3400 unique human genes. These ORF clones were derived using the human MGC collection as template and were produced in two formats: with and without stop codons. Thus, this collection supports the production of either native protein or proteins with fusion tags added to either or both ends. The template clones used to generate this collection were enriched in three ways. First, gene redundancy was removed. Second, clones were selected to represent the best available GenBank reference sequence. Finally, a literature-based software tool was used to evaluate the list of target genes to ensure that it broadly reflected biomedical research interests. The target gene list was compared with 4000 human diseases and over 8500 biological and chemical MeSH classes in approximately 15 Million publications recorded in PubMed at the time of analysis. The outcome of this analysis revealed that relative to the genome and the MGC collection, this collection is enriched for the presence of genes with published associations with a wide range of diseases and biomedical terms without displaying a particular bias towards any single disease or concept. Thus, this collection is likely to be a powerful resource for researchers who wish to study protein function in a set of genes with documented biomedical significance.  相似文献   

3.

Background  

Microarray data are often used for patient classification and gene selection. An appropriate tool for end users and biomedical researchers should combine user friendliness with statistical rigor, including carefully avoiding selection biases and allowing analysis of multiple solutions, together with access to additional functional information of selected genes. Methodologically, such a tool would be of greater use if it incorporates state-of-the-art computational approaches and makes source code available.  相似文献   

4.
5.
Since the development of technologies that can determine the base-pair sequence of DNA, the ability to sequence genes has contributed much to science and medicine. However, it has remained a relatively costly and laborious process, hindering its use as a routine biomedical tool. Recent times are seeing rapid developments in this field, both in the availability of novel sequencing platforms, as well as supporting technologies involved in processes such as targeting and data analysis. This is leading to significant reductions in the cost of sequencing a human genome and the potential for its use as a routine biomedical tool. This review is a snapshot of this rapidly moving field examining the current state of the art, forthcoming developments and some of the issues still to be resolved prior to the use of new sequencing technologies in routine clinical diagnosis.  相似文献   

6.
An important and ongoing focus of biomedical and agricultural avian research is to understand gene function, which for a significant fraction of genes remains unknown. A first step is to determine when and where genes are expressed during development and in the adult. Whole mount in situ hybridization gives precise spatial and temporal resolution of gene expression throughout an embryo, and a comprehensive analysis and centralized repository of in situ hybridization information would provide a valuable research tool. The GEISHA project (gallus expression in situ hybridization analysis) was initiated to explore the utility of using high-throughput in situ hybridization as a means for gene discovery and annotation in chicken embryos, and to provide a unified repository for in situ hybridization information. This report describes the design and implementation of a new GEISHA database and user interface (www.geisha.arizona.edu), and illustrates its utility for researchers in the biomedical and poultry science communities. Results obtained from a high throughput screen of microRNA expression in chicken embryos are also presented.  相似文献   

7.
BACKGROUND: The semantic integration of biomedical resources is still a challenging issue which is required for effective information processing and data analysis. The availability of comprehensive knowledge resources such as biomedical ontologies and integrated thesauri greatly facilitates this integration effort by means of semantic annotation, which allows disparate data formats and contents to be expressed under a common semantic space. In this paper, we propose a multidimensional representation for such a semantic space, where dimensions regard the different perspectives in biomedical research (e.g., population, disease, anatomy and protein/genes). RESULTS: This paper presents a novel method for building multidimensional semantic spaces from semantically annotated biomedical data collections. This method consists of two main processes: knowledge and data normalization. The former one arranges the concepts provided by a reference knowledge resource (e.g., biomedical ontologies and thesauri) into a set of hierarchical dimensions for analysis purposes. The latter one reduces the annotation set associated to each collection item into a set of points of the multidimensional space. Additionally, we have developed a visual tool, called 3D-Browser, which implements OLAP-like operators over the generated multidimensional space. The method and the tool have been tested and evaluated in the context of the Health-e-Child (HeC) project. Automatic semantic annotation was applied to tag three collections of abstracts taken from PubMed, one for each target disease of the project, the Uniprot database, and the HeC patient record database. We adopted the UMLS Meta-thesaurus 2010AA as the reference knowledge resource. CONCLUSIONS: Current knowledge resources and semantic-aware technology make possible the integration of biomedical resources. Such an integration is performed through semantic annotation of the intended biomedical data resources. This paper shows how these annotations can be exploited for integration, exploration, and analysis tasks. Results over a real scenario demonstrate the viability and usefulness of the approach, as well as the quality of the generated multidimensional semantic spaces.  相似文献   

8.
Deciphering the genetic basis of human diseases is an important goal of biomedical research. On the basis of the assumption that phenotypically similar diseases are caused by functionally related genes, we propose a computational framework that integrates human protein–protein interactions, disease phenotype similarities, and known gene–phenotype associations to capture the complex relationships between phenotypes and genotypes. We develop a tool named CIPHER to predict and prioritize disease genes, and we show that the global concordance between the human protein network and the phenotype network reliably predicts disease genes. Our method is applicable to genetically uncharacterized phenotypes, effective in the genome‐wide scan of disease genes, and also extendable to explore gene cooperativity in complex diseases. The predicted genetic landscape of over 1000 human phenotypes, which reveals the global modular organization of phenotype–genotype relationships. The genome‐wide prioritization of candidate genes for over 5000 human phenotypes, including those with under‐characterized disease loci or even those lacking known association, is publicly released to facilitate future discovery of disease genes.  相似文献   

9.

Background

A key challenge in the realm of human disease research is next generation sequencing (NGS) interpretation, whereby identified filtered variant-harboring genes are associated with a patient’s disease phenotypes. This necessitates bioinformatics tools linked to comprehensive knowledgebases. The GeneCards suite databases, which include GeneCards (human genes), MalaCards (human diseases) and PathCards (human pathways) together with additional tools, are presented with the focus on MalaCards utility for NGS interpretation as well as for large scale bioinformatic analyses.

Results

VarElect, our NGS interpretation tool, leverages the broad information in the GeneCards suite databases. MalaCards algorithms unify disease-related terms and annotations from 69 sources. Further, MalaCards defines hierarchical relatedness—aliases, disease families, a related diseases network, categories and ontological classifications. GeneCards and MalaCards delineate and share a multi-tiered, scored gene-disease network, with stringency levels, including the definition of elite status—high quality gene-disease pairs, coming from manually curated trustworthy sources, that includes 4500 genes for 8000 diseases. This unique resource is key to NGS interpretation by VarElect. VarElect, a comprehensive search tool that helps infer both direct and indirect links between genes and user-supplied disease/phenotype terms, is robustly strengthened by the information found in MalaCards. The indirect mode benefits from GeneCards’ diverse gene-to-gene relationships, including SuperPaths—integrated biological pathways from 12 information sources. We are currently adding an important information layer in the form of “disease SuperPaths”, generated from the gene-disease matrix by an algorithm similar to that previously employed for biological pathway unification. This allows the discovery of novel gene-disease and disease–disease relationships. The advent of whole genome sequencing necessitates capacities to go beyond protein coding genes. GeneCards is highly useful in this respect, as it also addresses 101,976 non-protein-coding RNA genes. In a more recent development, we are currently adding an inclusive map of regulatory elements and their inferred target genes, generated by integration from 4 resources.

Conclusions

MalaCards provides a rich big-data scaffold for in silico biomedical discovery within the gene-disease universe. VarElect, which depends significantly on both GeneCards and MalaCards power, is a potent tool for supporting the interpretation of wet-lab experiments, notably NGS analyses of disease. The GeneCards suite has thus transcended its 2-decade role in biomedical research, maturing into a key player in clinical investigation.
  相似文献   

10.

Background  

Once specific genes are identified through high throughput genomics technologies there is a need to sort the final gene list to a manageable size for validation studies. The triaging and sorting of genes often relies on the use of supplemental information related to gene structure, metabolic pathways, and chromosomal location. Yet in disease states where the genes may not have identifiable structural elements, poorly defined metabolic pathways, or limited chromosomal data, flexible systems for obtaining additional data are necessary. In these situations having a tool for searching the biomedical literature using the list of identified genes while simultaneously defining additional search terms would be useful.  相似文献   

11.
自噬是真核生物中重要且高度保守的蛋白降解过程。在此过程中,细胞中的细胞器、长寿蛋白及其他大分子物质被双层膜的自噬体包裹并运送至降解细胞器中进行降解并重新利用。自噬在病原真菌诸如细胞分化、营养动态平衡以及致病性等各种细胞过程中起重要作用。在本综述中,我们简要介绍了自噬过程,并以人体病原真菌新生隐球菌为例介绍了病原真菌的有性生殖过程;同时我们也总结了目前模式病原真菌中自噬相关基因的研究情况以及自噬调控病原真菌无性和有性生殖的可能机理;最后我们总结全文并讨论了未来自噬调控真菌有性生殖机理研究的工作方向。  相似文献   

12.
In the last decade, advances in high-throughput technologies such as DNA microarrays have made it possible to simultaneously measure the expression levels of tens of thousands of genes and proteins. This has resulted in large amounts of biological data requiring analysis and interpretation. Nonnegative matrix factorization (NMF) was introduced as an unsupervised, parts-based learning paradigm involving the decomposition of a nonnegative matrix V into two nonnegative matrices, W and H, via a multiplicative updates algorithm. In the context of a pxn gene expression matrix V consisting of observations on p genes from n samples, each column of W defines a metagene, and each column of H represents the metagene expression pattern of the corresponding sample. NMF has been primarily applied in an unsupervised setting in image and natural language processing. More recently, it has been successfully utilized in a variety of applications in computational biology. Examples include molecular pattern discovery, class comparison and prediction, cross-platform and cross-species analysis, functional characterization of genes and biomedical informatics. In this paper, we review this method as a data analytical and interpretive tool in computational biology with an emphasis on these applications.  相似文献   

13.
14.
MOTIVATION: The problems of analyzing dose effects on gene expression are gaining attention in biomedical research. A specific challenge is to detect genes with expression levels that change according to dose levels in a non-random manner, but nonetheless may be considered as potential biomarkers. METHOD: We are among the first to formally apply a tool that uses an isotonic (monotonic) regression approach to this area of study. We introduce a test statistic to select genes with significant dose-response expression in a monotonic fashion based on a permutation procedure. We then compare the results with those achieved from the application of a likelihood ratio-based test. RESULTS: We apply the isotonic regression approach to a study of gene expression in the RKO colon carcinoma cell line in response to varying dosage levels of the chemotherapeutic agent 5-fluorouracil. A feature of both Affymetrix and printed 75mer oligomer cDNA arrays produced from the same samples provides an opportunity to compare the two microarray platforms. AVAILABILITY: Statistical software S-plus Code to implement the method is available from the authors. CONTACT: kcoombes@mdanderson.org  相似文献   

15.
目的利用已有的研究结果和数据,采用多目标评价方法建立乳腺癌易感基因评价模型,对与已知乳腺癌基因关系密切的其它基因进行分析和排序,并给出结果的网络表达模式。方法通过分析已有的文献,并利用有关的基因数据库和已有文献中的数据,提炼出乳腺癌易感基因的多目标评价体系,构建基于加权和法的乳腺癌易感基因评价模型,并利用Cytoscape软件进行评价结果计算和评价结果的网络模式表达。结果利用多目标模型所得到的评价结果,与已有的研究结果一致。其中,乳腺癌易感基因TopBP1排名第二,已知乳腺癌候选易感基因HMMR排名第六。结论文章提出的多目标评价模型能够准确评价被选基因与乳腺癌易感性之间的关系,所提出的评价方法与相关软件结合使用,将成为癌症易感基因研究方面有效的分析方法和途径。  相似文献   

16.
胚胎干细胞研究是20世纪90年代以来在生物医学领域中最引人注目的热点之一,而新近发展起来的RNA干扰技术,能快速有效地沉默基因表达,将成为胚胎干细胞生物学研究的得力工具。现对RNA干扰的作用机制,以及RNA干扰应用于胚胎干细胞研究的方法与RNA干扰在胚胎干细胞研究领域的进展作一综述,以期为今后这方面的研究提供参考。  相似文献   

17.
A huge amount of important biomedical information is hidden in the bulk of research articles in biomedical fields. At the same time, the publication of databases of biological information and of experimental datasets generated by high-throughput methods is in great expansion, and a wealth of annotated gene databases, chemical, genomic (including microarray datasets), clinical and other types of data repositories are now available on the Web. Thus a current challenge of bioinformatics is to develop targeted methods and tools that integrate scientific literature, biological databases and experimental data for reducing the time of database curation and for accessing evidence, either in the literature or in the datasets, useful for the analysis at hand. Under this scenario, this article reviews the knowledge discovery systems that fuse information from the literature, gathered by text mining, with microarray data for enriching the lists of down and upregulated genes with elements for biological understanding and for generating and validating new biological hypothesis. Finally, an easy to use and freely accessible tool, GeneWizard, that exploits text mining and microarray data fusion for supporting researchers in discovering gene-disease relationships is described.  相似文献   

18.
小波变换与生物医学信号处理   总被引:6,自引:0,他引:6  
作为数字信号处理领域的一个重要分支,生物医学信号处理理论与技术的研究一直受到国内外科技工作者的高度重视。小波变换是近年来发展起来的一种新的信号分析工具。本文结合生物医学信号与小波变换的特点,探讨了小波变换在生物医学信号处理领域的应用前景。  相似文献   

19.
Murine models are an essential tool to study human immune responses and related diseases. However, the use of traditional murine models has been challenged by recent systemic surveys that show discordance between human and model immune responses in their gene expression. This is a significant problem in translational biomedical research for human immunity. Here, we describe evidence-based translation (EBT) to improve the analysis of genomic responses of murine models in the translation to human immune responses. Based on evidences from prior experiments, EBT introduces pseudo variances, penalizes gene expression changes in a model experiment, and finally detects false positive translations of model genomic responses that poorly correlate with human responses. Demonstrated over multiple data sets, EBT significantly improves the agreement of overall responses (up to 56%), experiment-specific responses (up to 143%), and enriched biological contexts (up to 100%) between human and model systems. In addition, we provide the category of genes specifically benefiting from EBT and the factors affecting the performance of EBT. The overall result indicates the usefulness of the proposed computational translation in biomedical research for human immunity using murine models.  相似文献   

20.
姜伟  李霞  郭政  饶绍奇 《生物信息学》2005,3(3):112-115
基因表达调控网络的深入研究有利于分子药物靶标的发现以及推新药的研发,是未来生物医学研究的重要内容。针对基因表达调控的时间延迟问题,我们初步设计开发了一套基于基因表达谱数据识别基因表达时间延迟调控关系的软件ITdGR(Identification of Time-delayed Gene Regulations)。并已经成功地将该软件应用于酿酒酵母细胞周期的基因表达谱数据中,识别出的调控关系与已有的知识相符。该软件为基因调控网络重构以及基因表达动态研究提供了一个方便和快捷的工具。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号