首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The increasing use of gene expression profiling offers great promise in clinical research into disease biology and its treatment. Along with the ability to measure changing expression levels in thousands of genes at once, comes the challenge of analyzing and interpreting the vast sets of data generated. Analysis tools are evolving rapidly to meet such challenges. The next step is to interpret observed changes in terms of the biological properties or relationships underlying them. One powerful approach is to make associations between the genes that are under investigation and well-known biochemical or signaling pathways, and further to assess the significance of such associations. Similarly, genes can be mapped to standardized biological categories via an ontology resource. We discuss these approaches and several web-based resources and tools designed to facilitate such analyses. This information can be used to facilitate understanding and to help design more focused experiments for validating the relevance and importance of these biological pathways and processes in human disease and therapeutics.  相似文献   

2.
MOTIVATION: Protein family databases provide a central focus for scientific communities as well as providing useful resources to aide research. However, such resources require constant curation and often become outdated and discontinued. We have developed an ontology-driven system for capturing and managing protein family data that addresses the problems of maintenance and sustainability. RESULTS: Using protein phosphatases and ABC transporters as model protein families, we constructed two protein family database resources around a central DAML+OIL ontology. Each resource contains specialist information about each protein family, providing specialized domain-specific resources based on the same template structure. The formal structure, combined with the extraction of biological data using GO terms, allows for automated update strategies. Despite the functional differences between the two protein families, the ontology model was equally applicable to both, demonstrating the generic nature of the system. AVAILABILITY: The protein phosphatase resource, PhosphaBase, is freely available on the internet (http://www.bioinf.man.ac.uk/phosphabase). The DAML+OIL ontology for the protein phosphatases and the ABC transporters is available on request from the authors. CONTACT: kwolstencroft@cs.man.ac.uk.  相似文献   

3.
Modelling biological processes using workflow and Petri Net models   总被引:4,自引:0,他引:4  
MOTIVATION: Biological processes can be considered at many levels of detail, ranging from atomic mechanism to general processes such as cell division, cell adhesion or cell invasion. The experimental study of protein function and gene regulation typically provides information at many levels. The representation of hierarchical process knowledge in biology is therefore a major challenge for bioinformatics. To represent high-level processes in the context of their component functions, we have developed a graphical knowledge model for biological processes that supports methods for qualitative reasoning. RESULTS: We assessed eleven diverse models that were developed in the fields of software engineering, business, and biology, to evaluate their suitability for representing and simulating biological processes. Based on this assessment, we combined the best aspects of two models: Workflow/Petri Net and a biological concept model. The Workflow model can represent nesting and ordering of processes, the structural components that participate in the processes, and the roles that they play. It also maps to Petri Nets, which allow verification of formal properties and qualitative simulation. The biological concept model, TAMBIS, provides a framework for describing biological entities that can be mapped to the workflow model. We tested our model by representing malaria parasites invading host erythrocytes, and composed queries, in five general classes, to discover relationships among processes and structural components. We used reachability analysis to answer queries about the dynamic aspects of the model. AVAILABILITY: The model is available at http://smi.stanford.edu/projects/helix/pubs/process-model/.  相似文献   

4.
Modular organization of protein interaction networks   总被引:6,自引:0,他引:6  
MOTIVATION: Accumulating evidence suggests that biological systems are composed of interacting, separable, functional modules. Identifying these modules is essential to understand the organization of biological systems. RESULT: In this paper, we present a framework to identify modules within biological networks. In this approach, the concept of degree is extended from the single vertex to the sub-graph, and a formal definition of module in a network is used. A new agglomerative algorithm was developed to identify modules from the network by combining the new module definition with the relative edge order generated by the Girvan-Newman (G-N) algorithm. A JAVA program, MoNet, was developed to implement the algorithm. Applying MoNet to the yeast core protein interaction network from the database of interacting proteins (DIP) identified 86 simple modules with sizes larger than three proteins. The modules obtained are significantly enriched in proteins with related biological process Gene Ontology terms. A comparison between the MoNet modules and modules defined by Radicchi et al. (2004) indicates that MoNet modules show stronger co-clustering of related genes and are more robust to ties in betweenness values. Further, the MoNet output retains the adjacent relationships between modules and allows the construction of an interaction web of modules providing insight regarding the relationships between different functional modules. Thus, MoNet provides an objective approach to understand the organization and interactions of biological processes in cellular systems. AVAILABILITY: MoNet is available upon request from the authors.  相似文献   

5.

Background

Several types of genetic interactions in humans can be directly or indirectly associated with the causal effects of mutations. These interactions are usually based on their co-associations to biological processes, coexistence in cellular locations, coexpression in cell lines, physical interactions and so on. In addition, pathological processes can present similar phenotypes that have mutations either in the same genomic location or in different genomic regions. Therefore, integrative resources for all of these complex interactions can help us prioritize the relationships between genes and diseases that are most deserving to be studied by researchers and physicians.

Results

PhenUMA is a web application that displays biological networks using information from biomedical and biomolecular data repositories. One of its most innovative features is to combine the benefits of semantic similarity methods with the information taken from databases of genetic diseases and biological interactions. More specifically, this tool is useful in studying novel pathological relationships between functionally related genes, merging diseases into clusters that share specific phenotypes or finding diseases related to reported phenotypes.

Conclusions

This framework builds, analyzes and visualizes networks based on both functional and phenotypic relationships. The integration of this information helps in the discovery of alternative pathological roles of genes, biological functions and diseases. PhenUMA represents an advancement toward the use of new technologies for genomics and personalized medicine.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0375-1) contains supplementary material, which is available to authorized users.  相似文献   

6.
Systemic analysis of available large-scale biological/biomedical data is critical for studying biological mechanisms, and developing novel and effective treatment approaches against diseases. However, different layers of the available data are produced using different technologies and scattered across individual computational resources without any explicit connections to each other, which hinders extensive and integrative multi-omics-based analysis. We aimed to address this issue by developing a new data integration/representation methodology and its application by constructing a biological data resource. CROssBAR is a comprehensive system that integrates large-scale biological/biomedical data from various resources and stores them in a NoSQL database. CROssBAR is enriched with the deep-learning-based prediction of relationships between numerous data entries, which is followed by the rigorous analysis of the enriched data to obtain biologically meaningful modules. These complex sets of entities and relationships are displayed to users via easy-to-interpret, interactive knowledge graphs within an open-access service. CROssBAR knowledge graphs incorporate relevant genes-proteins, molecular interactions, pathways, phenotypes, diseases, as well as known/predicted drugs and bioactive compounds, and they are constructed on-the-fly based on simple non-programmatic user queries. These intensely processed heterogeneous networks are expected to aid systems-level research, especially to infer biological mechanisms in relation to genes, proteins, their ligands, and diseases.  相似文献   

7.
Geodiversity, (diversity of the geosphere) incorporates many of the environmental patterns and processes that are considered drivers of biodiversity. Components of geodiversity (climate, topography, geology and hydrology) can be considered in terms of their resource giving potential, where resources are taken as energy, water, space and nutrients. The total amount of these resources, along with their spatial and temporal variation, is herein proposed as a compound index of geodiversity that has the potential to model broad scale biodiversity patterns. This paper outlines potential datasets that could be used to represent geodiversity, and then reviews the theoretical links between each element of the proposed compound index of geodiversity (overall resource availability, temporal variation and spatial variation in those resources) and broad-scale patterns of biodiversity. Support for the influence of each of the elements of geodiversity on overall biodiversity patterns was found in the literature, although the majority of relevant research focuses on resource availability, particularly available energy. The links between temporal and spatial variation in resources and biodiversity have been less thoroughly investigated in the literature. For the most part, it was reported that overall resource availability, temporal variation and spatial variation in those resources do not act in isolation in terms of controlling biodiversity. Overall there are sufficient datasets to calculate the proposed compound index of geodiversity, and evidence in the literature for links between the geographical distribution of biodiversity and each of the elements of the compound index defined. Since data for measuring geodiversity is more spatially consistent and widely available (thanks to satellite remote sensing) geodiversity has potential as a conservation planning tool, especially where biological data are not available or sparsely distributed.  相似文献   

8.
9.
Comparative maps have been a valuable resource for extrapolating biological information among organisms. The relationship between mouse and human maps provides a framework for integrating information from each species and thereby increasing the utility of all available data such as gene location, structure and function. This review describes the various public resources, both databases and web sites, containing genome-wide mouse-human comparative map information available through the World-Wide Web. We will focus on the use and applicability of these resources in their current form and consider future potential directions.  相似文献   

10.
Tyrosine and serine/threonine kinases are essential regulators of cell processes and are important targets for human therapies. Unfortunately, very little is known about specific kinase-substrate relationships, making it difficult to infer meaning from dysregulated phosphoproteomic datasets or for researchers to identify possible kinases that regulate specific or novel phosphorylation sites. The last two decades have seen an explosion in algorithms to extrapolate from what little is known into the larger unknown—predicting kinase relationships with site-specific substrates using a variety of approaches that include the sequence-specificity of kinase catalytic domains and various other factors, such as evolutionary relationships, co-expression, and protein-protein interaction networks. Unfortunately, a number of limitations prevent researchers from easily harnessing these resources, such as loss of resource accessibility, limited information in publishing that results in a poor mapping to a human reference, and not being updated to match the growth of the human phosphoproteome. Here, we propose a methodological framework for publishing predictions in a unified way, which entails ensuring predictions have been run on a current reference proteome, mapping the same substrates and kinases across resources to a common reference, filtering for the human phosphoproteome, and providing methods for updating the resource easily in the future. We applied this framework on three currently available resources, published in the last decade, which provide kinase-specific predictions in the human proteome. Using the unified datasets, we then explore the role of study bias, the emergent network properties of these predictive algorithms, and comparisons within and between predictive algorithms. The combination of the code for unification and analysis, as well as the unified predictions are available under the resource we named KinPred. We believe this resource will be useful for a wide range of applications and establishes best practices for long-term usability and sustainability for new and existing predictive algorithms.  相似文献   

11.
Recent years have seen a huge increase in the amount of biomedical information that is available in electronic format. Consequently, for biomedical researchers wishing to relate their experimental results to relevant data lurking somewhere within this expanding universe of on-line information, the ability to access and navigate biomedical information sources in an efficient manner has become increasingly important. Natural language and text processing techniques can facilitate this task by making the information contained in textual resources such as MEDLINE more readily accessible and amenable to computational processing. Names of biological entities such as genes and proteins provide critical links between different biomedical information sources and researchers' experimental data. Therefore, automatic identification and classification of these terms in text is an essential capability of any natural language processing system aimed at managing the wealth of biomedical information that is available electronically. To support term recognition in the biomedical domain, we have developed Termino, a large-scale terminological resource for text processing applications, which has two main components: first, a database into which very large numbers of terms can be loaded from resources such as UMLS, and stored together with various kinds of relevant information; second, a finite state recognizer, for fast and efficient identification and mark-up of terms within text. Since many biomedical applications require this functionality, we have made Termino available to the community as a web service, which allows for its integration into larger applications as a remotely located component, accessed through a standardized interface over the web.  相似文献   

12.
13.
14.
PhosphaBase is an ontology-driven database resource containing information on the protein phosphatase family. It is the first public resource dedicated to protein phosphatases, which are enzymes that perform dephosphorylation reactions. In conjunction with the phosphorylation action of protein kinases, phosphatases are involved in important control and communication mechanisms in the cell. They have also been implicated in many human diseases, including diabetes and obesity, cancers, and neurodegenerative conditions. PhosphaBase aims to centralize the growing base of knowledge in the phosphatase research domain. The resource is built around a formal, domain-specific DAML+OIL ontology, and the data are collected from heterogeneous biological sources using Gene Ontology terms as a means of data extraction. The overall ontology-driven architecture provides a robust structure with distinct advantages for sustainability and provides the potential for the development of diagnostic tools, as well as a data repository.  相似文献   

15.
Clustering and correlation analysis techniques have become popular tools for the analysis of data produced by metabolomics experiments. The results obtained from these approaches provide an overview of the interactions between objects of interest. Often in these experiments, one is more interested in information about the nature of these relationships, e.g., cause-effect relationships, than in the actual strength of the interactions. Finding such relationships is of crucial importance as most biological processes can only be understood in this way. Bayesian networks allow representation of these cause-effect relationships among variables of interest in terms of whether and how they influence each other given that a third, possibly empty, group of variables is known. This technique also allows the incorporation of prior knowledge as established from the literature or from biologists. The representation as a directed graph of these relationship is highly intuitive and helps to understand these processes. This paper describes how constraint-based Bayesian networks can be applied to metabolomics data and can be used to uncover the important pathways which play a significant role in the ripening of fresh tomatoes. We also show here how this methods of reconstructing pathways is intuitive and performs better than classical techniques. Methods for learning Bayesian network models are powerful tools for the analysis of data of the magnitude as generated by metabolomics experiments. It allows one to model cause-effect relationships and helps in understanding the underlying processes.  相似文献   

16.
《BIOSILICO》2003,1(2):69-80
The information age has made the electronic storage of large amounts of data effortless. The proliferation of documents available on the Internet, corporate intranets, news wires and elsewhere is overwhelming. Search engines only exacerbate this overload problem by making increasingly more documents available in only a few keystrokes. This information overload also exists in the biomedical field, where scientific publications, and other forms of text-based data are produced at an unprecedented rate. Text mining is the combined, automated process of analyzing unstructured, natural language text to discover information and knowledge that are typically difficult to retrieve. Here, we focus on text mining as applied to the biomedical literature. We focus in particular on finding relationships among genes, proteins, drugs and diseases, to facilitate an understanding and prediction of complex biological processes. The LitMiner™ system, developed specifically for this purpose; is described in relation to the Knowledge Discovery and Data Mining Cup 2002, which serves as a formal evaluation of the system.  相似文献   

17.
The cytokines/related receptors system represents a complex regulatory network that is involved in those chronic inflammatory processes which lead to many diseases as cancers. We developed a Cytokine Receptor Database (CytReD) to collect information on cytokine receptors related to their biological activity, gene data, protein structures and diseases in which these and their ligands are implicated. This large set of information may be used by researchers as well as by physicians or clinicians to identify which cytokines, reported in the literature, are important in a given disease and, therefore, useful for purposes of diagnosis or prognostic. AVAILABILITY: The database is available for free at http://www.cro-m.eu/CytReD/  相似文献   

18.
19.
MOTIVATION: BioPAX is a standard language for representing and exchanging models of biological processes at the molecular and cellular levels. It is widely used by different pathway databases and genomics data analysis software. Currently, the primary source of BioPAX data is direct exports from the curated pathway databases. It is still uncommon for wet-lab biologists to share and exchange pathway knowledge using BioPAX. Instead, pathways are usually represented as informal diagrams in the literature. In order to encourage formal representation of pathways, we describe a software package that allows users to create pathway diagrams using CellDesigner, a user-friendly graphical pathway-editing tool and save the pathway data in BioPAX Level 3 format. AVAILABILITY: The plug-in is freely available and can be downloaded at ftp://ftp.pantherdb.org/CellDesigner/plugins/BioPAX/ CONTACT: huaiyumi@usc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

20.
We examined the cumulative prevalences of 22 symptoms thought to reflect immune system function reported in a questionnaire mailed to 7616 Australian twins. The associations between symptoms and demographic variables were expressed in terms of polychoric or polyserial correlations, and a principal components analysis performed. Factors representing underlying propensities respectively to allergic disease, various minor infections, diseases associated with aging such as arthritis, skin disease, and respiratory tract infection were extracted. Possible processes underlying these symptom clusters and the relative strengths and weaknesses of this type of analysis are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号