首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 20 毫秒
1.
KEGG Mapper for inferring cellular functions from protein sequences   总被引:1,自引:0,他引:1  
KEGG is a reference knowledge base for biological interpretation of large‐scale molecular datasets, such as genome and metagenome sequences. It accumulates experimental knowledge about high‐level functions of the cell and the organism represented in terms of KEGG molecular networks, including KEGG pathway maps, BRITE hierarchies, and KEGG modules. By the process called KEGG mapping, a set of protein coding genes in the genome, for example, can be converted to KEGG molecular networks enabling interpretation of cellular functions and other high‐level features. Here we report a new version of KEGG Mapper, a suite of KEGG mapping tools available at the KEGG website ( https://www.kegg.jp/ or https://www.genome.jp/kegg/ ), together with the KOALA family tools for automatic assignment of KO (KEGG Orthology) identifiers used in the mapping.  相似文献   

2.
3.
KEGG数据库在生物合成研究中的应用   总被引:1,自引:0,他引:1  
KEGG(Kyoto Encyclopedia of Genes and Genomes)提供了一个操作平台,即以基因组信息(GENES)和化学物质信息(LIGAND)为构建模块,通过代谢网络(PATHWAY)将基因组和生物系统联系起来,然后根据功能等级进行归纳分类(BRITE)。KEGG还为各种组学研究提供相关软件,用于代谢途径重建、遗传分析和化合物比对。作为一个综合数据库,KEGG不仅指导生物燃料、药物和新材料等生物基化学品的合成,而且致力于研究日趋严重的环境问题。系统介绍了KEGG数据库的结构、功能及其相关工具的最新进展,并展望在生物合成中的应用前景。  相似文献   

4.
The KEGG pathway maps are widely used as a reference data set for inferring high-level functions of the organism or the ecosystem from its genome or metagenome sequence data. The KEGG modules, which are tighter functional units often corresponding to subpathways in the KEGG pathway maps, are designed for better automation of genome interpretation. Each KEGG module is represented by a simple Boolean expression of KEGG Orthology (KO) identifiers (K numbers), enabling automatic evaluation of the completeness of genes in the genome. Here we focus on metabolic functions and introduce reaction modules for improving annotation and signature modules for inferring metabolic capacity. We also describe how genome annotation is performed in KEGG using the manually created KO database and the computationally generated SSDB database. The resulting KEGG GENES database with KO (K number) annotation is a reference sequence database to be compared for automated annotation and interpretation of newly determined genomes.  相似文献   

5.
随着生物信息学的飞速发展,与之相应的各种类型的数据库不断涌现,KEGG数据库便是其中之一。本文详细介绍了KEGG数据库的主要内容及其功能,并以运用该数据库Mat Inspector V2.2软件预测NGAL基因5’端转录调控区增强于元件为例说明了KEGG数据库在基因转录调控研究中的应用。  相似文献   

6.
7.
随着“蛋白质组学”的蓬勃发展和人类对生物大分子功能机制的知识积累,涌现出海量的蛋白质相互作用数据。随之,研究者开发了300多个蛋白质相互作用数据库,用于存储、展示和数据的重利用。蛋白质相互作用数据库是系统生物学、分子生物学和临床药物研究的宝贵资源。本文将数据库分为3类:(1)综合蛋白质相互作用数据库;(2)特定物种的蛋白质相互作用数据库;(3)生物学通路数据库。重点介绍常用的蛋白质相互作用数据库,包括BioGRID、STRING、IntAct、MINT、DIP、IMEx、HPRD、Reactome和KEGG等。  相似文献   

8.

Background  

Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary.  相似文献   

9.
In recent years, high‐throughput experimentation with quantitative analysis and modelling of cells, recently dubbed systems cell biology, has been harnessed to study the organisation and dynamics of simple biological systems. Here, we suggest that the peroxisome, a fascinating dynamic organelle, can be used as a good candidate for studying a complete biological system. We discuss several aspects of peroxisomes that can be studied using high‐throughput systematic approaches and be integrated into a predictive model. Such approaches can be used in the future to study and understand how a more complex biological system, like a cell and maybe even ultimately a whole organism, works.  相似文献   

10.

Background  

The goal of information integration in systems biology is to combine information from a number of databases and data sets, which are obtained from both high and low throughput experiments, under one data management scheme such that the cumulative information provides greater biological insight than is possible with individual information sources considered separately.  相似文献   

11.
Toxicogenomics is a rapidly developing discipline that promises to aid scientists in understanding the molecular and cellular effects of chemicals in biological systems. This field encompasses global assessment of biological effects using technologies such as DNA microarrays or high throughput NMR and protein expression analysis. This review provides an overview of advancing multiple approaches (genomic, proteomic, metabonomic) that may extend our understanding of toxicology and highlights the importance of coupling such approaches with classical toxicity studies.  相似文献   

12.
The restriction of invasion biology to non‐native species has been laid down as one founding principle of the discipline by many researchers. However, this split between native and non‐native species is highly controversial. Using a phenomenological approach and a more pragmatic examination of biological invasions, the present paper discusses how this dichotomy has restricted the relevance of the field, both from theoretical and practical viewpoints. We advocate the emergence of a broader disciplinary field.  相似文献   

13.
High‐throughput ‐omics techniques have revolutionised biology, allowing for thorough and unbiased characterisation of the molecular states of biological systems. However, cellular decision‐making is inherently a unicellular process to which “bulk” ‐omics techniques are poorly suited, as they capture ensemble averages of cell states. Recently developed single‐cell methods bridge this gap, allowing high‐throughput molecular surveys of individual cells. In this review, we cover core concepts of analysis of single‐cell gene expression data and highlight areas of developmental biology where single‐cell techniques have made important contributions. These include understanding of cell‐to‐cell heterogeneity, the tracing of differentiation pathways, quantification of gene expression from specific alleles, and the future directions of cell lineage tracing and spatial gene expression analysis.  相似文献   

14.
With mounting availability of genomic and phenotypic databases, data integration and mining become increasingly challenging. While efforts have been put forward to analyze prokaryotic phenotypes, current computational technologies either lack high throughput capacity for genomic scale analysis, or are limited in their capability to integrate and mine data across different scales of biology. Consequently, simultaneous analysis of associations among genomes, phenotypes, and gene functions is prohibited. Here, we developed a high throughput computational approach, and demonstrated for the first time the feasibility of integrating large quantities of prokaryotic phenotypes along with genomic datasets for mining across multiple scales of biology (protein domains, pathways, molecular functions, and cellular processes). Applying this method over 59 fully sequenced prokaryotic species, we identified genetic basis and molecular mechanisms underlying the phenotypes in bacteria. We identified 3,711 significant correlations between 1,499 distinct Pfam and 63 phenotypes, with 2,650 correlations and 1,061 anti-correlations. Manual evaluation of a random sample of these significant correlations showed a minimal precision of 30% (95% confidence interval: 20%-42%; n = 50). We stratified the most significant 478 predictions and subjected 100 to manual evaluation, of which 60 were corroborated in the literature. We furthermore unveiled 10 significant correlations between phenotypes and KEGG pathways, eight of which were corroborated in the evaluation, and 309 significant correlations between phenotypes and 166 GO concepts evaluated using a random sample (minimal precision = 72%; 95% confidence interval: 60%-80%; n = 50). Additionally, we conducted a novel large-scale phenomic visualization analysis to provide insight into the modular nature of common molecular mechanisms spanning multiple biological scales and reused by related phenotypes (metaphenotypes). We propose that this method elucidates which classes of molecular mechanisms are associated with phenotypes or metaphenotypes and holds promise in facilitating a computable systems biology approach to genomic and biomedical research.  相似文献   

15.
The outcomes of pathway database computations depend on pathway ontology   总被引:3,自引:0,他引:3  
Different biological notions of pathways are used in different pathway databases. Those pathway ontologies significantly impact pathway computations. Computational users of pathway databases will obtain different results depending on the pathway ontology used by the databases they employ, and different pathway ontologies are preferable for different end uses. We explore differences in pathway ontologies by comparing the BioCyc and KEGG ontologies. The BioCyc ontology defines a pathway as a conserved, atomic module of the metabolic network of a single organism, i.e. often regulated as a unit, whose boundaries are defined at high-connectivity stable metabolites. KEGG pathways are on average 4.2 times larger than BioCyc pathways, and combine multiple biological processes from different organisms to produce a substrate-centered reaction mosaic. We compared KEGG and BioCyc pathways using genome context methods, which determine the functional relatedness of pairs of genes. For each method we employed, a pair of genes randomly selected from a BioCyc pathway is more likely to be related by that method than is a pair of genes randomly selected from a KEGG pathway, supporting the conclusion that the BioCyc pathway conceptualization is closer to a single conserved biological process than is that of KEGG.  相似文献   

16.
The nematode Caenorhabditis elegans is used extensively by scientists to study a wide variety of biological processes and is one of the most thoroughly characterized animals. Over the years, the community of C. elegans researchers has generated a wealth of information on the genetics, development, behaviour, and cellular and molecular biology of the worm. This body of data has grown even larger with the recent application of high throughput screening methodology to study gene function, expression and interactions. WormBase (http://www.wormbase.org) is the primary online source of biological data on C. elegans and related nematodes. Equipped with an assortment of powerful search tools, WormBase allows users to quickly extract a variety of information, including data on individual genes, DNA sequence, cell lineage and literature citations. As the database is well maintained and the functionalities constantly modified in response to evolving researcher needs, WormBase has become a vital component of the laboratories studying the worm and a model for other biological databases.  相似文献   

17.
18.
Abstract

Excessive or inappropriate activation of cell surface receptors can mediate the development of disease. Receptors, therefore, are a focus for drug discovery activities. Empirical screening is important in the search for novel compounds acting at receptors. Technical developments and the application of molecular biology have facilitated access to receptors of interest and have provided efficient screening methods capable of very high throughput. Reliability in high throughput screening requires the use of appropriate methodology, good screen design and effective validation and quality control processes. Validation should aim to establish that the basic experimental design is sound. In developing software to handle high throughput screening data, a fundamental requirement is to provide performance monitoring and error trapping facilities. Additional requirements are automatic data capture from instruments, on-line data reduction and analysis and transfer of results to central databases. As data volumes increase through effective high throughput screening, conventional interrogation methods become less appropriate and are being augmented by newer computing techniques referred to as knowledge mapping or database mining. Targeting cell surface receptors has been very successful as an approach to drug discovery. If the challenges of high throughput empirical screening are addressed effectively, cell surface receptors will provide new opportunities for improved therapy in the coming years.  相似文献   

19.
There is a tendency that a unit of enzyme genes in an operon-like structure in the prokaryotic genome encodes enzymes that catalyze a series of consecutive reactions in a metabolic pathway. Our recent analysis shows that this and other genomic units correspond to chemical units reflecting chemical logic of organic reactions. From all known metabolic pathways in the KEGG database we identified chemical units, called reaction modules, as the conserved sequences of chemical structure transformation patterns of small molecules. The extracted patterns suggest co-evolution of genomic units and chemical units. While the core of the metabolic network may have evolved with mechanisms involving individual enzymes and reactions, its extension may have been driven by modular units of enzymes and reactions.  相似文献   

20.
Kebing Yu  Arthur R. Salomon 《Proteomics》2010,10(11):2113-2122
Recent advances in the speed and sensitivity of mass spectrometers and in analytical methods, the exponential acceleration of computer processing speeds, and the availability of genomic databases from an array of species and protein information databases have led to a deluge of proteomic data. The development of a lab‐based automated proteomic software platform for the automated collection, processing, storage, and visualization of expansive proteomic data sets is critically important. The high‐throughput autonomous proteomic pipeline described here is designed from the ground up to provide critically important flexibility for diverse proteomic workflows and to streamline the total analysis of a complex proteomic sample. This tool is composed of a software that controls the acquisition of mass spectral data along with automation of post‐acquisition tasks such as peptide quantification, clustered MS/MS spectral database searching, statistical validation, and data exploration within a user‐configurable lab‐based relational database. The software design of high‐throughput autonomous proteomic pipeline focuses on accommodating diverse workflows and providing missing software functionality to a wide range of proteomic researchers to accelerate the extraction of biological meaning from immense proteomic data sets. Although individual software modules in our integrated technology platform may have some similarities to existing tools, the true novelty of the approach described here is in the synergistic and flexible combination of these tools to provide an integrated and efficient analysis of proteomic samples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号