首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 468 毫秒
1.
MOTIVATION: Determining gene function is an important challenge arising from the availability of whole genome sequences. Until recently, approaches based on sequence homology were the only high-throughput method for predicting gene function. Use of high-throughput generated experimental data sets for determining gene function has been limited for several reasons. RESULTS: Here a new approach is presented for integration of high-throughput data sets, leading to prediction of function based on relationships supported by multiple types and sources of data. This is achieved with a database containing 125 different high-throughput data sets describing phenotypes, cellular localizations, protein interactions and mRNA expression levels from Saccharomyces cerevisiae, using a bit-vector representation and information content-based ranking. The approach takes characteristic and qualitative differences between the data sets into account, is highly flexible, efficient and scalable. Database queries result in predictions for 543 uncharacterized genes, based on multiple functional relationships each supported by at least three types of experimental data. Some of these are experimentally verified, further demonstrating their reliability. The results also generate insights into the relative merits of different data types and provide a coherent framework for functional genomic datamining. AVAILABILITY: Free availability over the Internet. CONTACT: f.c.p.holstege@med.uu.nl SUPPLEMENTARY INFORMATION: http://www.genomics.med.uu.nl/pub/pk/comb_gen_network.  相似文献   

2.
The repeated occurrence of genes in each other’s neighbourhood on genomes has been shown to indicate a functional association between the proteins they encode. Here we introduce STRING (search tool for recurring instances of neighbouring genes), a tool to retrieve and display the genes a query gene repeatedly occurs with in clusters on the genome. The tool performs iterative searches and visualises the results in their genomic context. By finding the genomically associated genes for a query, it delineates a set of potentially functionally associated genes. The usefulness of STRING is illustrated with an example that suggests a functional context for an RNA methylase with unknown specificity. STRING is available at http://www.bork.embl-heidelberg.de/STRING  相似文献   

3.
原核生物操纵子结构的准确注释对基因功能和基因调控网络的研究具有重要意义,通过生物信息学方法计算预测是当前基因组操纵子结构注释的最主要来源.当前的预测算法大都需要实验确认的操纵子作为训练集,但实验确认的操纵子数据的缺乏一直成为发展算法的瓶颈.基于对操纵子结构的认识,从基因间距离、转录翻译相关的调控信号以及COG功能注释等特征出发,建立了描述操纵子复杂结构的概率模型,并提出了不依赖于特定物种操纵子数据作为训练集的迭代自学习算法.通过对实验验证的操纵子数据集的测试比较,结果表明算法对于预测操纵子结构非常有效.在不依赖于任何已知操纵子信息的情况下,算法在总体预测水平上超过了目前最好的操纵子预测方法,而且这种自学习的预测算法要优于依赖特定物种进行训练的算法.这些特点使得该算法能够适用于新测序的物种,有别于当前常用的操纵子预测方法.对细菌和古细菌的基因组进行大规模比较分析,进一步提高了对基因组操纵子结构的普遍特征和物种特异性的认识.  相似文献   

4.
GeConT: gene context analysis   总被引:5,自引:1,他引:4  
SUMMARY: The fact that adjacent genes in bacteria are often functionally related is widely known. GeConT (Gene Context Tool) is a web interface designed to visualize genome context of a gene or a group of genes and their orthologs in all the completely sequenced genomes. The graphical information of GeConT can be used to analyze genome annotation, functional ortholog identification or to verify the genomic context congruence of any set of genes that share a common property. AVAILABILITY: http://www.ibt.unam.mx/biocomputo/gecont.html  相似文献   

5.
We have developed a web-based system (Pathway Miner) for visualizing gene expression profiles in the context of biological pathways. Pathway Miner catalogs genes based on their role in metabolic, cellular and regulatory pathways. A Fisher exact test is provided as an option to rank pathways. The genes are mapped onto pathways and gene product association networks are extracted for genes that co-occur in pathways. The networks can be filtered for analysis based on user-selected options. AVAILABILITY: Pathway Miner is a freely available web accessible tool at http://www.biorag.org/pathway.html  相似文献   

6.
Accurate prediction of operons can improve the functional annotation and application of genes within operons in prokaryotes. Here, we review several features: (i) intergenic distance, (ii) metabolic pathways, (iii) homologous genes, (iv) promoters and terminators, (v) gene order conservation, (vi) microarray, (vii) clusters of orthologous groups, (viii) gene length ratio, (ix) phylogenetic profiles, (x) operon length/size and (xi) STRING database scores, as well as some other features, which have been applied in recent operon prediction methods in prokaryotes in the literature. Based on a comparison of the prediction performances of these features, we conclude that other, as yet undiscovered features, or feature selection with a receiver operating characteristic analysis before algorithm processing can improve operon prediction in prokaryotes.  相似文献   

7.
8.
SUMMARY: We describe a tool, called ACE-it (Array CGH Expression integration tool). ACE-it links the chromosomal position of the gene dosage measured by array CGH to the genes measured by the expression array. ACE-it uses this link to statistically test whether gene dosage affects RNA expression. AVAILABILITY: ACE-it is freely available at http://ibivu.cs.vu.nl/programs/acewww/.  相似文献   

9.
GATHER: a systems approach to interpreting genomic signatures   总被引:1,自引:0,他引:1  
MOTIVATION: Understanding the full meaning of the biology captured in molecular profiles, within the context of the entire biological system, cannot be achieved with a simple examination of the individual genes in the signature. To facilitate such an understanding, we have developed GATHER, a tool that integrates various forms of available data to elucidate biological context within molecular signatures produced from high-throughput post-genomic assays. RESULTS: Analyzing the Rb/E2F tumor suppressor pathway, we show that GATHER identifies critical features of the pathway. We further show that GATHER identifies common biology in a series of otherwise unrelated gene expression signatures that each predict breast cancer outcome. We quantify the performance of GATHER and find that it successfully predicts 90% of the functions over a broad range of gene groups. We believe that GATHER provides an essential tool for extracting the full value from molecular signatures generated from genome-scale analyses. AVAILABILITY: GATHER is available at http://gather.genome.duke.edu/  相似文献   

10.
11.
12.
13.
MOTIVATION: Microarray technology enables large-scale inference of the participation of genes in biological process from similar expression profiles. Our aim is to induce classificatory models from expression data and biological knowledge that can automatically associate genes with novel hypotheses of biological process. RESULTS: We report a systematic supervised learning approach to predicting biological process from time series of gene expression data and biological knowledge. Biological knowledge is expressed using gene ontology and this knowledge is associated with discriminatory expression-based features to form minimal decision rules. The resulting rule model is first evaluated on genes coding for proteins with known biological process roles using cross validation. Then it is used to generate hypotheses for genes for which no knowledge of participation in biological process could be found. The theoretical foundation for the methodology based on rough sets is outlined in the paper, and its practical application demonstrated on a data set previously published by Cho et al. (Nat. Genet., 27, 48-54, 2001). AVAILABILITY: The Rosetta system is available at http://www.idi.ntnu.no/~aleks/rosetta. SUPPLEMENTARY INFORMATION: http://www.lcb.uu.se/~hvidsten/bioinf_cho/  相似文献   

14.
15.
In prokaryotes the twin-arginine translocase (Tat) is a unique transport system for the export of folded proteins. The Tat pathway is usually involved in the export of a small proportion of extracytoplasmic proteins. An exception is found in halophilic archaea, in which the majority of secretory proteins have been predicted to be Tat-dependent. All haloarchaea analysed to date contain two genes encoding homologues of the Tat-component TatC. In all of these cases both genes are located adjacently on the chromosome, indicating that they form a functional unit. We show that this gene cluster is essential for viability in haloarchaea, which is in complete contrast to all other prokaryotes that have been tested thus far.  相似文献   

16.
Comprehensive characterization of a gene's impact on phenotypes requires knowledge of the context of the gene. To address this issue we introduce a systematic data integration method Candidate Genes and SNPs (CANGES) that links SNP and linkage disequilibrium data to pathway- and protein-protein interaction information. It can be used as a knowledge discovery tool for the search of disease associated causative variants from genome-wide studies as well as to generate new hypotheses on synergistically functioning genes. We demonstrate the utility of CANGES by integrating pathway and protein-protein interaction data to identify putative functional variants for (i) the p53 gene and (ii) three glioblastoma multiforme (GBM) associated risk genes. For the GBM case, we further integrate the CANGES results with clinical and genome-wide data for 209 GBM patients and identify genes having effects on GBM patient survival. Our results show that selecting a focused set of genes can result in information beyond the traditional genome-wide association approaches. Taken together, holistic approach to identify possible interacting genes and SNPs with CANGES provides a means to rapidly identify networks for any set of genes and generate novel hypotheses. CANGES is available in http://csbi.ltdk.helsinki.fi/CANGES/  相似文献   

17.
Codon adaptation index as a measure of dominating codon bias   总被引:9,自引:0,他引:9  
We propose a simple algorithm to detect dominating synonymous codon usage bias in genomes. The algorithm is based on a precise mathematical formulation of the problem that lead us to use the Codon Adaptation Index (CAI) as a 'universal' measure of codon bias. This measure has been previously employed in the specific context of translational bias. With the set of coding sequences as a sole source of biological information, the algorithm provides a reference set of genes which is highly representative of the bias. This set can be used to compute the CAI of genes of prokaryotic and eukaryotic organisms, including those whose functional annotation is not yet available. An important application concerns the detection of a reference set characterizing translational bias which is known to correlate to expression levels; in this case, the algorithm becomes a key tool to predict gene expression levels, to guide regulatory circuit reconstruction, and to compare species. The algorithm detects also leading-lagging strands bias, GC-content bias, GC3 bias, and horizontal gene transfer. The approach is validated on 12 slow-growing and fast-growing bacteria, Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster. AVAILABILITY: http://www.ihes.fr/~materials.  相似文献   

18.
A new method to measure the semantic similarity of GO terms   总被引:4,自引:0,他引:4  
  相似文献   

19.
PiNGO is a tool to screen biological networks for candidate genes, i.e. genes predicted to be involved in a biological process of interest. The user can narrow the search to genes with particular known functions or exclude genes belonging to particular functional classes. PiNGO provides support for a wide range of organisms and Gene Ontology classification schemes, and it can easily be customized for other organisms and functional classifications. PiNGO is implemented as a plugin for Cytoscape, a popular network visualization platform. AVAILABILITY: PiNGO is distributed as an open-source Java package under the GNU General Public License (http://www.gnu.org/), and can be downloaded via the Cytoscape plugin manager. A detailed user guide and tutorial are available on the PiNGO website (http://www.psb.ugent.be/esb/PiNGO.  相似文献   

20.
EXProt (database for EXPerimentally verified Protein functions) is a new non-redundant database containing protein sequences for which the function has been experimentally verified. It is a selection of 3976 entries from the Prokaryotes section of the EMBL Nucleotide Sequence Database, Release 66, and 375 entries from the Pseudomonas Community Annotation Project (PseudoCAP). The entries in EXProt all have a unique ID number and provide information about the organism, protein sequence, functional annotation, link to entry in original database, and if known, gene name and link to references in PubMed/Medline. The EXProt web page (http://www.cmbi.nl/EXProt) provides further details of the database and a link to a BLAST search (blastp & blastx) of the database. The EXProt entries are indexed in SRS (http://www.cmbi.nl/srs/) and can be searched by means of keywords. Authors can be reached by email (exprot(cmbi.kun.nl).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号