首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present MetaRoute, an efficient search algorithm based on atom mapping rules and path weighting schemes that returns relevant or textbook-like routes between a source and a product metabolite within seconds for genome-scale networks. Its speed allows the algorithm to be used interactively through a web interface to visualize relevant routes and local networks for one or multiple organisms based on data from KEGG. AVAILABILITY: http://www-bs.informatik.uni-tuebingen.de/Services/MetaRoute. SUPPLEMENTARY INFORMATION: Supplementary details are available at http://www-bs.informatik.uni-tuebingen.de/Services/MetaRoute.  相似文献   

2.
3.
The whole genome shotgun approach to genome sequencing results in a collection of contigs that must be ordered and oriented to facilitate efficient gap closure. We present a new tool OSLay that uses synteny between matching sequences in a target assembly and a reference assembly to layout the contigs (or scaffolds) in the target assembly. The underlying algorithm is based on maximum weight matching. The tool provides an interactive visualization of the computed layout and the result can be imported into the assembly editing tool Consed to support the design of primer pairs for gap closure. MOTIVATION: To enhance efficiency in the gap closure phase of a genome project it is crucial to know which contigs are adjacent in the target genome. Related genome sequences can be used to layout contigs in an assembly. AVAILABILITY: OSLay is freely available from: http://www-ab.informatik.unituebingen.de/software/oslay.  相似文献   

4.
Phylogenetic trees based on gene content   总被引:2,自引:0,他引:2  
Comparing gene content between species can be a useful approach for reconstructing phylogenetic trees. In this paper, we derive a maximum-likelihood estimation of evolutionary distance between species under a simple model of gene genesis and gene loss. Using simulated data on a biological tree with 107 taxa (and on a number of randomly generated trees), we compare the accuracy of tree reconstruction using this ML distance measure to an earlier ad hoc distance. We then compare these distance-based approaches to a character-based tree reconstruction method (Dollo parsimony) which seems well suited to the analysis of gene content data. To simplify simulations, we give a formal proof of the well-known 'fact' that the Dollo parsimony score is independent of the choice of root. Our results show a consistent trend, with the character-based method and ML distance measure outperforming the earlier ad hoc distance method. AVAILABILITY: http://www.ab.informatik.uni-tuebingen.de/software/genecontent/welcome_en.html  相似文献   

5.
An important aspect of the functional annotation of enzymes is not only the type of reaction catalysed by an enzyme, but also the substrate specificity, which can vary widely within the same family. In many cases, prediction of family membership and even substrate specificity is possible from enzyme sequence alone, using a nearest neighbour classification rule. However, the combination of structural information and sequence information can improve the interpretability and accuracy of predictive models. The method presented here, Active Site Classification (ASC), automatically extracts the residues lining the active site from one representative three-dimensional structure and the corresponding residues from sequences of other members of the family. From a set of representatives with known substrate specificity, a Support Vector Machine (SVM) can then learn a model of substrate specificity. Applied to a sequence of unknown specificity, the SVM can then predict the most likely substrate. The models can also be analysed to reveal the underlying structural reasons determining substrate specificities and thus yield valuable insights into mechanisms of enzyme specificity. We illustrate the high prediction accuracy achieved on two benchmark data sets and the structural insights gained from ASC by a detailed analysis of the family of decarboxylating dehydrogenases. The ASC web service is available at http://asc.informatik.uni-tuebingen.de/.  相似文献   

6.
7.
8.
MOTIVATION: Functional annotation of unknown proteins is a major goal in proteomics. A key annotation is the prediction of a protein's subcellular localization. Numerous prediction techniques have been developed, typically focusing on a single underlying biological aspect or predicting a subset of all possible localizations. An important step is taken towards emulating the protein sorting process by capturing and bringing together biologically relevant information, and addressing the clear need to improve prediction accuracy and localization coverage. RESULTS: Here we present a novel SVM-based approach for predicting subcellular localization, which integrates N-terminal targeting sequences, amino acid composition and protein sequence motifs. We show how this approach improves the prediction based on N-terminal targeting sequences, by comparing our method TargetLoc against existing methods. Furthermore, MultiLoc performs considerably better than comparable methods predicting all major eukaryotic subcellular localizations, and shows better or comparable results to methods that are specialized on fewer localizations or for one organism. AVAILABILITY: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc/  相似文献   

9.
MOTIVATION: Knowing the localization of a protein within the cell helps elucidate its role in biological processes, its function and its potential as a drug target. Thus, subcellular localization prediction is an active research area. Numerous localization prediction systems are described in the literature; some focus on specific localizations or organisms, while others attempt to cover a wide range of localizations. RESULTS: We introduce SherLoc, a new comprehensive system for predicting the localization of eukaryotic proteins. It integrates several types of sequence and text-based features. While applying the widely used support vector machines (SVMs), SherLoc's main novelty lies in the way in which it selects its text sources and features, and integrates those with sequence-based features. We test SherLoc on previously used datasets, as well as on a new set devised specifically to test its predictive power, and show that SherLoc consistently improves on previous reported results. We also report the results of applying SherLoc to a large set of yet-unlocalized proteins. AVAILABILITY: SherLoc, along with Supplementary Information, is available at: http://www-bs.informatik.uni-tuebingen.de/Services/SherLoc/  相似文献   

10.
Faspad is a user-friendly tool that detects candidates for linear signaling pathways in protein interaction networks based on an approach by Scott et al. (Journal of Computational Biology, 2006). Using recent algorithmic insights, it can solve the underlying NP-hard problem quite fast: for protein networks of typical size (several thousand nodes), pathway candidates of length up to 13 proteins can be found within seconds and with a 99.9% probability of optimality. Faspad graphically displays all candidates that are found; for evaluation and comparison purposes, an overlay of several candidates and the surrounding network context can also be shown. Availability: Faspad is available as free software under the GPL license at http://theinf1.informatik.uni-jena.de/faspad/ and runs under Linux and Windows.  相似文献   

11.
MOTIVATION: Typesetting, shading and labeling of nucleotide and peptide alignments using standard word processing or graphics software is time consuming. Available automatic sequence shading programs usually do not allow manual application of additional shadings or labels. Hence, a flexible alignment shading package was designed for both calculated and manual shading, using the macro language of the scientific typesetting software LATEX2 epsilon. RESULTS: TEXshade is the first TEX-based alignment shading software featuring, in addition to standard identity and similarity shading, special modes for the display of functional aspects such as charge, hydropathy or solvent accessibility. A plenitude of commands for manual shading, graphical labels, re-arrangements of the sequence order, numbering, legends etc. is implemented. Further, TEXshade allows the inclusion and display of secondary structure predictions in the DSSP-, STRIDE- and PHD-format. AVAILABILITY: From http://homepages.uni-tuebingen.de/beitz/tse.h tml (macro package and on-line documentation) CONTACT: eric.beitz@uni-tuebingen.de  相似文献   

12.
Genome-wide association studies (GWAS) led to the identification of numerous novel loci for a number of complex diseases. Pathway-based approaches using genotypic data provide tangible leads which cannot be identified by single marker approaches as implemented in GWAS. The available pathway analysis approaches mainly differ in the employed databases and in the applied statistics for determining the significance of the associated disease markers.So far, pathway-based approaches using GWAS data failed to consider the overlapping of genes among different pathways or the influence of protein–interactions. We performed a multistage integrative pathway (MIP) analysis on three common diseases - Crohn''s disease (CD), rheumatoid arthritis (RA) and type 1 diabetes (T1D) - incorporating genotypic, pathway, protein- and domain-interaction data to identify novel associations between these diseases and pathways. Additionally, we assessed the sensitivity of our method by studying the influence of the most significant SNPs on the pathway analysis by removing those and comparing the corresponding pathway analysis results. Apart from confirming many previously published associations between pathways and RA, CD and T1D, our MIP approach was able to identify three new associations between disease phenotypes and pathways. This includes a relation between the influenza-A pathway and RA, as well as a relation between T1D and the phagosome and toxoplasmosis pathways. These results provide new leads to understand the molecular underpinnings of these diseases.The developed software herein used is available at http://www.cogsys.cs.uni-tuebingen.de/software/GWASPathwayIdentifier/index.htm.  相似文献   

13.
Rational design of epitope-driven vaccines is a key goal of immunoinformatics. Typically, candidate selection relies on the prediction of MHC-peptide binding only, as this is known to be the most selective step in the MHC class I antigen processing pathway. However, proteasomal cleavage and transport by the transporter associated with antigen processing (TAP) are essential steps in antigen processing as well. While prediction methods exist for the individual steps, no method has yet offered an integrated prediction of all three major processing events. Here we present WAPP, a method combining prediction of proteasomal cleavage, TAP transport, and MHC binding into a single prediction system. The proteasomal cleavage site prediction employs a new matrix-based method that is based on experimentally verified proteasomal cleavage sites. Support vector regression is used for predicting peptides transported by TAP. MHC binding is the last step in the antigen processing pathway and was predicted using a support vector machine method, SVMHC. The individual methods are combined in a filtering approach mimicking the natural processing pathway. WAPP thus predicts peptides that are cleaved by the proteasome at the C terminus, transported by TAP, and show significant affinity to MHC class I molecules. This results in a decrease in false positive rates compared to MHC binding prediction alone. Compared to prediction of MHC binding only, we report an increased overall accuracy and a lower rate of false positive predictions for the HLA-A*0201, HLA-B*2705, HLA-A*01, and HLA-A*03 alleles using WAPP. The method is available online through our prediction server at http://www-bs.informatik.uni-tuebingen.de/WAPP  相似文献   

14.
15.
Central to Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-Cas systems are repeated RNA sequences that serve as Cas-protein–binding templates. Classification is based on the architectural composition of associated Cas proteins, considering repeat evolution is essential to complete the picture. We compiled the largest data set of CRISPRs to date, performed comprehensive, independent clustering analyses and identified a novel set of 40 conserved sequence families and 33 potential structure motifs for Cas-endoribonucleases with some distinct conservation patterns. Evolutionary relationships are presented as a hierarchical map of sequence and structure similarities for both a quick and detailed insight into the diversity of CRISPR-Cas systems. In a comparison with Cas-subtypes, I-C, I-E, I-F and type II were strongly coupled and the remaining type I and type III subtypes were loosely coupled to repeat and Cas1 evolution, respectively. Subtypes with a strong link to CRISPR evolution were almost exclusive to bacteria; nevertheless, we identified rare examples of potential horizontal transfer of I-C and I-E systems into archaeal organisms. Our easy-to-use web server provides an automated assignment of newly sequenced CRISPRs to our classification system and enables more informed choices on future hypotheses in CRISPR-Cas research: http://rna.informatik.uni-freiburg.de/CRISPRmap.  相似文献   

16.
MOTIVATION: Microarray experiments generate vast amounts of data. The unknown or only partially known functional context of differentially expressed genes may be assessed by querying the Gene Ontology database via GOMiner. Resulting tree representations are difficult to interpret and are not suited for visualization of this type of data. Methods are needed to effectively visualize these complex set relationships. RESULTS: We present a visualization approach for set relationships based on Venn diagrams. The proposed extension enhances the usual notion of Venn diagrams by incorporating set size information. The cardinality of the sets and intersection sets is represented by their corresponding circle (polygon) sizes. To avoid local minima, solutions to this problem are sought by evolutionary optimization. This generalized Venn diagram approach has been implemented as an interactive Java application (VennMaster) specifically designed for use with GOMiner in the context of the Gene Ontology database. AVAILABILITY: VennMaster is platform-independent (Java 1.4.2) and has been tested on Windows (XP, 2000), Mac OS X, and Linux. Supplementary information and the software (free for non-commercial use) are available at http://www.informatik.uni-ulm.de/ni/mitarbeiter/HKestler/vennm together with a user documentation. CONTACT: hans.kestler@medizin.uni-ulm.de.  相似文献   

17.
MOTIVATION: DNA copy number aberrations are frequently found in different types of cancer. Recent developments of microarray-based approaches have broadened the knowledge on number and structure of such aberrations. High-density single nucleotide polymorphism (SNP) microarrays provide an extremely high resolution with up to 500,000 SNPs per genome. Owing to the enormous amount of data the detection of common aberrations in large datasets is a great challenge. We describe a novel open source software tool--IdeogramBrowser--which was specifically designed for use with the Affymetrix SNP arrays. It provides an interactive karyotypic visualization of multiple aberration profiles and direct links to GeneCards. Visualization of consensus regions together with gene representation allows the explorative assessment of the data. AVAILABILITY: IdeogramBrowser and its source code are freely available under a creative commons license and can be obtained from http://www.informatik.uni-ulm.de/ni/staff/HKestler/ideo/. IdeogramBrowser is a platform independent Java application.  相似文献   

18.
Mayday is a workbench for visualization, analysis and storage of microarray data. It features a graphical user interface and supports the development and integration of existing and new analysis methods. Besides the infrastructural core functionality, Mayday offers a variety of plug-ins, such as various interactive viewers, a connection to the R statistical environment, a connection to SQL-based databases and different data mining methods, including WEKA-library based methods for classification and various clustering methods. In addition, so-called meta information objects are provided for annotation of the microarray data allowing integration of data from different sources, which is a feature that, for instance, is employed in the enhanced heatmap visualization. Supplementary information: The software and more detailed information including screenshots and a user guide as well as test data can be found on the Mayday home page http://www.zbit.uni-tuebingen.de/pas/mayday. The core is published under the GPL (GNU Public License) and the associated plug-ins under the LGPL (Lesser GNU Public License).  相似文献   

19.
The biomedical literature contains a wealth of information on associations between many different types of objects, such as protein-protein interactions, gene-disease associations and subcellular locations of proteins. When searching such information using conventional search engines, e.g. PubMed, users see the data only one-abstract at a time and 'hidden' in natural language text. AliBaba is an interactive tool for graphical summarization of search results. It parses the set of abstracts that fit a PubMed query and presents extracted information on biomedical objects and their relationships as a graphical network. AliBaba extracts associations between cells, diseases, drugs, proteins, species and tissues. Several filter options allow for a more focused search. Thus, researchers can grasp complex networks described in various articles at a glance. AVAILABILITY: http://alibaba.informatik.hu-berlin.de/  相似文献   

20.
SUMMARY: We present the web-based program CREx for heuristically determining pairwise rearrangement events in unichromosomal genomes. CREx considers transpositions, reverse transpositions, reversals and tandem-duplication-random-loss (TDRL) events. It supports the user in finding parsimonious rearrangement scenarios given a phylogenetic hypothesis. CREx is based on common intervals, which reflect genes that appear consecutively in several of the input gene orders. AVAILABILITY: CREx is freely available at http://pacosy.informatik.uni-leipzig.de/crex  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号