首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
F Corpet  J Gouzy    D Kahn 《Nucleic acids research》1999,27(1):263-267
The ProDom database contains protein domain families generated from the SWISS-PROT database by automated sequence comparisons. The current version was built with a new improved procedure based on recursive PSI-BLAST homology searches. ProDom can be searched on the World Wide Web to study domain arrangements within either known families or new proteins, with the help of a user-friendly graphical interface (http://www.toulouse.inra.fr/prodom.html). Recent improvements to the ProDom server include: ProDom queries under the SRS Sequence Retrieval System; links to the PredictProtein server; phylogenetic trees and condensed multiple alignments for a better representation of large domain families, with zooming in and out capabilities. In addition, a similar server was set up to display the outcome of whole genome domain analysis as applied to 17 completed microbial genomes (http://www.toulouse.inra.fr/prodomCG.html ).  相似文献   

2.
ProDom contains all protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases (http://www. toulouse.inra.fr/prodom.html ). ProDom-CG results from a similar domain analysis as applied to completed genomes (http://www.toulouse. inra.fr/prodomCG.html ). Recent improvements to the ProDom database and its server include: scaling up to include sequences from TrEMBL, addition of Pfam-A entries to the set of expert validated families, assignment of stable accession numbers, consistency indicators for domain families, domain arrangements of sub-families and links to Pfam-A.  相似文献   

3.
As the number of complete microbial genomes publicly available is still growing, the problem of annotation quality in these very large sequences remains unsolved. Indeed, the number of annotations associated with complete genomes is usually lower than those of the shorter entries encountered in the repository collections. Moreover, classical sequence database management systems have difficulties in handling entries of such size. In this context, the Enhanced Microbial Genomes Library (EMGLib) was developed to try to alleviate these problems. This library contains all the complete genomes from prokaryotes (bacteria and archaea) already sequenced and the yeast genome in GenBank format. The annotations are improved by the introduction of data on codon usage, gene orientation on the chromosome and gene families. It is possible to access EMGLib through two database systems set up on WWW servers: the PBIL server at http://pbil.univ-lyon1.fr/emglib.html and the MICADO server at http://locus.jouy.inra.fr/micado  相似文献   

4.
MOTIVATION: Identifying differentially regulated genes in experiments comparing two experimental conditions is often a key step in the microarray data analysis process. Many different approaches and methodological developments have been put forward, yet the question remains open. RESULTS: Varmixt is a powerful and efficient novel methodology for this task. It is based on a flexible and realistic variance modelling strategy. It compares favourably with other popular techniques (standard t-test, SAM and Cyber-T). The relevance of the approach is demonstrated with real-world and simulated datasets. The analysis strategy was successfully applied to both a 'two-colour' cDNA microarray and an Affymetrix Genechip. Strong control of false positive and false negative rates is proven in large simulation studies. AVAILABILITY: The R package is freely available at http://www.inapg.inra.fr/ens_rech/mathinfo/recherche/mathematique/outil.html CONTACT: delmar@inapg.inra.fr SUPPLEMENTARY INFORMATION: http://www.inapg.inra.fr/ens_rech/mathinfo/recherche/mathematique/outil.html.  相似文献   

5.
Database interconnection requires the development of links between related objects from different databases. We built a database of links, called Virgil, to manage and distribute rich (documented) links between GDB genes and GenBank human sequences. Virgil contains 18 667 unique links. In addition to a simple Web form for ad-hoc queries, we propose a generic Web interface and a prototype CORBA server for link distribution. Materials described in this paper are available from http://www.infobiogen.fr/services/virgil/home. html  相似文献   

6.
MOTIVATION: Most multiple sequence alignment programs use heuristics that sometimes introduce errors into the alignment. The most commonly used methods to correct these errors use iterative techniques to maximize an objective function. We present here an alternative, knowledge-based approach that combines a number of recently developed methods into a two-step refinement process. The alignment is divided horizontally and vertically to form a 'lattice' in which well aligned regions can be differentiated. Alignment correction is then restricted to the less reliable regions, leading to a more reliable and efficient refinement strategy. RESULTS: The accuracy and reliability of RASCAL is demonstrated using: (i) alignments from the BAliBASE benchmark database, where significant improvements were often observed, with no deterioration of the existing high-quality regions, (ii) a large scale study involving 946 alignments from the ProDom protein domain database, where alignment quality was increased in 68% of the cases; and (iii) an automatic pipeline to obtain a high-quality alignment of 695 full-length nuclear receptor proteins, which took 11 min on a DEC Alpha 6100 computer Availability: RASCAL is available at ftp://ftp-igbmc.u-strasbg.fr/pub/RASCAL. SUPPLEMENTARY INFORMATION: http://bioinfo-igbmc.u-strasbourg.fr/BioInfo/RASCAL/paper/rascal_supp.html  相似文献   

7.
The HuGeMap database stores the major genetic and physical maps of the human genome. HuGeMap is accessible on the Web at http://www. infobiogen.fr/services/Hugemap and through a CORBA server. A standard genome map data format for the interconnection of genome map databases was defined in collaboration with the EBI. The HuGeMap CORBA server provides this interconnection using the interface definition language IDL. Two graphical user interfaces were developed for the visualization of the HuGeMap data: ZoomMap (http://www.infobiogen.fr/services/zomit/Zoom Map.html) for navigation by zooming and data transformation via magic lenses, and MappetShow (http://www.infobiogen.fr/services/Mappet) for visualizing and comparing maps.  相似文献   

8.
9.
SUMMARY: The Kinase Sequence Database (KSD) located at http://kinase.ucsf.edu/ksd contains information on 290 protein kinase families derived by profile-based clustering of the non-redundant list of sequences obtained from a GenBank-wide search. Included in the database are a total of 5,041 protein kinases from over 100 organisms. Clustering into families is based on the extent of homology within the kinase catalytic domain (250-300 residues in length). Alignments of the families are viewed by interactive Excel-based sequence spreadsheets. In addition, KSD features evolutionary trees derived for each family and detailed information on each sequence as well as links to the corresponding GenBank entries. Sequence manipulation tools, such as evolutionary tree generation, novel sequence assignment, and statistical analysis, are also provided. AVAILABILITY: The kinase sequence database is a web-based service accessible at http://kinase.ucsf.edu/ksd CONTACT: buzko@cmp.ucsf.edu; shokat@cmp.ucsf.edu/ksd  相似文献   

10.
The Pfam Protein Families Database   总被引:17,自引:0,他引:17       下载免费PDF全文
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at http://www.sanger.ac.uk/Software/Pfam/, in Sweden at http://www.cgb.ki.se/Pfam/, in France at http://pfam.jouy.inra.fr/ and in the US at http://pfam.wustl.edu/. The latest version (6.6) of Pfam contains 3071 families, which match 69% of proteins in SWISS-PROT 39 and TrEMBL 14. Structural data, where available, have been utilised to ensure that Pfam families correspond with structural domains, and to improve domain-based annotation. Predictions of non-domain regions are now also included. In addition to secondary structure, Pfam multiple sequence alignments now contain active site residue mark-up. New search tools, including taxonomy search and domain query, greatly add to the functionality and usability of the Pfam resource.  相似文献   

11.
ProClass is a protein family database that organizes non-redundant sequence entries into families defined collectively by PIR superfamilies and PROSITE patterns. By combining global similarities and functional motifs into a single classification scheme, ProClass helps to reveal domain and family relationships and classify multi-domain proteins. The database currently consists of >155 000 sequence entries retrieved from both PIR-International and SWISS-PROT databases. Approximately 92 000 or 60% of the ProClass entries are classified into approximately 6000 families, including a large number of new members detected by our GeneFIND family identification system. The ProClass motif collection contains approximately 72 000 motif sequences and >1300 multiple alignments for all PROSITE patterns, including >21 000 matches not listed in PROSITE and mostly detected from unique PIR sequences. To maximize family information retrieval, the database provides links to various protein family, domain, alignment and structural class databases. With its high classification rate and comprehensive family relationships, ProClass can be used to support full-scale genomic annotation. The database, now being implemented in an object-relational database management system, is available for online sequence search and record retrieval from our WWW server at http://pir.georgetown.edu/gfserver/proclass.html  相似文献   

12.
The ProDom database is a comprehensive set of protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases. An associated database, ProDom-CG, has been derived as a restriction of ProDom to completely sequenced genomes. The ProDom construction method is based on iterative PSI-BLAST searches and multiple alignments are generated for each domain family. The ProDom web server provides the user with a set of tools to visualise multiple alignments, phylogenetic trees and domain architectures of proteins, as well as a BLAST-based server to analyse new sequences for homologous domains. The comprehensive nature of ProDom makes it particularly useful to help sustain the growth of InterPro.  相似文献   

13.
HuGeMap: a distributed and integrated Human Genome Map database.   总被引:1,自引:0,他引:1       下载免费PDF全文
The HuGeMap database stores the major genetic and physical maps of the human genome. It is also interconnected with the gene radiation hybrid mapping database RHdb. HuGeMap is accessible through a Web server for interactive browsing at URL http://www.infobiogen. fr/services/Hugemap , as well as through a CORBA server for effective programming. HuGeMap is intended as an attempt to build open, interconnected databases, that is databases that distribute their objects worldwide in compliance with a recognized standard of distribution. Maps can be displayed and compared with a java applet (http://babbage.infobiogen.fr:15000/Mappet/Show. html ) that queries the HuGeMap ORB server as well as the RHdb ORB server at the EBI.  相似文献   

14.
SUMMARY: CRH_Server is an on line Comparative and Radiation Hybrid mapping Server dedicated to canine genomics. CRH_Server allows users to compute their own RH data using the current canine RH map, and allows comparative dog/human mapping analyses. Finally, it suggests multiple options for storage and queries of the dog RH database. AVAILABILITY: http://idefix.univ-rennes1.fr:8080/Dogs/rh-server.html. SUPPLEMENTARY INFORMATION: All information is available at http://idefix.univ-rennes1.fr:8080/Dogs/help_rh-server.html.  相似文献   

15.
We have built a database of sequences phylogenetically related to cholinesterases (ESTHER) for esterases, alpha/beta hydrolase enzymes and relatives). These sequences define a homogeneous group of enzymes (carboxylesterases, lipases and hormone-sensitive lipases) with some related proteins devoid of enzymatic activity. The purpose of ESTHER is to help comparison and alignment of any new sequence appearing in the field, to favour mutation analysis of structure-function relationships and to allow structural data recovery. ESTHER is a World Wide Web server with the URL http://www.montpellier.inra.fr:70/cholinesterase.  相似文献   

16.
The CluSTr (Clusters of SWISS-PROT and TrEMBL proteins) database offers an automatic classification of SWISS-PROT and TrEMBL proteins into groups of related proteins. The clustering is based on analysis of all pairwise comparisons between protein sequences. Analysis has been carried out for different levels of protein similarity, yielding a hierarchical organisation of clusters. The database provides links to InterPro, which integrates information on protein families, domains and functional sites from PROSITE, PRINTS, Pfam and ProDom. Links to the InterPro graphical interface allow users to see at a glance whether proteins from the cluster share particular functional sites. CluSTr also provides cross-references to HSSP and PDB. The database is available for querying and browsing at http://www.ebi.ac.uk/clustr.  相似文献   

17.
InterProScan is a tool that scans given protein sequences against the protein signatures of the InterPro member databases, currently--PROSITE, PRINTS, Pfam, ProDom and SMART. The number of signature databases and their associated scanning tools as well as the further refinement procedures make the problem complex. InterProScan is designed to be a scalable and extensible system with a robust internal architecture. AVAILABILITY: The Perl-based InterProScan implementation is available from the EBI ftp server (ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/) and the SRS-basedInterProScan is available upon request. We provide the public web interface (http://www.ebi.ac.uk/interpro/scan.html) as well as email submission server (interproscan@ebi.ac.uk).  相似文献   

18.
Structural comparison reveals remote homology that often fails to be detected by sequence comparison. The DALI web server ( http://ekhidna2.biocenter.helsinki.fi/dali ) is a platform for structural analysis that provides database searches and interactive visualization, including structural alignments annotated with secondary structure, protein families and sequence logos, and 3D structure superimposition supported by color-coded sequence and structure conservation. Here, we are using DALI to mine the AlphaFold Database version 1, which increased the structural coverage of protein families by 20%. We found 100 remote homologous relationships hitherto unreported in the current reference database for protein domains, Pfam 35.0. In particular, we linked 35 domains of unknown function (DUFs) to the previously characterized families, generating a functional hypothesis that can be explored downstream in structural biology studies. Other findings include gene fusions, tandem duplications, and adjustments to domain boundaries. The evidence for homology can be browsed interactively through live examples on DALI's website.  相似文献   

19.
The structure of many proteins consists of a combination of discrete modules that have been shuffled during evolution. Such modules can frequently be recognized from the analysis of homology. Here we present a systematic analysis of the modular organization of all sequenced proteins. To achieve this we have developed an automatic method to identify protein domains from sequence comparisons. Homologous domains can then be clustered into consistent families. The method was applied to all 21,098 nonfragment protein sequences in SWISS-PROT 21.0, which was automatically reorganized into a comprehensive protein domain database, ProDom. We have constructed multiple sequence alignments for each domain family in ProDom, from which consensus sequences were generated. These nonreduntant domain consensuses are useful for fast homology searches. Domain organization in ProDom is exemplified for proteins of the phosphoenolpyruvate:sugar phosphotransferase system (PEP:PTS) and for bacterial 2-component regulators. We provide 2 examples of previously unrecognized domain arrangements discovered with the help of ProDom.  相似文献   

20.
SUMMARY: Emerging web-services technology allows interoperability between multiple distributed architectures. Here, we present REMORA, a web server implemented according to the BioMoby web-service specifications, providing life science researchers with an easy-to-use workflow generator and launcher, a repository of predefined workflows and a survey system. CONTACT: Jerome.Gouzy@toulouse.inra.fr AVAILABILITY: The REMORA web server is freely available at http://bioinfo.genopole-toulouse.prd.fr/remora, sources are available upon request from the authors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号