首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
The automated sequence annotation pipeline (ASAP) is designed to ease routine investigation of new functional annotations on unknown sequences, such as expressed sequence tags (ESTs), through querying of web-accessible resources and maintenance of a local database. The system allows easy use of the output from one search as the input for a new search, as well as the filtering of results. The database is used to store formats and parameters and information for parsing data from web sites. The database permits easy updating of format information should a site modify the format of a query or of a returned web page.  相似文献   

3.
UniSave: the UniProtKB sequence/annotation version database   总被引:1,自引:0,他引:1  
SUMMARY: The UniProtKB Sequence/Annotation Version database (UniSave) is a comprehensive archive of UniProtKB/Swiss-Prot and UniProtKB/TrEMBL entry versions. All changed Swiss-Prot and TrEMBL entries are loaded into the UniSave as part of the public bi-weekly UniProtKB releases. Unlike the UniProtKB, which contains only the latest Swiss-Prot and TrEMBL entry versions, the UniSave provides access to previous versions of these entries. AVAILABILITY: http://www.ebi.ac.uk/uniprot/unisave  相似文献   

4.
In the recent past, there has been a resurgence of interest in Chikungunya virus (CHIKV) attributed to massive outbreaks of Chikungunya fever in the South-East Asia Region. This has reflected in substantial increase in submission of CHIKV genome sequences to NCBI (National Center for Biotechnology Information) database. Hereby we submit a database "CHIKVPRO" containing structural and functional annotation of Chikungunya virus proteins (25 strains) submitted in the NCBI repository. The CHIKV genome encodes for 9 proteins:4 non-structural and 5 structural. The CHIKVPRO database aims to provide the virology community with a single accession authoritative resource for CHIKV proteome- with reference to physiochemical and molecular properties, proteolytic cleavage sites, hydrophobicity, transmembrane prediction, and classification into functional families using SVMProt and other Expasy tools. AVAILABILITY: The database is freely available at http://www.chikvpro.info/  相似文献   

5.
Genome-scale sequencing projects have provided the essential information required for the construction of entire genome chips or microarrays for RNA expression studies. The Arabidopsis and rice genomes have been sequenced and whole-genome oligonucleotide arrays are being manufactured. These should soon become available to researchers. Expression studies using genomic-scale expression arrays are providing us with a vast quantity of information at a rapid pace. The rate-limiting step in this type of experiments is not the data generation step but rather the data analysis component of experiments. We report improvements that should facilitate the analysis of Affymetrix Genechip expression data.  相似文献   

6.
MOTIVATION: To be fully and efficiently exploited, data coming from sequencing projects together with specific sequence analysis tools need to be integrated within reliable data management systems. Systems designed to manage genome data and analysis tend to give a greater importance either to the data storage or to the methodological aspect, but lack a complete integration of both components. RESULTS: This paper presents a co-operative computer environment (called Imagenetrade mark) dedicated to genomic sequence analysis and annotation. Imagene has been developed by using an object-based model. Thanks to this representation, the user can directly manipulate familiar data objects through icons or lists. Imagene also incorporates a solving engine in order to manage analysis tasks. A global task is solved by successive divisions into smaller sub-tasks. During program execution, these sub- tasks are graphically displayed to the user and may be further re- started at any point after task completion. In this sense, Imagene is more transparent to the user than a traditional menu-driven package. Imagene also provides a user interface to display, on the same screen, the results produced by several tasks, together with the capability to annotate these results easily. In its current form, Imagene has been designed particularly for use in microbial sequencing projects. AVAILABILITY: Imagene best runs on SGI (Irix 6.3 or higher) workstations. It is distributed free of charge on a CD-ROM, but requires some Ilog licensed software to run. Some modules also require separate license agreements. Please contact the authors for specific academic conditions and other Unix platforms. CONTACT: imagene home page: http://wwwabi.snv.jussieu.fr/imagene   相似文献   

7.
Flavonoids are polyphenolic compounds that occur ubiquitously in foods of plant origin. Some of these molecules exhibit various physiological activities. Among existing drugs, there are a huge number of compounds bearing a flavonoid-related skeleton. Because of the relevance for pharmaceutical research, it would be beneficial to collect these compounds into a database. Recently, various databases of chemicals were compiled to help biological and/or chemical research, but no comprehensive database of flavonoids with chemical structures and physicochemical parameters, supposedly related to their activity, is available yet. The aim of this research was to merge the information about flavonoids of plant origin and flavonoids used as medicines into a database. Moreover, predictions of activities against various targets were performed using a virtual screening procedure to demonstrate a possible application of the database for pharmaceutical research.  相似文献   

8.
Databases containing proteomic information have become indispensable for virology studies. As the gap between the amount of sequence information and functional characterization widens, increasing efforts are being directed to the development of databases. For virologist, it is therefore desirable to have a single data collection point which integrates research related data from different domains. CHPVDB is our effort to provide virologist such a one‐step information center. We describe herein the creation of CHPVDB, a new database that integrates information of different proteins in to a single resource. For basic curation of protein information, the database relies on features from other selected databases, servers and published reports. This database facilitates significant relationship between molecular analysis, cleavage sites, possible protein functional families assigned to different proteins of Chandipura virus (CHPV) by SVMProt and related tools.  相似文献   

9.
We developed a semi-automated genome analysis system called GAMBLER in order to support the current whole-genome sequencing project focusing on alkaliphilic Bacillus halodurans C-125. GAMBLER was designed to reduce the human intervention required and to reduce the complications in annotating thousands of ORFs in the microbial genome. GAMBLER automates three major routines: analyzing assembly results provided by genome assembler software, assigning ORFs, and homology searching. GAMBLER is equipped with an interface for convenience of annotation. All processes and options are manipulatable through a WWW browser that enables scientists to share their genome analysis results without choosing computer platforms.  相似文献   

10.

Background  

Automated protein function prediction methods are needed to keep pace with high-throughput sequencing. With the existence of many programs and databases for inferring different protein functions, a pipeline that properly integrates these resources will benefit from the advantages of each method. However, integrated systems usually do not provide mechanisms to generate customized databases to predict particular protein functions. Here, we describe a tool termed PIPA (Pipeline for Protein Annotation) that has these capabilities.  相似文献   

11.
Kumar D  Mittal Y 《Bioinformation》2011,6(3):134-136
Lectins, a class of carbohydrate-binding proteins and widely recognized to play a range of crucial roles in many cell-cell recognition events triggering several important cellular processes encompass different members that are diverse in their protein structures, carbohydrate affinities and specificities, their larger biological roles and potential applications. To attain an effective use of all the diverse data initially an animal lectin database 'AnimalLectinDb' with information pertaining to taxonomic, structural, domain architecture, molecular sequence, carbohydrate structure and blood group specificity has been developed. It is expected to be of high value not only for basic study in lectin biology but also for advanced research in pursuing several applications in biotechnology, immunology, and clinical practice. AVAILABILITY: The database is available for free at http://www.research-bioinformatics.in.  相似文献   

12.
Kumar D  Mittal Y 《Bioinformation》2012,8(6):281-283
Studies of various diversified bacterial lectins/ lectin data may serve as a tool with enormous promise to help biotechnologists/ geneticists in their innovative technology to explore a deeper understanding in proteomics/ genomics research for finding the molecular basis of infectious diseases and also to new approaches for their prevention and in development of new bacterial vaccines. Hence we developed a bacterial lectin database named 'BacterialLectinDb'. An organized database schema for BacterialLectinDb was designed to collate all the available information about all bacterial lectins as a central repository. The database was designed using HTML, XML. AVAILABILITY: The database is available for free at http://www.research-bioinformatics.in.  相似文献   

13.
SELEX_DB is a novel curated database on selected randomized DNA/RNA sequences designed for accumulation of experimental data on functional site sequences obtained by using SELEX and SELEX-like technologies from the pools of random sequences. This database also contains the programs for DNA/RNA functional site recognition within arbitrary nucleotide sequences. The first release of SELEX_DB has been installed under SRS and is available through the WWW at http://wwwmgs.bionet.nsc.ru/mgs/systems/selex/  相似文献   

14.
15.
We present a bacterial genome computational analysis pipeline, called GenVar. The pipeline, based on the program GeneWise, is designed to analyze an annotated genome and automatically identify missed gene calls and sequence variants such as genes with disrupted reading frames (split genes) and those with insertions and deletions (indels). For a given genome to be analyzed, GenVar relies on a database containing closely related genomes (such as other species or strains) as well as a few additional reference genomes. GenVar also helps identify gene disruptions probably caused by sequencing errors. We exemplify GenVar's capabilities by presenting results from the analysis of four Brucella genomes. Brucella is an important human pathogen and zoonotic agent. The analysis revealed hundreds of missed gene calls, new split genes and indels, several of which are species specific and hence provide valuable clues to the understanding of the genome basis of Brucella pathogenicity and host specificity.  相似文献   

16.
BACKGROUND: Major histocompatibility complex (MHC) class I molecules play key roles in host immunity against pathogens by presenting peptide antigens to CD8+ T-cells. Many variants of MHC molecules exist, and each has a unique preference for certain peptide ligands. Both experimental approaches and computational algorithms have been utilized to analyze these peptide MHC binding characteristics. Traditionally, MHC binding specificities have been described in terms of binding motifs. Such motifs classify certain peptide positions as primary and secondary anchors according to their impact on binding, and they list the preferred and deleterious residues at these positions. This provides a concise and easily communicatable summary of MHC binding specificities. However, so far there has been no algorithm to generate such binding motifs in an automated and uniform fashion. In this paper, we present a computational pipeline that takes peptide MHC binding data as input and produces a concise MHC binding motif. We tested our pipeline on a set of 18 MHC class I molecules and showed that the derived motifs are consistent with historic expert assignments. We have implemented a pipeline that formally codifies rules to generate MHC binding motifs. The pipeline has been incorporated into the immune epitope database and analysis resource (IEDB) and motifs can be visualized while browsing MHC alleles in the IEDB.  相似文献   

17.
Kim C  Yoon U  Lee G  Park S  Seol YJ  Lee H  Hahn J 《Bioinformation》2009,4(6):269-270
The National Academy of Agricultural Science (NAAS) has developed a web-based marker database to provide information about SNP markers in rice. The database consists of three major functional categories: map viewing, marker searching and gene annotation. It provides 12,829 SNP markers information including gene location information on 12 chromosomes in rice. The annotation of SNP marker provides information such as marker name, EST number, gene definition and general marker information. Users are assisted in tracing any new structures of the chromosomes and gene positional functions using specific SNP markers. AVAILABILITY: The database is available for free at http://nabic.niab.go.kr/SNP/  相似文献   

18.
Artemis: sequence visualization and annotation   总被引:31,自引:0,他引:31  
SUMMARY: Artemis is a DNA sequence visualization and annotation tool that allows the results of any analysis or sets of analyses to be viewed in the context of the sequence and its six-frame translation. Artemis is especially useful in analysing the compact genomes of bacteria, archaea and lower eukaryotes, and will cope with sequences of any size from small genes to whole genomes. It is implemented in Java, and can be run on any suitable platform. Sequences and annotation can be read and written directly in EMBL, GenBank and GFF format. AVAILABITLTY: Artemis is available under the GNU General Public License from http://www.sanger.ac.uk/Software/Artemis  相似文献   

19.
SMART (Simple Modular Architecture Research Tool, http://smart.embl-heidelberg.de) is a web-based resource used for the annotation of protein domains and the analysis of domain architectures, with particular emphasis on mobile eukaryotic domains. Extensive annotation for each domain family is available, providing information relating to function, subcellular localization, phyletic distribution and tertiary structure. The January 2002 release has added more than 200 hand-curated domain models. This brings the total to over 600 domain families that are widely represented among nuclear, signalling and extracellular proteins. Annotation now includes links to the Online Mendelian Inheritance in Man (OMIM) database in cases where a human disease is associated with one or more mutations in a particular domain. We have implemented new analysis methods and updated others. New advanced queries provide direct access to the SMART relational database using SQL. This database now contains information on intrinsic sequence features such as transmembrane regions, coiled-coils, signal peptides and internal repeats. SMART output can now be easily included in users’ documents. A SMART mirror has been created at http://smart.ox.ac.uk.  相似文献   

20.
Automated genome sequence analysis and annotation.   总被引:5,自引:0,他引:5  
MOTIVATION: Large-scale genome projects generate a rapidly increasing number of sequences, most of them biochemically uncharacterized. Research in bioinformatics contributes to the development of methods for the computational characterization of these sequences. However, the installation and application of these methods require experience and are time consuming. RESULTS: We present here an automatic system for preliminary functional annotation of protein sequences that has been applied to the analysis of sets of sequences from complete genomes, both to refine overall performance and to make new discoveries comparable to those made by human experts. The GeneQuiz system includes a Web-based browser that allows examination of the evidence leading to an automatic annotation and offers additional information, views of the results, and links to biological databases that complement the automatic analysis. System structure and operating principles concerning the use of multiple sequence databases, underlying sequence analysis tools, lexical analyses of database annotations and decision criteria for functional assignments are detailed. The system makes automatic quality assessments of results based on prior experience with the underlying sequence analysis tools; overall error rates in functional assignment are estimated at 2.5-5% for cases annotated with highest reliability ('clear' cases). Sources of over-interpretation of results are discussed with proposals for improvement. A conservative definition for reporting 'new findings' that takes account of database maturity is presented along with examples of possible kinds of discoveries (new function, family and superfamily) made by the system. System performance in relation to sequence database coverage, database dynamics and database search methods is analysed, demonstrating the inherent advantages of an integrated automatic approach using multiple databases and search methods applied in an objective and repeatable manner. AVAILABILITY: The GeneQuiz system is publicly available for analysis of protein sequences through a Web server at http://www.sander.ebi.ac. uk/gqsrv/submit  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号