首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
HOWDY: an integrated database system for human genome research   总被引:1,自引:0,他引:1  
HOWDY is an integrated database system for accessing and analyzing human genomic information (http://www-alis.tokyo.jst.go.jp/HOWDY/). HOWDY stores information about relationships between genetic objects and the data extracted from a number of databases. HOWDY consists of an Internet accessible user interface that allows thorough searching of the human genomic databases using the gene symbols and their aliases. It also permits flexible editing of the sequence data. The database can be searched using simple words and the search can be restricted to a specific cytogenetic location. Linear maps displaying markers and genes on contig sequences are available, from which an object can be chosen. Any search starting point identifies all the information matching the query. HOWDY provides a convenient search environment of human genomic data for scientists unsure which database is most appropriate for their search.  相似文献   

2.
Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. AVAILABILITY: The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.  相似文献   

3.
Current research of gene regulatory mechanisms is increasingly dependent on the availability of high-quality information from manually curated databases. Biocurators undertake the task of extracting knowledge claims from scholarly publications, organizing these claims in a meaningful format and making them computable. In doing so, they enhance the value of existing scientific knowledge by making it accessible to the users of their databases.In this capacity, biocurators are well positioned to identify and weed out information that is of insufficient quality. The criteria that define information quality are typically outlined in curation guidelines developed by biocurators. These guidelines have been prudently developed to reflect the needs of the user community the database caters to. The guidelines depict the standard evidence that this community recognizes as sufficient justification for trustworthy data. Additionally, these guidelines determine the process by which data should be organized and maintained to be valuable to users. Following these guidelines, biocurators assess the quality, reliability, and validity of the information they encounter.In this article we explore to what extent different use cases agree with the inclusion criteria that define positive and negative data, implemented by the database. What are the drawbacks to users who have queries that would be well served by results that fall just short of the criteria used by a database? Finally, how can databases (and biocurators) accommodate the needs of such more explorative use cases?  相似文献   

4.
5.
UniProt蛋白质数据库简介   总被引:1,自引:0,他引:1       下载免费PDF全文
罗静初 《生物信息学》2019,17(3):131-144
UniProt(https://www.uniprot.org/)是国际知名蛋白质数据库,主要包括UniProtKB知识库、UniParc归档库和UniRef参考序列集三部分。UniProtKB知识库是UniProt的核心,除蛋白质序列数据外,还包括大量注释信息。UniProtKB知识库分Swiss-Prot和TrEMBL两个子库。Swiss-Prot子库中50多万条序列均由人工审阅和注释,而TrEMBL子库中1.4亿多条序列是由核酸序列数据库EMBL中的蛋白质编码序列翻译所得,并由计算机根据一定规则进行注释。UniParc归档库将存放于不同数据库中的同一个蛋白质归并到一个记录中以避免冗余,并赋予序列唯一性特定标识符。UniRef参考序列集按相似性程度将UniProtKB和UniParc中的序列分为UniRef100、UniRef90和UniRef50三个数据集。UniProt网站为用户提供了高效实用的高级检索系统和大量帮助文档。UniProt数据库每4周发布新版的同时也发布统计报表,用户可通过统计报表了解该数据库的数据量及更新情况、数据类别和物种分布等基本信息,查看常规注释信息、序列特征注释信息和数据库交叉链接等统计数据。UniProt是目前国际上序列数据最完整、注释信息最丰富的非冗余蛋白质序列数据库,自本世纪初创建以来,为生命科学领域提供了宝贵资源。  相似文献   

6.
7.
With the explosive growth of biological data, the development of new means of data storage was needed. More and more often biological information is no longer published in the conventional way via a publication in a scientific journal, but only deposited into a database. In the last two decades these databases have become essential tools for researchers in biological sciences. Biological databases can be classified according to the type of information they contain. There are basically three types of sequence-related databases (nucleic acid sequences, protein sequences and protein tertiary structures) as well as various specialized data collections. It is important to provide the users of biomolecular databases with a degree of integration between these databases as by nature all of these databases are connected in a scientific sense and each one of them is an important piece to biological complexity. In this review we will highlight our effort in connecting biological information as demonstrated in the SWISS-PROT protein database.  相似文献   

8.
The Homeodomain Resource is an annotated collection of non-redundant protein sequences, three-dimensional structures and genomic information for the homeodomain protein family. Release 2.0 contains 765 full-length homeodomain-containing sequences, 29 experimentally derived structures and 116 homeobox loci implicated in human genetic disorders. Entries are fully hyperlinked to facilitate easy retrieval of the original records from source databases. A simple search engine with a graphical user interface is provided to query the component databases and assemble customized data sets. A new feature for this release is the addition of more automated methods for database searching, maintenance and implementation of efficient data management. The Homeodomain Resource is freely available through the WWW at http://genome.nhgri.nih.gov/homeodomain  相似文献   

9.
GABAagent: a system for integrating data on GABA receptors   总被引:1,自引:0,他引:1  
  相似文献   

10.
Over the past few years, large amounts of data linking gene-expression (GE) patterns and other genetic data with the development of the mouse kidney have been published, and the next task will be to integrate these data with the molecular networks responsible for the emergence of the kidney phenotype. This paper discusses how a start to this task can be made by using the kidney database and its associated search tools, and shows how the data generated by such an approach can be used as a guide to future experimentation. Many of the events taking place as the kidney develops do, of course, also take place in other tissues and organisms and it will soon be possible to incorporate relevant information from these systems into analyses of kidney data as well as the new information from microarray technology. The key to success here will be the ability to access over the internet data from the textual and graphical databases for the mouse and other organisms now being established. In order to do this, informatic tools will be needed that will allow a user working with one database to query another. This paper also considers both the types of tools that will be necessary and the databases on which they will operate.  相似文献   

11.
MOTIVATION: Protein sequence clustering has been widely exploited to facilitate in-depth analysis of protein functions and families. For some applications of protein sequence clustering, it is highly desirable that a hierarchical structure, also referred to as dendrogram, which shows how proteins are clustered at various levels, is generated. However, as the sizes of contemporary protein databases continue to grow at rapid rates, it is of great interest to develop some summarization mechanisms so that the users can browse the dendrogram and/or search for the desired information more effectively. RESULTS: In this paper, the design of a novel incremental clustering algorithm aimed at generating summarized dendrograms for analysis of protein databases is described. The proposed incremental clustering algorithm employs a statistics-based model to summarize the distributions of the similarity scores among the proteins in the database and to control formation of clusters. Experimental results reveal that, due to the summarization mechanism incorporated, the proposed incremental clustering algorithm offers the users highly concise dendrograms for analysis of protein clusters with biological significance. Another distinction of the proposed algorithm is its incremental nature. As the sizes of the contemporary protein databases continue to grow at fast rates, due to the concern of efficiency, it is desirable that cluster analysis of a protein database can be carried out incrementally, when the protein database is updated. Experimental results with the Swiss-Prot protein database reveal that the time complexity for carrying out incremental clustering with k new proteins added into the database containing n proteins is O(n2betalogn), where beta congruent with 0.865, provided that k < n. AVAILABILITY: The Linux executable is available on the following supplementary page.  相似文献   

12.
CyanoPhyChe is a user friendly database that one can browse through for physico-chemical properties, structure and biochemical pathway information of cyanobacterial proteins. We downloaded all the protein sequences from the cyanobacterial genome database for calculating the physico-chemical properties, such as molecular weight, net charge of protein, isoelectric point, molar extinction coefficient, canonical variable for solubility, grand average hydropathy, aliphatic index, and number of charged residues. Based on the physico-chemical properties, we provide the polarity, structural stability and probability of a protein entering in to an inclusion body (PEPIB). We used the data generated on physico-chemical properties, structure and biochemical pathway information of all cyanobacterial proteins to construct CyanoPhyChe. The data can be used for optimizing methods of expression and characterization of cyanobacterial proteins. Moreover, the ‘Search’ and data export options provided will be useful for proteome analysis. Secondary structure was predicted for all the cyanobacterial proteins using PSIPRED tool and the data generated is made accessible to researchers working on cyanobacteria. In addition, external links are provided to biological databases such as PDB and KEGG for molecular structure and biochemical pathway information, respectively. External links are also provided to different cyanobacterial databases. CyanoPhyChe can be accessed from the following URL: http://bif.uohyd.ac.in/cpc.  相似文献   

13.
DNA barcoding is based on the use of short DNA sequences to provide taxonomic tags for rapid, efficient identification of biological specimens. Currently, reference databases are being compiled. In the future, it will be important to facilitate access to these databases, especially for nonspecialist users. The method described here provides a rapid, web-based, user-friendly link between the DNA sequence from an unidentified biological specimen and various types of biological information, including the species name. Specifically, we use a customized, Google-type search algorithm to quickly match an unknown DNA sequence to a list of verified DNA barcodes in the reference database. In addition to retrieving the species name, our web tool also provides automatic links to a range of other information about that species. As the DNA barcode database becomes more populated, it will become increasingly important for the broader user community to be able to exploit it for the rapid identification of unknown specimens and to easily obtain relevant biological information about these species. The application presented here meets that need.  相似文献   

14.
With the exponentially increasing amount of information in the biomedical field, the significance of advanced information retrieval and information extraction, as well as the role of databases, has been increasing. PRIME is an integrated gene/protein informatics database based on natural language processing. It provides automatically extracted protein/family/gene/compound interaction information including both physical and genetic interactions, gene ontology based functions, and graphic pathway viewers. Gene/protein/family names and functional terms are recognized based on dictionaries developed in our laboratory. The interaction and functional information are extracted by syntactic dependencies and various phrase patterns. We have included about 920,000 (non-redundant) protein interactions and 360,000 annotated gene-function relationships for major eukaryotes. By combining the sequence and text information, the pathway comparison between two organisms and simple pathway deduction based on other organism interaction data, and pathway filtering using tissue expression data, are also available. This database is accessible at http://prime.ontology.ims.u-tokyo.ac.jp:8081.  相似文献   

15.
iSPOT (http://cbm.bio.uniroma2.it/ispot) is a web tool developed to infer the recognition specificity of protein module families; it is based on the SPOT procedure that utilizes information from position-specific contacts, derived from the available domain/ligand complexes of known structure, and experimental interaction data to build a database of residue-residue contact frequencies. iSPOT is available to infer the interaction specificity of PDZ, SH3 and WW domains. For each family of protein domains, iSPOT evaluates the probability of interaction between a query domain of the specified families and an input protein/peptide sequence and makes it possible to search for potential binding partners of a given domain within the SWISS-PROT database. The experimentally derived interaction data utilized to build the PDZ, SH3 and WW databases of residue-residue contact frequencies are also accessible. Here we describe the application to the WW family of protein modules.  相似文献   

16.
H B Jenson 《BioTechniques》1989,7(6):590-592
A novel computer database program dedicated to storing, cataloging, and accessing information about recombinant clones and libraries has been developed for the IBM (or compatible) personal computer. This program, named CLONES, also stores information about bacterial strains and plasmid and bacteriophage vectors used in molecular biology. The advantages of this method are improved organization of data, fast and easy assimilation of new data, automatic association of new data with existing data, and rapid retrieval of desired records using search criteria specified by the user. Individual records are indexed in the database using B-trees, which automatically index new entries and expedite later access. The use of multiple windows, pull-down menus, scrolling pick-lists, and field-input techniques make the program intuitive to understand and easy to use. Daughter databases can be created to include all records of a particular type, or only those records matching user-specified search criteria. Separate databases can also be merged into a larger database. This computer program provides an easy-to-use and accurate means to organize, maintain, access, and share information about recombinant clones and other laboratory products of molecular biology technology.  相似文献   

17.
The Protein Data Bank Japan (PDBj) curates, edits and distributes protein structural data as a member of the worldwide Protein Data Bank (wwPDB) and currently processes approximately 25-30% of all deposited data in the world. Structural information is enhanced by the addition of biological and biochemical functional data as well as experimental details extracted from the literature and other databases. Several applications have been developed at PDBj for structural biology and biomedical studies: (i) a Java-based molecular graphics viewer, jV; (ii) display of electron density maps for the evaluation of structure quality; (iii) an extensive database of molecular surfaces for functional sites, eF-site, as well as a search service for similar molecular surfaces, eF-seek; (iv) identification of sequence and structural neighbors; (v) a graphical user interface to all known protein folds with links to the above applications, Protein Globe. Recent examples are shown that highlight the utility of these tools in recognizing remote homologies between pairs of protein structures and in assigning putative biochemical functions to newly determined targets from structural genomics projects.  相似文献   

18.
Thermodynamic data regarding proteins and their interactions are important for understanding the mechanisms of protein folding, protein stability, and molecular recognition. Although there are several structural databases available for proteins and their complexes with other molecules, databases for experimental thermodynamic data on protein stability and interactions are rather scarce. Thus, we have developed two electronically accessible thermodynamic databases. ProTherm, Thermodynamic Database for Proteins and Mutants, contains numerical data of several thermodynamic parameters of protein stability, experimental methods and conditions, along with structural, functional, and literature information. ProNIT, Thermodynamic Database for Protein-Nucleic Acid Interactions, contains thermodynamic data for protein-nucleic acid binding, experimental conditions, structural information of proteins, nucleic acids and the complex, and literature information. These data have been incorporated into 3DinSight, an integrated database for structure, function, and properties of biomolecules. A WWW interface allows users to search for data based on various conditions, with different display and sorting options, and to visualize molecular structures and their interactions. These thermodynamic databases, together with structural databases, help researchers gain insight into the relationship among structure, function, and thermodynamics of proteins and their interactions, and will become useful resources for studying proteins in the postgenomic era.  相似文献   

19.
20.
Schlamp K  Weinmann A  Krupp M  Maass T  Galle P  Teufel A 《Gene》2008,427(1-2):47-50
With the availability of high-throughput gene expression analysis, multiple public expression databases emerged, mostly based on microarray expression data. Although these databases are of significant biomedical value, they do hold significant drawbacks, especially concerning the reliability of single gene expression profiles obtained by microarray data. Simultaneously, reliable data on an individual gene's expression are often published as single northern blots in individual publications. These data were not yet available for high-throughput screening. To reduce the gap between high-throughput expression data and individual highly reliable expression data, we designed a novel database "BlotBase", a freely and easily accessible database, currently containing approximately 700 published northern blots of human or mouse origin (http://www.medicalgenomics.org/Databases/BlotBase). As the database is open for public data submission, we expect this database to quickly become a large expression profiling resource, eventually providing higher reliability in high-throughput gene expression analysis. Realizing BlotBase, Pubmed was searched manually and by computer based text mining methods to obtain publications containing northern blot results. Subsequently, northern blots were extracted and expression values of different tissues calculated utilizing Image J. All data were made available through a user friendly web front end. The data may be searched by either full text search or list of available northern blots of a specific tissue. Northern blot expression profiles were displayed by three expression states as well as a bar chart, allowing for automated evaluation. Furthermore, we integrated additional features, e.g. instant access to the corresponding RNA sequence or primer design tools making further expression analysis more convenient. Finally, through a semiautomatic submission system this database was opened to the bioinformatics community.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号