首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Three-dimensional structures are now known within most protein families and it is likely, when searching a sequence database, that one will identify a homolog of known structure. The goal of Entrez's 3D-structure database is to make structure information and the functional annotation it can provide easily accessible to molecular biologists. To this end, Entrez's search engine provides several powerful features: (i) links between databases, for example between a protein's sequence and structure; (ii) pre-computed sequence and structure neighbors; and (iii) structure and sequence/structure alignment visualization. Here, we focus on a new feature of Entrez's Molecular Modeling Database (MMDB): Graphical summaries of the biological annotation available for each 3D structure, based on the results of automated comparative analysis. MMDB is available at: http://www.ncbi.nlm.nih.gov/Entrez/structure.html.  相似文献   

2.
MOTIVATION: SAGE enables the determination of genome-wide mRNA expression profiles. A comprehensive analysis of SAGE data requires software, which integrates (statistical) data analysis methods with a database system. Furthermore, to facilitate data sharing between users, the application should reside on a central server and be accessed via the internet. Since such an application was not available we developed the USAGE package. RESULTS: USAGE is a web-based application that comprises an integrated set of tools, which offers many functions for analysing and comparing SAGE data. Additionally, USAGE includes a statistical method for the planning of new SAGE experiments. USAGE is available in a multi-user environment giving users the option of sharing data. USAGE is interfaced to a relational database to store data and analysis results. The USAGE query editor allows the composition of queries for searching this database. Several database functions have been included which enable the selection and combination of data. USAGE provides the biologist increased functionality and flexibility for analysing SAGE data. AVAILABILITY: USAGE is freely accessible for academic institutions at http://www.cmbi.kun.nl/usage/. The source code of USAGE is freely available for academic institutions on request from the first author.  相似文献   

3.
4.
Statistical analysis of antigen receptor spectratype data   总被引:3,自引:0,他引:3  
MOTIVATION: The effectiveness of vertebrate adaptive immunity depends crucially on the establishment and maintenance of extreme diversity in the antigen receptor repertoire. Spectratype analysis is a method used in clinical and basic immunological settings in which antigen receptor length diversity is assessed as a surrogate for functional diversity. The purpose of this paper is to describe the systematic derivation and application of statistical methods for the analysis of spectratype data. RESULTS: The basic probability model used for spectratype analysis is the multinomial model with n, the total number of counts, indeterminate. We derive the appropriate statistics and statistical procedures for testing hypotheses regarding differences in antigen receptor distributions and variable repertoire diversity in different treatment groups.We then apply these methods to spectratype data obtained from several healthy donors to examine the differences between normal CD4+ and CD8+ T cell repertoires, and to data from a thymus transplant patient to examine the development of repertoire diversity following the transplant.  相似文献   

5.
MOTIVATION: There is an imperative need to integrate functional genomics data to obtain a more comprehensive systems-biology view of the results. We believe that this is best achieved through the visualization of data within the biological context of metabolic pathways. Accordingly, metabolic pathway reconstruction was used to predict the metabolic composition for Medicago truncatula and these pathways were engineered to enable the correlated visualization of integrated functional genomics data. Results: Metabolic pathway reconstruction was used to generate a pathway database for M. truncatula (MedicCyc), which currently features more than 250 pathways with related genes, enzymes and metabolites. MedicCyc was assembled from more than 225,000 M. truncatula ESTs (MtGI Release 8.0) and available genomic sequences using the Pathway Tools software and the MetaCyc database. The predicted pathways in MedicCyc were verified through comparison with other plant databases such as AraCyc and RiceCyc. The comparison with other plant databases provided crucial information concerning enzymes still missing from the ongoing, but currently incomplete M. truncatula genome sequencing project. MedicCyc was further manually curated to remove non-plant pathways, and Medicago-specific pathways including isoflavonoid, lignin and triterpene saponin biosynthesis were modified or added based upon available literature and in-house expertise. Additional metabolites identified in metabolic profiling experiments were also used for pathway predictions. Once the metabolic reconstruction was completed, MedicCyc was engineered to visualize M. truncatula functional genomics datasets within the biological context of metabolic pathways. Availability: freely accessible at http://www.noble.org/MedicCyc/  相似文献   

6.
The Ontologizer is a Java application that can be used to perform statistical analysis for overrepresentation of Gene Ontology (GO) terms in sets of genes or proteins derived from an experiment. The Ontologizer implements the standard approach to statistical analysis based on the one-sided Fisher's exact test, the novel parent-child method, as well as topology-based algorithms. A number of multiple-testing correction procedures are provided. The Ontologizer allows users to visualize data as a graph including all significantly overrepresented GO terms and to explore the data by linking GO terms to all genes/proteins annotated to the term and by linking individual terms to child terms. AVAILABILITY: The Ontologizer application is available under the terms of the GNU GPL. It can be started as a WebStart application from the project homepage, where source code is also provided: http://compbio.charite.de/ontologizer. REQUIREMENTS: Ontologizer requires a Java SE 5.0 compliant Java runtime engine and GraphViz for the optional graph visualization tool.  相似文献   

7.
MMDB: Entrez's 3D structure database.   总被引:5,自引:1,他引:4       下载免费PDF全文
The three dimensional structures for representatives of nearly half of all protein families are now available in public databases. Thus, no matter which protein one investigates, it is increasingly likely that the 3D structure of a homolog will be known and may reveal unsuspected structure-function relationships. The goal of Entrez's 3D-structure database is to make this information accessible and usable by molecular biologists (http://www.ncbi.nlm.nih.gov/Entrez). To this end Entrez provides two major analysis tools, a search engine based on sequence and structure 'neighboring' and an integrated visualization system for sequence and structure alignments. From a protein's sequence 'neighbors' one may rapidly identify other members of a protein family, including those where 3D structure is known. By comparing aligned sequences and/or structures in detail, using the visualization system, one may identify conserved features and perhaps infer functional properties. Here we describe how these analysis tools may be used to investigate the structure and function of newly discovered proteins, using the PTEN gene product as an example.  相似文献   

8.
SUMMARY: Lipoxygenases are a family of enzymes involved in a variety of human diseases like inflammation, asthma, artherosclerosis and cancer. The lipoxygenases database (LOX-DB) aims to be a web accessible compendium of information in particular on the mammalian members of this multigene family. This resource includes molecular structures, reference data, tools for structural and computational analysis as well as links to related information maintained by others. The data can be retrieved by the use of various search options and analyzed applying publicly available visualization tools. AVAILABILITY: LOX-DB is available at http://www.dkfz-heidelberg.de/spec/lox-db/  相似文献   

9.
The Synergizer is a database and web service that provides translations of biological database identifiers. It is accessible both programmatically and interactively. AVAILABILITY: The Synergizer is freely available to all users inter-actively via a web application (http://llama.med.harvard.edu/synergizer/translate) and programmatically via a web service. Clients implementing the Synergizer application programming interface (API) are also freely available. Please visit http://llama.med.harvard.edu/synergizer/doc for details.  相似文献   

10.
11.
T Conway  B Kraus  D L Tucker  D J Smalley  A F Dorman  L McKibben 《BioTechniques》2002,32(1):110, 112-4, 116, 118-9
Microsoft Windows-based computers have evolved to the point that they provide sufficient computational and visualization power for robust analysis of DNA array data. In fact, smaller laboratories might prefer to carry out some or all of their analyses and visualization in a Windows environment, rather than alternative platforms such as UNIX. We have developed a series of manually executed macros written in Visual Basic for Microsoft Excel spreadsheets, that allows for rapid and comprehensive gene expression data analysis. The first macro assigns gene names to spots on the DNA array and normalizes individual hybridizations by expressing the signal intensity for each gene as a percentage of the sum of all gene intensities. The second macro streamlines statistical consideration of the confidence in individual gene measurements for sets of experimental replicates by calculating probability values with the Student's t test. The third macro introduces a threshold value, calculates expression ratios between experimental conditions, and calculates the standard deviation of the mean of the log ratio values. Selected columns of data are copied by a fourth macro to create a processed data set suitable for entry into a Microsoft Access database. An Access database structure is described that allows simple queries across multiple experiments and export of data into third-party data visualization software packages. These analysis tools can be used in their present form by others working with commercial E. coli membrane arrays, or they may be adapted for use with other systems. The Excel spreadsheets with embedded Visual Basic macros and detailed instructions for their use are available at http://www.ou.edu/microarray.  相似文献   

12.
The ever evolving Next Generation Sequencing technology is calling for new and innovative ways of data processing and visualization. Following a detailed survey of the current needs of researchers and service providers, the authors have developed GenoViewer: a highly user-friendly, easy-to-operate SAM/BAM viewer and aligner tool. GenoViewer enables fast and efficient NGS assembly browsing, analysis and read mapping. It is highly customized, making it suitable for a wide range of NGS related tasks. Due to its relatively simple architecture, it is easy to add specialised visualization functionalities, facilitating further customised data analysis. The software's source code is freely available; it is open for project and task-specific modifications. AVAILABILITY: The database is available for free at http://www.genoviewer.com/  相似文献   

13.
The LCB Data Warehouse   总被引:2,自引:0,他引:2  
  相似文献   

14.
The Horizontal Gene Transfer DataBase (HGT-DB) is a genomic database that includes statistical parameters such as G+C content, codon and amino-acid usage, as well as information about which genes deviate in these parameters for prokaryotic complete genomes. Under the hypothesis that genes from distantly related species have different nucleotide compositions, these deviated genes may have been acquired by horizontal gene transfer. The current version of the database contains 88 bacterial and archaeal complete genomes, including multiple chromosomes and strains. For each genome, the database provides statistical parameters for all the genes, as well as averages and standard deviations of G+C content, codon usage, relative synonymous codon usage and amino-acid content. It also provides information about correspondence analyses of the codon usage, plus lists of extraneous group of genes in terms of G+C content and lists of putatively acquired genes. With this information, researchers can explore the G+C content and codon usage of a gene when they find incongruities in sequence-based phylogenetic trees. A search engine that allows searches for gene names or keywords for a specific organism is also available. HGT-DB is freely accessible at http://www.fut.es/~debb/HGT.  相似文献   

15.
Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.  相似文献   

16.
SpotWhatR is a user-friendly microarray data analysis tool that runs under a widely and freely available R statistical language (http://www.r-project.org) for Windows and Linux operational systems. The aim of SpotWhatR is to help the researcher to analyze microarray data by providing basic tools for data visualization, normalization, determination of differentially expressed genes, summarization by Gene Ontology terms, and clustering analysis. SpotWhatR allows researchers who are not familiar with computational programming to choose the most suitable analysis for their microarray dataset. Along with well-known procedures used in microarray data analysis, we have introduced a stand-alone implementation of the HTself method, especially designed to find differentially expressed genes in low-replication contexts. This approach is more compatible with our local reality than the usual statistical methods. We provide several examples derived from the Blastocladiella emersonii and Xylella fastidiosa Microarray Projects. SpotWhatR is freely available at http://blasto.iq.usp.br/~tkoide/SpotWhatR, in English and Portuguese versions. In addition, the user can choose between "single experiment" and "batch processing" versions.  相似文献   

17.
SUMMARY: The Kinase Sequence Database (KSD) located at http://kinase.ucsf.edu/ksd contains information on 290 protein kinase families derived by profile-based clustering of the non-redundant list of sequences obtained from a GenBank-wide search. Included in the database are a total of 5,041 protein kinases from over 100 organisms. Clustering into families is based on the extent of homology within the kinase catalytic domain (250-300 residues in length). Alignments of the families are viewed by interactive Excel-based sequence spreadsheets. In addition, KSD features evolutionary trees derived for each family and detailed information on each sequence as well as links to the corresponding GenBank entries. Sequence manipulation tools, such as evolutionary tree generation, novel sequence assignment, and statistical analysis, are also provided. AVAILABILITY: The kinase sequence database is a web-based service accessible at http://kinase.ucsf.edu/ksd CONTACT: buzko@cmp.ucsf.edu; shokat@cmp.ucsf.edu/ksd  相似文献   

18.
19.
20.
We present ProtaBank, a repository for storing, querying, analyzing, and sharing protein design and engineering data in an actively maintained and updated database. ProtaBank provides a format to describe and compare all types of protein mutational data, spanning a wide range of properties and techniques. It features a user‐friendly web interface and programming layer that streamlines data deposition and allows for batch input and queries. The database schema design incorporates a standard format for reporting protein sequences and experimental data that facilitates comparison of results across different data sets. A suite of analysis and visualization tools are provided to facilitate discovery, to guide future designs, and to benchmark and train new predictive tools and algorithms. ProtaBank will provide a valuable resource to the protein engineering community by storing and safeguarding newly generated data, allowing for fast searching and identification of relevant data from the existing literature, and exploring correlations between disparate data sets. ProtaBank invites researchers to contribute data to the database to make it accessible for search and analysis. ProtaBank is available at https://protabank.org .  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号