首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship.  相似文献   

2.
In spite of many efforts, the prediction of the location of proteins in eukaryotic cells (cytoplasm, mitochondrion or chloroplast) is still far from straightforward. In some cases (e.g. ribosomal proteins and aminoacyl-tRNA synthetases) both the cytoplasmic proteins and their organellar counterparts are encoded by the nuclear genome. A factorial correspondence analysis of the codon usage in yeast and Caenorhabditis elegans shows that the codon usage of those nuclear genes encoding ribosomal proteins or aminoacyl-tRNA synthetases is markedly different, depending on the final location of the proteins (cytoplasmic or mitochondrial). As a consequence, the location of such proteins-whose sequences are now frequently determined by systematic genomic sequencing-can be easily and quickly predicted. A WWW interface has been developed, aimed at providing a user-friendly tool for codon usage pattern analysis. It is available from http://www.genetique.uvsq.fr/afc.html  相似文献   

3.
TESE is a web server for the generation of test sets of protein sequences and structures fulfilling a number of different criteria. At least three different use cases can be envisaged: (i) benchmarking of novel methods; (ii) test sets tailored for special needs and (iii) extending available datasets. The CATH structure classification is used to control structural/sequence redundancy and a variety of structural quality parameters can be used to interactively select protein subsets with specific characteristics, e.g. all X-ray structures of alpha-helical repeat proteins with more than 120 residues and resolution <2.0 A. The output includes FASTA-formatted sequences, PDB files and a clickable HTML index file containing images of the selected proteins. Multiple subsets for cross-validation are also supported. AVAILABILITY: The TESE server is available for non-commercial use at URL: http://protein.bio.unipd.it/tese/.  相似文献   

4.
The frequencies of each of the 257 468 complete protein coding sequences (CDSs) have been compiled from the taxonomical divisions of the GenBank DNA sequence database. The sum of the codons used by 8792 organisms has also been calculated. The data files can be obtained from the anonymous ftp sites of DDBJ, Kazusa and EBI. A list of the codon usage of genes and the sum of the codons used by each organism can be obtained through the web site http://www.kazusa.or.jp/codon/ . The present study also reports recent developments on the WWW site. The new web interface provides data in the CodonFrequency-compatible format as well as in the traditional table format. The use of the database is facilitated by keyword based search analysis and the availability of codon usage tables for selected genes from each species. These new tools will provide users with the ability to further analyze for variations in codon usage among different genomes.  相似文献   

5.
ACUA: a software tool for automated codon usage analysis   总被引:1,自引:0,他引:1  
Currently available codon usage analysis tools lack intuitive graphical user interface and are limited to inbuilt calculations. ACUA (Automated Codon Usage Tool) has been developed to perform high throughput sequence analysis aiding statistical profiling of codon usage. The results of ACUA are presented in a spreadsheet with all perquisite codon usage data required for statistical analysis, displayed in a graphical interface. The package is also capable of on-click sequence retrieval from the results interface, and this feature is unique to ACUA. AVAILABILITY: The package is available for non-commercial purposes and can be downloaded from: http://www.bioinsilico.com/acua.  相似文献   

6.
Maintained at the University of Texas Health Science Center at Tyler, Texas, the tmRNA database (tmRDB) is accessible at the URL http://psyche.uthct.edu/dbs/tmRDB/tmRDB.html with mirror sites located at Auburn University, Auburn, Alabama (http://www.ag.auburn.edu/mirror/tmRDB/) and the Bioinformatics Research Center, Aarhus, Denmark (http://www.bioinf.au.dk/tmRDB/). The tmRDB collects and distributes information relevant to the study of tmRNA. In trans-translation, this molecule combines properties of tRNA and mRNA and binds several proteins to form the tmRNP. Related RNPs are likely to be functional in all bacteria. In this release of tmRDB, 186 new entries from 10 bacterial groups for a total of 274 tmRNA sequences have been added. Lists of the tmRNAs and the corresponding tmRNA-encoded tag-peptides are presented in alphabetical and phylogenetic order. The tmRNA sequences are aligned manually, assisted by computational tools, to determine base pairs supported by comparative sequence analysis. The tmRNA alignment, available in a variety of formats, provides the basis for the secondary and tertiary structure of each tmRNA molecule. Three-dimensional models of the tmRNAs and their associated proteins in PDB format give evidence for the recent progress that has been made in the understanding of tmRNP structure and function.  相似文献   

7.
8.
MHCPred 2.0     
The accurate computational prediction of T-cell epitopes can greatly reduce the experimental overhead implicit in candidate epitope identification within genomic sequences. In this article we present MHCPred 2.0, an enhanced version of our online, quantitative T-cell epitope prediction server. The previous version of MHCPred included mostly alleles from the human leukocyte antigen A (HLA-A) locus. In MHCPred 2.0, mouse models are added and computational constraints removed. Currently the server includes 11 human HLA class I, three human HLA class II, and three mouse class I models. Additionally, a binding model for the human transporter associated with antigen processing (TAP) is incorporated into the new MHCPred. A tool for the design of heteroclitic peptides is also included within the server. To refine the veracity of binding affinities prediction, a confidence percentage is also now calculated for each peptide predicted. AVAILABILITY: As previously, MHCPred 2.0 is freely available at the URL http://www.jenner.ac.uk/MHCPred/ CONTACT: Darren R. Flower (darren.flower@jenner.ac.uk).  相似文献   

9.
We have built a database of sequences phylogenetically related to cholinesterases (ESTHER) for esterases, alpha/beta hydrolase enzymes and relatives). These sequences define a homogeneous group of enzymes (carboxylesterases, lipases and hormone-sensitive lipases) with some related proteins devoid of enzymatic activity. The purpose of ESTHER is to help comparison and alignment of any new sequence appearing in the field, to favour mutation analysis of structure-function relationships and to allow structural data recovery. ESTHER is a World Wide Web server with the URL http://www.montpellier.inra.fr:70/cholinesterase.  相似文献   

10.
SUMMARY: INteractive Codon usage Analysis (INCA) provides an array of features useful in analysis of synonymous codon usage in whole genomes. In addition to computing codon frequencies and several usage indices, such as 'codon bias', effective Nc and CAI, the primary strength of INCA has numerous options for the interactive graphical display of calculated values, thus allowing visual detection of various trends in codon usage. Finally, INCA includes a specific unsupervised neural network algorithm, the self-organizing map, used for gene clustering according to the preferred utilization of codons. AVAILABILITY: INCA is available for the Win32 platform and is free of charge for academic use. For details, visit the web page http://www.bioinfo-hr.org/inca or contact the author directly. SUPPLEMENTARY INFORMATION: Software is accompanied with a user manual and a short tutorial.  相似文献   

11.
SUMMARY: Correspondence analysis of codon usage data is a widely used method in sequence analysis, but the variability in amino acid composition between proteins is a confounding factor when one wants to analyse synonymous codon usage variability. A simple and natural way to cope with this problem is to use within-group correspondence analysis. There is, however, no user-friendly implementation of this method available for genomic studies. Our motivation was to provide to the community a Web facility to easily study synonymous codon usage on a subset of data available in public genomic databases. AVAILABILITY: Availability through the Pole Bioinformatique Lyonnais (PBIL) Web server at http://pbil.univ-lyon1.fr/datasets/charif04/ with a demo allowing us to reproduce the figure in the present application note. All underlying software is distributed under a GPL licence. CONTACT: http://pbil.univ-lyon1.fr/members/lobry.  相似文献   

12.
CUTG (codon usage tabulated from GenBank) is a comprehensive database for codon usage. The codon usage for each full-length protein gene has been calculated using the nucleotide sequence obtained from GenBank sequence database. The sum of the codon use of each organism has been also calculated. The data files can be obtained from anonymous ftp sites of DDBJ, DISC and EBI. The list of codonusage of genes in organisms was made searchableby name of organism through a web site http://www.dna.affrc.go.jp/ approximately nakamura/CUTG.html The compilation is synchronized with major release of GenBank.  相似文献   

13.
Sample classification and class prediction is the aim of many gene expression studies. We present a web-based application, Prophet, which builds prediction rules and allows using them for further sample classification. Prophet automatically chooses the best classifier, along with the optimal selection of genes, using a strategy that renders unbiased cross-validated errors. Prophet is linked to different microarray data analysis modules, and includes a unique feature: the possibility of performing the functional interpretation of the molecular signature found. Availability: Prophet can be found at the URL http://prophet.bioinfo.cipf.es/ or within the GEPAS package at http://www.gepas.org/ Supplementary information: http://gepas.bioinfo.cipf.es/tutorial/prophet.html.  相似文献   

14.
Coding information is the main source of heterogeneity (non-randomness) in the sequences of microbial genomes. The heterogeneity corresponds to a cluster structure in triplet distributions of relatively short genomic fragments (200-400 bp). We found a universal 7-cluster structure in microbial genomic sequences and explained its properties. We show that codon usage of bacterial genomes is a multi-linear function of their genomic G+C-content with high accuracy. Based on the analysis of 143 completely sequenced bacterial genomes available in Genbank in August 2004, we show that there are four "pure" types of the 7-cluster structure observed. All 143 cluster animated 3D-scatters are collected in a database which is made available on our web-site (http://www.ihes.fr/~zinovyev/7clusters). The findings can be readily introduced into software for gene prediction, sequence alignment or microbial genomes classification.  相似文献   

15.
Frequencies for each of the 206 526 complete protein-coding genes (CDS's) have been compiled from taxonomical divisions of the GenBank DNA sequence database. The sum of the codon use of 7434 organisms has also been calculated. These data files can be obtained from anonymous ftp sites of DDBJ, DISC and EBI. The list of the codon usage of genes in an organism as well as the sum of the codon usage of the organism was made searchable by the name of organism through a web site http://www.dna.affrc.go.jp//CUTG.html  相似文献   

16.
SENTRA, available via URL http://wit.mcs.anl.gov/WIT2/Sentra/, is a database of proteins associated with microbial signal transduction. The database currently includes the classical two-component signal transduction pathway proteins and methyl-accepting chemotaxis proteins, but will be expanded to also include other classes of signal transduction systems that are modulated by phosphorylation or methylation reactions. Although the majority of database entries are from prokaryotic systems, eukaroytic proteins with bacterial-like signal transduction domains are also included. Currently SENTRA contains signal transduction proteins in 34 complete and almost completely sequenced prokaryotic genomes, as well as sequences from 243 organisms available in public databases (SWISS-PROT and EMBL). The analysis was carried out within the framework of the WIT2 system, which is designed and implemented to support genetic sequence analysis and comparative analysis of sequenced genomes.  相似文献   

17.
GeneBuilder: interactive in silico prediction of gene structure.   总被引:2,自引:0,他引:2  
MOTIVATION: Prediction of gene structure in newly sequenced DNA becomes very important in large genome sequencing projects. This problem is complicated due to the exon-intron structure of eukaryotic genes and because gene expression is regulated by many different short nucleotide domains. In order to be able to analyse the full gene structure in different organisms, it is necessary to combine information about potential functional signals (promoter region, splice sites, start and stop codons, 3' untranslated region) together with the statistical properties of coding sequences (coding potential), information about homologous proteins, ESTs and repeated elements. RESULTS: We have developed the GeneBuilder system which is based on prediction of functional signals and coding regions by different approaches in combination with similarity searches in proteins and EST databases. The potential gene structure models are obtained by using a dynamic programming method. The program permits the use of several parameters for gene structure prediction and refinement. During gene model construction, selecting different exon homology levels with a protein sequence selected from a list of homologous proteins can improve the accuracy of the gene structure prediction. In the case of low homology, GeneBuilder is still able to predict the gene structure. The GeneBuilder system has been tested by using the standard set (Burset and Guigo, Genomics, 34, 353-367, 1996) and the performances are: 0.89 sensitivity and 0.91 specificity at the nucleotide level. The total correlation coefficient is 0.88. AVAILABILITY: The GeneBuilder system is implemented as a part of the WebGene a the URL: http://www.itba.mi. cnr.it/webgene and TRADAT (TRAncription Database and Analysis Tools) launcher URL: http://www.itba.mi.cnr.it/tradat.  相似文献   

18.
ProTherm: Thermodynamic Database for Proteins and Mutants.   总被引:2,自引:1,他引:1       下载免费PDF全文
The first release of the Thermodynamic Database for Proteins and Mutants (ProTherm) contains more than 3300 data of several thermodynamic parameters for wild type and mutant proteins. Each entry includes numerical data for unfolding Gibbs free energy change, enthalpy change, heat capacity change, transition temperature, activity etc., which are important for understanding the mechanism of protein stability. ProTherm also includes structural information such as secondary structure and solvent accessibility of wild type residues, and experimental methods and other conditions. A WWW interface enables users to search data based on various conditions with different sorting options for outputs. Further, ProTherm is cross-linked with NCBI PUBMED literature database, Protein Mutant Database, Enzyme Code and Protein Data Bank structural database. Moreover, all the mutation sites associated with each PDB structure are automatically mapped and can be directly viewed through 3DinSight developed in our laboratory. The database is available at the URL, http://www.rtc.riken.go.jp/protherm.htm l  相似文献   

19.
Guide RNAs (gRNAs) are small, metabolically stable RNA molecules which perform a pivotal, template-like function during the RNA editing process in kinetoplastid protozoa. The gRNA database currently contains 250 guide RNA sequences as well as secondary and tertiary structure models and other relevant information. The database is made available as a hypertext document accessible via the World Wide Web (WWW) at the URL: http://www.biochem.mpg.de/ goeringe/  相似文献   

20.
tmRDB (tmRNA database)   总被引:2,自引:0,他引:2       下载免费PDF全文
The tmRNA database (tmRDB) is maintained at the University of Texas Health Science Center at Tyler, Texas, and is accessible on the WWW at URL http://psyche.uthct.edu/dbs/tmRDB/tmRDB.++ +html. A tmRDB mirror site is located on the campus of Auburn University, Auburn, Alabama, reachable at the URL http://www.ag.auburn.edu/mirror/tmRDB/. Since April 1997, the tmRDB has provided sequences of tmRNA (previously called 10Sa RNA), a molecule present in most bacteria and some organelles. This release adds 17 new sequences for a total of 60 tmRNAs. Sequences and corresponding tmRNA-encoded tag peptides are tabulated in alphabetical and phylo-genetic order. The updated tmRNA alignment improves the secondary structures of known tmRNAs on the level of individual basepairs. tmRDB also provides an introduction to tmRNA function in trans-translation (with links to relevant literature), a limited number of tmRNA secondary structure diagrams, and numerous three-dimensional models generated interactively with the program ERNA-3D.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号