首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have re-evaluated the information used in the Garnier-Osguthorpe-Robson (GOR) method of secondary structure prediction with the currently available database. The framework of information theory provides a means to formulate the influence of local sequence upon the conformation of a given residue, in a rigorous manner. However, the existing database does not allow the evaluation of parameters required for an exact treatment of the problem. The validity of the approximations drawn from the theory is examined. It is shown that the first-level approximation, involving single-residue parameters, is only marginally improved by an increase in the database. The second-level approximation, involving pairs of residues, provides a better model. However, in this case the database is not big enough and this method might lead to parameters with deficiencies. Attention is therefore given to overcoming this lack of data. We have determined the significant pairs and the number of dummy observations necessary to obtain the best result for the prediction. This new version of the GOR method increases the accuracy of prediction by 7%, bringing the amount of residues correctly predicted to 63% for three states and 68 proteins, each protein to be predicted being removed from the database and the parameters derived from the other proteins. If the protein to be predicted is kept in the database the accuracy goes up to 69.7%.  相似文献   

2.
PhosphoBase, a database of phosphorylation sites: release 2.0.   总被引:16,自引:0,他引:16       下载免费PDF全文
PhosphoBase contains information about phosphorylated residues in proteins and data about peptide phosphorylation by a variety of protein kinases. The data are collected from literature and compiled into a common format. The current release of PhosphoBase (October 1998, version 2.0) comprises 414 phosphoprotein entries covering 1052 phosphorylatable serine, threonine and tyrosine residues. The kinetic data from peptide phosphorylation assays for approximately 330 oligopeptides is also included. The database entries are cross-referenced to the corresponding records in the Swiss-Prot protein database and literature references are linked to MedLine records. PhosphoBase is available via the WWW at http://www.cbs.dtu. dk/databases/PhosphoBase/  相似文献   

3.
血红素是一种重要的、常用的配体,在电子传递、催化、信号转导和基因表达等方面发挥着重要作用,准确预测蛋白质与血红素相互作用的结合残基是结构生物信息学的主要挑战之一。本文下载整理了Biolip数据库中HEME配体与蛋白质结合的信息,统计分析了结合残基和非结合残基的氨基酸组分和位点保守性信息并将其作为预测特征参数,用Fisher-PSSM判别法识别HEME结合残基,计算结果表明优化特征参数的Fisher-PSSM判别法得到了较好的预测结果。  相似文献   

4.
Recently, we reported a database (Noncoded Amino acids Database; http://recerca.upc.edu/imem/index.htm) that was built to compile information about the intrinsic conformational preferences of nonproteinogenic residues determined by quantum mechanical calculations, as well as bibliographic information about their synthesis, physical and spectroscopic characterization, the experimentally established conformational propensities, and applications (Revilla-López et al., J Phys Chem B 2010;114:7413-7422). The database initially contained the information available for α-tetrasubstituted α-amino acids. In this work, we extend NCAD to three families of compounds, which can be used to engineer peptides and proteins incorporating modifications at the--NHCO--peptide bond. Such families are: N-substituted α-amino acids, thio-α-amino acids, and diamines and diacids used to build retropeptides. The conformational preferences of these compounds have been analyzed and described based on the information captured in the database. In addition, we provide an example of the utility of the database and of the compounds it compiles in protein and peptide engineering. Specifically, the symmetry of a sequence engineered to stabilize the 3(10)-helix with respect to the α-helix has been broken without perturbing significantly the secondary structure through targeted replacements using the information contained in the database.  相似文献   

5.
SMART: a web-based tool for the study of genetically mobile domains   总被引:61,自引:2,他引:59  
SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures (http://SMART.embl-heidelberg.de ). More than 400 domain families found in signalling, extra-cellular and chromatin-associated proteins are detectable. These domains are extensively annotated with respect to phyletic distributions, functional class, tertiary structures and functionally important residues. Each domain found in a non-redundant protein database as well as search parameters and taxonomic information are stored in a relational database system. User interfaces to this database allow searches for proteins containing specific combinations of domains in defined taxa.  相似文献   

6.
ProTherm 2.0 is the second release of the Thermo-dynamic Database for Proteins and Mutants that includes numerical data for several thermodynamic parameters, structural information, experimental methods and conditions, functional and literature information. The present release contains >5500 entries, an approximately 67% increase over the previous version. In addition, we have included information about reversibility of data, details about buffer and ion concentrations and the surrounding residues in space for all mutants. A WWW interface enables users to search data based on various conditions with different sorting options for outputs. Further, ProTherm has links with other structural and literature databases, and the mutation sites and surrounding residues are automatically mapped on the structures and can be directly viewed through 3DinSight developed in our laboratory. The ProTherm database is freely available through the WWW at http://www.rtc.riken.go.jp/protherm.html  相似文献   

7.
The bulk hydrophobic character for the 20 natural amino acid residues, has been obtained from a database of 60 protein structures, grouped in the four structural classes alpha alpha, beta beta, alpha + beta and alpha/beta. The hydrophobicity coefficients thus obtained are compared with Ponnuswamy's original values using scales normalized to average = 0.0 and standard deviation = 1.0. Even though most of the amino acid residues do not change their hydropathic character in the different structural classes, their behaviour suggests the convenience that averaging methods should only consider proteins of the same structural class and that this information should be included in the secondary structure methods.  相似文献   

8.
Structure prediction methods aim to identify the relationship between the amino acid sequence of an unknown protein and information comprised in databases of known protein structures. Towards this end, we created a database by combining the amino acid sequences and the corresponding three-dimensional atomic coordinates for all the 25% non-redundant protein chains available in the Protein Data Bank. It contains information about the peptide fragments that are 5 to 10 residues long. In addition, options are provided for the users to visualize the individual motifs and the superposed fragments in the client machine. Further, useful functionalities areprovided to look for similar sequence motifs in all the sequence databases like PDB, 90% non-redundant protein chains, Genome database, PIR and Swiss-Prot. The database is being updated at regular intervals and the same can be accessed over the World Wide Web interface at the following URL: http://pranag.physics.iisc.ernet.in/sms/.  相似文献   

9.
A relational database of protein structure has been developed to enable rapid and flexible enquiries about the occurrence of many aspects of protein architecture. The coordinates of 294 proteins from the Brookhaven Data Bank have been processed by standard computer programs to generate many additional terms that quantify aspects of protein structure. These terms include solvent accessibility, main-chain and side-chain dihedral angles, and secondary structure. In a relational database, the information is stored in tables with columns holding the different terms and rows holding the different entries for the terms. The different relational base tables store the information about the protein coordinate set, the different chains in the protein, the amino acid residues and ligands, the atomic coordinates, the salt bridges, the hydrogen bonds, the disulphide bridges and the close tertiary contacts. The database was established under ORACLE management system. Enquiries are constructed in ORACLE using SQL (structured query language) which is simple to use and alleviates the need for extensive computer programs. A single table can be searched for entries that meet various criteria, e.g. all protein solved to better than a given resolution. The power of the database occurs when several tables, or the entries in a single table, are cross-correlated. For example the dihedral angles of proline in the fourth position in an alpha-helix in high resolution structures can be rapidly obtained. The structural database provides a powerful tool to obtain empirical rules about protein conformation. This database of protein structures is part of a joint project between Birkbeck College and Leeds University to establish an integrated data resource of protein sequences and structures (ISIS) that encodes the complex patterns of residues and coordinates that define protein conformation. The entire data resource (ISIS) will provide a system to guide all areas of protein modelling including structure prediction, site-directed mutagenesis and de novo protein design. The availability of ISIS is described in the paper.  相似文献   

10.
The analysis of disulphide bond containing proteins in the Protein Data Bank (PDB) revealed that out of 27,209 protein structures analyzed, 12,832 proteins contain at least one intra-chain disulphide bond and 811 proteins contain at least one inter-chain disulphide bond. The intra-chain disulphide bond containing proteins can be grouped into 256 categories based on the number of disulphide bonds and the disulphide bond connectivity patterns (DBCPs) that were generated according to the position of half-cystine residues along the protein chain. The PDB entries corresponding to these 256 categories represent 509 unique SCOP superfamilies. A simple web-based computational tool is made freely available at the website http://www.ccmb.res.in/bioinfo/dsbcp that allows flexible queries to be made on the database in order to retrieve useful information on the disulphide bond containing proteins in the PDB. The database is useful to identify the different SCOP superfamilies associated with a particular disulphide bond connectivity pattern or vice versa. It is possible to define a query based either on a single field or a combination of the following fields, i.e., PDB code, protein name, SCOP superfamily name, number of disulphide bonds, disulphide bond connectivity pattern and the number of amino acid residues in a protein chain and retrieve information that match the criterion. Thereby, the database may be useful to select suitable protein structural templates in order to model the more distantly related protein homologs/analogs using the comparative modeling methods.  相似文献   

11.
A computer-assisted approach to the prediction of the primary structures of regular glycopolymers is described. The analysis is based on comparing the calculated 13C NMR spectra of all the possible structures of the repeating unit (for the given monomeric composition) to an experimental 13C NMR spectrum. The spectra generation is based on the spectral database containing information on the 13C chemical shifts of monomers, di- and trimeric fragments. If the required data are missing from this database, the special database for average glycosylation effects is used. The analysis reveals those structures with the calculated 13C NMR spectrum most close to observed. The structures of repeating units of any topology containing up to six residues linked by glycosidic, amidic or phospho-diester bridges can be predicted. Unambiguous selection of the proper structure from the output list of possible structures may require additional experimental data. Testing the created program and databases on bacterial polysaccharides and their derivatives containing up to three non-sugar residues (alditols, amino acids, phosphate groups etc.) per repeating unit revealed the good convergence of prediction with independently obtained structural data.  相似文献   

12.
13.
Protein export from the nucleus is often mediated by a Leucine-rich Nuclear Export Signal (NES). NESbase is a database of experimentally validated Leucine-rich NESs curated from literature. These signals are not annotated in databases such as SWISS-PROT, PIR or PROSITE. Each NESbase entry contains information of whether NES was shown to be necessary and/or sufficient for export, and whether the export was shown to be mediated by the export receptor CRM1. The compiled information was used to make a sequence logo of the Leucine-rich NESs, displaying the conservation of amino acids within a window of 25 residues. Surprisingly, only 36% of the sequences used for the logo fit the widely accepted NES consensus L-x(2,3)-[LIVFM]-x(2,3)-L-x-[LI]. The database is available online at http://www.cbs.dtu.dk/databases/NESbase/.  相似文献   

14.
15.
The contribution of individual Trp residues to alpha-actin fluorescence was evaluated by means of an analysis of their microenvironment, which was done on the basis of PIR-International protein sequence database information. The contribution of Trp79 and Trp86 was shown to be low due to an effective nonradiating energy transfer according to the inductive resonance mechanism between the Trp residues and the fluorescence quenching of Trp86 by S gamma of Cys10, an efficient fluorescence quencher. The intrinsic fluorescence of actin was found to be determined mainly by Trp340 and Trp356, which are internal, inaccessible to solvent, and have a high density microenvironment formed mainly by nonpolar groups of protein. It is possible that the side chain conformation of Trp340 (t-isomer; chi 1 190 degrees, chi 2 89 degrees), aromatic rings of Tyr and Phe residues, and Pro residues in the microenvironment of Trp340 and Trp356 substantially contribute to the short-wavelength fluorescence spectrum of actin.  相似文献   

16.
提出一种新颖的方案使蛋白质结构信息可视化。在滑动窗口方法基础上,每一个天然氨基酸采用从氨基酸索引数据库中挑选的48种特性参数描述,在某一特定窗口下的所有氨基酸残基的参数就组成一个矩阵,通过矩阵变换得到一个方矩阵,再经过窗口的滑动就得到基于整个蛋白质的所有这些窗口矩阵的本征值矩阵。对本征值矩阵元素作图得到一系列的本征值曲线,这种曲线的轮廓不随窗口的变化而变化,这些曲线被称为蛋白质的特征曲线。为选择合适的窗口宽度、对同一类型蛋白质不同窗口宽度及不同类型蛋白质相同窗口宽度下的本征值矩阵进行了比较研究,对其潜在的用途进行了讨论。  相似文献   

17.
We present the development of a Comprehensive database of 12 076 invariant Peptide Signatures (CoPS) derived from 52 bacterial genomes with a minimum occurrence in at least seven organisms. These peptides were observed in functionally similar proteins and are distributed over nearly 1250 different functional proteins. The database provides function, structure and occurrence in biochemical pathways of the proteins containing these signature peptides. It houses additional information on the signature peptides, such as identical match in other motif/pattern (e.g. PROSITE, BLOCKS, PRINTS and Pfam) databases and the database of interacting proteins, human proteome and mutation effect on these signature peptides. There is a wide applicability of this database in the identification of critical functional residues in proteins. The database also facilitates the identification of folding nucleus/structural determinants in proteins and functional assignment to yet unknown proteins. We demonstrate functional assignment to 2605 hypothetical proteins in bacterial genomes and 112 unknown proteins in human using this database. AVAILABILITY: The database can be freely accessed through the following URL: http://203.195.151.46/copsv2/index.html or http://203.90.127.70/copsv2/index.html  相似文献   

18.
A database of 452 two-domain proteins with less than 25% homology was constructed. One half of the database was used to obtain statistics on the appearance of amino acid residues at domain boundaries. Small and hydrophilic residues (proline, glycine, asparagine, glutamic acid, arginine, etc.) occurred more often at domain boundaries than in total proteins. Hydrophobic residues (tryptophan, methionine, phenylalanine, etc.) were rarer at domain boundaries than in total proteins. Probability scales of amino acid appearance in boundary-flanking regions were constructed with these statistics and used to predict the domain boundaries in proteins of the other half of the database. The probability scale obtained by averaging the appearance of amino acids over an 8-residue region (±4 residues from the real domain boundaries) yielded the best results: domain boundaries were predicted within 40 residues of the real boundary in 57% of proteins and within 20 residues of the real boundary in 41% of proteins. The probability scale was used to predict the domain boundaries in proteins with unknown structures (CASP6).  相似文献   

19.
SUMMARY: The Kinase Sequence Database (KSD) located at http://kinase.ucsf.edu/ksd contains information on 290 protein kinase families derived by profile-based clustering of the non-redundant list of sequences obtained from a GenBank-wide search. Included in the database are a total of 5,041 protein kinases from over 100 organisms. Clustering into families is based on the extent of homology within the kinase catalytic domain (250-300 residues in length). Alignments of the families are viewed by interactive Excel-based sequence spreadsheets. In addition, KSD features evolutionary trees derived for each family and detailed information on each sequence as well as links to the corresponding GenBank entries. Sequence manipulation tools, such as evolutionary tree generation, novel sequence assignment, and statistical analysis, are also provided. AVAILABILITY: The kinase sequence database is a web-based service accessible at http://kinase.ucsf.edu/ksd CONTACT: buzko@cmp.ucsf.edu; shokat@cmp.ucsf.edu/ksd  相似文献   

20.
We present a novel method that combines protein structure information with protein interaction data to identify residues that form part of an interaction interface. Our prediction method can retrieve interaction hotspots with an accuracy of 60% (at a 20% false positive rate). The method was applied to all mutations in the Online Mendelian Inheritance in Man (OMIM) database, predicting 1,428 mutations to be related to an interaction defect. Combining predicted and hand-curated sets, we discuss how mutations affect protein interactions in general.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号