首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: Structural genomics projects are beginning to produce protein structures with unknown function, therefore, accurate, automated predictors of protein function are required if all these structures are to be properly annotated in reasonable time. Identifying the interface between two interacting proteins provides important clues to the function of a protein and can reduce the search space required by docking algorithms to predict the structures of complexes. RESULTS: We have combined a support vector machine (SVM) approach with surface patch analysis to predict protein-protein binding sites. Using a leave-one-out cross-validation procedure, we were able to successfully predict the location of the binding site on 76% of our dataset made up of proteins with both transient and obligate interfaces. With heterogeneous cross-validation, where we trained the SVM on transient complexes to predict on obligate complexes (and vice versa), we still achieved comparable success rates to the leave-one-out cross-validation suggesting that sufficient properties are shared between transient and obligate interfaces. AVAILABILITY: A web application based on the method can be found at http://www.bioinformatics.leeds.ac.uk/ppi_pred. The dataset of 180 proteins used in this study is also available via the same web site. CONTACT: westhead@bmb.leeds.ac.uk SUPPLEMENTARY INFORMATION: http://www.bioinformatics.leeds.ac.uk/ppi-pred/supp-material.  相似文献   

2.
Software is presented for the calculation of packing angles and geometry of helical secondary structure elements in protein structures. AVAILABILITY: C language source code and documentation is available from http://www.bioinformatics.leeds.ac.uk.  相似文献   

3.
We describe a fold level fast protein comparison and motif matching facility based on the TOPS representation of structure. This provides an update to a previous service at the EBI, with a better graph matching with faster results and visualization of both the structures being compared against and the common pattern of each with the target domain. AVAILABILITY: Web service at http://balabio.dcs.gla.ac.uk/tops or via the main TOPS site at http://www.tops.leeds.ac.uk. Software is also available for download from these sites.  相似文献   

4.
PROMISE: a database of bioinorganic motifs.   总被引:1,自引:1,他引:0       下载免费PDF全文
The PROMISE (prosthetic centres andmetalions in protein activesites) database aims to present comprehensive sequence, structural, functional and bibliographic information on metalloproteins and other complex proteins, with an emphasis on active site structure and function. The database is available on the WorldWide Web at http://bioinf.leeds.ac.uk/promise/  相似文献   

5.
The PROMISE (Prosthetic centres andmetalions in protein activesites) database aims to gather together comprehensive sequence, structural, functional and bibliographic information on proteins which possess prosthetic centres, with an emphasis on active site structure and function. The database is available on the World Wide Web at http://bioinf.leeds.ac.uk/promise/  相似文献   

6.
SHARP2: protein-protein interaction predictions using patch analysis   总被引:2,自引:0,他引:2  
SHARP2 is a flexible web-based bioinformatics tool for predicting potential protein-protein interaction sites on protein structures. It implements a predictive algorithm that calculates multiple parameters for overlapping patches of residues on the surface of a protein. Six parameters are calculated: solvation potential, hydrophobicity, accessible surface area, residue interface propensity, planarity and protrusion (SHARP2). Parameter scores for each patch are combined, and the patch with the highest combined score is predicted as a potential interaction site. SHARP2 enables users to upload 3D protein structure files in PDB format, to obtain information on potential interaction sites as downloadable HTML tables and to view the location of the sites on the 3D structure using Jmol. The server allows for the input of multiple structures and multiple combinations of parameters. Therefore predictions can be made for complete datasets, as well as individual structures. AVAILABILITY: http://www.bioinformatics.sussex.ac.uk/SHARP2.  相似文献   

7.
We present a large test set of protein-ligand complexes for the purpose of validating algorithms that rely on the prediction of protein-ligand interactions. The set consists of 305 complexes with protonation states assigned by manual inspection. The following checks have been carried out to identify unsuitable entries in this set: (1) assessing the involvement of crystallographically related protein units in ligand binding; (2) identification of bad clashes between protein side chains and ligand; and (3) assessment of structural errors, and/or inconsistency of ligand placement with crystal structure electron density. In addition, the set has been pruned to assure diversity in terms of protein-ligand structures, and subsets are supplied for different protein-structure resolution ranges. A classification of the set by protein type is available. As an illustration, validation results are shown for GOLD and SuperStar. GOLD is a program that performs flexible protein-ligand docking, and SuperStar is used for the prediction of favorable interaction sites in proteins. The new CCDC/Astex test set is freely available to the scientific community (http://www.ccdc.cam.ac.uk).  相似文献   

8.
BACKGROUND: Mixture model on graphs (MMG) is a probabilistic model that integrates network topology with (gene, protein) expression data to predict the regulation state of genes and proteins. It is remarkably robust to missing data, a feature particularly important for its use in quantitative proteomics. A new implementation in C and interfaced with R makes MMG extremely fast and easy to use and to extend. AVAILABILITY: The original implementation (Matlab) is still available from http://www.dcs.shef.ac.uk/~guido/; the new implementation is available from http://wrightlab.group.shef.ac.uk/people_noirel.htm, from CRAN, and has been submitted to BioConductor, http://www.bioconductor.org/.  相似文献   

9.
Doramapimod (BIRB-796) is widely recognized as one of the most potent and selective type II inhibitors of human p38α mitogen-activated protein kinase (MAPK); however, the understanding of its binding mechanism remains incomplete. Previous studies indicated high affinity of the ligand to a so-called allosteric pocket revealed only in the ‘out’ state of the DFG motif (i.e. Asp168-Phe169-Gly170) when Phe169 becomes fully exposed to the solvent. The possibility of alternative binding in the DFG-in state was hypothesized, but the molecular mechanism was not known. Methods of bioinformatics, docking and long-time scale classical and accelerated molecular dynamics have been applied to study the interaction of Doramapimod with the human p38α MAPK. It was shown that Doramapimod can bind to the protein even when the Phe169 is fully buried inside the allosteric pocket and the kinase activation loop is in the DFG-in state. Orientation of the inhibitor in such a complex is significantly different from that in the known crystallographic complex formed by the kinase in the DFG-out state; however, the Doramapimod’s binding is followed by the ligand-induced conformational changes, which finally improve accommodation of the inhibitor. Molecular modelling has confirmed that Doramapimod combines the features of type I and II inhibitors of p38α MAPK, i.e. can directly and indirectly compete with the ATP binding. It can be concluded that optimization of the initial binding in the DFG-in state and the final accommodation in the DFG-out state should be both considered at designing novel efficient type II inhibitors of MAPK and homologous proteins.

Communicated by Ramaswamy H. Sarma  相似文献   


10.
The ability to search sequence datasets for membrane spanning proteins is an important requirement for genome annotation. However, the development of algorithms to identify novel types of transmembrane beta-barrel (TMB) protein has proven substantially harder than for transmembrane helical proteins, owing to a shorter TM domain in which only alternate residues are hydrophobic. Although recent reports have described important improvements in the development of such algorithms, there is still concern over their ability to confidently screen genomes. Here we describe a new algorithm combining composition and hidden Markov model topology based classifiers (called TMB-Hunt2), which achieves a crossvalidation accuracy of >95%, with 96.7% precision and 94.2% recall. An overview is given of the algorithm design, with a thorough assessment of performance and application to a number of genomes. Of particular note is that TMB/extracellular protein discrimination is significantly more difficult than TMB/cytoplasmic protein discrimination, with the predictor correctly rejecting just 74% of extracellular proteins, in comparison to 98% of cytoplasmic proteins. Focus is given to directions for further improvements in TMB/non-TMB protein discrimination, with a call for the development of standardized tests and assessments of such algorithms. Tools and datasets are made available through a website called TMB-Web (http://www.bioinformatics.leeds.ac.uk/TMB-Web/TMB-Hunt2).  相似文献   

11.
The three-dimensional environments of ligand binding sites have been derived from the parsing and loading of the PDB entries into a relational database. For each bound molecule the biological assembly of the quaternary structure has been used to determine all contact residues and a fast interactive search and retrieval system has been developed. Prosite pattern and short sequence search options are available together with a novel graphical query generator for inter-residue contacts. The database and its query interface are accessible from the Internet through a web server located at: http://www.ebi.ac.uk/msd-srv/msdsite.  相似文献   

12.
The PDBsum web server provides structural analyses of the entries in the Protein Data Bank (PDB). Two recent additions are described here. The first is the detailed analysis of the SARS‐CoV‐2 virus protein structures in the PDB. These include the variants of concern, which are shown both on the sequences and 3D structures of the proteins. The second addition is the inclusion of the available AlphaFold models for human proteins. The pages allow a search of the protein against existing structures in the PDB via the Sequence Annotated by Structure (SAS) server, so one can easily compare the predicted model against experimentally determined structures. The server is freely accessible to all at http://www.ebi.ac.uk/pdbsum.  相似文献   

13.
The Proteome Analysis database (http://www.ebi.ac.uk/proteome/) has been developed by the Sequence Database Group at EBI utilizing existing resources and providing comparative analysis of the predicted protein coding sequences of the complete genomes of bacteria, archeae and eukaryotes. Three main projects are used, InterPro, CluSTr and GO Slim, to give an overview on families, domains, sites, and functions of the proteins from each of the complete genomes. Complete proteome analysis is available for a total of 89 proteome sets. A specifically designed application enables InterPro proteome comparisons for any one proteome against any other one or more of the proteomes in the database.  相似文献   

14.
Yeast Exploration Tool Integrator (YETI) is a novel bioinformatics tool for the integrated visualization and analysis of functional genomic data sets from the budding yeast Saccharomyces cerevisiae. AVAILABILITY: YETI is freely available for use over the WWW, or download under license, at http://www.bru.ed.ac.uk/~orton/yeti.html  相似文献   

15.
The Pfam protein families database   总被引:105,自引:12,他引:93  
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the WWW in the UK at http://www.sanger.ac.uk/Software/Pfam/, in Sweden at http://www.cgr.ki.se/Pfam/ and in the US at http://pfam.wustl.edu/. The latest version (4.3) of Pfam contains 1815 families. These Pfam families match 63% of proteins in SWISS-PROT 37 and TrEMBL 9. For complete genomes Pfam currently matches up to half of the proteins. Genomic DNA can be directly searched against the Pfam library using the Wise2 package.  相似文献   

16.
17.
The CluSTr (Clusters of SWISS-PROT and TrEMBL proteins) database offers an automatic classification of SWISS-PROT and TrEMBL proteins into groups of related proteins. The clustering is based on analysis of all pairwise comparisons between protein sequences. Analysis has been carried out for different levels of protein similarity, yielding a hierarchical organisation of clusters. The database provides links to InterPro, which integrates information on protein families, domains and functional sites from PROSITE, PRINTS, Pfam and ProDom. Links to the InterPro graphical interface allow users to see at a glance whether proteins from the cluster share particular functional sites. CluSTr also provides cross-references to HSSP and PDB. The database is available for querying and browsing at http://www.ebi.ac.uk/clustr.  相似文献   

18.
EBI databases and services   总被引:2,自引:0,他引:2  
The EMBL Outstation-European Bioinformatics Institute (EBI) is a center for research and services in bioinformatics. It serves researchers in molecular biology, genetics, medicine, and agriculture from academia, and the agricultural, biotechnology, chemical, and pharmaceutical industries. The Institute manages and makes available databases of biological data including nucleic acid, protein sequences, and macromolecular structures. It provides to this community bioinformatics services relevant to molecular biology free of charge over the Internet. Some of these databases and services are described in this review. For more information, visit the EBI Web server at http://www.ebi.ac.uk/.  相似文献   

19.
The SWISS-PROT group at EBI has developed the Proteome Analysis Database utilising existing resources and providing comparative analysis of the predicted protein coding sequences of the complete genomes of bacteria, archaea and eukaryotes (http://www.ebi.ac. uk/proteome/). The two main projects used, InterPro and CluSTr, give a new perspective on families, domains and sites and cover 31-67% (InterPro statistics) of the proteins from each of the complete genomes. CluSTr covers the three complete eukaryotic genomes and the incomplete human genome data. The Proteome Analysis Database is accompanied by a program that has been designed to carry out InterPro proteome comparisons for any one proteome against any other one or more of the proteomes in the database.  相似文献   

20.
MOTIVATION: Due to the limitations in experimental methods for determining binary interactions and structure determination of protein complexes, the need exists for computational models to fill the increasing gap between genome sequence information and protein annotation. Here we describe a novel method that uses structural models to reduce a large number of in silico predictions to a high confidence subset that is amenable to experimental validation. RESULTS: A two-stage evaluation procedure was developed, first, a sequence-based method assessed the conservation of protein interface patches used in the original in silico prediction method, both in terms of position within the primary sequence, and in terms of sequence conservation. When applying the most stringent conditions it was found that 20.5% of the data set being assessed passed this test. Secondly, a high-throughput structure-based docking evaluation procedure assessed the soundness of three dimensional models produced for the putative interactions. Of the data set being assessed, 8264 interactions or over 70% could be modelled in this way, and 27% of these can be considered 'valid' by the applied criteria. In all, 6.9% of the interactions passed both the tests and can be considered to be a high confidence set of predicted interactions, several of which are described. AVAILABILITY: http://bioinformatics.leeds.ac.uk/~bmb4sjc. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号