首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Due to large sizes and complex nature, few large macromolecular complexes have been solved to atomic resolution. This has lead to an under-representation of these structures, which are composed of novel and/or homologous folds, in the library of known structures and folds. While it is often difficult to achieve a high-resolution model for these structures, X-ray crystallography and electron cryomicroscopy are capable of determining structures of large assemblies at low to intermediate resolutions. To aid in the interpretation and analysis of such structures, we have developed two programs: helixhunter and foldhunter. Helixhunter is capable of reliably identifying helix position, orientation and length using a five-dimensional cross-correlation search of a three-dimensional density map followed by feature extraction. Helixhunter's results can in turn be used to probe a library of secondary structure elements derived from the structures in the Protein Data Bank (PDB). From this analysis, it is then possible to identify potential homologous folds or suggest novel folds based on the arrangement of alpha helix elements, resulting in a structure-based recognition of folds containing alpha helices. Foldhunter uses a six-dimensional cross-correlation search allowing a probe structure to be fitted within a region or component of a target structure. The structural fitting therefore provides a quantitative means to further examine the architecture and organization of large, complex assemblies. These two methods have been successfully tested with simulated structures modeled from the PDB at resolutions between 6 and 12 A. With the integration of helixhunter and foldhunter into sequence and structural informatics techniques, we have the potential to deduce or confirm known or novel folds in domains or components within large complexes.  相似文献   

2.
Protein function is a dynamic property closely related to the conformational mechanisms of protein structure in its physiological environment. To understand and control the function of target proteins, it becomes increasingly important to develop methods and tools for predicting collective motions at the molecular level. In this article, we review computational methods for predicting conformational dynamics and discuss software tools for data analysis. In particular, we discuss a high-throughput, web-based system called iGNM for protein structural dynamics. iGNM contains a database of protein motions for more than 20 000 PDB structures and supports online calculations for newly deposited PDB structures or user-modified structures. iGNM allows dynamics analysis of protein structures ranging from enzymes to large complexes and assemblies, and enables the exploration of protein sequence-structure-dynamics-function relations.  相似文献   

3.
An overview of the structures of protein-DNA complexes   总被引:1,自引:0,他引:1  
Luscombe NM  Austin SE  Berman HM  Thornton JM 《Genome biology》2000,1(1):reviews001.1-reviews00137
On the basis of a structural analysis of 240 protein-DNA complexes contained in the Protein Data Bank (PDB), we have classified the DNA-binding proteins involved into eight different structural/functional groups, which are further classified into 54 structural families. Here we present this classification and review the functions, structures and binding interactions of these protein-DNA complexes.  相似文献   

4.
5.
The database NPIDB (Nucleic Acids-Protein Interaction DataBase) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from PDB (1834 complexes in July 2007). It is organized as a collection of files in PDB format and is equipped with a web-interface and a set of tools for extracting biologically meaningful characteristics of complexes. The content of the database is weekly updated. AVAILABILITY: http://monkey.belozersky.msu.ru/NPIDB/  相似文献   

6.
The large number of macromolecular structures deposited with the Protein Data Bank (PDB) describing complexes between proteins and either physiological compounds or synthetic drugs made it possible a systematic analysis of the interactions occurring between proteins and their ligands. In this work, the binding pockets of about 4000 PDB protein‐ligand complexes were investigated and amino acid and interaction types were analyzed. The residues observed with lowest frequency in protein sequences, Trp, His, Met, Tyr, and Phe, turned out to be the most abundant in binding pockets. Significant differences between drug‐like and physiological compounds were found. On average, physiological compounds establish with respect to drugs about twice as many hydrogen bonds with protein atoms, whereas drugs rely more on hydrophobic interactions to establish target selectivity. The large number of PDB structures describing homologous proteins in complex with the same ligand made it possible to analyze the conservation of binding pocket residues among homologous protein structures bound to the same ligand, showing that Gly, Glu, Arg, Asp, His, and Thr are more conserved than other amino acids. Also in the cases in which the same ligand is bound to unrelated proteins, the binding pockets showed significant conservation in the residue types. In this case, the probability of co‐occurrence of the same amino acid type in the binding pockets could be up to thirteen times higher than that expected on a random basis. The trends identified in this study may provide an useful guideline in the process of drug design and lead optimization. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

7.
MOTIVATION: Modeling of protein interactions is often possible from known structures of related complexes. It is often time-consuming to find the most appropriate template. Hypothesized biological units (BUs) often differ from the asymmetric units and it is usually preferable to model from the BUs. RESULTS: ProtBuD is a database of BUs for all structures in the Protein Data Bank (PDB). We use both the PDBs BUs and those from the Protein Quaternary Server. ProtBuD is searchable by PDB entry, the Structural Classification of Proteins (SCOP) designation or pairs of SCOP designations. The database provides the asymmetric and BU contents of related proteins in the PDB as identified in SCOP and Position-Specific Iterated BLAST (PSI-BLAST). The asymmetric unit is different from PDB and/or Protein Quaternary Server (PQS) BUs for 52% of X-ray structures, and the PDB and PQS BUs disagree on 18% of entries. AVAILABILITY: The database is provided as a standalone program and a web server from http://dunbrack.fccc.edu/ProtBuD.php.  相似文献   

8.
We describe the current status of the Java molecular graphics tool, MolSurfer. MolSurfer has been designed to assist the analysis of the structures and physico-chemical properties of macromolecular interfaces. MolSurfer provides a coupled display of two-dimensional (2D) maps of the interfaces generated with the ADS software and a three-dimensional (3D) view of the macromolecular structure in the Java PDB viewer, WebMol. The interfaces are analytically defined and properties such as electrostatic potential or hydrophobicity are projected on to them. MolSurfer has been applied previously to analyze a set of 39 protein-protein complexes, with structures available from the Protein Data Bank (PDB). A new application, described here, is the visualization of 75 interfaces in structures of protein-DNA and protein-RNA complexes. Another new feature is that the MolSurfer web server is now able to compute and map Poisson-Boltzmann electrostatic potentials of macromolecules onto interfaces. The MolSurfer web server is available at http://projects.villa-bosch.de/mcm/software/molsurfer.  相似文献   

9.
MOTIVATION: Integral membrane proteins play important roles in living cells. Although these proteins are estimated to constitute 25% of proteins at a genomic scale, the Protein Data Bank (PDB) contains only a few hundred membrane proteins due to the difficulties with experimental techniques. The presence of transmembrane proteins in the structure data bank, however, is quite invisible, as the annotation of these entries is rather poor. Even if a protein is identified as a transmembrane one, the possible location of the lipid bilayer is not indicated in the PDB because these proteins are crystallized without their natural lipid bilayer, and currently no method is publicly available to detect the possible membrane plane using the atomic coordinates of membrane proteins. RESULTS: Here, we present a new geometrical approach to distinguish between transmembrane and globular proteins using structural information only and to locate the most likely position of the lipid bilayer. An automated algorithm (TMDET) is given to determine the membrane planes relative to the position of atomic coordinates, together with a discrimination function which is able to separate transmembrane and globular proteins even in cases of low resolution or incomplete structures such as fragments or parts of large multi chain complexes. This method can be used for the proper annotation of protein structures containing transmembrane segments and paves the way to an up-to-date database containing the structure of all known transmembrane proteins and fragments (PDB_TM) which can be automatically updated. The algorithm is equally important for the purpose of constructing databases purely of globular proteins.  相似文献   

10.

Background  

In the research on protein functional sites, researchers often need to identify binding-site residues on a protein. A commonly used strategy is to find a complex structure from the Protein Data Bank (PDB) that consists of the protein of interest and its interacting partner(s) and calculate binding-site residues based on the complex structure. However, since a protein may participate in multiple interactions, the binding-site residues calculated based on one complex structure usually do not reveal all binding sites on a protein. Thus, this requires researchers to find all PDB complexes that contain the protein of interest and combine the binding-site information gleaned from them. This process is very time-consuming. Especially, combing binding-site information obtained from different PDB structures requires tedious work to align protein sequences. The process becomes overwhelmingly difficult when researchers have a large set of proteins to analyze, which is usually the case in practice.  相似文献   

11.
The rapidly increasing amount of information on three-dimensional (3D) structures of biological macro-molecules has still an insufficient impact on genome analysis, functional genomics and proteomics as well as on many other fields in biomedicine including disease-related research. There are, however, attempts to make structural data more easily accessible to the bench biologist. As members of the world-wide Protein Data Bank (wwPDB), the RCSB Protein Data Bank (PDB), the Protein Data Bank Japan and the Macromolecular Structure Database are the primary information resources for 3D structures of proteins, nucleic acids, carbohydrates and complexes thereof. In addition, a number of secondary resources have been set up that also provide information on all currently known structures in a relatively comprehensive manner and not focusing on specific features only. They include PDBsum, the OCA browser-database for protein structure/function, the Molecular Modeling Database and the Jena Library of Biological Macromolecules--JenaLib. Both the primary and secondary resources often merge the information in the PDB files with data from other resources and offer additional analysis tools thereby adding value to the original PDB data. Here, we briefly describe these resources from a user's point of view and from a comparative perspective. It is our aim to guide researchers outside the structure biology field in getting the most out of the 3D structure resources.  相似文献   

12.
MOTIVATION: The Protein Data Bank (PDB) contains over 43,800 experimentally determined 3D models of macromolecular structures and their complexes. Each 3D model reveals something interesting and important about the given molecule's function and biological significance. Usually the best source of this information is the original article describing it, and it is often possible to discern the key aspects of the structure from just one or two of the figures in that article. RESULTS: Here we describe how, with the permission of the journals and their publishers, we have endeavoured to make these key figures publicly available to enhance the functional information relating to each PDB entry in our PDBsum database. AVAILABILITY: http://www.ebi.ac.uk/pdbsum.  相似文献   

13.
In the study of protein complexes, is there a computational method for inferring which combinations of proteins in an organism are likely to form a crystallizable complex? Here we attempt to answer this question, using the Protein Data Bank (PDB) to assess the usefulness of inferred functional protein linkages from the Prolinks database. We find that of the 242 nonredundant prokaryotic protein complexes shared between the current PDB and Prolinks, 44% (107/242) contain proteins linked at high confidence by one or more methods of computed functional linkages. Similarly, high-confidence linkages detect 47% of known Escherichia coli protein complexes, with 45% accuracy. Together these findings suggest that functional linkages will be useful in defining protein complexes for structural studies, including for structural genomics. We offer a database of inferred linkages corresponding to likely protein complexes for some 629,952 pairs of proteins in 154 prokaryotes and archaea.  相似文献   

14.
The Protein Data Bank (PDB) is the worldwide repository of 3D structures of proteins, nucleic acids and complex assemblies. The PDB’s large corpus of data (> 100,000 structures) and related citations provide a well-organized and extensive test set for developing and understanding data citation and access metrics. In this paper, we present a systematic investigation of how authors cite PDB as a data repository. We describe a novel metric based on information cascade constructed by exploring the citation network to measure influence between competing works and apply that to analyze different data citation practices to PDB. Based on this new metric, we found that the original publication of RCSB PDB in the year 2000 continues to attract most citations though many follow-up updates were published. None of these follow-up publications by members of the wwPDB organization can compete with the original publication in terms of citations and influence. Meanwhile, authors increasingly choose to use URLs of PDB in the text instead of citing PDB papers, leading to disruption of the growth of the literature citations. A comparison of data usage statistics and paper citations shows that PDB Web access is highly correlated with URL mentions in the text. The results reveal the trend of how authors cite a biomedical data repository and may provide useful insight of how to measure the impact of a data repository.  相似文献   

15.
MOTIVATION: Public resources for studying protein interfaces are necessary for better understanding of molecular recognition and developing intermolecular potentials, search procedures and scoring functions for the prediction of protein complexes. RESULTS: The first release of the DOCKGROUND resource implements a comprehensive database of co-crystallized (bound-bound) protein-protein complexes, providing foundation for the upcoming expansion to unbound (experimental and simulated) protein-protein complexes, modeled protein-protein complexes and systematic sets of docking decoys. The bound-bound part of DOCKGROUND is a relational database of annotated structures based on the Biological Unit file (Biounit) provided by the RCSB as a separated file containing probable biological molecule. DOCKGROUND is automatically updated to reflect the growth of PDB. It contains 67,220 pairwise complexes that rely on 14,913 Biounit entries from 34,778 PDB entries (January 30, 2006). The database includes a dynamic generation of non-redundant datasets of pairwise complexes based either on the structural similarity (SCOP classification) or on user-defined sequence identity. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new methodologies for modeling of protein interactions. AVAILABILITY: DOCKGROUND is available at http://dockground.bioinformatics.ku.edu. The current first release implements the bound-bound part.  相似文献   

16.
Structure‐based drug design utilizes apoprotein or complex structures retrieved from the PDB. >57% of crystallographic PDB entries were obtained with polyethylene glycols (PEGs) as precipitant and/or as cryoprotectant, but <6% of these report presence of individual ethyleneglycol oligomers. We report a case in which ethyleneglycol oligomers' presence in a crystal structure markedly affected the bound ligand's position. Specifically, we compared the positions of methylene blue and decamethonium in acetylcholinesterase complexes obtained using isomorphous crystals precipitated with PEG200 or ammonium sulfate. The ligands' positions within the active‐site gorge in complexes obtained using PEG200 are influenced by presence of ethyleneglycol oligomers in both cases bound to W84 at the gorge's bottom, preventing interaction of the ligand's proximal quaternary group with its indole. Consequently, both ligands are ~3.0Å further up the gorge than in complexes obtained using crystals precipitated with ammonium sulfate, in which the quaternary groups make direct π‐cation interactions with the indole. These findings have implications for structure‐based drug design, since data for ligand‐protein complexes with polyethylene glycol as precipitant may not reflect the ligand's position in its absence, and could result in selecting incorrect drug discovery leads. Docking methylene blue into the structure obtained with PEG200, but omitting the ethyleneglycols, yields results agreeing poorly with the crystal structure; excellent agreement is obtained if they are included. Many proteins display features in which precipitants might lodge. It will be important to investigate presence of precipitants in published crystal structures, and whether it has resulted in misinterpreting electron density maps, adversely affecting drug design.  相似文献   

17.
Protein docking procedures carry out the task of predicting the structure of a protein–protein complex starting from the known structures of the individual protein components. More often than not, however, the structure of one or both components is not known, but can be derived by homology modeling on the basis of known structures of related proteins deposited in the Protein Data Bank (PDB). Thus, the problem is to develop methods that optimally integrate homology modeling and docking with the goal of predicting the structure of a complex directly from the amino acid sequences of its component proteins. One possibility is to use the best available homology modeling and docking methods. However, the models built for the individual subunits often differ to a significant degree from the bound conformation in the complex, often much more so than the differences observed between free and bound structures of the same protein, and therefore additional conformational adjustments, both at the backbone and side chain levels need to be modeled to achieve an accurate docking prediction. In particular, even homology models of overall good accuracy frequently include localized errors that unfavorably impact docking results. The predicted reliability of the different regions in the model can also serve as a useful input for the docking calculations. Here we present a benchmark dataset that should help to explore and solve combined modeling and docking problems. This dataset comprises a subset of the experimentally solved ‘target’ complexes from the widely used Docking Benchmark from the Weng Lab (excluding antibody–antigen complexes). This subset is extended to include the structures from the PDB related to those of the individual components of each complex, and hence represent potential templates for investigating and benchmarking integrated homology modeling and docking approaches. Template sets can be dynamically customized by specifying ranges in sequence similarity and in PDB release dates, or using other filtering options, such as excluding sets of specific structures from the template list. Multiple sequence alignments, as well as structural alignments of the templates to their corresponding subunits in the target are also provided. The resource is accessible online or can be downloaded at http://cluspro.org/benchmark , and is updated on a weekly basis in synchrony with new PDB releases. Proteins 2016; 85:10–16. © 2016 Wiley Periodicals, Inc.  相似文献   

18.
Simonson T  Calimet N 《Proteins》2002,49(1):37-48
In zinc proteins, the Zn2+ cation frequently binds with a tetrahedral coordination to cysteine and histidine side chains, for example, in many DNA-binding proteins, where it plays primarily a structural role. We examine the possibility of thiolate protonation in Cys(x)His(y)-Zn2+ groups, both in proteins and in solution, through a combination of theoretical calculations and database analysis. Seventy-five percent of the thiolate-coordinated zincs in the Cambridge Structural Database are tetrahedral, while di-alkanethiol coordination always involves five or more ligands. Ab initio quantum calculations are performed on (ethanethiol/thiolate)(3)imidazole-Zn2+ complexes in vacuum, yielding geometries and gas phase basicities. Protonating one (respectively two) thiolates increases the Zn-S(thiol) distance by 0.4 A (respectively 0.3 A), providing a structural marker for protonation. The stabilities of the complexes in solution are compared by combining the gas phase basicities with continuum dielectric solvation calculations. In a continuum solvent with permittivity epsilon = 4, 20, or 80, one of three thiolates is predicted to be protonated at neutral pH. By extension, Cys4-Zn2+ groups are expected to be protonated in the same conditions. In contrast, most Cys3His and Cys4 geometries in the Protein Data Bank (PDB) appear consistent with all-thiolate Zn2+ coordination. This apparent discrepancy is resolved by two recent surveys of zinc protein structures, which suggest that these all-thiolate sites are stabilized by charged and polar groups nearby in the protein, thus overcoming their intrinsic instability. However, the experimental resolution is not sufficient in all the PDB structures to rule out a thiol/thiolate mixture, and protonated thiolates may occur in some proteins not solved at high resolution or not represented in the PDB, as suggested by recent mass spectrometry experiments; this possibility should be allowed for in X-ray structure refinement.  相似文献   

19.
Biomolecular structures at atomic resolution present a valuable resource for the understanding of biology. NMR spectroscopy accounts for 11 % of all structures in the PDB repository. In response to serious problems with the accuracy of some of the NMR-derived structures and in order to facilitate proper analysis of the experimental models, a number of program suites are available. We discuss nine of these tools in this review: PROCHECK-NMR, PSVS, GLM-RMSD, CING, Molprobity, Vivaldi, ResProx, NMR constraints analyzer and QMEAN. We evaluate these programs for their ability to assess the structural quality, restraints and their violations, chemical shifts, peaks and the handling of multi-model NMR ensembles. We document both the input required by the programs and output they generate. To discuss their relative merits we have applied the tools to two representative examples from the PDB: a small, globular monomeric protein (Staphylococcal nuclease from S. aureus, PDB entry 2kq3) and a small, symmetric homodimeric protein (a region of human myosin-X, PDB entry 2lw9).  相似文献   

20.
The recent accumulation of large amounts of 3D structural data warrants a sensitive and automatic method to compare and classify these structures. We developed a web server for comparing protein 3D structures using the program Matras (http://biunit.aist-nara.ac.jp/matras). An advantage of Matras is its structure similarity score, which is defined as the log-odds of the probabilities, similar to Dayhoff's substitution model of amino acids. This score is designed to detect evolutionarily related (homologous) structural similarities. Our web server has three main services. The first one is a pairwise 3D alignment, which is simply align two structures. A user can assign structures by either inputting PDB codes or by uploading PDB format files in the local machine. The second service is a multiple 3D alignment, which compares several protein structures. This program employs the progressive alignment algorithm, in which pairwise 3D alignments are assembled in the proper order. The third service is a 3D library search, which compares one query structure against a large number of library structures. We hope this server provides useful tools for insights into protein 3D structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号