首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Protein surfaces comprise only a fraction of the total residues but are the most conserved functional features of proteins. Surfaces performing identical functions are found in proteins absent of any sequence or fold similarity. While biochemical activity can be attributed to a few key residues, the broader surrounding environment plays an equally important role.

Results

We describe a methodology that attempts to optimize two components, global shape and local physicochemical texture, for evaluating the similarity between a pair of surfaces. Surface shape similarity is assessed using a three-dimensional object recognition algorithm and physicochemical texture similarity is assessed through a spatial alignment of conserved residues between the surfaces. The comparisons are used in tandem to efficiently search the Global Protein Surface Survey (GPSS), a library of annotated surfaces derived from structures in the PDB, for studying evolutionary relationships and uncovering novel similarities between proteins.

Conclusion

We provide an assessment of our method using library retrieval experiments for identifying functionally homologous surfaces binding different ligands, functionally diverse surfaces binding the same ligand, and binding surfaces of ubiquitous and conformationally flexible ligands. Results using surface similarity to predict function for proteins of unknown function are reported. Additionally, an automated analysis of the ATP binding surface landscape is presented to provide insight into the correlation between surface similarity and function for structures in the PDB and for the subset of protein kinases.  相似文献   

2.
Structural properties of carbohydrate surface binding sites (SBSs) were investigated with computational methods. Eighty‐five SBSs of 44 enzymes in 119 Protein Data Bank (PDB) files were collected as a dataset. On the basis of SBSs shape, they were divided into 3 categories: flat surfaces, clefts, and cavities (types A, B, and C, respectively). Ligand varieties showed the correlation between shape of SBSs and ligands size. To reduce cut‐off differences in each SBSs with different ligand size, molecular docking were performed. Molecular docking results were used to refine SBSs classification and binding sites cut‐off. Docking results predicted putative ligands positions and displayed dependence of the ligands binding mode to the structural type of SBSs. Physicochemical properties of SBSs were calculated for all docking results with YASARA Structure. The results showed that all SBSs are hydrophilic, while their charges could vary and depended to ligand size and defined cut‐off. Surface binding sites type B had highest average values of solvent accessible surface area. Analysis of interactions showed that hydrophobic interactions occur more than hydrogen bonds, which is related to the presence of aromatic residues and carbohydrates interactions.  相似文献   

3.
4.
研究蛋白质和配体相互作用的结构和亲和力,不仅有助于了解蛋白质的功能,而且对药物研发以及药物作用机制的研究,也具有十 分重要的意义。目前,人们通过人工检索和半自动检索的方式,从文献和蛋白质数据库(Protein Data Bank,PDB)中获得了许多蛋白质- 配体亲和力信息和生物相关配体信息,并构建了许多蛋白质-配体相互作用的信息数据库。对3 个蛋白质-配体亲和力数据库和6 个蛋白质 晶体结构-配体生物相关性数据库进行介绍,并对其主要应用进行简述,希望能为实现高效准确地筛选和设计药物提供一定的帮助。  相似文献   

5.
In the study of protein complexes, is there a computational method for inferring which combinations of proteins in an organism are likely to form a crystallizable complex? Here we attempt to answer this question, using the Protein Data Bank (PDB) to assess the usefulness of inferred functional protein linkages from the Prolinks database. We find that of the 242 nonredundant prokaryotic protein complexes shared between the current PDB and Prolinks, 44% (107/242) contain proteins linked at high confidence by one or more methods of computed functional linkages. Similarly, high-confidence linkages detect 47% of known Escherichia coli protein complexes, with 45% accuracy. Together these findings suggest that functional linkages will be useful in defining protein complexes for structural studies, including for structural genomics. We offer a database of inferred linkages corresponding to likely protein complexes for some 629,952 pairs of proteins in 154 prokaryotes and archaea.  相似文献   

6.
Protein Data Bank (PDB) file contains atomic data for protein and ligand in protein-ligand complexes. Structure data file (SDF) contains data for atoms, bonds, connectivity and coordinates of molecule for ligands. We describe PDBToSDF as a tool to separate the ligand data from pdb file for the calculation of ligand properties like molecular weight, number of hydrogen bond acceptors, hydrogen bond receptors easily.  相似文献   

7.
The Protein Data Bank (PDB) has been processed to extract a screening protein library (sc-PDB) of 2148 entries. A knowledge-based detection algorithm has been applied to 18,000 PDB files to find regular expressions corresponding to either protein, ions, co-factors, solvent, or ligand atoms. The sc-PDB database comprises high-resolution X-ray structures of proteins for which (i) a well-defined active site exists, (ii) the bound-ligand is a small molecular weight molecule. The database has been screened by an inverse docking tool derived from the GOLD program to recover the known target of four unrelated ligands. Both the database and the inverse screening procedures are accurate enough to rank the true target of the four investigated ligands among the top 1% scorers, with 70-100 fold enrichment with respect to random screening. Applying the proposed screening procedure to a small-sized generic ligand was much less accurate suggesting that inverse screening shall be reserved to rather selective compounds.  相似文献   

8.
The fast heuristic graph match algorithm for small molecules, COMPLIG, was improved by adding a structural superposition process to verify the atom–atom matching. The modified method was used to classify the small molecule ligands in the Protein Data Bank (PDB) by their three-dimensional structures, and 16,660 types of ligands in the PDB were classified into 7561 clusters. In contrast, a classification by a previous method (without structure superposition) generated 3371 clusters from the same ligand set. The characteristic feature in the current classification system is the increased number of singleton clusters, which contained only one ligand molecule in a cluster. Inspections of the singletons in the current classification system but not in the previous one implied that the major factors for the isolation were differences in chirality, cyclic conformations, separation of substructures, and bond length. Comparisons between current and previous classification systems revealed that the superposition-based classification was effective in clustering functionally related ligands, such as drugs targeted to specific biological processes, owing to the strictness of the atom–atom matching.  相似文献   

9.
The identification and modelling of ligands into macromolecular models is important for understanding molecule's function and for designing inhibitors to modulate its activities. We describe new algorithms for the automated building of ligands into electron density maps in crystal structure determination. Location of the ligand-binding site is achieved by matching numerical shape features describing the ligand to those of density clusters using a "fragmentation-tree" density representation. The ligand molecule is built using two distinct algorithms exploiting free atoms with inter-atomic connectivity and Metropolis-based optimisation of the conformational state of the ligand, producing an ensemble of structures from which the final model is derived. The method was validated on several thousand entries from the Protein Data Bank. In the majority of cases, the ligand-binding site could be correctly located and the ligand model built with a coordinate accuracy of better than 1 ?. We anticipate that the method will be of routine use to anyone modelling ligands, lead compounds or even compound fragments as part of protein functional analyses or drug design efforts.  相似文献   

10.
We report here the first crystal structure of the N-terminal domain of an A-type Lon protease. Lon proteases are ubiquitous, multidomain, ATP-dependent enzymes with both highly specific and non-specific protein binding, unfolding, and degrading activities. We expressed and purified a stable, monomeric 119-amino acid N-terminal subdomain of the Escherichia coli A-type Lon protease and determined its crystal structure at 2.03 A (Protein Data Bank [PDB] code 2ANE). The structure was solved in two crystal forms, yielding 14 independent views. The domain exhibits a unique fold consisting primarily of three twisted beta-sheets and a single long alpha-helix. Analysis of recent PDB depositions identified a similar fold in BPP1347 (PDB code 1ZBO), a 203-amino acid protein of unknown function from Bordetella parapertussis, crystallized as part of a structural genomics effort. BPP1347 shares sequence homology with Lon N-domains and with a family of other independently expressed proteins of unknown functions. We postulate that, as is the case in Lon proteases, this structural domain represents a general protein and polypeptide interaction domain.  相似文献   

11.
We developed a new computational algorithm for the accurate identification of ligand binding envelopes rather than surface binding sites. We performed a large scale classification of the identified envelopes according to their shape and physicochemical properties. The predicting algorithm, called PocketFinder, uses a transformation of the Lennard-Jones potential calculated from a three-dimensional protein structure and does not require any knowledge about a potential ligand molecule. We validated this algorithm using two systematically collected data sets of ligand binding pockets from complexed (bound) and uncomplexed (apo) structures from the Protein Data Bank, 5616 and 11,510, respectively. As many as 96.8% of experimental binding sites were predicted at better than 50% overlap level. Furthermore 95.0% of the asserted sites from the apo receptors were predicted at the same level. We demonstrate that conformational differences between the apo and bound pockets do not dramatically affect the prediction results. The algorithm can be used to predict ligand binding pockets of uncharacterized protein structures, suggest new allosteric pockets, evaluate feasibility of protein-protein interaction inhibition, and prioritize molecular targets. Finally the data base of the known and predicted binding pockets for the human proteome structures, the human pocketome, was collected and classified. The pocketome can be used for rapid evaluation of possible binding partners of a given chemical compound.  相似文献   

12.
13.
A shape-based Gaussian docking function is constructed which uses Gaussian functions to represent the shapes of individual atoms. A set of 20 trypsin ligand-protein complexes are drawn from the Protein Data Bank (PDB), the ligands are separated from the proteins, and then are docked back into the active sites using numerical optimization of this function. It is found that by employing this docking function, quasi-Newton optimization is capable of moving ligands great distances [on average 7 A root mean square distance (RMSD)] to locate the correctly docked structure. It is also found that a ligand drawn from one PDB file can be docked into a trypsin structure drawn from any of the trypsin PDB files. This implies that this scoring function is not limited to more accurate x-ray structures, as is the case for many of the conventional docking methods, but could be extended to homology models.  相似文献   

14.

Background  

The rapid development of structural genomics has resulted in many "unknown function" proteins being deposited in Protein Data Bank (PDB), thus, the functional prediction of these proteins has become a challenge for structural bioinformatics. Several sequence-based and structure-based methods have been developed to predict protein function, but these methods need to be improved further, such as, enhancing the accuracy, sensitivity, and the computational speed. Here, an accurate algorithm, the CMASA (Contact MAtrix based local Structural Alignment algorithm), has been developed to predict unknown functions of proteins based on the local protein structural similarity. This algorithm has been evaluated by building a test set including 164 enzyme families, and also been compared to other methods.  相似文献   

15.
Vicinity analysis (VA) is a new methodology developed to identify similarities between protein binding sites based on their three-dimensional structure and the chemical similarity of matching residues. The major objective is to enable searching of the Protein Data Bank (PDB) for similar sub-pockets, especially in proteins from different structural and biochemical series. Inspection of the ligands bound in these pockets should allow ligand functionality to be identified, thus suggesting novel monomers for use in library synthesis. VA has been developed initially using the ATP binding site in kinases, an important class of protein targets involved in cell signalling and growth regulation. This paper defines the VA procedure and describes matches to the phosphate binding sub-pocket of cyclin-dependent protein kinase 2 that were found by searching a small test database that has also been used to parameterise the methodology.  相似文献   

16.
The Protein Data Bank (PDB; http://www.pdb.org/) continues to be actively involved in various aspects of the informatics of structural genomics projects--developing and maintaining the Target Registration Database (TargetDB), organizing data dictionaries that will define the specification for the exchange and deposition of data with the structural genomics centers and creating software tools to capture data from standard structure determination applications.  相似文献   

17.
As a discipline, structural biology has been transformed by the three-dimensional electron microscopy (3DEM) “Resolution Revolution” made possible by convergence of robust cryo-preservation of vitrified biological materials, sample handling systems, and measurement stages operating a liquid nitrogen temperature, improvements in electron optics that preserve phase information at the atomic level, direct electron detectors (DEDs), high-speed computing with graphics processing units, and rapid advances in data acquisition and processing software. 3DEM structure information (atomic coordinates and related metadata) are archived in the open-access Protein Data Bank (PDB), which currently holds more than 11,000 3DEM structures of proteins and nucleic acids, and their complexes with one another and small-molecule ligands (~ 6% of the archive). Underlying experimental data (3DEM density maps and related metadata) are stored in the Electron Microscopy Data Bank (EMDB), which currently holds more than 21,000 3DEM density maps. After describing the history of the PDB and the Worldwide Protein Data Bank (wwPDB) partnership, which jointly manages both the PDB and EMDB archives, this review examines the origins of the resolution revolution and analyzes its impact on structural biology viewed through the lens of PDB holdings. Six areas of focus exemplifying the impact of 3DEM across the biosciences are discussed in detail (icosahedral viruses, ribosomes, integral membrane proteins, SARS-CoV-2 spike proteins, cryogenic electron tomography, and integrative structure determination combining 3DEM with complementary biophysical measurement techniques), followed by a review of 3DEM structure validation by the wwPDB that underscores the importance of community engagement.  相似文献   

18.
The rapidly increasing amount of information on three-dimensional (3D) structures of biological macro-molecules has still an insufficient impact on genome analysis, functional genomics and proteomics as well as on many other fields in biomedicine including disease-related research. There are, however, attempts to make structural data more easily accessible to the bench biologist. As members of the world-wide Protein Data Bank (wwPDB), the RCSB Protein Data Bank (PDB), the Protein Data Bank Japan and the Macromolecular Structure Database are the primary information resources for 3D structures of proteins, nucleic acids, carbohydrates and complexes thereof. In addition, a number of secondary resources have been set up that also provide information on all currently known structures in a relatively comprehensive manner and not focusing on specific features only. They include PDBsum, the OCA browser-database for protein structure/function, the Molecular Modeling Database and the Jena Library of Biological Macromolecules--JenaLib. Both the primary and secondary resources often merge the information in the PDB files with data from other resources and offer additional analysis tools thereby adding value to the original PDB data. Here, we briefly describe these resources from a user's point of view and from a comparative perspective. It is our aim to guide researchers outside the structure biology field in getting the most out of the 3D structure resources.  相似文献   

19.
Identification of protein biochemical functions based on their three-dimensional structures is now required in the post-genome-sequencing era. Ligand binding is one of the major biochemical functions of proteins, and thus the identification of ligands and their binding sites is the starting point for the function identification. Previously we reported our first trial on structure-based function prediction, based on the similarity searches of molecular surfaces against the functional site database. Here we describe the extension of our first trial by expanding the search database to whole heteroatom binding sites appearing within the Protein Data Bank (PDB) with the new analysis protocol. In addition, we have determined the similarity threshold line, by using 10 structure pairs with solved free and complex structures. Finally, we extensively applied our method to newly determined hypothetical proteins, including some without annotations, and evaluated the performance of our methods.  相似文献   

20.
Nucleoside triphosphate (NTP) ligands are of high biological importance and are essential for all life forms. A pre‐requisite for them to participate in diverse biochemical processes is their recognition by diverse proteins. It is thus of great interest to understand the basis for such recognition in different proteins. Towards this, we have used a structural bioinformatics approach and analyze structures of 4677 NTP complexes available in Protein Data Bank (PDB). Binding sites were extracted and compared exhaustively using PocketMatch, a sensitive in‐house site comparison algorithm, which resulted in grouping the entire dataset into 27 site‐types. Each of these site‐types represent a structural motif comprised of two or more residue conservations, derived using another in‐house tool for superposing binding sites, PocketAlign. The 27 site‐types could be grouped further into 9 super‐types by considering partial similarities in the sites, which indicated that the individual site‐types comprise different combinations of one or more site features. A scan across PDB using the 27 structural motifs determined the motifs to be specific to NTP binding sites, and a computational alanine mutagenesis indicated that residues identified to be highly conserved in the motifs are also most contributing to binding. Alternate orientations of the ligand in several site‐types were observed and rationalized, indicating the possibility of some residues serving as anchors for NTP recognition. The presence of multiple site‐types and the grouping of multiple folds into each site‐type is strongly suggestive of convergent evolution. Knowledge of determinants obtained from this study will be useful for detecting function in unknown proteins. Proteins 2017; 85:1699–1712. © 2017 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号