首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Using a data set of 454 crystal structures of peptides and 80 crystal structures of non-homologous proteins solved at ultra high resolution of 1.2 A or better we have analyzed the occurrence of disallowed Ramachandran (phi, psi) angles. Out of 1492 and 13508 non-glycyl residues in peptides and proteins respectively 12 and 76 residues in the two datasets adopt clearly disallowed combinations of Ramachandran angles. These examples include a number of conformational points which are far away from any of the allowed regions in the Ramachandran map. According to the Ramachandran map a given (phi, psi) combination is considered disallowed when two non-bonded atoms in a system of two-linked peptide units with ideal geometry are prohibitively proximal in space. However, analysis of the disallowed conformations in peptide and protein structures reveals that none of the observations of disallowed conformations in the crystal structures correspond to a short contact between non-bonded atoms. A further analysis of deviations of bond lengths and angles, from the ideal peptide geometry, at the residue positions of disallowed conformations in the crystal structures suggest that individual bond lengths and angles are all within acceptable limits. Thus, it appears that the rare tolerance of disallowed conformations is possible by gentle and acceptable deviations in a number of bond lengths and angles, from ideal geometry, over a series of bonds resulting in a net gross effect of acceptable non-bonded inter-atomic distances.  相似文献   

2.
Knowledge of the 3D structure of glycans is a prerequisite for a complete understanding of the biological processes glycoproteins are involved in. However, due to a lack of standardised nomenclature, carbohydrate compounds are difficult to locate within the Protein Data Bank (PDB). Using an algorithm that detects carbohydrate structures only requiring element types and atom coordinates, we were able to detect 1663 entries containing a total of 5647 carbohydrate chains. The majority of chains are found to be N-glycosidically bound. Noncovalently bound ligands are also frequent, while O-glycans form a minority. About 30% of all carbohydrate containing PDB entries comprise one or several errors. The automatic assignment of carbohydrate structures in PDB entries will improve the cross-linking of glycobiology resources with genomic and proteomic data collections, which will be an important issue of the upcoming glycomics projects. By aiding in detection of erroneous annotations and structures, the algorithm might also help to increase database quality.  相似文献   

3.
The analysis of the basic geometry of amino acid residues of protein structures has demonstrated the invariability of all the bond lengths and bond angles except for tau, the backbone N-Calpha-C' angle. This angle can be widened or contracted significantly from the tetrahedral geometry to accommodate various other strains in the structure. In order to accurately determine the cause for this deviation, a survey is made for the tau angles using the peptide structures and the ultrahigh resolution protein structures. The average deviation of N-Calpha-C' angles from tetrahedral geometry for each amino acid in all the categories were calculated and then correlated with forty-eight physiochemical, energetic and conformational properties of amino acids. Linear and multiple regression analysis were carried out between the amino acid deviation and the 48 properties. This study confirms the deviation of tau angles in both the peptide and protein structures but similar forces do not influence them. The peptide structures are influenced by physical properties whereas as expected the conformational properties influence the protein structures. And it is not any single property that dominates the deviation but the combination of different factors contributes to the tau angle deviation.  相似文献   

4.
The Protein Data Bank (PDB) is the global archive for structural information on macromolecules, and a popular resource for researchers, teachers, and students, amassing more than one million unique users each year. Crystallographic structure models in the PDB (more than 100,000 entries) are optimized against the crystal diffraction data and geometrical restraints. This process of crystallographic refinement typically ignored hydrogen bond (H‐bond) distances as a source of information. However, H‐bond restraints can improve structures at low resolution where diffraction data are limited. To improve low‐resolution structure refinement, we present methods for deriving H‐bond information either globally from well‐refined high‐resolution structures from the PDB‐REDO databank, or specifically from on‐the‐fly constructed sets of homologous high‐resolution structures. Refinement incorporating HOmology DErived Restraints (HODER), improves geometrical quality and the fit to the diffraction data for many low‐resolution structures. To make these improvements readily available to the general public, we applied our new algorithms to all crystallographic structures in the PDB: using massively parallel computing, we constructed a new instance of the PDB‐REDO databank ( https://pdb-redo.eu ). This resource is useful for researchers to gain insight on individual structures, on specific protein families (as we demonstrate with examples), and on general features of protein structure using data mining approaches on a uniformly treated dataset.  相似文献   

5.
6.
Refined crystal structure of carboxypeptidase A at 1.54 A resolution   总被引:19,自引:0,他引:19  
The crystal structure of bovine carboxypeptidase A (Cox) has been refined at 1.54 A resolution using the restrained least-squares algorithm of Hendrickson & Konnert (1981). The crystallographic R factor (formula; see text) for structure factors calculated from the final model is 0.190. Bond lengths and bond angles in the carboxypeptidase A model have root-mean-square deviations from ideal values of 0.025 A and 3.6 degrees, respectively. Four examples of a reverse turn like structure (the "Asx" turn) requiring an aspartic acid or asparagine residue are observed in this structure. The Asx turn has the same number of atoms as a reverse turn, but only one peptide bond, and the hydrogen bond that closes the turn is between the Asx side-chain CO group and a main-chain NH group. The distributions of CO-N and NH-O hydrogen bond angles in the alpha-helices and beta-sheet structures of carboxypeptidase A are centered about 156 degrees. A total of 192 water molecules per molecule of enzyme are included in the final model. Unlike the hydrogen bonding geometry observed in the secondary structure of the enzyme, the CO-O(wat) hydrogen bond angle is distributed about 131 degrees, indicating the role of the lone pair electrons of the carbonyl oxygen in the hydrogen bond interaction. Twenty four solvent molecules are observed buried within the protein. Several of these waters are organized into hydrogen-bonded chains containing up to five waters. The average temperature factor for atoms in carboxypeptidase A is 8 A2, and varies from 5 A2 in the center of the protein, to over 30 A2 at the surface.  相似文献   

7.
Many proteins function as homo-oligomers and are regulated via their oligomeric state. For some proteins, the stoichiometry of homo-oligomeric states under various conditions has been studied using gel filtration or analytical ultracentrifugation experiments. The interfaces involved in these assemblies may be identified using cross-linking and mass spectrometry, solution-state NMR, and other experiments. However, for most proteins, the actual interfaces that are involved in oligomerization are inferred from X-ray crystallographic structures using assumptions about interface surface areas and physical properties. Examination of interfaces across different Protein Data Bank (PDB) entries in a protein family reveals several important features. First, similarities in space group, asymmetric unit size, and cell dimensions and angles (within 1%) do not guarantee that two crystals are actually the same crystal form, containing similar relative orientations and interactions within the crystal. Conversely, two crystals in different space groups may be quite similar in terms of all the interfaces within each crystal. Second, NMR structures and an existing benchmark of PDB crystallographic entries consisting of 126 dimers as well as larger structures and 132 monomers were used to determine whether the existence or lack of common interfaces across multiple crystal forms can be used to predict whether a protein is an oligomer or not. Monomeric proteins tend to have common interfaces across only a minority of crystal forms, whereas higher-order structures exhibit common interfaces across a majority of available crystal forms. The data can be used to estimate the probability that an interface is biological if two or more crystal forms are available. Finally, the Protein Interfaces, Surfaces, and Assemblies (PISA) database available from the European Bioinformatics Institute is more consistent in identifying interfaces observed in many crystal forms compared with the PDB and the European Bioinformatics Institute's Protein Quaternary Server (PQS). The PDB, in particular, is missing highly likely biological interfaces in its biological unit files for about 10% of PDB entries.  相似文献   

8.
9.
The analysis of disulphide bond containing proteins in the Protein Data Bank (PDB) revealed that out of 27,209 protein structures analyzed, 12,832 proteins contain at least one intra-chain disulphide bond and 811 proteins contain at least one inter-chain disulphide bond. The intra-chain disulphide bond containing proteins can be grouped into 256 categories based on the number of disulphide bonds and the disulphide bond connectivity patterns (DBCPs) that were generated according to the position of half-cystine residues along the protein chain. The PDB entries corresponding to these 256 categories represent 509 unique SCOP superfamilies. A simple web-based computational tool is made freely available at the website http://www.ccmb.res.in/bioinfo/dsbcp that allows flexible queries to be made on the database in order to retrieve useful information on the disulphide bond containing proteins in the PDB. The database is useful to identify the different SCOP superfamilies associated with a particular disulphide bond connectivity pattern or vice versa. It is possible to define a query based either on a single field or a combination of the following fields, i.e., PDB code, protein name, SCOP superfamily name, number of disulphide bonds, disulphide bond connectivity pattern and the number of amino acid residues in a protein chain and retrieve information that match the criterion. Thereby, the database may be useful to select suitable protein structural templates in order to model the more distantly related protein homologs/analogs using the comparative modeling methods.  相似文献   

10.
The process of deducing the catalytic mechanism of an enzyme from its structure is highly complex and requires extensive experimental work to validate a proposed mechanism. As one step towards improving the reliability of this process, we have gathered statistics describing the typical geometry of catalytic residues with regard to the substrate and one another. In order to analyse residue-substrate interactions, we have assembled a dataset of structures of enzymes of known mechanism bound to substrate, product, or a substrate analogue. Despite the challenges presented in obtaining such experimental data, we were able to include 42 enzyme structures. We have also assembled a separate dataset of catalytic residues which act upon other catalytic residues, using a set of 60 enzyme structures. For both datasets, we have extracted the distances between residues with a given catalytic function and their target moieties. The geometry of residues whose function involves the transfer or sharing of hydrogens (either with substrate or another residue) was analysed more closely. The results showed that the geometry for such productive interactions (prior to the transition state) closely resembles that seen in non-catalytic hydrogen bonds, with distances and angles in the normal expected range. Such statistics provide limits on "expected geometries" for catalytic residues, which will help to identify these residues and elucidate enzyme mechanisms.  相似文献   

11.
Of the roughly 20,000 canonical human protein sequences, as of January 20, 2021, 7,077 proteins have had their full or partial, medium‐ to high‐resolution structures determined by x‐ray crystallography or other methods. Which of these proteins dominate the protein data bank (the PDB) and why? In this paper, we list the 273 top human protein structures based on the number of their PDB entries. This set of proteins accounts for more than 40% of all available human PDB entries and represent past trends as well as current status for protein structural biology. We briefly discuss the relationship which some of the prominent protein structures have with protein research as a whole and mention their relevance to human diseases. The top‐10 soluble and membrane proteins are all well‐known (most of their first structures being deposited more than 30 years ago). Overall, there is no dramatic change in recent trends in the PDB. Remarkably, the number of structure depositions has grown nearly exponentially over the last 10 or more years (with a doubling time of 7 years for proteins, obtained from any organism). Growth in human protein structures is slightly faster (at 5.9 years). The information in this paper may be informative to senior scientists but also inspire researchers who are new to protein science, providing the year 2021 snap‐shot for the state of protein structural biology.  相似文献   

12.
Receptor activity modifying proteins (RAMPs) are a family of single-pass transmembrane proteins that dimerize with G-protein-coupled receptors. They may alter the ligand recognition properties of the receptors (particularly for the calcitonin receptor-like receptor, CLR). Very little structural information is available about RAMPs. Here, an ab initio model has been generated for the extracellular domain of RAMP1. The disulfide bond arrangement (Cys27-Cys82, Cys40-Cys72, and Cys57-Cys104) was determined by site-directed mutagenesis. The secondary structure (alpha-helices from residues 29-51, 60-80, and 87-100) was established from a consensus of predictive routines. Using these constraints, an assemblage of 25,000 structures was constructed and these were ranked using an all-atom statistical potential. The best 1000 conformations were energy minimized. The lowest scoring model was refined by molecular dynamics simulation. To validate our strategy, the same methods were applied to three proteins of known structure; PDB:1HP8, PDB:1V54 chain H (residues 21-85), and PDB:1T0P. When compared to the crystal structures, the models had root mean-square deviations of 3.8 A, 4.1 A, and 4.0 A, respectively. The model of RAMP1 suggested that Phe93, Tyr100, and Phe101 form a binding interface for CLR, whereas Trp74 and Phe92 may interact with ligands that bind to the CLR/RAMP1 heterodimer.  相似文献   

13.
The program HBAT is a tool to automate the analysis of potential hydrogen bonds and similar type of weak interactions like halogen bonds and non-canonical interactions in macromolecular structures, available in Brookhaven Protein Database (PDB) file format. HBAT is written using PERL and TK languages. The program generates an MSOFFICE Excel compatible output file for statistical analysis. HBAT identify potential interactions based on geometrical criteria. A series of analysis reports like frequency tables, geometry distribution tables, furcations list are generated. A user friendly GUI offers freedom to select several parameters and options. Graphviz based visualization of hydrogen bond networks in 2D helps to study the cooperativity and anticooperativity geometry in hydrogen bond. HBAT supports post docking interaction analysis between PDB files for any target/receptor (in PDB files) and docked ligands/poses (in SDF). This tool can be implemented in active site interaction analysis, structure based drug design and molecular dynamics simulations.  相似文献   

14.
15.
The ability to determine the structure of a protein in solution is a critical tool for structural biology, as proteins in their native state are found in aqueous environments. Using a physical chemistry based prediction protocol, we demonstrate the ability to reproduce protein loop geometries in experimentally derived solution structures. Predictions were run on loops drawn from (1)NMR entries in the Protein Databank (PDB), and from (2) the RECOORD database in which NMR entries from the PDB have been standardized and re-refined in explicit solvent. The predicted structures are validated by comparison with experimental distance restraints, a test of structural quality as defined by the WHAT IF structure validation program, root mean square deviation (RMSD) of the predicted loops to the original structural models, and comparison of precision of the original and predicted ensembles. Results show that for the RECOORD ensembles, the predicted loops are consistent with an average of 95%, 91%, and 87% of experimental restraints for the short, medium and long loops respectively. Prediction accuracy is strongly affected by the quality of the original models, with increases in the percentage of experimental restraints violated of 2% for the short loops, and 9% for both the medium and long loops in the PDB derived ensembles. We anticipate the application of our protocol to theoretical modeling of protein structures, such as fold recognition methods; as well as to experimental determination of protein structures, or segments, for which only sparse NMR restraint data is available.  相似文献   

16.
Mapping PDB chains to UniProtKB entries   总被引:2,自引:0,他引:2  
MOTIVATION: UniProtKB/SwissProt is the main resource for detailed annotations of protein sequences. This database provides a jumping-off point to many other resources through the links it provides. Among others, these include other primary databases, secondary databases, the Gene Ontology and OMIM. While a large number of links are provided to Protein Data Bank (PDB) files, obtaining a regularly updated mapping between UniProtKB entries and PDB entries at the chain or residue level is not straightforward. In particular, there is no regularly updated resource which allows a UniProtKB/SwissProt entry to be identified for a given residue of a PDB file. RESULTS: We have created a completely automatically maintained database which maps PDB residues to residues in UniProtKB/SwissProt and UniProtKB/trEMBL entries. The protocol uses links from PDB to UniProtKB, from UniProtKB to PDB and a brute-force sequence scan to resolve PDB chains for which no annotated link is available. Finally the sequences from PDB and UniProtKB are aligned to obtain a residue-level mapping. AVAILABILITY: The resource may be queried interactively or downloaded from http://www.bioinf.org.uk/pdbsws/.  相似文献   

17.
Normal mode analysis (NMA) can facilitate quick and systematic investigation of protein dynamics using data from the Protein Data Bank (PDB). We developed an elastic network model-based NMA program using dihedral angles as independent variables. Compared to the NMA programs that use Cartesian coordinates as independent variables, key attributes of the proposed program are as follows: (1) chain connectivity related to the folding pattern of a polypeptide chain is naturally embedded in the model; (2) the full-atom system is acceptable, and owing to a considerably smaller number of independent variables, the PDB data can be used without further manipulation; (3) the number of variables can be easily reduced by some of the rotatable dihedral angles; (4) the PDB data for any molecule besides proteins can be considered without coarse-graining; and (5) individual motions of constituent subunits and ligand molecules can be easily decomposed into external and internal motions to examine their mutual and intrinsic motions. Its performance is illustrated with an example of a DNA-binding allosteric protein, a catabolite activator protein. In particular, the focus is on the conformational change upon cAMP and DNA binding, and on the communication between their binding sites remotely located from each other. In this illustration, NMA creates a vivid picture of the protein dynamics at various levels of the structures, i.e., atoms, residues, secondary structures, domains, subunits, and the complete system, including DNA and cAMP. Comparative studies of the specific protein in different states, e.g., apo- and holo-conformations, and free and complexed configurations, provide useful information for studying structurally and functionally important aspects of the protein.  相似文献   

18.
Refined structure of spinach glycolate oxidase at 2 A resolution   总被引:11,自引:0,他引:11  
The amino acid sequence of glycolate oxidase from spinach has been fitted to an electron density map of 2.0 A nominal resolution and the structure has been refined using the restrained parameter least-squares refinement of Hendrickson and Konnert. A final crystallographic R-factor of 18.9% was obtained for 32,888 independent reflections from 5.5 to 2 A resolution. The geometry of the model, consisting of 350 amino acid residues, the cofactor flavin mononucleotide and 298 solvent molecules, is close to ideal with root-mean-square deviations of 0.015 A in bond lengths and 2.6 degrees in bond angles. The expected trimodal distribution with preference for staggered conformation is obtained for the side-chain chi 1-angles. The core of the subunit is built up from the eight beta-strands in the beta/alpha-barrel. This core consists of two hydrophobic layers. One in the center is made up of residues pointing in from the beta-strands towards the barrel axis and the second, consisting of two segments of residues, pointing out from the beta-strands towards the eight alpha-helices of the barrel and pointing from the helices towards the strands. The hydrogen bond pattern for the beta-strands in the beta/alpha-barrel is described. There are a number of residues with 3(10)-helix conformation, in particular there is one left-handed helix. The ordered solvent molecules are organized mainly in clusters. The average isotropic temperature factor is quite high, 27.1 A2, perhaps a reflection of the high solvent content in the crystal. The octameric glycolate oxidase molecule, which has 422 symmetry, makes strong interactions around the 4-fold axis forming a tight tetramer, but only weak interactions between the two tetramers forming the octamer.  相似文献   

19.
Biomolecular NMR chemical shift data are key information for the functional analysis of biomolecules and the development of new techniques for NMR studies utilizing chemical shift statistical information. Structural genomics projects are major contributors to the accumulation of protein chemical shift information. The management of the large quantities of NMR data generated by each project in a local database and the transfer of the data to the public databases are still formidable tasks because of the complicated nature of NMR data. Here we report an automated and efficient system developed for the deposition and annotation of a large number of data sets including (1)H, (13)C and (15)N resonance assignments used for the structure determination of proteins. We have demonstrated the feasibility of our system by applying it to over 600 entries from the internal database generated by the RIKEN Structural Genomics/Proteomics Initiative (RSGI) to the public database, BioMagResBank (BMRB). We have assessed the quality of the deposited chemical shifts by comparing them with those predicted from the PDB coordinate entry for the corresponding protein. The same comparison for other matched BMRB/PDB entries deposited from 2001-2011 has been carried out and the results suggest that the RSGI entries greatly improved the quality of the BMRB database. Since the entries include chemical shifts acquired under strikingly similar experimental conditions, these NMR data can be expected to be a promising resource to improve current technologies as well as to develop new NMR methods for protein studies.  相似文献   

20.
Geometric (HOMA) and magnetic (NICS) indices of aromaticity were estimated for aromatic rings of amino acids and nucleobases. Cartesian coordinates were taken directly either from PDB files deposited in public databases at the finest resolution available (≤1.5?Å), or from structures resulting from full gradient geometry optimization in a hybrid QM/MM approach. Significant environmental effects imposing alterations of HOMA values were noted for all aromatic rings analysed. Furthermore, even extra fine resolution (≤1.0?Å) is not sufficient for direct estimation of HOMA values based on Cartesian coordinates provided by PDB files. The values of mean bond errors seem to be much higher than the 0.05?Å often reported for PDB files. The use of quantum chemistry geometry optimization is strongly advised; even a simple QM/MM model comprising only the aromatic substructure within the QM region and the rest of biomolecule treated classically within the MM framework proved to be a promising means of describing aromaticity inside native environments. According to the results presented, three consequences of the interaction with the environment can be observed that induce changes in structural and magnetic indices of aromaticity. First, broad ranges of HOMA or NICS values are usually obtained for different conformations of nearest neighborhood. Next, these values and their means can differ significantly from those characterising isolated monomers. The most significant increase in aromaticities is expected for the six-membered rings of guanine, thymine and cytosine. The same trend was also noticed for all amino acids inside proteins but this effect was much smaller, reaching the highest value for the five-membered ring of tryptophan. Explicit water solutions impose similar changes on HOMA and NICS distributions. Thus, environment effects of protein, DNA and even explicit water molecules are non-negligible sources of aromaticity changes appearing in the rings of nucleobases and aromatic amino acids residues.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号