首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Protein Data Bank: unifying the archive   总被引:9,自引:3,他引:6       下载免费PDF全文
The Protein Data Bank (PDB; http://www.pdb.org/) is the single worldwide archive of structural data of biological macromolecules. This paper describes the progress that has been made in validating all data in the PDB archive and in releasing a uniform archive for the community. We have now produced a collection of mmCIF data files for the PDB archive (ftp://beta.rcsb.org/pub/pdb/uniformity/data/mmCIF/). A utility application that converts the mmCIF data files to the PDB format (called CIFTr) has also been released to provide support for existing software.  相似文献   

2.
PDBML: the representation of archival macromolecular structure data in XML   总被引:2,自引:1,他引:1  
Summary: The Protein Data Bank (PDB) has recently released versionsof the PDB Exchange dictionary and the PDB archival data filesin XML format collectively named PDBML. The automated generationof these XML files is driven by the data dictionary infrastructurein use at the PDB. The correspondences between the PDB dictionaryand the XML schema metadata are described as well as the XMLrepresentations of PDB dictionaries and data files. Availability: The current software translated XML schema fileis located at http://deposit.pdb.org/pdbML/pdbx-v1.000.xsd,and on the PDB mmCIF resource page at http://deposit.pdb.org/mmcif/.PDBML files are stored on the PDB beta ftp site at ftp://beta.rcsb.org/pub/pdb/uniformity/data/XML Contact: jwest{at}rcsb.rutgers.edu  相似文献   

3.
Paget disease of bone (PDB) is a common disorder characterized by focal and disorganized increases of bone turnover. Genetic factors are important in the pathogenesis of PDB. We and others recently mapped the third locus associated with the disorder, PDB3, at 5q35-qter. In the present study, by use of 24 French Canadian families and 112 unrelated subjects with PDB, the PDB3 locus was confined to approximately 300 kb. Within this interval, two disease-related haplotype signatures were observed in 11 families and 18 unrelated patients. This region encoded the ubiquitin-binding protein sequestosome 1 (SQSTM1/p62), which is a candidate gene for PDB because of its association with the NF-kappaB pathway. Screening SQSTM1/p62 for mutations led to the identification of a recurrent nonconservative change (P392L) flanking the ubiquitin-associated domain (UBA) (position 394-440) of the protein that was not present in 291 control individuals. Our data demonstrate that two independent mutational events at the same position in SQSTM1/p62 caused PDB in a high proportion of French Canadian patients.  相似文献   

4.
Mutations of the p62/Sequestosome 1 gene (p62/SQSTM1) account for both sporadic and familial forms of Paget's disease of bone (PDB). We originally described a methionine-->valine substitution at codon 404 (M404V) of exon 8, in the ubiquitin protein-binding domain of p62/SQSTM1 gene in an Italian PDB patient. The collection of data from the patient's pedigree provided evidence for a familial form of PDB. Extension of the genetic analysis to other relatives in this family demonstrated segregation of the M404V mutation with the polyostotic PDB phenotype and provided the identification of six asymptomatic gene carriers. DNA for mutational analysis of the exon 8 coding sequence was obtained from 22 subjects, 4 PDB patients and 18 clinically unaffected members. Of the five clinically ascertained affected members of the family, four possessed the M404V mutation and exhibited the polyostotic form of PDB, except one patient with a single X-ray-assessed skeletal localization and one with a polyostotic disease who had died several years before the DNA analysis. By both reconstitution and mutational analysis of the pedigree, six unaffected subjects were shown to bear the M404V mutation, representing potential asymptomatic gene carriers whose circulating levels of alkaline phosphatase were recently assessed as still within the normal range. Taken together, these results support a genotype-phenotype correlation between the M404V mutation in the p62/SQSTM1 gene and a polyostotic form of PDB in this family. The high penetrance of the PDB trait in this family together with the study of the asymptomatic gene carriers will allow us to confirm the proposed genotype-phenotype correlation and to evaluate the potential use of mutational analysis of the p62/SQSTM1 gene in the early detection of relatives at risk for PDB.  相似文献   

5.
6.
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/) is the single worldwide archive of structural data of biological macromolecules. This paper describes the data uniformity project that is underway to address the inconsistency in PDB data.  相似文献   

7.
The Protein Data Bank (PDB) is the single most important repository of structural data for proteins and other biologically relevant molecules. Therefore, it is critically important to keep the PDB data, as much as possible, error-free. In this study, we have analyzed PDB crystal structures possessing oligonucleotide/oligosaccharide binding (OB)-fold, one of the highly populated folds, for the presence of sequence-structure mapping errors. Using energy-based structure quality assessment coupled with sequence analyses, we have found that there are at least five OB-structures in the PDB that have regions where sequences have been incorrectly mapped onto the structure. We have demonstrated that the combination of these computation techniques is effective not only in detecting sequence-structure mapping errors, but also in providing guidance to correct them. Namely, we have used results of computational analysis to direct a revision of X-ray data for one of the PDB entries containing a fairly inconspicuous sequence-structure mapping error. The revised structure has been deposited with the PDB. We suggest use of computational energy assessment and sequence analysis techniques to facilitate structure determination when homologs having known structure are available to use as a reference. Such computational analysis may be useful in either guiding the sequence-structure assignment process or verifying the sequence mapping within poorly defined regions.  相似文献   

8.
Analyses of publicly available structural data reveal interesting insights into the impact of the three‐dimensional (3D) structures of protein targets important for discovery of new drugs (e.g., G‐protein‐coupled receptors, voltage‐gated ion channels, ligand‐gated ion channels, transporters, and E3 ubiquitin ligases). The Protein Data Bank (PDB) archive currently holds > 155,000 atomic‐level 3D structures of biomolecules experimentally determined using crystallography, nuclear magnetic resonance spectroscopy, and electron microscopy. The PDB was established in 1971 as the first open‐access, digital‐data resource in biology, and is now managed by the Worldwide PDB partnership (wwPDB; wwPDB.org ). US PDB operations are the responsibility of the Research Collaboratory for Structural Bioinformatics PDB (RCSB PDB). The RCSB PDB serves millions of RCSB.org users worldwide by delivering PDB data integrated with ~40 external biodata resources, providing rich structural views of fundamental biology, biomedicine, and energy sciences. Recently published work showed that the PDB archival holdings facilitated discovery of ~90% of the 210 new drugs approved by the US Food and Drug Administration 2010–2016. We review user‐driven development of RCSB PDB services, examine growth of the PDB archive in terms of size and complexity, and present examples and opportunities for structure‐guided drug discovery for challenging targets (e.g., integral membrane proteins).  相似文献   

9.
Based on the experimental data and homologous sites in Protein Data Bank (PDB) a model for metal binding sites in D1/D2 heterodimer has been proposed. On searching for tetranuclear and binuclear Mn binding sites in the PDB, a suitable sequence homology in thermolysin and D1 could be observed. From the homology and site-directed mutagenesis data, a model for binuclear Mn-Ca or Mn-Mn has been built and it is extended to a tetranuclear Mn centre.  相似文献   

10.
The ability of the phorbol ester tumor promoter, PDB, to activate contraction and stimulate calcium influx was investigated in rabbit thoracic aorta. PDB caused a strong, slowly-developing sustained contraction in physiological salt solution which was concentration-related (0.01 to 10.0 microM). PDB-induced contractions (0.1 microM) in calcium-free medium were attenuated but not prevented. PDB (1.0 microM) maximally stimulated Ca influx above basal control, vehicle = 39.2 +/- 2.2; PDB 1.0 microM = 70.7 +/- 6.7 mumoles Ca/kg tissue; N = 16, p less than 0.01). These data suggest that PDB activates rabbit thoracic aorta by a combination of intracellular and extracellular calcium dependent mechanisms.  相似文献   

11.
Structural data as collated in the Protein Data Bank (PDB) have been widely applied in the study and prediction of protein-protein interactions. However, since the basic PDB Entries contain only the contents of the asymmetric unit rather than the biological unit, some key interactions may be missed by analysing only the PDB Entry. A total of 69,054 SCOP (Structural Classification of Proteins) domains were examined systematically to identify the number of additional novel interacting domain pairs and interfaces found by considering the biological unit as stored in the PQS (Protein Quaternary Structure) database. The PQS data adds 25,965 interacting domain pairs to those seen in the PDB Entries to give a total of 61,783 redundant interacting domain pairs. Redundancy filtering at the level of the SCOP family shows PQS to increase the number of novel interacting domain-family pairs by 302 (13.3%) from 2277, but only 16/302 (1.4%) of the interacting domain pairs have the two domains in different SCOP families. This suggests the biological units add little to the elucidation of novel biological interaction networks. However, when the orientation of the domain pairs is considered, the PQS data increases the number of novel domain-domain interfaces observed by 1455 (34.5%) to give 5677 non-redundant domain-domain interfaces. In all, 162/1455 novel domain-domain interfaces are between domains from different families, an increase of 8.9% over the PDB Entries. Overall, the PQS biological units provide a rich source of novel domain-domain interfaces that are not seen in the studied PDB Entries, and so PQS domain-domain interaction data should be exploited wherever possible in the analysis and prediction of protein-protein interactions.  相似文献   

12.
13.
The Protein Data Bank (PDB) is the worldwide repository of 3D structures of proteins, nucleic acids and complex assemblies. The PDB’s large corpus of data (> 100,000 structures) and related citations provide a well-organized and extensive test set for developing and understanding data citation and access metrics. In this paper, we present a systematic investigation of how authors cite PDB as a data repository. We describe a novel metric based on information cascade constructed by exploring the citation network to measure influence between competing works and apply that to analyze different data citation practices to PDB. Based on this new metric, we found that the original publication of RCSB PDB in the year 2000 continues to attract most citations though many follow-up updates were published. None of these follow-up publications by members of the wwPDB organization can compete with the original publication in terms of citations and influence. Meanwhile, authors increasingly choose to use URLs of PDB in the text instead of citing PDB papers, leading to disruption of the growth of the literature citations. A comparison of data usage statistics and paper citations shows that PDB Web access is highly correlated with URL mentions in the text. The results reveal the trend of how authors cite a biomedical data repository and may provide useful insight of how to measure the impact of a data repository.  相似文献   

14.
Mapping PDB chains to UniProtKB entries   总被引:2,自引:0,他引:2  
MOTIVATION: UniProtKB/SwissProt is the main resource for detailed annotations of protein sequences. This database provides a jumping-off point to many other resources through the links it provides. Among others, these include other primary databases, secondary databases, the Gene Ontology and OMIM. While a large number of links are provided to Protein Data Bank (PDB) files, obtaining a regularly updated mapping between UniProtKB entries and PDB entries at the chain or residue level is not straightforward. In particular, there is no regularly updated resource which allows a UniProtKB/SwissProt entry to be identified for a given residue of a PDB file. RESULTS: We have created a completely automatically maintained database which maps PDB residues to residues in UniProtKB/SwissProt and UniProtKB/trEMBL entries. The protocol uses links from PDB to UniProtKB, from UniProtKB to PDB and a brute-force sequence scan to resolve PDB chains for which no annotated link is available. Finally the sequences from PDB and UniProtKB are aligned to obtain a residue-level mapping. AVAILABILITY: The resource may be queried interactively or downloaded from http://www.bioinf.org.uk/pdbsws/.  相似文献   

15.
The genomes of more than 100 species have been sequenced, and the biological functions of encoded proteins are now actively being researched. Protein function is based on interactions between proteins and other molecules. One approach to assuming protein function based on genomic sequence is to predict interactions between an encoded protein and other molecules. As a data source for such predictions, knowledge regarding known protein-small molecule interactions needs to be compiled. We have, therefore, surveyed interactions between proteins and other molecules in Protein Data Bank (PDB), the protein three-dimensional (3D) structure database. Among 20,685 entries in PDB (April, 2003), 4,189 types of small molecules were found to interact with proteins. Biologically relevant small molecules most often found in PDB were metal ions, such as calcium, zinc, and magnesium. Sugars and nucleotides were the next most common. These molecules are known to act as cofactors for enzymes and/or stabilizers of proteins. In each case of interactions between a protein and small molecule, we found preferred amino acid residues at the interaction sites. These preferences can be the basis for predicting protein function from genomic sequence and protein 3D structures. The data pertaining to these small molecules were collected in a database named Het-PDB Navi., which is freely available at http://daisy.nagahama-i-bio.ac.jp/golab/hetpdbnavi.html and linked to the official PDB home page.  相似文献   

16.
Enlarged FAMSBASE is a relational database of comparative protein structure models for the whole genome of 41 species, presented in the GTOP database. The models are calculated by Full Automatic Modeling System (FAMS). Enlarged FAMSBASE provides a wide range of query keys, such as name of ORF (open reading frame), ORF keywords, Protein Data Bank (PDB) ID, PDB heterogen atoms and sequence similarity. Heterogen atoms in PDB include cofactors, ligands and other factors that interact with proteins, and are a good starting point for analyzing interactions between proteins and other molecules. The data may also work as a template for drug design. The present number of ORFs with protein 3D models in FAMSBASE is 183 805, and the database includes an average of three models for each ORF. FAMSBASE is available at http://famsbase.bio.nagoya-u.ac.jp/famsbase/.  相似文献   

17.
STING Millennium Suite (SMS) is a new web-based suite of programs and databases providing visualization and a complex analysis of molecular sequence and structure for the data deposited at the Protein Data Bank (PDB). SMS operates with a collection of both publicly available data (PDB, HSSP, Prosite) and its own data (contacts, interface contacts, surface accessibility). Biologists find SMS useful because it provides a variety of algorithms and validated data, wrapped-up in a user friendly web interface. Using SMS it is now possible to analyze sequence to structure relationships, the quality of the structure, nature and volume of atomic contacts of intra and inter chain type, relative conservation of amino acids at the specific sequence position based on multiple sequence alignment, indications of folding essential residue (FER) based on the relationship of the residue conservation to the intra-chain contacts and Calpha-Calpha and Cbeta-Cbeta distance geometry. Specific emphasis in SMS is given to interface forming residues (IFR)-amino acids that define the interactive portion of the protein surfaces. SMS may simultaneously display and analyze previously superimposed structures. PDB updates trigger SMS updates in a synchronized fashion. SMS is freely accessible for public data at http://www.cbi.cnptia.embrapa.br, http://mirrors.rcsb.org/SMS and http://trantor.bioc.columbia.edu/SMS.  相似文献   

18.
MOTIVATION: The rapid increase in the number of structures in the Protein Databank (PDB) makes it difficult to find all structures in a given protein class. Automatically-maintained web-based summaries are one solution to this problem. RESULTS: Summary of Antibody Crystal Structures (SACS), a self-maintaining web-site containing summary information on antibody structures in the PDB, is described. Mirrored PDB data are processed automatically using a Make-based system to identify new antibody structures. The PDB header records and sequence data are then parsed to identify a number of features of the structure and the data are stored using eXtensible Markup Language (XML). eXtensible Stylesheet Language: Transformations (XSLT), a new style sheet language for XML, is used to generate Hypertext Markup Language (HTML) pages containing either a one-line summary of every structure or a more detailed page describing a single antibody.  相似文献   

19.
This report presents the conclusions of the X-ray Validation Task Force of the worldwide Protein Data Bank (PDB). The PDB has expanded massively since current criteria for validation of deposited structures were adopted, allowing a much more sophisticated understanding of all the components of macromolecular crystals. The size of the PDB creates new opportunities to validate structures by comparison with the existing database, and the now-mandatory deposition of structure factors creates new opportunities to validate the underlying diffraction data. These developments highlighted the need for a new assessment of validation criteria. The Task Force recommends that a small set of validation data be presented in an easily understood format, relative to both the full PDB and the applicable resolution class, with greater detail available to interested users. Most importantly, we recommend that referees and editors judging the quality of structural experiments have access to a concise summary of well-established quality indicators.  相似文献   

20.
The Protein Data Bank (PDB) is the global archive for structural information on macromolecules, and a popular resource for researchers, teachers, and students, amassing more than one million unique users each year. Crystallographic structure models in the PDB (more than 100,000 entries) are optimized against the crystal diffraction data and geometrical restraints. This process of crystallographic refinement typically ignored hydrogen bond (H‐bond) distances as a source of information. However, H‐bond restraints can improve structures at low resolution where diffraction data are limited. To improve low‐resolution structure refinement, we present methods for deriving H‐bond information either globally from well‐refined high‐resolution structures from the PDB‐REDO databank, or specifically from on‐the‐fly constructed sets of homologous high‐resolution structures. Refinement incorporating HOmology DErived Restraints (HODER), improves geometrical quality and the fit to the diffraction data for many low‐resolution structures. To make these improvements readily available to the general public, we applied our new algorithms to all crystallographic structures in the PDB: using massively parallel computing, we constructed a new instance of the PDB‐REDO databank ( https://pdb-redo.eu ). This resource is useful for researchers to gain insight on individual structures, on specific protein families (as we demonstrate with examples), and on general features of protein structure using data mining approaches on a uniformly treated dataset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号