首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
《Journal of molecular biology》2019,431(13):2442-2448
At present, about half of the protein domain families lack a structural representative. However, in the last decade, predicting contact maps and using these to model the tertiary structure for these protein families have become an alternative approach to gain structural insight. At present, reliable models for several hundreds of protein families have been created using this approach. To increase the use of this approach, we present PconsFam, which is an intuitive and interactive database for predicted contact maps and tertiary structure models of the entire Pfam database. By modeling all possible families, both with and without a representative structure, using the PconsFold2 pipeline, and running quality assessment estimator on the models, we predict an estimation for how confident the contact maps and structures are for each family.  相似文献   

2.
The Pfam protein families database   总被引:105,自引:12,他引:93  
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the WWW in the UK at http://www.sanger.ac.uk/Software/Pfam/, in Sweden at http://www.cgr.ki.se/Pfam/ and in the US at http://pfam.wustl.edu/. The latest version (4.3) of Pfam contains 1815 families. These Pfam families match 63% of proteins in SWISS-PROT 37 and TrEMBL 9. For complete genomes Pfam currently matches up to half of the proteins. Genomic DNA can be directly searched against the Pfam library using the Wise2 package.  相似文献   

3.
The Protein Mutant Database.   总被引:3,自引:0,他引:3       下载免费PDF全文
Currently the protein mutant database (PMD) contains over 81 000 mutants, including artificial as well as natural mutants of various proteins extracted from about 10 000 articles. We recently developed a powerful viewing and retrieving system (http://pmd.ddbj.nig.ac.jp), which is integrated with the sequence and tertiary structure databases. The system has the following features: (i) mutated sequences are displayed after being automatically generated from the information described in the entry together with the sequence data of wild-type proteins integrated. This is a convenient feature because it allows one to see the position of altered amino acids (shown in a different color) in the entire sequence of a wild-type protein; (ii) for those proteins whose 3D structures have been experimentally determined, a 3D structure is displayed to show mutation sites in a different color; (iii) a sequence homology search against PMD can be carried out with any query sequence; (iv) a summary of mutations of homologous sequences can be displayed, which shows all the mutations at a certain site of a protein, recorded throughout the PMD.  相似文献   

4.
The PIR-International Protein Sequence Database.   总被引:1,自引:1,他引:0       下载免费PDF全文
PIR-International is an association of macromolecular sequence data collection centers dedicated to fostering international cooperation as an essential element in the development of scientific databases. A major objective of PIR-International is to continue the development of the Protein Sequence Database as an essential public resource for protein sequence information. This paper briefly describes the architecture of the Protein Sequence Database and how it and associated data sets are distributed and can be accessed electronically.  相似文献   

5.
The PIR-International Protein Sequence Database.   总被引:1,自引:0,他引:1       下载免费PDF全文
From its origin the Protein Sequence Database has been designed to support research and has focused on comprehensive coverage, quality control and organization of the data in accordance with biological principles. Since 1988 the database has been maintained collaboratively within the framework of PIR-International, an association of macromolecular sequence data collection centers dedicated to fostering international cooperation as an essential element in the development of scientific databases. The database is widely distributed and is available on the World Wide Web, via ftp, email server, on CD-ROM and magnetic media. It is widely redistributed and incorporated into many other protein sequence data compilations, including SWISS-PROT and the Entrez system of the NCBI.  相似文献   

6.
7.
Protein–protein interactions are challenging targets for modulation by small molecules. Here, we propose an approach that harnesses the increasing structural coverage of protein complexes to identify small molecules that may target protein interactions. Specifically, we identify ligand and protein binding sites that overlap upon alignment of homologous proteins. Of the 2,619 protein structure families observed to bind proteins, 1,028 also bind small molecules (250–1000 Da), and 197 exhibit a statistically significant (p<0.01) overlap between ligand and protein binding positions. These “bi-functional positions”, which bind both ligands and proteins, are particularly enriched in tyrosine and tryptophan residues, similar to “energetic hotspots” described previously, and are significantly less conserved than mono-functional and solvent exposed positions. Homology transfer identifies ligands whose binding sites overlap at least 20% of the protein interface for 35% of domain–domain and 45% of domain–peptide mediated interactions. The analysis recovered known small-molecule modulators of protein interactions as well as predicted new interaction targets based on the sequence similarity of ligand binding sites. We illustrate the predictive utility of the method by suggesting structural mechanisms for the effects of sanglifehrin A on HIV virion production, bepridil on the cellular entry of anthrax edema factor, and fusicoccin on vertebrate developmental pathways. The results, available at http://pibase.janelia.org, represent a comprehensive collection of structurally characterized modulators of protein interactions, and suggest that homologous structures are a useful resource for the rational design of interaction modulators.  相似文献   

8.
9.
The RESID Database is a comprehensive collection of annotations and structures for protein pre-, co- and post-translational modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link modifications. The RESID Database includes: systematic and alternate names, atomic formulas and masses, enzyme activities generating the modifications, keywords, literature citations, Gene Ontology cross-references, Protein Information Resource (PIR) and SWISS-PROT protein sequence database feature table annotations, structure diagrams and molecular models. This database is freely accessible on the Internet through the European Bioinformatics Institute at http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-page+LibInfo+-lib+RESID, through the National Cancer Institute - Frederick Advanced Biomedical Computing Center at http://www.ncifcrf.gov/RESID, or through the Protein Information Resource at http://pir.georgetown.edu/pirwww/dbinfo/resid.html.  相似文献   

10.
Cysteine residues have a rich chemistry and play a critical role in the catalytic activity of a plethora of enzymes. However, cysteines are susceptible to oxidation by Reactive Oxygen and Nitrogen Species, leading to a loss of their catalytic function. Therefore, cysteine oxidation is emerging as a relevant physiological regulatory mechanism. Formation of a cyclic sulfenyl amide residue at the active site of redox-regulated proteins has been proposed as a protection mechanism against irreversible oxidation as the sulfenyl amide intermediate has been identified in several proteins. However, how and why only some specific cysteine residues in particular proteins react to form this intermediate is still unknown. In the present work using in-silico based tools, we have identified a constrained conformation that accelerates sulfenyl amide formation. By means of combined MD and QM/MM calculation we show that this conformation positions the NH backbone towards the sulfenic acid and promotes the reaction to yield the sulfenyl amide intermediate, in one step with the concomitant release of a water molecule. Moreover, in a large subset of the proteins we found a conserved beta sheet-loop-helix motif, which is present across different protein folds, that is key for sulfenyl amide production as it promotes the previous formation of sulfenic acid. For catalytic activity, in several cases, proteins need the Cysteine to be in the cysteinate form, i.e. a low pKa Cys. We found that the conserved motif stabilizes the cysteinate by hydrogen bonding to several NH backbone moieties. As cysteinate is also more reactive toward ROS we propose that the sheet-loop-helix motif and the constraint conformation have been selected by evolution for proteins that need a reactive Cys protected from irreversible oxidation. Our results also highlight how fold conservation can be correlated to redox chemistry regulation of protein function.  相似文献   

11.
From its origin, the PIR has aspired to support research in computational biology and genomics through the compilation of a comprehensive, quality controlled and well-organized protein sequence information resource. The resource originated with the pioneering work of the late Margaret O. Dayhoff in the early 1960s. Since 1988, the Protein Sequence Database has been maintained collaboratively by PIR-International, an association of macromolecular sequence data collection centers dedicated to fostering international cooperation as an essential element in the development of scientific databases. The work of the resource is widely distributed and is available on the World Wide Web, via FTP, E-mail server, CD-ROM and magnetic media. It is widely redistributed and incorporated into many other protein sequence data compilations including SWISS-PROT and theEntrezsystem of the NCBI.  相似文献   

12.
13.

Background  

Pfam is a comprehensive collection of protein domains and families, with a range of well-established information including genome annotation. Pfam has two large series of functionally uncharacterized families, known as Domains of Unknown Function (DUFs) and Uncharacterized Protein Families (UPFs).  相似文献   

14.
PSPDB: Plant Stress Protein Database   总被引:1,自引:0,他引:1  
Plants produce various proteins to overcome biotic and abiotic stresses. Current plant stress databases report plant genes without protein annotations specific to these stresses. To date, according to our findings, a unique plant stress protein database for both biotic and abiotic stresses is not available explicitly for plant biologists that describe linking out to other related databases. This need initiated us to formulate a distinctive database that includes important resources for stress-based factors. We developed the Plant Stress Protein Database (PSPDB), a web-accessible resource that covers 2,064 manually curated plant stress proteins from a wide array of 134 plant species with 30 different types of biotic and abiotic stresses. Functional and experimental validation of proteins associated with biotic and abiotic stresses has been employed as the sole criterion for inclusion in the database. The database is available at http://www.bioclues.org/pspdb/.  相似文献   

15.
Predicting active site residue annotations in the Pfam database   总被引:1,自引:0,他引:1  

Background

The recent increase in the use of high-throughput two-hybrid analysis has generated large quantities of data on protein interactions. Specifically, the availability of information about experimental protein-protein interactions and other protein features on the Internet enables human protein-protein interactions to be computationally predicted from co-evolution events (interolog). This study also considers other protein interaction features, including sub-cellular localization, tissue-specificity, the cell-cycle stage and domain-domain combination. Computational methods need to be developed to integrate these heterogeneous biological data to facilitate the maximum accuracy of the human protein interaction prediction.

Results

This study proposes a relative conservation score by finding maximal quasi-cliques in protein interaction networks, and considering other interaction features to formulate a scoring method. The scoring method can be adopted to discover which protein pairs are the most likely to interact among multiple protein pairs. The predicted human protein-protein interactions associated with confidence scores are derived from six eukaryotic organisms – rat, mouse, fly, worm, thale cress and baker's yeast.

Conclusion

Evaluation results of the proposed method using functional keyword and Gene Ontology (GO) annotations indicate that some confidence is justified in the accuracy of the predicted interactions. Comparisons among existing methods also reveal that the proposed method predicts human protein-protein interactions more accurately than other interolog-based methods.  相似文献   

16.
17.
Negative examples – genes that are known not to carry out a given protein function – are rarely recorded in genome and proteome annotation databases, such as the Gene Ontology database. Negative examples are required, however, for several of the most powerful machine learning methods for integrative protein function prediction. Most protein function prediction efforts have relied on a variety of heuristics for the choice of negative examples. Determining the accuracy of methods for negative example prediction is itself a non-trivial task, given that the Open World Assumption as applied to gene annotations rules out many traditional validation metrics. We present a rigorous comparison of these heuristics, utilizing a temporal holdout, and a novel evaluation strategy for negative examples. We add to this comparison several algorithms adapted from Positive-Unlabeled learning scenarios in text-classification, which are the current state of the art methods for generating negative examples in low-density annotation contexts. Lastly, we present two novel algorithms of our own construction, one based on empirical conditional probability, and the other using topic modeling applied to genes and annotations. We demonstrate that our algorithms achieve significantly fewer incorrect negative example predictions than the current state of the art, using multiple benchmarks covering multiple organisms. Our methods may be applied to generate negative examples for any type of method that deals with protein function, and to this end we provide a database of negative examples in several well-studied organisms, for general use (The NoGO database, available at: bonneaulab.bio.nyu.edu/nogo.html).  相似文献   

18.
19.
HPID: the Human Protein Interaction Database   总被引:1,自引:0,他引:1  
The Human Protein Interaction Database (http://www.hpid.org) was designed (1) to provide human protein interaction information pre-computed from existing structural and experimental data, (2) to predict potential interactions between proteins submitted by users and (3) to provide a depository for new human protein interaction data from users. Two types of interaction are available from the pre-computed data: (1) interactions at the protein superfamily level and (2) those transferred from the interactions of yeast proteins. Interactions at the superfamily level were obtained by locating known structural interactions of the PDB in the SCOP domains and identifying homologs of the domains in the human proteins. Interactions transferred from yeast proteins were obtained by identifying homologs of the yeast proteins in the human proteins. For each human protein in the database and each query submitted by users, the protein superfamilies and yeast proteins assigned to the protein are shown, along with their interacting partners. We have also developed a set of web-based programs so that users can visualize and analyze protein interaction networks in order to explore the networks further. AVAILABILITY: http://www.hpid.org.  相似文献   

20.
A method of noise decomposition has been developed. This method allows for the identification of a latent periodicity with symbol insertions and deletions that is specific for all or most amino acid sequences belonging to the same protein family or protein domain. The latent periodicity has been identified in catalytic domains of 85% of serine/threonine and tyrosine protein kinases. Similar results have been obtained for 22 other protein families. The possible role of latent periodicity in protein families is discussed.__________Translated from Molekulyarnaya Biologiya, Vol. 39, No. 3, 2005, pp. 420–436.Original Russian Text Copyright © 2005 by Laskin, Kudryashov, Skryabin, Korotkov.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号