首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the postgenomic era, one of the most interesting and important challenges is to understand protein interactions on a large scale. The physical interactions between protein domains are fundamental to the workings of a cell: in multi-domain polypeptide chains, in multi-subunit proteins and in transient complexes between proteins that also exist independently. Thus experimental investigation of protein-protein interactions has been extensive, including recent large-scale screens using mass spectrometry. The role of computational research on protein-protein interactions encompasses not only prediction, but also understanding the nature of the interactions and their three-dimensional structures. I will discuss properties such as sequence conservation and co-regulation of genes and proteins involved in different types of physical interactions. Given that all proteins consist of their evolutionary units, the domains, all interactions occur between these domains. The interactions between domains belonging to different protein families will be the second topic of my talk.  相似文献   

2.
The FSSP database of structurally aligned protein fold families.   总被引:17,自引:0,他引:17       下载免费PDF全文
L Holm  C Sander 《Nucleic acids research》1994,22(17):3600-3609
FSSP (families of structurally similar proteins) is a database of structural alignments of proteins in the Protein Data Bank (PDB). The database currently contains an extended structural family for each of 330 representative protein chains. Each data set contains structural alignments of one search structure with all other structurally significantly similar proteins in the representative set (remote homologs, < 30% sequence identity), as well as all structures in the Protein Data Bank with 70-30% sequence identity relative to the search structure (medium homologs). Very close homologs (above 70% sequence identity) are excluded as they rarely have marked structural differences. The alignments of remote homologs are the result of pairwise all-against-all structural comparisons in the set of 330 representative protein chains. All such comparisons are based purely on the 3D co-ordinates of the proteins and are derived by automatic (objective) structure comparison programs. The significance of structural similarity is estimated based on statistical criteria. The FSSP database is available electronically from the EMBL file server and by anonymous ftp (file transfer protocol).  相似文献   

3.
Intensive growth in 3D structure data on DNA-protein complexes as reflected in the Protein Data Bank (PDB) demands new approaches to the annotation and characterization of these data and will lead to a new understanding of critical biological processes involving these data. These data and those from other protein structure classifications will become increasingly important for the modeling of complete proteomes. We propose a fully automated classification of DNA-binding protein domains based on existing 3D-structures from the PDB. The classification, by domain, relies on the Protein Domain Parser (PDP) and the Combinatorial Extension (CE) algorithm for structural alignment. The approach involves the analysis of 3D-interaction patterns in DNA-protein interfaces, assignment of structural domains interacting with DNA, clustering of domains based on structural similarity and DNA-interacting patterns. Comparison with existing resources on describing structural and functional classifications of DNA-binding proteins was used to validate and improve the approach proposed here. In the course of our study we defined a set of criteria and heuristics allowing us to automatically build a biologically meaningful classification and define classes of functionally related protein domains. It was shown that taking into consideration interactions between protein domains and DNA considerably improves the classification accuracy. Our approach provides a high-throughput and up-to-date annotation of DNA-binding protein families which can be found at http://spdc.sdsc.edu.  相似文献   

4.
Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes.  相似文献   

5.
Irregular protein secondary structures are believed to be important structural domains involved in molecular recognition processes between proteins, in interactions between peptide substrates and receptors, and in protein folding. In these respects tight turns are being studied in detail. They also represent template structures for the design of new molecules such as drugs, pesticides, or antigens. Isolated α-turns, not participating in α-helical structures, have received little attention due to the overwhelming presence of other types of tight turns in peptide and protein structures. The growing number of protein X-ray structures allowed us to undertake a systematic search into the Protein Data Bank of this uncharacterized protein secondary structure. A classification of isolated α-turns into different types, based on conformational similarity, is reported here. A preliminary analysis on the occurrence of some particular amino acids in certain positions of the turned structure is also presented. © 1996 John Wiley & Sons, Inc.  相似文献   

6.
7.
Protein Structural Interactome map (PSIMAP) is a global interaction map that describes domain-domain and protein-protein interaction information for known Protein Data Bank structures. It calculates the Euclidean distance to determine interactions between possible pairs of structural domains in proteins. PSIbase is a database and file server for protein structural interaction information calculated by the PSIMAP algorithm. PSIbase also provides an easy-to-use protein domain assignment module, interaction navigation and visual tools. Users can retrieve possible interaction partners of their proteins of interests if a significant homology assignment is made with their query sequences. AVAILABILITY: http://psimap.org and http://psibase.kaist.ac.kr/  相似文献   

8.
SUMMARY: The database of structural motifs in proteins (DSMP) contains data relevant to helices, beta-turns, gamma-turns, beta-hairpins, psi-loops, beta-alpha-beta motifs, beta-sheets, beta-strands and disulphide bridges extracted from all proteins in the Protein Data Bank primarily using the PROMOTIF program and implemented as a web-based network service using the SRS. The data corresponding to the structural motifs includes; sequence, position in polypeptide chain, geometry, type, unique code, keywords and resolution of crystal structure. This data is available for a representative data set of 1028 protein chains and also for all 10 213 proteins in the Protein Data Bank. The three-dimensional coordinates for all structural motifs (except sheet and disulphide bridge) are also available for the representative data set. Using features in SRS, DSMP can be queried to extract information from one or more structural motifs that may be useful for sequence-structure analysis, prediction, modelling or design. AVAILABILITY: http://www. cdfd.org.in/dsmp.html  相似文献   

9.
The relationship between the synonymous codon usage and different protein secondary structural classes were investigated using 401 Homo sapiens proteins extracted from Protein Data Bank (PDB). A simple Chi-square test was used to assess the significance of deviation of the observed and expected frequencies of 59 codons at the level of individual synonymous families in the four different protein secondary structural classes. It was observed that synonymous codon families show non-randomness in codon usage in four different secondary structural classes. However,when the genes were classified according to their GC3 levels there was an increase in non-randomness in high GC3 group of genes. The non-randomness in codon usage was further tested among the same protein secondary structures belonging to four different protein folding classes of high GC3 group of genes. The results show that in each of the protein secondary structural unit there exist some synonymous family that shows class specific codon-usage pattern. Moreover, there is an increased non-random behaviour of synonymous codons in sheet structure of all secondary structural classes in high GC3 group of genes. Biological implications of these results have been discussed.  相似文献   

10.
Multidomain proteins form in evolution through the concatenation of domains, but structural domains may comprise multiple segments of the chain. In this work, we demonstrate that new multidomain architectures can evolve by an apparent three-dimensional swap of segments between structurally similar domains within a single-chain monomer. By a comprehensive structural search of the current Protein Data Bank (PDB), we identified 32 well-defined segment-swapped proteins (SSPs) belonging to 18 structural families. Nearly 13% of all multidomain proteins in the PDB may have a segment-swapped evolutionary precursor as estimated by more permissive searching criteria. The formation of SSPs can be explained by two principal evolutionary mechanisms: (i) domain swapping and fusion (DSF) and (ii) circular permutation (CP). By large-scale comparative analyses using structural alignment and hidden Markov model methods, it was found that the majority of SSPs have evolved via the DSF mechanism, and a much smaller fraction, via CP. Functional analyses further revealed that segment swapping, which results in two linkers connecting the domains, may impart directed flexibility to multidomain proteins and contributes to the development of new functions. Thus, inter-domain segment swapping represents a novel general mechanism by which new protein folds and multidomain architectures arise in evolution, and SSPs have structural and functional properties that make them worth defining as a separate group.  相似文献   

11.
There are several different families of repeat proteins. In each, a distinct structural motif is repeated in tandem to generate an elongated structure. The nonglobular, extended structures that result are particularly well suited to present a large surface area and to function as interaction domains. Many repeat proteins have been demonstrated experimentally to fold and function as independent domains. In tetratricopeptide (TPR) repeats, the repeat unit is a helix-turn-helix motif. The majority of TPR motifs occur as three to over 12 tandem repeats in different proteins. The majority of TPR structures in the Protein Data Bank are of isolated domains. Here we present the high-resolution structure of NlpI, the first structure of a complete TPR-containing protein. We show that in this instance the TPR motifs do not fold and function as an independent domain, but are fully integrated into the three-dimensional structure of a globular protein. The NlpI structure is also the first TPR structure from a prokaryote. It is of particular interest because it is a membrane-associated protein, and mutations in it alter septation and virulence.  相似文献   

12.
A method is presented that uses beta-strand interactions to predict the parallel right-handed beta-helix super-secondary structural motif in protein sequences. A program called BetaWrap implements this method and is shown to score known beta-helices above non-beta-helices in the Protein Data Bank in cross-validation. It is demonstrated that BetaWrap learns each of the seven known SCOP beta-helix families, when trained primarily on beta-structures that are not beta-helices, together with structural features of known beta-helices from outside the family. BetaWrap also predicts many bacterial proteins of unknown structure to be beta-helices; in particular, these proteins serve as virulence factors, adhesins, and toxins in bacterial pathogenesis and include cell surface proteins from Chlamydia and the intestinal bacterium Helicobacter pylori. The computational method used here may generalize to other beta-structures for which strand topology and profiles of residue accessibility are well conserved.  相似文献   

13.
Disulfides are conventionally viewed as structurally stabilizing elements in proteins but emerging evidence suggests two disulfide subproteomes exist. One group mediates the well known role of structural stabilization. A second redox‐active group are best known for their catalytic functions but are increasingly being recognized for their roles in regulation of protein function. Redox‐active disulfides are, by their very nature, more susceptible to reduction than structural disulfides; and conversely, the Cys pairs that form them are more susceptible to oxidation. In this study, we searched for potentially redox‐active Cys Pairs by scanning the Protein Data Bank for structures of proteins in alternate redox states. The PDB contains over 1134 unique redox pairs of proteins, many of which exhibit conformational differences between alternate redox states. Several classes of structural changes were observed, proteins that exhibit: disulfide oxidation following expulsion of metals such as zinc; major reorganisation of the polypeptide backbone in association with disulfide redox‐activity; order/disorder transitions; and changes in quaternary structure. Based on evidence gathered supporting disulfide redox activity, we propose disulfides present in alternate redox states are likely to have physiologically relevant redox activity.  相似文献   

14.
With a growing number of structures available in the Brookhaven Protein Data Bank, automatic methods for domain identification are required for the construction of databases. Domains are considered to be clusters of secondary structure elements. Thus, helices and strands are first clustered using intersecondary structural distances between C alpha positions, and dendrograms based on this distance measure are used to identify domains. Individual domains are recognized by a disjoint factor, which enables the automatic identification and classification into disjoint, interacting, and conjoint domains. Application to a database of 83 protein families and 18 unique structures shows that the approach provides an effective delineation of boundaries and identifies those proteins that can be considered as a single domain. A quantitative estimate of the interaction between domains has been proposed. The database of protein domains is a useful tool for understanding protein folding, for recognizing protein folds, and for understanding structure-activity relationships.  相似文献   

15.
Denessiouk KA  Johnson MS 《Proteins》2000,38(3):310-326
ATP is a ligand common to many proteins, yet it is unclear whether common recognition patterns do exist among the many different folds that bind ATP. Previously, it was shown that cAMP-dependent protein kinase, D-Ala:D-Ala ligase and the alpha-subunit of the alpha 2 beta 2 ribonucleotide reductase do share extensive common structural elements for ATP recognition although their folds are different. Here, we have made a survey of structures that bind ATP and compared them with the key features seen in these three proteins. Our survey shows that 12 different fold types share a specific recognition pattern for the adenine moiety, and 8 of these folds have a common structural framework for recognition of the AMP moiety of the ligand. The common framework consists of a tripeptide segment plus three additional residues, which provides similar polar and hydrophobic interactions between the protein and mononucleotide. Consensus interactions are represented by four key hydrogen bonds present in each fold type. Two of these four hydrogen bonds, together with three aliphatic residues, form a specific recognition pattern for the adenine moiety in all 12 folds. These similarities point to a structural-functional requirement shared by these different mononucleotide-binding proteins that represent at this time 28% of the adenine mononucleotide complexes found in the Brookhaven Protein Data Bank.  相似文献   

16.
We have updated the Protein Sequence-Structure Analysis Relational Database (PSSARD) first published in the Int. J. Biol. Macromol. 36 (2005) 259-262 corresponding to 1573 representative protein chains selected from the Protein Data Bank (PDB). In this, the updated and revised PSSARD (Version 2.0), we have included all proteins in the Protein Data Bank available at the time of developing this database including the NMR PDB entries. The current database corresponds to 22,752 XRAY PDB entries and 3977 NMR PDB entries and is separated accordingly in order to facilitate the appropriate database search. The representative protein chains can also be separately accessed within the current database. We have made a provision to combine more than one field to query the database and the results of any search can be used to carry out further nested searches using a combination of queries. We have provided hyperlinks to the individual PDB entries obtained as the result of any search in PSSARD in order to obtain additional details relevant to the protein structure. Certain applications useful to identify domains and structural motifs are discussed.  相似文献   

17.
Selection of representative protein data sets.   总被引:37,自引:17,他引:20       下载免费PDF全文
The Protein Data Bank currently contains about 600 data sets of three-dimensional protein coordinates determined by X-ray crystallography or NMR. There is considerable redundancy in the data base, as many protein pairs are identical or very similar in sequence. However, statistical analyses of protein sequence-structure relations require nonredundant data. We have developed two algorithms to extract from the data base representative sets of protein chains with maximum coverage and minimum redundancy. The first algorithm focuses on optimizing a particular property of the selected proteins and works by successive selection of proteins from an ordered list and exclusion of all neighbors of each selected protein. The other algorithm aims at maximizing the size of the selected set and works by successive thinning out of clusters of similar proteins. Both algorithms are generally applicable to other data bases in which criteria of similarity can be defined and relate to problems in graph theory. The largest nonredundant set extracted from the current release of the Protein Data Bank has 155 protein chains. In this set, no two proteins have sequence similarity higher than a certain cutoff (30% identical residues for aligned subsequences longer than 80 residues), yet all structurally unique protein families are represented. Periodically updated lists of representative data sets are available by electronic mail from the file server "netserv@embl-heidelberg.de." The selection may be useful in statistical approaches to protein folding as well as in the analysis and documentation of the known spectrum of three-dimensional protein structures.  相似文献   

18.
MOTIVATION: Assignment of putative protein functional annotation by comparative analysis using pre-defined experimental annotations is performed routinely by molecular biologists. The number and statistical significance of these assignments remains a challenge in this era of high-throughput proteomics. A combined statistical method that enables robust, automated protein annotation by reliably expanding existing annotation sets is described. An existing clustering scheme, based on relevant experimental information (e.g. sequence identity, keywords or gene expression data) is required. The method assigns new proteins to these clusters with a measure of reliability. It can also provide human reviewers with a reliability score for both new and previously classified proteins. RESULTS: A dataset of 27 000 annotated Protein Data Bank (PDB) polypeptide chains (of 36 000 chains currently in the PDB) was generated from 23 000 chains classified a priori. AVAILABILITY: PDB annotations and sample software implementation are freely accessible on the Web at http://pmr.sdsc.edu/go  相似文献   

19.
HPID: the Human Protein Interaction Database   总被引:1,自引:0,他引:1  
The Human Protein Interaction Database (http://www.hpid.org) was designed (1) to provide human protein interaction information pre-computed from existing structural and experimental data, (2) to predict potential interactions between proteins submitted by users and (3) to provide a depository for new human protein interaction data from users. Two types of interaction are available from the pre-computed data: (1) interactions at the protein superfamily level and (2) those transferred from the interactions of yeast proteins. Interactions at the superfamily level were obtained by locating known structural interactions of the PDB in the SCOP domains and identifying homologs of the domains in the human proteins. Interactions transferred from yeast proteins were obtained by identifying homologs of the yeast proteins in the human proteins. For each human protein in the database and each query submitted by users, the protein superfamilies and yeast proteins assigned to the protein are shown, along with their interacting partners. We have also developed a set of web-based programs so that users can visualize and analyze protein interaction networks in order to explore the networks further. AVAILABILITY: http://www.hpid.org.  相似文献   

20.
Statistical analysis of domains in interacting protein pairs   总被引:10,自引:0,他引:10  
MOTIVATION: Several methods have recently been developed to analyse large-scale sets of physical interactions between proteins in terms of physical contacts between the constituent domains, often with a view to predicting new pairwise interactions. Our aim is to combine genomic interaction data, in which domain-domain contacts are not explicitly reported, with the domain-level structure of individual proteins, in order to learn about the structure of interacting protein pairs. Our approach is driven by the need to assess the evidence for physical contacts between domains in a statistically rigorous way. RESULTS: We develop a statistical approach that assigns p-values to pairs of domain superfamilies, measuring the strength of evidence within a set of protein interactions that domains from these superfamilies form contacts. A set of p-values is calculated for SCOP superfamily pairs, based on a pooled data set of interactions from yeast. These p-values can be used to predict which domains come into contact in an interacting protein pair. This predictive scheme is tested against protein complexes in the Protein Quaternary Structure (PQS) database, and is used to predict domain-domain contacts within 705 interacting protein pairs taken from our pooled data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号