首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The group of proteins that contain a thioredoxin (Trx) fold is huge and diverse. Assessment of the variation in catalytic machinery of Trx fold proteins is essential in providing a foundation for understanding their functional diversity and predicting the function of the many uncharacterized members of the class. The proteins of the Trx fold class retain common features—including variations on a dithiol CxxC active site motif—that lead to delivery of function. We use protein similarity networks to guide an analysis of how structural and sequence motifs track with catalytic function and taxonomic categories for 4,082 representative sequences spanning the known superfamilies of the Trx fold. Domain structure in the fold class is varied and modular, with 2.8% of sequences containing more than one Trx fold domain. Most member proteins are bacterial. The fold class exhibits many modifications to the CxxC active site motif—only 56.8% of proteins have both cysteines, and no functional groupings have absolute conservation of the expected catalytic motif. Only a small fraction of Trx fold sequences have been functionally characterized. This work provides a global view of the complex distribution of domains and catalytic machinery throughout the fold class, showing that each superfamily contains remnants of the CxxC active site. The unifying context provided by this work can guide the comparison of members of different Trx fold superfamilies to gain insight about their structure-function relationships, illustrated here with the thioredoxins and peroxiredoxins.  相似文献   

2.
The first application of a novel technique for the identification of common folding motifs in proteins is presented. Using techniques derived from graph theory, developed in order to compare secondary structure motifs in proteins, we have established that there is a striking resemblance in the tertiary fold of the Salmonella typhimurium Che Y chemotaxis protein and that of the GDP-binding domain of Escherichia coli elongation factor Tu (EF Tu). These two protein structures are representatives of two major macromolecular classes: CheY is a signal-transduction protein with sequence homologies to a wide range of bacterial proteins involved in regulation of chemotaxis, membrane synthesis and sporulation; whilst EF Tu is one of a family of guanosine-nucleotide-binding proteins which include the ras oncogene proteins and signal-transducing G proteins. The similarity we have found extends far beyond the previously recognized resemblances of each protein's fold to that of a generic nucleotide-binding domain. The lack of significant sequence homology between the two classes of proteins may mean that the common fold of the two proteins constitutes a particularly stable folding motif. However, an alternative possibility is that the strong three-dimensional structural resemblance may be indicative of a remote shared common ancestry between the bacterial signal-transduction proteins and the GDP-binding proteins.  相似文献   

3.
Thiol-dependent peroxidase systems are reviewed with special emphasis on their potential use as drug targets. The basic catalytic mechanism of the two major thiol-peroxidase families, the glutathione peroxidases and the peroxiredoxins, are reasonably well understood. Sequence-based predictions of substrate specificities are still unsatisfactory. GPx-type enzymes are not generally specific for GSH but may specifically react with CXXC motifs as present in thioredoxins or tryparedoxins. Inversely, the peroxiredoxin family that was believed to be specific for CXXC-type proteins, also comprises glutathione peroxidases. Since structure-based predictions of function are also limited by small data bases, the increasing number of sequences emerging from genome projects require enzymatic characterization and genetic proof of relevance before they can be classified as drug targets.  相似文献   

4.
We have collected a set of 44 Arabidopsis proteins with similarity to the USPA (universal stress protein A of Escherichia coli) domain of bacteria. The USPA domain is found either in small proteins, or it makes up the N-terminal portion of a larger protein, usually a protein kinase. Phylogenetic tree analysis based upon a multiple sequence alignment of the USPA domains shows that these domains of protein kinases 1.3.1 and 1.3.2 form distinct groups, as do the protein kinases 1.4.1. This indicates that their USPA domain structures have diverged appreciably and suggests that they may subserve distinct cellular functions. Two USPA fold classes have been proposed: one based on Methanococcus jannaschii MJ0577 (1MJH) that binds ATP, and the other based on the Haemophilus influenzae universal stress protein (1JMV), highly similar to E. coli UspA, which does not bind ATP. A set of common residues involved in ATP binding in 1MJH and conserved in similar bacterial sequences is also found in a distinct cluster of Arabidopsis sequences. Threading analysis, which examines aspects of secondary and tertiary structure, confirms this Arabidopsis sequence cluster as highly similar to 1MJH. This structural approach can distinguish between the characteristic fold differences of 1MJH-like and 1JMV-like bacterial proteins and was used to assign the complete set of candidate Arabidopsis proteins to one of these fold classes. It is clear that all the plant sequences have arisen from a 1MJH-like ancestor.  相似文献   

5.
Abstract

Thiol-dependent peroxidase systems are reviewed with special emphasis on their potential use as drug targets. The basic catalytic mechanism of the two major thiol-peroxidase families, the glutathione peroxidases and the peroxiredoxins, are reasonably well understood. Sequence-based predictions of substrate specificities are still unsatisfactory. GPx-type enzymes are not generally specific for GSH but may specifically react with CXXC motifs as present in thioredoxins or tryparedoxins. Inversely, the peroxiredoxin family that was believed to be specific for CXXC-type proteins, also comprises glutathione peroxidases. Since structure-based predictions of function are also limited by small data bases, the increasing number of sequences emerging from genome projects require enzymatic characterization and genetic proof of relevance before they can be classified as drug targets.  相似文献   

6.
Comparison of ARM and HEAT protein repeats   总被引:18,自引:0,他引:18  
ARM and HEAT motifs are tandemly repeated sequences of approximately 50 amino acid residues that occur in a wide variety of eukaryotic proteins. An exhaustive search of sequence databases detected new family members and revealed that at least 1 in 500 eukaryotic protein sequences contain such repeats. It also rendered the similarity between ARM and HEAT repeats, believed to be evolutionarily related, readily apparent. All the proteins identified in the database searches could be clustered by sequence similarity into four groups: canonical ARM-repeat proteins and three groups of the more divergent HEAT-repeat proteins. This allowed us to build improved sequence profiles for the automatic detection of repeat motifs. Inspection of these profiles indicated that the individual repeat motifs of all four classes share a common set of seven highly conserved hydrophobic residues, which in proteins of known three-dimensional structure are buried within or between repeats. However, the motifs differ at several specific residue positions, suggesting important structural or functional differences among the classes. Our results illustrate that ARM and HEAT-repeat proteins, while having a common phylogenetic origin, have since diverged significantly. We discuss evolutionary scenarios that could account for the great diversity of repeats observed.  相似文献   

7.
Prediction of short linear protein binding regions   总被引:1,自引:0,他引:1  
Short linear motifs in proteins (typically 3-12 residues in length) play key roles in protein-protein interactions by frequently binding specifically to peptide binding domains within interacting proteins. Their tendency to be found in disordered segments of proteins has meant that they have often been overlooked. Here we present SLiMPred (short linear motif predictor), the first general de novo method designed to computationally predict such regions in protein primary sequences independent of experimentally defined homologs and interactors. The method applies machine learning techniques to predict new motifs based on annotated instances from the Eukaryotic Linear Motif database, as well as structural, biophysical, and biochemical features derived from the protein primary sequence. We have integrated these data sources and benchmarked the predictive accuracy of the method, and found that it performs equivalently to a predictor of protein binding regions in disordered regions, in addition to having predictive power for other classes of motif sites such as polyproline II helix motifs and short linear motifs lying in ordered regions. It will be useful in predicting peptides involved in potential protein associations and will aid in the functional characterization of proteins, especially of proteins lacking experimental information on structures and interactions. We conclude that, despite the diversity of motif sequences and structures, SLiMPred is a valuable tool for prioritizing potential interaction motifs in proteins.  相似文献   

8.
Zinc fingers--folds for many occasions   总被引:1,自引:0,他引:1  
Matthews JM  Sunde M 《IUBMB life》2002,54(6):351-355
  相似文献   

9.
In the postgenomic era it is essential that protein sequences are annotated correctly in order to help in the assignment of their putative functions. Over 1300 proteins in current protein sequence databases are predicted to contain a PAS domain based upon amino acid sequence alignments. One of the problems with the current annotation of the PAS domain is that this domain exhibits limited similarity at the amino acid sequence level. It is therefore essential, when using proteins with low-sequence similarities, to apply profile hidden Markov model searches for the PAS domain-containing proteins, as for the PFAM database. From recent 3D X-ray and NMR structures, however, PAS domains appear to have a conserved 3D fold as shown here by structural alignment of the six representative 3D-structures from the PDB database. Large-scale modelling of the PAS sequences from the PFAM database against the 3D-structures of these six structural prototypes was performed. All 3D models generated (> 5700) were evaluated using prosaii. We conclude from our large-scale modelling studies that the PAS and PAC motifs (which are separately defined in the PFAM database) are directly linked and that these two motifs form the PAS fold. The existing subdivision in PAS and PAC motifs, as used by the PFAM and SMART databases, appears to be caused by major differences in sequences in the region connecting these two motifs. This region, as has been shown by Gardner and coworkers for human PAS kinase (Amezcua, C.A., Harper, S.M., Rutter, J. & Gardner, K.H. (2002) Structure 10, 1349-1361, [1]), is very flexible and adopts different conformations depending on the bound ligand. Some PAS sequences present in the PFAM database did not produce a good structural model, even after realignment using a structure-based alignment method, suggesting that these representatives are unlikely to have a fold resembling any of the structural prototypes of the PAS domain superfamily.  相似文献   

10.
11.
Chelation therapy is one of the most appreciated methods in the treatment of metal induced disease predisposition. Coordination chemistry provides a way to understand metal association in biological structures. In this work we have implemented coordination chemistry to study nickel coordination due to its high impact in industrial usage and thereby health consequences. This paper reports the analysis of nickel coordination from a large dataset of nickel bound structures and sequences. Coordination patterns predicted from the structures are reported in terms of donors, chelate length, coordination number, chelate geometry, structural fold and architecture. The analysis revealed histidine as the most favored residue in nickel coordination. The most common chelates identified were histidine based namely HHH, HDH, HEH and HH spaced at specific intervals. Though a maximum coordination number of 8 was observed, the presence of a single protein donor was noted to be mandatory in nickel coordination. The coordination pattern did not reveal any specific fold, nevertheless we report preferable residue spacing for specific structural architecture. In contrast, the analysis of nickel binding proteins from bacterial and archeal species revealed no common coordination patterns. Nickel binding sequence motifs were noted to be organism specific and protein class specific. As a result we identified about 13 signatures derived from 13 classes of nickel binding proteins. The specifications on nickel coordination presented in this paper will prove beneficial for developing better chelation strategies.  相似文献   

12.
We have used GRATH, a graph-based structure comparison algorithm, to map the similarities between the different folds observed in the CATH domain structure database. Statistical analysis of the distributions of the fold similarities has allowed us to assess the significance for any similarity. Therefore we have examined whether it is best to represent folds as discrete entities or whether, in fact, a more accurate model would be a continuum wherein folds overlap via common motifs. To do this we have introduced a new statistical measure of fold similarity, termed gregariousness. For a particular fold, gregariousness measures how many other folds have a significant structural overlap with that fold, typically comprising 40% or more of the larger structure. Gregarious folds often contain commonly occurring super-secondary structural motifs, such as beta-meanders, greek keys, alpha-beta plait motifs or alpha-hairpins, which are matching similar motifs in other folds. Apart from one example, all the most gregarious folds matching 20% or more of the other folds in the database, are alpha-beta proteins. They also occur in highly populated architectural regions of fold space, adopting sandwich-like arrangements containing two or more layers of alpha-helices and beta-strands.Domains that exhibit a low gregariousness, are those that have very distinctive folds, with few common motifs or motifs that are packed in unusual arrangements. Most of the superhelices exhibit low gregariousness despite containing some commonly occurring super-secondary structural motifs. In these folds, these common motifs are combined in an unusual way and represent a small proportion of the fold (<10%). Our results suggest that fold space may be considered as continuous for some architectural arrangements (e.g. alpha-beta sandwiches), in that super-secondary motifs can be used to link neighbouring fold groups. However, in other regions of fold space much more discrete topologies are observed with little similarity between folds.  相似文献   

13.
Qi Y  Grishin NV 《Proteins》2005,58(2):376-388
Protein structure classification is necessary to comprehend the rapidly growing structural data for better understanding of protein evolution and sequence-structure-function relationships. Thioredoxins are important proteins that ubiquitously regulate cellular redox status and various other crucial functions. We define the thioredoxin-like fold using the structure consensus of thioredoxin homologs and consider all circular permutations of the fold. The search for thioredoxin-like fold proteins in the PDB database identified 723 protein domains. These domains are grouped into eleven evolutionary families based on combined sequence, structural, and functional evidence. Analysis of the protein-ligand structure complexes reveals two major active site locations for the thioredoxin-like proteins. Comparison to existing structure classifications reveals that our thioredoxin-like fold group is broader and more inclusive, unifying proteins from five SCOP folds, five CATH topologies and seven DALI domain dictionary globular folding topologies. Considering these structurally similar domains together sheds new light on the relationships between sequence, structure, function and evolution of thioredoxins.  相似文献   

14.
The 'immunoglobulin-like' fold is one of most common structural motifs observed in proteins. This topology is found in more than 80 superfamilies of proteins, including Cu,Zn-superoxide dismutase (SOD) and cupredoxin. Evolutionary relationships have not been identified, but may exist. The challenge remains, therefore, of resolving the issue of whether the diverse distribution of the fold is accounted for by divergent evolution of function or convergent evolution of structure following multiple independent origins of function. Since the early studies that revealed conformational similarity of immunoglobulins and other proteins, the number of primary structures available for comparison has dramatically increased and new computational approaches for analysis of sequences have been developed. It now appears that a hypothesis of a common evolutionary origin for cupredoxins, Cu,Zn-SOD, and immunoglobulins may be credible. The distinction between protein homology and protein analogy is fundamental. The immunoglobulin-like fold may represent a robust system within which to examine again the issue of protein homology versus analogy.  相似文献   

15.
Human SCO1 and SCO2 are copper-binding proteins involved in the assembly of mitochondrial cytochrome c oxidase (COX). We have determined the crystal structure of the conserved, intermembrane space core portion of apo-hSCO1 to 2.8 A. It is similar to redox active proteins, including thioredoxins (Trx) and peroxiredoxins (Prx), with putative copper-binding ligands located at the same positions as the conserved catalytic residues in Trx and Prx. SCO1 does not have disulfide isomerization or peroxidase activity, but both hSCO1 and a sco1 null in yeast show extreme sensitivity to hydrogen peroxide. Of the six missense mutations in SCO1 and SCO2 associated with fatal mitochondrial disorders, one lies in a highly conserved exposed surface away from the copper-binding region, suggesting that this region is involved in protein-protein interactions. These data suggests that SCO functions not as a COX copper chaperone, but rather as a mitochondrial redox signaling molecule.  相似文献   

16.
Experimental proteome analysis was combined with a genome-wide prediction screen to characterize the protein content of the thylakoid lumen of Arabidopsis chloroplasts. Soluble thylakoid proteins were separated by two-dimensional electrophoresis and identified by mass spectrometry. The identities of 81 proteins were established, and N termini were sequenced to validate localization prediction. Gene annotation of the identified proteins was corrected by experimental data, and an interesting case of alternative splicing was discovered. Expression of a surprising number of paralogs was detected. Expression of five isomerases of different classes suggests strong (un)folding activity in the thylakoid lumen. These isomerases possibly are connected to a network of peripheral and lumenal proteins involved in antioxidative response, including peroxiredoxins, m-type thioredoxins, and a lumenal ascorbate peroxidase. Characteristics of the experimentally identified lumenal proteins and their orthologs were used for a genome-wide prediction of the lumenal proteome. Lumenal proteins with a typical twin-arginine translocation motif were predicted with good accuracy and sensitivity and included additional isomerases and proteases. Thus, prime functions of the lumenal proteome include assistance in the folding and proteolysis of thylakoid proteins as well as protection against oxidative stress. Many of the predicted lumenal proteins must be present at concentrations at least 10,000-fold lower than proteins of the photosynthetic apparatus.  相似文献   

17.
ATM/ATR-like protein kinases play central roles in the maintenance of genome stability and phosphorylate numerous substrates in response to DNA damage, preferentially on SQ or TQ motifs. ATM/ATR substrates often contain several closely spaced SQ/TQ motifs in regions that have been termed SQ/TQ cluster domains (SCDs). SCDs are now considered a structural hallmark of DNA-damage-response proteins. Mutational analyses of a number of SCD-containing proteins indicate that multisite phosphorylation of SQ/TQ motifs is required for normal DNA-damage responses, most commonly by mediating protein-protein interactions in the formation of DNA-damage-induced complexes. SCD sequences are highly diverse and these domains may be largely unfolded in their native state rather than adopting a common three-dimensional fold. Structural disorder of SCDs could be advantageous for efficient phosphorylation by ATM/ATR kinases and also enable them to be molded into distinct conformations to facilitate flexible interactions with multiple binding partners.  相似文献   

18.
Small hemoproteins displaying amino acid sequences 20-40 residues shorter than (non-)vertebrate hemoglobins (Hbs) have recently been identified in several pathogenic and non-pathogenic unicellular organisms, and named 'truncated hemoglobins' (trHbs). They have been proposed to be involved not only in oxygen transport but also in other biological functions, such as protection against reactive nitrogen species, photosynthesis or to act as terminal oxidases. Crystal structures of trHbs from the ciliated protozoan Paramecium caudatum and the green unicellular alga Chlamydomonas eugametos show that the tertiary structure of both proteins is based on a 'two-over-two' alpha-helical sandwich, reflecting an unprecedented editing of the classical 'three-over-three' alpha-helical globin fold. Based on specific Gly-Gly motifs the tertiary structure accommodates the deletion of the N-terminal A-helix and replacement of the crucial heme-binding F-helix with an extended polypeptide loop. Additionally, concerted structural modifications allow burying of the heme group and define the distal site, which hosts a TyrB10, GlnE7 residue pair. A set of structural and amino acid sequence consensus rules for stabilizing the fold and the bound heme in the trHbs homology subfamily is deduced.  相似文献   

19.
Detection of similarity is particularly difficult for small proteins and thus connections between many of them remain unnoticed. Structure and sequence analysis of several metal-binding proteins reveals unexpected similarities in structural domains classified as different protein folds in SCOP and suggests unification of seven folds that belong to two protein classes. The common motif, termed treble clef finger in this study, forms the protein structural core and is 25-45 residues long. The treble clef motif is assembled around the central zinc ion and consists of a zinc knuckle, loop, beta-hairpin and an alpha-helix. The knuckle and the first turn of the helix each incorporate two zinc ligands. Treble clef domains constitute the core of many structures such as ribosomal proteins L24E and S14, RING fingers, protein kinase cysteine-rich domains, nuclear receptor-like fingers, LIM domains, phosphatidylinositol-3-phosphate-binding domains and His-Me finger endonucleases. The treble clef finger is a uniquely versatile motif adaptable for various functions. This small domain with a 25 residue structural core can accommodate eight different metal-binding sites and can have many types of functions from binding of nucleic acids, proteins and small molecules, to catalysis of phosphodiester bond hydrolysis. Treble clef motifs are frequently incorporated in larger structures or occur in doublets. Present analysis suggests that the treble clef motif defines a distinct structural fold found in proteins with diverse functional properties and forms one of the major zinc finger groups.  相似文献   

20.
Knowledge of three dimensional structure is essential to understand the function of a protein. Although the overall fold is made from the whole details of its sequence, a small group of residues, often called as structural motifs, play a crucial role in determining the protein fold and its stability. Identification of such structural motifs requires sufficient number of sequence and structural homologs to define conservation and evolutionary information. Unfortunately, there are many structures in the protein structure databases have no homologous structures or sequences. In this work, we report an SVM method, SMpred, to identify structural motifs from single protein structure without using sequence and structural homologs. SMpred method was trained and tested using 132 proteins domains containing 581 motifs. SMpred method achieved 78.79% accuracy with 79.06% sensitivity and 78.53% specificity. The performance of SMpred was evaluated with MegaMotifBase using 188 proteins containing 1161 motifs. Out of 1161 motifs, SMpred correctly identified 1503 structural motifs reported in MegaMotifBase. Further, we showed that SMpred is useful approach for the length deviant superfamilies and single member superfamilies. This result suggests the usefulness of our approach for facilitating the identification of structural motifs in protein structure in the absence of sequence and structural homologs. The dataset and executable for the SMpred algorithm is available at http://www3.ntu.edu.sg/home/EPNSugan/index_files/SMpred.htm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号