首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Spinocerebellar ataxia types 2 (SCA2) and 3 (SCA3) are autosomal-dominantly inherited, neurodegenerative diseases caused by CAG repeat expansions in the coding regions of the genes encoding ataxin-2 and ataxin-3, respectively. To provide a rationale for further functional experiments, we explored the protein architectures of ataxin-2 and ataxin-3. Using structure-based multiple sequence alignments of homologous proteins, we investigated domains, sequence motifs, and interaction partners. Our analyses focused on presumably functional amino acids and the construction of tertiary structure models of the RNA-binding Lsm domain of ataxin-2 and the deubiquitinating Josephin domain of ataxin-3. We also speculate about distant evolutionary relationships of ubiquitin-binding UIM, GAT, UBA and CUE domains and helical ANTH and UBX domain extensions.  相似文献   

2.
The current pace of structural biology now means that protein three-dimensional structure can be known before protein function, making methods for assigning homology via structure comparison of growing importance. Previous research has suggested that sequence similarity after structure-based alignment is one of the best discriminators of homology and often functional similarity. Here, we exploit this observation, together with a merger of protein structure and sequence databases, to predict distant homologous relationships. We use the Structural Classification of Proteins (SCOP) database to link sequence alignments from the SMART and Pfam databases. We thus provide new alignments that could not be constructed easily in the absence of known three-dimensional structures. We then extend the method of Murzin (1993b) to assign statistical significance to sequence identities found after structural alignment and thus suggest the best link between diverse sequence families. We find that several distantly related protein sequence families can be linked with confidence, showing the approach to be a means for inferring homologous relationships and thus possible functions when proteins are of known structure but of unknown function. The analysis also finds several new potential superfamilies, where inspection of the associated alignments and superimpositions reveals conservation of unusual structural features or co-location of conserved amino acids and bound substrates. We discuss implications for Structural Genomics initiatives and for improvements to sequence comparison methods.  相似文献   

3.
Ataxin-3 belongs to the family of polyglutamine proteins, which are associated with nine different neurodegenerative disorders. Relatively little is known about the structural and functional properties of ataxin-3, and only recently have these aspects of the protein begun to be explored. We have performed a preliminary investigation into the conserved N-terminal domain of ataxin-3, termed Josephin. We show that Josephin is a monomeric domain which folds into a globular conformation and possesses ubiquitin protease activity. In addition, we demonstrate that the presence of the polyglutamine region of the protein does not alter the structure of the protein. However, its presence destabilizes the Josephin domain. The implications of these data in the pathogenesis of polyglutamine repeat proteins are discussed.  相似文献   

4.
Structural genomics is the idea of covering protein space so that every protein sequence comes within model building distance of a protein of known structure. Unfortunately, reproducing the structural alignment of distantly related proteins is a difficult challenge to existing sequence alignment and motif search software. We have developed a new transitive alignment algorithm (MaxFlow), which generates accurate alignments between proteins deep in the twilight zone of sequence similarity, below 20% sequence identity. In particular, MaxFlow reliably identifies conserved core motifs between proteins which are only indirect PSI-Blast neighbours. Based on MaxFlow alignments, useful 3D models can be generated for all members of a superfamily from as few as a single structural template – despite hundreds of representatives at 40% sequence identity level and patchy detection of homology by PSI-Blast. We propose novel strategies for target prioritization using MaxFlow scores to predict the optimal templates in a superfamily. Our results support an increase in the granularity of covering protein space that has potentially enormous economic implications for planning the transition to the full production phase of structural genomics.  相似文献   

5.
C Sander  R Schneider 《Proteins》1991,9(1):56-68
The database of known protein three-dimensional structures can be significantly increased by the use of sequence homology, based on the following observations. (1) The database of known sequences, currently at more than 12,000 proteins, is two orders of magnitude larger than the database of known structures. (2) The currently most powerful method of predicting protein structures is model building by homology. (3) Structural homology can be inferred from the level of sequence similarity. (4) The threshold of sequence similarity sufficient for structural homology depends strongly on the length of the alignment. Here, we first quantify the relation between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homology threshold curve as a function of alignment length. We then produce a database of homology-derived secondary structure of proteins (HSSP) by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the number of known protein structures by a factor of five to more than 1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homology.  相似文献   

6.
MOTIVATION: An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. RESULTS: We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. AVAILABILITY: Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID  相似文献   

7.
8.
EF-hand calcium binding proteins (CaBPs) share strong sequence homology, but exhibit great diversity in structure and function. Thus although calmodulin (CaM) and calcineurin B (CNB) both consist of four EF hands, their domain arrangements are quite distinct. CaM and the CaM-like proteins are characterized by an extended architecture, whereas CNB and the CNB-like proteins have a more compact form. In this study, we performed structural alignments and molecular dynamics (MD) simulations on 3 CaM-like proteins and 6 CNB-like proteins, and quantified their distinct structural and dynamical features in an effort to establish how their sequences specify their structures and dynamics. Alignments of the EF2-EF3 region of these proteins revealed that several residues (not restricted to the linker between the EF2 and EF3 motifs) differed between the two groups of proteins. A customized inverse folding approach followed by structural assessments and MD simulations established the critical role of these residues in determining the structure of the proteins. Identification of the critical determinants of the two different EF-hand domain arrangements and the distinct dynamical features relevant to their respective functions provides insight into the relationships between sequence, structure, dynamics and function among these EF-hand CaBPs.  相似文献   

9.
EF-hand calcium binding proteins (CaBPs) share strong sequence homology, but exhibit great diversity in structure and function. Thus although calmodulin (CaM) and calcineurin B (CNB) both consist of four EF hands, their domain arrangements are quite distinct. CaM and the CaM-like proteins are characterized by an extended architecture, whereas CNB and the CNB-like proteins have a more compact form. In this study, we performed structural alignments and molecular dynamics (MD) simulations on 3 CaM-like proteins and 6 CNB-like proteins, and quantified their distinct structural and dynamical features in an effort to establish how their sequences specify their structures and dynamics. Alignments of the EF2-EF3 region of these proteins revealed that several residues (not restricted to the linker between the EF2 and EF3 motifs) differed between the two groups of proteins. A customized inverse folding approach followed by structural assessments and MD simulations established the critical role of these residues in determining the structure of the proteins. Identification of the critical determinants of the two different EF-hand domain arrangements and the distinct dynamical features relevant to their respective functions provides insight into the relationships between sequence, structure, dynamics and function among these EF-hand CaBPs.  相似文献   

10.
Deubiquitinating enzymes (DUbs) play important roles in many ubiquitin-dependent pathways, yet how DUbs themselves are regulated is not well understood. Here, we provide insight into the mechanism by which ubiquitination directly enhances the activity of ataxin-3, a DUb implicated in protein quality control and the disease protein in the polyglutamine neurodegenerative disorder, Spinocerebellar Ataxia Type 3. We identify Lys-117, which resides near the catalytic triad, as the primary site of ubiquitination in wild type and pathogenic ataxin-3. Further studies indicate that ubiquitin-dependent activation of ataxin-3 at Lys-117 is important for its ability to reduce high molecular weight ubiquitinated species in cells. Ubiquitination at Lys-117 also facilitates the ability of ataxin-3 to induce aggresome formation in cells. Finally, structure-function studies support a model of activation whereby ubiquitination at Lys-117 enhances ataxin-3 activity independent of the known ubiquitin-binding sites in ataxin-3, most likely through a direct conformational change in or near the catalytic domain.  相似文献   

11.
12.

Background  

Accurate sequence alignments are essential for homology searches and for building three-dimensional structural models of proteins. Since structure is better conserved than sequence, structure alignments have been used to guide sequence alignments and are commonly used as the gold standard for sequence alignment evaluation. Nonetheless, as far as we know, there is no report of a systematic evaluation of pairwise structure alignment programs in terms of the sequence alignment accuracy.  相似文献   

13.
An open question in protein homology modeling is, how well do current modeling packages satisfy the dual criteria of quality of results and practical ease of use? To address this question objectively, we examined homology‐built models of a variety of therapeutically relevant proteins. The sequence identities across these proteins range from 19% to 76%. A novel metric, the difference alignment index (DAI), is developed to aid in quantifying the quality of local sequence alignments. The DAI is also used to construct the relative sequence alignment (RSA), a new representation of global sequence alignment that facilitates comparison of sequence alignments from different methods. Comparisons of the sequence alignments in terms of the RSA and alignment methodologies are made to better understand the advantages and caveats of each method. All sequence alignments and corresponding 3D models are compared to their respective structure‐based alignments and crystal structures. A variety of protein modeling software was used. We find that at sequence identities >40%, all packages give similar (and satisfactory) results; at lower sequence identities (<25%), the sequence alignments generated by Profit and Prime, which incorporate structural information in their sequence alignment, stand out from the rest. Moreover, the model generated by Prime in this low sequence identity region is noted to be superior to the rest. Additionally, we note that DSModeler and MOE, which generate reasonable models for sequence identities >25%, are significantly more functional and easier to use when compared with the other structure‐building software.  相似文献   

14.
The Josephin domain is a conserved cysteine protease domain found in four human deubiquitinating enzymes: ataxin-3, the ataxin-3-like protein (ATXN3L), Josephin-1, and Josephin-2. Josephin domains from these four proteins were purified and assayed for their ability to cleave ubiquitin substrates. Reaction rates differed markedly both among the different proteins and for different substrates with a given protein. The ATXN3L Josephin domain is a significantly more efficient enzyme than the ataxin-3 domain despite their sharing 85% sequence identity. To understand the structural basis of this difference, the 2.6 Å x-ray crystal structure of the ATXN3L Josephin domain in complex with ubiquitin was determined. Although ataxin-3 and ATXN3L adopt similar folds, they bind ubiquitin in different, overlapping sites. Mutations were made in ataxin-3 at selected positions, introducing the corresponding ATXN3L residue. Only three such mutations are sufficient to increase the catalytic activity of the ataxin-3 domain to levels comparable with that of ATXN3L, suggesting that ataxin-3 has been subject to evolutionary restraints that keep its deubiquitinating activity in check.  相似文献   

15.
This paper evaluates the results of a protein structure prediction contest. The predictions were made using threading procedures, which employ techniques for aligning sequences with 3D structures to select the correct fold of a given sequence from a set of alternatives. Nine different teams submitted 86 predictions, on a total of 21 target proteins with little or no sequence homology to proteins of known structure. The 3D structures of these proteins were newly determined by experimental methods, but not yet published or otherwise available to the predictors. The predictions, made from the amino acid sequence alone, thus represent a genuine test of the current performance of threading methods. Only a subset of all the predictions is evaluated here. It corresponds to the 44 predictions submitted for the 11 target proteins seen to adopt known folds. The predictions for the remaining 10 proteins were not analyzed, although weak similarities with known folds may also exist in these proteins. We find that threading methods are capable of identifying the correct fold in many cases, but not reliably enough as yet. Every team predicts correctly a different set of targets, with virtually all targets predicted correctly by at least one team. Also, common folds such as TIM barrels are recognized more readily than folds with only a few known examples. However, quite surprisingly, the quality of the sequence-structure alignments, corresponding to correctly recognized folds, is generally very poor, as judged by comparison with the corresponding 3D structure alignments. Thus, threading can presently not be relied upon to derive a detailed 3D model from the amino acid sequence. This raises a very intriguing question: how is fold recognition achieved? Our analysis suggests that it may be achieved because threading procedures maximize hydrophobic interactions in the protein core, and are reasonably good at recognizing local secondary structure. © 1995 Wiley-Liss, Inc.  相似文献   

16.
MOTIVATION: In recent years, advances have been made in the ability of computational methods to discriminate between homologous and non-homologous proteins in the 'twilight zone' of sequence similarity, where the percent sequence identity is a poor indicator of homology. To make these predictions more valuable to the protein modeler, they must be accompanied by accurate alignments. Pairwise sequence alignments are inferences of orthologous relationships between sequence positions. Evolutionary distance is traditionally modeled using global amino acid substitution matrices. But real differences in the likelihood of substitutions may exist for different structural contexts within proteins, since structural context contributes to the selective pressure. RESULTS: HMMSUM (HMMSTR-based substitution matrices) is a new model for structural context-based amino acid substitution probabilities consisting of a set of 281 matrices, each for a different sequence-structure context. HMMSUM does not require the structure of the protein to be known. Instead, predictions of local structure are made using HMMSTR, a hidden Markov model for local structure. Alignments using the HMMSUM matrices compare favorably to alignments carried out using the BLOSUM matrices or structure-based substitution matrices SDM and HSDM when validated against remote homolog alignments from BAliBASE. HMMSUM has been implemented using local Dynamic Programming and with the Bayesian Adaptive alignment method.  相似文献   

17.
Cytoplasm-nucleus shuttling of phosphoinositol 3-kinase enhancer (PIKE) is known to correlate directly with its cellular functions. However, the molecular mechanism governing this shuttling is not known. In this work, we demonstrate that PIKE is a new member of split pleckstrin homology (PH) domain-containing proteins. The structure solved in this work reveals that the PIKE PH domain is split into halves by a positively charged nuclear localization sequence. The PIKE PH domain binds to the head groups of di- and triphosphoinositides with similar affinities. Lipid membrane binding of the PIKE PH domain is further enhanced by the positively charged nuclear localization sequence, which is juxtaposed to the phosphoinositide head group-binding pocket of the domain. We demonstrate that the cytoplasmic-nuclear shuttling of PIKE is dynamically regulated by the balancing actions of the lipid-binding property of both the split PH domain and the nuclear targeting function of its nuclear localization sequence.  相似文献   

18.
Kosloff M  Kolodny R 《Proteins》2008,71(2):891-902
It is often assumed that in the Protein Data Bank (PDB), two proteins with similar sequences will also have similar structures. Accordingly, it has proved useful to develop subsets of the PDB from which "redundant" structures have been removed, based on a sequence-based criterion for similarity. Similarly, when predicting protein structure using homology modeling, if a template structure for modeling a target sequence is selected by sequence alone, this implicitly assumes that all sequence-similar templates are equivalent. Here, we show that this assumption is often not correct and that standard approaches to create subsets of the PDB can lead to the loss of structurally and functionally important information. We have carried out sequence-based structural superpositions and geometry-based structural alignments of a large number of protein pairs to determine the extent to which sequence similarity ensures structural similarity. We find many examples where two proteins that are similar in sequence have structures that differ significantly from one another. The source of the structural differences usually has a functional basis. The number of such proteins pairs that are identified and the magnitude of the dissimilarity depend on the approach that is used to calculate the differences; in particular sequence-based structure superpositioning will identify a larger number of structurally dissimilar pairs than geometry-based structural alignments. When two sequences can be aligned in a statistically meaningful way, sequence-based structural superpositioning provides a meaningful measure of structural differences. This approach and geometry-based structure alignments reveal somewhat different information and one or the other might be preferable in a given application. Our results suggest that in some cases, notably homology modeling, the common use of nonredundant datasets, culled from the PDB based on sequence, may mask important structural and functional information. We have established a data base of sequence-similar, structurally dissimilar protein pairs that will help address this problem (http://luna.bioc.columbia.edu/rachel/seqsimstrdiff.htm).  相似文献   

19.
Comparing and classifying the three-dimensional (3D) structures of proteins is of crucial importance to molecular biology, from helping to determine the function of a protein to determining its evolutionary relationships. Traditionally, 3D structures are classified into groups of families that closely resemble the grouping according to their primary sequence. However, significant structural similarities exist at multiple levels between proteins that belong to these different structural families. In this study, we propose a new algorithm, CLICK, to capture such similarities. The method optimally superimposes a pair of protein structures independent of topology. Amino acid residues are represented by the Cartesian coordinates of a representative point (usually the C(α) atom), side chain solvent accessibility, and secondary structure. Structural comparison is effected by matching cliques of points. CLICK was extensively benchmarked for alignment accuracy on four different sets: (i) 9537 pair-wise alignments between two structures with the same topology; (ii) 64 alignments from set (i) that were considered to constitute difficult alignment cases; (iii) 199 pair-wise alignments between proteins with similar structure but different topology; and (iv) 1275 pair-wise alignments of RNA structures. The accuracy of CLICK alignments was measured by the average structure overlap score and compared with other alignment methods, including HOMSTRAD, MUSTANG, Geometric Hashing, SALIGN, DALI, GANGSTA(+), FATCAT, ARTS and SARA. On average, CLICK produces pair-wise alignments that are either comparable or statistically significantly more accurate than all of these other methods. We have used CLICK to uncover relationships between (previously) unrelated proteins. These new biological insights include: (i) detecting hinge regions in proteins where domain or sub-domains show flexibility; (ii) discovering similar small molecule binding sites from proteins of different folds and (iii) discovering topological variants of known structural/sequence motifs. Our method can generally be applied to compare any pair of molecular structures represented in Cartesian coordinates as exemplified by the RNA structure superimposition benchmark.  相似文献   

20.
In prokaryotes, DNA replication is initiated by the binding of DnaA to the oriC region of the chromosome to load the primosome machinery and start a new replication round. Several proteins control these events in Escherichia coli to ensure that replication is precisely timed during the cell cycle. Here, we report the crystal structure of HobA (HP1230) at 1.7 A, a recently discovered protein that specifically interacts with DnaA protein from Helicobacter pylori (HpDnaA). We found that the closest structural homologue of HobA is a sugar isomerase (SIS) domain containing protein, the phosphoheptose isomerase from Pseudomonas aeruginosa. Remarkably, SIS proteins share strong sequence homology with DiaA from E. coli; yet, HobA and DiaA share no sequence homology. Thus, by solving the structure of HobA, we unexpectedly discovered that HobA is a H. pylori structural homologue of DiaA. By comparing the structure of HobA to a homology model of DiaA, we identified conserved, surface-accessible residues that could be involved in protein-protein interaction. Finally, we show that HobA specifically interacts with the N-terminal part of HpDnaA. The structural homology between DiaA and HobA strongly supports their involvement in the replication process and these proteins could define a new structural family of replication regulators in bacteria.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号