首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abhiman S  Sonnhammer EL 《Proteins》2005,60(4):758-768
Protein function shift can be predicted from sequence comparisons, either using positive selection signals or evolutionary rate estimation. None of the methods have been validated on large datasets, however. Here we investigate existing and novel methods for protein function shift prediction, and benchmark the accuracy against a large dataset of proteins with known enzymatic functions. Function change was predicted between subfamilies by identifying two kinds of sites in a multiple sequence alignment: Conservation-Shifting Sites (CSS), which are conserved in two subfamilies using two different amino acid types, and Rate-Shifting Sites (RSS), which have different evolutionary rates in two subfamilies. CSS were predicted by a new entropy-based method, and RSS using the Rate-Shift program. In principle, the more CSS and RSS between two subfamilies, the more likely a function shift between them. A test dataset was built by extracting subfamilies from Pfam with different EC numbers that belong to the same domain family. Subfamilies were generated automatically using a phylogenetic tree-based program, BETE. The dataset comprised 997 subfamily pairs with four or more members per subfamily. We observed a significant increase in CSS and RSS for subfamily comparisons with different EC numbers compared to cases with same EC numbers. The discrimination was better using RSS than CSS, and was more pronounced for larger families. Combining RSS and CSS by discriminant analysis improved classification accuracy to 71%. The method was applied to the Pfam database and the results are available at http://FunShift.cgb.ki.se. A closer examination of some superfamily comparisons showed that single EC numbers sometimes embody distinct functional classes. Hence, the measured accuracy of function shift is underestimated.  相似文献   

2.
The rapid increase in the amount of protein sequence data has created a need for automated identification of sites that determine functional specificity among related subfamilies of proteins. A significant fraction of subfamily specific sites are only marginally conserved, which makes it extremely challenging to detect those amino acid changes that lead to functional diversification. To address this critical problem we developed a method named SPEER (specificity prediction using amino acids' properties, entropy and evolution rate) to distinguish specificity determining sites from others. SPEER encodes the conservation patterns of amino acid types using their physico-chemical properties and the heterogeneity of evolutionary changes between and within the subfamilies. To test the method, we compiled a test set containing 13 protein families with known specificity determining sites. Extensive benchmarking by comparing the performance of SPEER with other specificity site prediction algorithms has shown that it performs better in predicting several categories of subfamily specific sites.  相似文献   

3.
Automatic methods for predicting functionally important residues   总被引:9,自引:0,他引:9  
Sequence analysis is often the first guide for the prediction of residues in a protein family that may have functional significance. A few methods have been proposed which use the division of protein families into subfamilies in the search for those positions that could have some functional significance for the whole family, but at the same time which exhibit the specificity of each subfamily ("Tree-determinant residues"). However, there are still many unsolved questions like the best division of a protein family into subfamilies, or the accurate detection of sequence variation patterns characteristic of different subfamilies. Here we present a systematic study in a significant number of protein families, testing the statistical meaning of the Tree-determinant residues predicted by three different methods that represent the range of available approaches. The first method takes as a starting point a phylogenetic representation of a protein family and, following the principle of Relative Entropy from Information Theory, automatically searches for the optimal division of the family into subfamilies. The second method looks for positions whose mutational behavior is reminiscent of the mutational behavior of the full-length proteins, by directly comparing the corresponding distance matrices. The third method is an automation of the analysis of distribution of sequences and amino acid positions in the corresponding multidimensional spaces using a vector-based principal component analysis. These three methods have been tested on two non-redundant lists of protein families: one composed by proteins that bind a variety of ligand groups, and the other composed by proteins with annotated functionally relevant sites. In most cases, the residues predicted by the three methods show a clear tendency to be close to bound ligands of biological relevance and to those amino acids described as participants in key aspects of protein function. These three automatic methods provide a wide range of possibilities for biologists to analyze their families of interest, in a similar way to the one presented here for the family of proteins related with ras-p21.  相似文献   

4.
Function prediction by homology is widely used to provide preliminary functional annotations for genes for which experimental evidence of function is unavailable or limited. This approach has been shown to be prone to systematic error, including percolation of annotation errors through sequence databases. Phylogenomic analysis avoids these errors in function prediction but has been difficult to automate for high-throughput application. To address this limitation, we present a computationally efficient pipeline for phylogenomic classification of proteins. This pipeline uses the SCI-PHY (Subfamily Classification in Phylogenomics) algorithm for automatic subfamily identification, followed by subfamily hidden Markov model (HMM) construction. A simple and computationally efficient scoring scheme using family and subfamily HMMs enables classification of novel sequences to protein families and subfamilies. Sequences representing entirely novel subfamilies are differentiated from those that can be classified to subfamilies in the input training set using logistic regression. Subfamily HMM parameters are estimated using an information-sharing protocol, enabling subfamilies containing even a single sequence to benefit from conservation patterns defining the family as a whole or in related subfamilies. SCI-PHY subfamilies correspond closely to functional subtypes defined by experts and to conserved clades found by phylogenetic analysis. Extensive comparisons of subfamily and family HMM performances show that subfamily HMMs dramatically improve the separation between homologous and non-homologous proteins in sequence database searches. Subfamily HMMs also provide extremely high specificity of classification and can be used to predict entirely novel subtypes. The SCI-PHY Web server at http://phylogenomics.berkeley.edu/SCI-PHY/ allows users to upload a multiple sequence alignment for subfamily identification and subfamily HMM construction. Biologists wishing to provide their own subfamily definitions can do so. Source code is available on the Web page. The Berkeley Phylogenomics Group PhyloFacts resource contains pre-calculated subfamily predictions and subfamily HMMs for more than 40,000 protein families and domains at http://phylogenomics.berkeley.edu/phylofacts/.  相似文献   

5.
Current hypotheses of gene duplicate divergence propose that surviving members of a gene duplicate pair may evolve, under conditions of purifying or nearly neutral selection, in one of two ways: with new function arising in one duplicate while the other retains original function (neofunctionalization [NF]) or partitioning of the original function between the 2 paralogs (subfunctionalization [SF]). More recent studies propose that SF followed by NF (subneofunctionalization [SNF]) explains the divergence of many duplicate genes. In this analysis, we evaluate these hypotheses in the context of the large monosaccharide transporter (MST) gene families in Arabidopsis and rice. MSTs have an ancient origin, predating plants, and have evolved in the seed plant lineage to comprise 7 subfamilies. In Arabidopsis, 53 putative MST genes have been identified, with one subfamily greatly expanded by tandem gene duplications. We searched the rice genome for members of the MST gene family and compared them with the MST gene family in Arabidopsis to determine subfamily expansion patterns and estimate gene duplicate divergence times. We tested hypotheses of gene duplicate divergence in 24 paralog pairs by comparing protein sequence divergence rates, estimating positive selection on codon sites, and analyzing tissue expression patterns. Results reveal the MST gene family to be significantly larger (65) in rice with 2 subfamilies greatly expanded by tandem duplications. Gene duplicate divergence time estimates indicate that early diversification of most subfamilies occurred in the Proterozoic (2500-540 Myr) and that expansion of large subfamilies continued through the Cenozoic (65-0 Myr). Two-thirds of paralog pairs show statistically symmetric rates of sequence evolution, most consistent with the SF model, with half of those showing evidence for positive selection in one or both genes. Among 8 paralog pairs showing asymmetric divergence rates, most consistent with the NF model, nearly half show evidence of positive selection. Positive selection does not appear in any duplicate pairs younger than approximately 34 Myr. Our data suggest that the NF, SF, and SNF models describe different outcomes along a continuum of divergence resulting from initial conditions of relaxed constraint after duplication.  相似文献   

6.
Peroxiredoxins (Prxs) are a widespread and highly expressed family of cysteine‐based peroxidases that react very rapidly with H2O2, organic peroxides, and peroxynitrite. Correct subfamily classification has been problematic because Prx subfamilies are frequently not correlated with phylogenetic distribution and diverge in their preferred reductant, oligomerization state, and tendency toward overoxidation. We have developed a method that uses the Deacon Active Site Profiler (DASP) tool to extract functional‐site profiles from structurally characterized proteins to computationally define subfamilies and to identify new Prx subfamily members from GenBank(nr). For the 58 literature‐defined Prx test proteins, 57 were correctly assigned, and none were assigned to the incorrect subfamily. The >3500 putative Prx sequences identified were then used to analyze residue conservation in the active site of each Prx subfamily. Our results indicate that the existence and location of the resolving cysteine vary in some subfamilies (e.g., Prx5) to a greater degree than previously appreciated and that interactions at the A interface (common to Prx5, Tpx, and higher order AhpC/Prx1 structures) are important for stabilization of the correct active‐site geometry. Interestingly, this method also allows us to further divide the AhpC/Prx1 into four groups that are correlated with functional characteristics. The DASP method provides more accurate subfamily classification than PSI‐BLAST for members of the Prx family and can now readily be applied to other large protein families. Proteins 2011. © 2010 Wiley‐Liss, Inc.  相似文献   

7.
With the rapid increment of protein sequence data, it is indispensable to develop automated and reliable predictive methods for protein function annotation. One approach for facilitating protein function prediction is to classify proteins into functional families from primary sequence. Being the most important group of all proteins, the accurate prediction for enzyme family classes and subfamily classes is closely related to their biological functions. In this paper, for the prediction of enzyme subfamily classes, the Chou's amphiphilic pseudo-amino acid composition [Chou, K.C., 2005. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21, 10-19] has been adopted to represent the protein samples for training the 'one-versus-rest' support vector machine. As a demonstration, the jackknife test was performed on the dataset that contains 2640 oxidoreductase sequences classified into 16 subfamily classes [Chou, K.C., Elrod, D.W., 2003. Prediction of enzyme family classes. J. Proteome Res. 2, 183-190]. The overall accuracy thus obtained was 80.87%. The significant enhancement in the accuracy indicates that the current method might play a complementary role to the exiting methods.  相似文献   

8.
Voltage-gated ion channels (VGCs) mediate selective diffusion of ions across cell membranes to enable many vital cellular processes. Three-dimensional structure data are lacking for VGC proteins; hence, to better understand their function, there is a need to identify the conserved motifs using sequence analysis methods. In this study, we have used a profile-to-profile alignment method to identify several new conserved motifs specific to each transmembrane segment (TMS) of the voltage-sensing and the pore-forming modules of Ca2+, Na+, and K+ channel subfamilies. For Ca2+ and Na+, the functional theme of motif conservation is similar in all segments while they differ with those of the K+ channel proteins. Nevertheless, the conservation is strikingly similar in the S4 segment of the voltage-sensing module across all subfamilies. In each subfamily and for each TMS, we have identified conserved motifs/residues and correlated their functional significance and disease associations in human, using mutational data from the literature.  相似文献   

9.
It is a central assumption of evolution that gene duplications provide the genetic raw material from which to create proteins with new functions. The increasing availability in multigene family sequences that has resulted from genome projects has inspired the creation of novel in silico approaches to predict details of protein function. The underlying principle of all such approaches is to compare the evolutionary properties of homologous sequence positions in paralogous proteins. It has been proposed that the positions that show switches in substitution rate over time-i.e., "heterotachous sites," are good indicators of functional divergence. Here, we analyzed the alpha and beta paralogous subunits of hemoglobin in search for such signatures. We found as many heterotachous sites in comparisons between groups of paralogous subunits (alpha/beta) as between orthologous ones (alpha/alpha, beta/beta). Thus, the importance of substitution rate shifts as predictors of specialization between protein subfamilies might be reconsidered. Instead, such shifts may reflect a more general process of protein evolution, consistent with the fact that they can be compatible with function conservation. As an alternative, we focused on those residues showing highly constrained states in two sequence groups, but different in each group, and we named them CBD (for "constant but different"). As opposed to heterotachous positions, CBD sites were markedly overrepresented in paralogous (alpha/beta) comparisons, as opposed to orthologous ones (alpha/alpha, beta/beta), identifying them as likely signatures of functional specialization between the two subunits. When superimposed onto the three-dimensional structure of hemoglobin, CBD positions consistently appeared to cluster preferentially on inter-subunit surfaces, two contact areas crucial to function in vertebrate tetrameric hemoglobin. The identification and analysis of CBD sites by complementing structural information with evolutionary data may represent a promising direction for future studies dealing with the functional characterization of a growing number of multigene families identified by complete genome analyses.  相似文献   

10.
By HPLC, a taurine-conjugated bile acid with a retention time different from that of taurocholate was found to be present in the bile of the black-necked swan, Cygnus melanocoryphus. The bile acid was isolated and its structure, established by (1)H and (13)C NMR and mass spectrometry, was that of the taurine N-acyl amidate of 3alpha,7alpha,15alpha-trihydroxy-5beta-cholan-24-oic acid. The compound was shown to have chromatographic and spectroscopic properties that were identical to those of the taurine conjugate of authentic 3alpha,7alpha,15alpha-trihydroxy-5beta-cholan-24-oic acid, previously synthesized by us from ursodeoxycholic acid. By HPLC, the taurine conjugate of 3alpha,7alpha,15alpha-trihydroxy-5beta-cholan-24-oic acid was found to be present in 6 of 6 species in the subfamily Dendrocygninae (tree ducks) and in 10 of 13 species in the subfamily Anserinae (swans and geese) but not in other subfamilies in the Anatidae family. It was also not present in species from the other two families of the order Anseriformes. 3alpha,7alpha,15alpha-Trihydroxy-5beta-cholan-24-oic acid is a new primary bile acid that is present in the biliary bile acids of swans, tree ducks, and geese and may be termed 15alpha-hydroxy-chenodeoxycholic acid.  相似文献   

11.
Protein phosphorylation is a ubiquitous protein post-translational modification, which plays an important role in cellular signaling systems underlying various physiological and pathological processes. Current in silico methods mainly focused on the prediction of phosphorylation sites, but rare methods considered whether a phosphorylation site is functional or not. Since functional phosphorylation sites are more valuable for further experimental research and a proportion of phosphorylation sites have no direct functional effects, the prediction of functional phosphorylation sites is quite necessary for this research area. Previous studies have shown that functional phosphorylation sites are more conserved than non-functional phosphorylation sites in evolution. Thus, in our method, we developed a web server by integrating existing phosphorylation site prediction methods, as well as both absolute and relative evolutionary conservation scores to predict the most likely functional phosphorylation sites. Using our method, we predicted the most likely functional sites of the human, rat and mouse proteomes and built a database for the predicted sites. By the analysis of overall prediction results, we demonstrated that protein phosphorylation plays an important role in all the enriched KEGG pathways. By the analysis of protein-specific prediction results, we demonstrated the usefulness of our method for individual protein studies. Our method would help to characterize the most likely functional phosphorylation sites for further studies in this research area.  相似文献   

12.
Margus T  Remm M  Tenson T 《PloS one》2011,6(8):e22789

Background

Elongation factor G (EFG) is a core translational protein that catalyzes the elongation and recycling phases of translation. A more complex picture of EFG''s evolution and function than previously accepted is emerging from analyzes of heterogeneous EFG family members. Whereas the gene duplication is postulated to be a prominent factor creating functional novelty, the striking divergence between EFG paralogs can be interpreted in terms of innovation in gene function.

Methodology/Principal Findings

We present a computational study of the EFG protein family to cover the role of gene duplication in the evolution of protein function. Using phylogenetic methods, genome context conservation and insertion/deletion (indel) analysis we demonstrate that the EFG gene copies form four subfamilies: EFG I, spdEFG1, spdEFG2, and EFG II. These ancient gene families differ by their indispensability, degree of divergence and number of indels. We show the distribution of EFG subfamilies and describe evidences for lateral gene transfer and recent duplications. Extended studies of the EFG II subfamily concern its diverged nature. Remarkably, EFG II appears to be a widely distributed and a much-diversified subfamily whose subdivisions correlate with phylum or class borders. The EFG II subfamily specific characteristics are low conservation of the GTPase domain, domains II and III; absence of the trGTPase specific G2 consensus motif “RGITI”; and twelve conserved positions common to the whole subfamily. The EFG II specific functional changes could be related to changes in the properties of nucleotide binding and hydrolysis and strengthened ionic interactions between EFG II and the ribosome, particularly between parts of the decoding site and loop I of domain IV.

Conclusions/Significance

Our work, for the first time, comprehensively identifies and describes EFG subfamilies and improves our understanding of the function and evolution of EFG duplicated genes.  相似文献   

13.
拟南芥R2R3-MYB家族第22亚族的结构与功能   总被引:2,自引:0,他引:2  
樊锦涛  蒋琛茜  邢继红  董金皋 《遗传》2014,36(10):985-994
拟南芥R2R3-MYB转录因子在拟南芥生长发育、代谢及响应生物和非生物胁迫的调控网络中具有重要作用。根据保守的氨基酸序列,R2R3-MYB转录因子被分为25个亚族,其中第22亚族包含AtMYB44、AtMYB77、AtMYB73和AtMYB70 4个基因,主要响应生物和非生物胁迫。文章从基因功能的相似性、基因表达的一致性和基因结构的保守性3方面综述了第22亚族的4个基因,并综合讨论了其在结构与功能上的冗余性和多样性。  相似文献   

14.
Phylogenetic relationships among the NBS-LRR (nucleotide binding site–leucine-rich repeat) resistance gene homologues (RGHs) from 30 genera and nine families were evaluated relative to phylogenies for these taxa. More than 800 NBS-LRR RGHs were analyzed, primarily from Fabaceae, Brassicaceae, Poaceae, and Solanaceae species, but also from representatives of other angiosperm and gymnosperm families. Parsimony, maximum likelihood, and distance methods were used to classify these RGHs relative to previously observed gene subfamilies as well as within more closely related sequence clades. Grouping sequences using a distance cutoff of 250 PAM units (point accepted mutations per 100 residues) identified at least five ancient sequence clades with representatives from several plant families: the previously observed TIR gene subfamily and a minimum of four deep splits within the non-TIR gene subfamily. The deep splits in the non-TIR subfamily are also reflected in comparisons of amino acid substitution rates in various species and in ratios of nonsynonymous-to-synonymous nucleotide substitution rates (K A/K S values) in Arabidopsis thaliana. Lower K A/K S values in the TIR than the non-TIR sequences suggest greater functional constraints in the TIR subfamily. At least three of the five identified ancient clades appear to predate the angiosperm–gymnosperm radiation. Monocot sequences are absent from the TIR subfamily, as observed in previous studies. In both subfamilies, clades with sequences separated by approximately 150 PAM units are family but not genus specific, providing a rough measure of minimum dates for the first diversification event within these clades. Within any one clade, particular taxa may be dramatically over- or underrepresented, suggesting preferential expansions or losses of certain RGH types within particular taxa and suggesting that no one species will provide models for all major sequence types in other taxa. Received: 13 June 2001 / Accepted: 22 October 2001  相似文献   

15.
Bacterial 3-deoxy-d-arabino-heptulosonate 7-phosphate synthases (DAHPSs) have been divided into either of two classes (Class I/Class II) or subfamilies (AroAI(alpha)/AroAI(beta)). Our investigation into the biochemical properties of the unique bifunctional DAHPS from Bacillus subtilis provides new insight into the evolutionary link among DAHPS subfamilies. In the present study, the DAHPS (aroA) and chorismate mutase (aroQ) activities of B. subtilis DAHPS are separated by domain truncation. Detailed enzymatic studies with the full-length wild-type protein and the truncated domains led to our hypothesis that the aroQ domain was fused to the N terminus of aroA in B. subtilis during evolution for the purpose of feedback regulation and not for the creation of a bona fide bifunctional enzyme. In addition, examination of aroA and aroQ fusion proteins from Porphyromonas gingivalis, in which the aroQ domain is fused to the C terminus of aroA, further supports the hypothesis. These results, along with sequence structure analysis of the DAHPS families suggest that "feedback regulation" may indeed be the evolutionary link between the two classes/subfamilies. It is likely that DAHPSs evolved from a primitive unregulated member of the AroAI(beta) subfamily. During evolution, some members of the AroAI(beta) subfamily remained unregulated, whereas other members acquired an extra domain for feedback regulation. The AroAI(alpha) subfamilies, however, evolved in a more complex manner to acquire insertions/extensions in the (beta/alpha)(8) barrel to function as regulatory elements.  相似文献   

16.
Cai CZ  Han LY  Ji ZL  Chen YZ 《Proteins》2004,55(1):66-76
One approach for facilitating protein function prediction is to classify proteins into functional families. Recent studies on the classification of G-protein coupled receptors and other proteins suggest that a statistical learning method, Support vector machines (SVM), may be potentially useful for protein classification into functional families. In this work, SVM is applied and tested on the classification of enzymes into functional families defined by the Enzyme Nomenclature Committee of IUBMB. SVM classification system for each family is trained from representative enzymes of that family and seed proteins of Pfam curated protein families. The classification accuracy for enzymes from 46 families and for non-enzymes is in the range of 50.0% to 95.7% and 79.0% to 100% respectively. The corresponding Matthews correlation coefficient is in the range of 54.1% to 96.1%. Moreover, 80.3% of the 8,291 correctly classified enzymes are uniquely classified into a specific enzyme family by using a scoring function, indicating that SVM may have certain level of unique prediction capability. Testing results also suggest that SVM in some cases is capable of classification of distantly related enzymes and homologous enzymes of different functions. Effort is being made to use a more comprehensive set of enzymes as training sets and to incorporate multi-class SVM classification systems to further enhance the unique prediction accuracy. Our results suggest the potential of SVM for enzyme family classification and for facilitating protein function prediction. Our software is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.  相似文献   

17.
为了筛选适宜于养殖中华绒螯蟹幼蟹的饲料植物蛋白源,探究不同植物蛋白源饲料对幼蟹生长性能、氨基酸沉积率和抗氧化性能等方面的影响,以50%的鱼粉配制基础饲料(记为FM),分别采用30.5%发酵豆粕、32.5%豆粕、28%棉粕和39%菜粕替代基础饲料中鱼粉总量的50%,配制成4种等氮等能的饲料(分别记为FSBM、SBM、CSM和RSM),投喂初始体重为(0.249±0.003)g的中华绒螯蟹幼蟹8周。结果表明:(1)与FM组相比,FSBM、SBM和CSM组的增重率、特定生长率、饲料系数、蛋白质效率和蛋白质沉积率均没有显著性差异;RSM组的增重率与FM组相比差异不显著(P>0.05),但显著低于SBM组(P < 0.05),而其饲料系数则显著高于FM、FSBM及SBM组(P < 0.05),蛋白质效率显著低于其他各组(P < 0.05),蛋白质沉积率显著低于SBM和CSM组(P < 0.05)。(2)不同植物蛋白组的总必需氨基酸沉积率和FM组相比差异不显著(P>0.05),而RSM组总必需氨基酸沉积率显著低于FSBM和CSM组(P < 0.05)。(3)与FM组相比,不同植物蛋白组蟹的血清和肝胰腺中超氧化物歧化酶(SOD)、谷胱甘肽过氧化物酶(GSH-PX)活性和肝胰腺丙二醛(MDA)含量并没有显著的影响,而RSM组血清丙二醛(MDA)含量显著的高于其他各组(P < 0.05)。结果表明,在幼蟹饲料中,豆粕、发酵豆粕和棉粕替代基础配方中鱼粉的50%后并未对幼蟹的生长性能、氨基酸沉积率及抗氧化能力造成负面的影响,发酵豆粕、豆粕和棉粕可以作为替代鱼粉的适宜蛋白源,且添加水平约在30%左右。菜粕替代后降低了饲料的利用和氨基酸沉积效率,这可能是由于菜粕的蛋白质消化率低、含有相应的抗营养因子和添加水平过高所致,建议使用前应适当进行脱毒处理,并与或和其他植物蛋白配伍使用。  相似文献   

18.
Cai XH  Jaroszewski L  Wooley J  Godzik A 《Proteins》2011,79(8):2389-2402
The protein universe can be organized in families that group proteins sharing common ancestry. Such families display variable levels of structural and functional divergence, from homogenous families, where all members have the same function and very similar structure, to very divergent families, where large variations in function and structure are observed. For practical purposes of structure and function prediction, it would be beneficial to identify sub-groups of proteins with highly similar structures (iso-structural) and/or functions (iso-functional) within divergent protein families. We compared three algorithms in their ability to cluster large protein families and discuss whether any of these methods could reliably identify such iso-structural or iso-functional groups. We show that clustering using profile-sequence and profile-profile comparison methods closely reproduces clusters based on similarities between 3D structures or clusters of proteins with similar biological functions. In contrast, the still commonly used sequence-based methods with fixed thresholds result in vast overestimates of structural and functional diversity in protein families. As a result, these methods also overestimate the number of protein structures that have to be determined to fully characterize structural space of such families. The fact that one can build reliable models based on apparently distantly related templates is crucial for extracting maximal amount of information from new sequencing projects.  相似文献   

19.
Civera C  Simon B  Stier G  Sattler M  Macias MJ 《Proteins》2005,58(2):354-366
Pleckstrin1 is a major substrate for protein kinase C in platelets and leukocytes, and comprises a central DEP (disheveled, Egl-10, pleckstrin) domain, which is flanked by two PH (pleckstrin homology) domains. DEP domains display a unique alpha/beta fold and have been implicated in membrane binding utilizing different mechanisms. Using multiple sequence alignments and phylogenetic tree reconstructions, we find that 6 subfamilies of the DEP domain exist, of which pleckstrin represents a novel and distinct subfamily. To clarify structural determinants of the DEP fold and to gain further insight into the role of the DEP domain, we determined the three-dimensional structure of the pleckstrin DEP domain using heteronuclear NMR spectroscopy. Pleckstrin DEP shares main structural features with the DEP domains of disheveled and Epac, which belong to different DEP subfamilies. However, the pleckstrin DEP fold is distinct from these structures and contains an additional, short helix alpha4 inserted in the beta4-beta5 loop that exhibits increased backbone mobility as judged by NMR relaxation measurements. Based on sequence conservation, the helix alpha4 may also be present in the DEP domains of regulator of G-protein signaling (RGS) proteins, which are members of the same DEP subfamily. In pleckstrin, the DEP domain is surrounded by two PH domains. Structural analysis and charge complementarity suggest that the DEP domain may interact with the N-terminal PH domain in pleckstrin. Phosphorylation of the PH-DEP linker, which is required for pleckstrin function, could regulate such an intramolecular interaction. This suggests a role of the pleckstrin DEP domain in intramolecular domain interactions, which is distinct from the functions of other DEP domain subfamilies found so far.  相似文献   

20.
Physicochemical properties are potentially useful in predicting functional differences between aligned protein subfamilies. We present a method that considers physicochemical properties from ancestral sequences predicted to have given rise to the subfamilies of interest by gene duplication. Comparison between two map kinases subfamilies, p38 and ERK, revealed a region that had an excess of change in properties after gene duplication followed by conservation within the two subfamilies. This region corresponded to that experimentally defined as important for substrate and pathway specificity. The derived scores for the region of interest were found to differ significantly in their distribution compared to the rest of the protein when the Kolmogorov-Smirnov test was applied (p = 0.005). Thus, the incorporation of ancestral physicochemical properties is useful in predicting functional differences between protein subfamilies. In addition, the method was applied to the MKK and MAPK components of the p38 and JNK pathways. These proteins showed a similar pattern in their evolution and regions predicted to confer functional differences are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号