首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hughes AL 《Immunogenetics》2012,64(7):549-558
The βGRP/GNBP/β-1,3-glucanase protein family of insects includes several proteins involved in innate immune recognition, such as the β-glucan recognition proteins of Lepidoptera and the Gram-negative bacteria-binding proteins of Drosophila. A phylogenetic analysis supported the existence of two distinct subfamilies, designated the pattern recognition receptor (PRR) and glucanase subfamilies, which originated by gene duplication prior to the origin of the Holometabola. In the C-terminal region (CTR) shared by both subfamilies, the PRR subfamily has evolved significantly more rapidly at the amino acid sequence level than has the glucanase subfamily, implying a relative lack of constraint on the amino acid sequence of this region in the PRR subfamily. PRR subfamily members also include an N-terminal region (NTR), involved in carbohydrate recognition, which is not shared by glucanase subfamily members. In comparisons between paralogous PRR subfamily members, there were no conserved amino acid residues in the NTR. However, when pairs of putatively orthologous PRR subfamily members were compared, the NTR was most often as conserved as the CTR or more so. This pattern suggests that the NTR may be important in functions specific to the different paralogs, while amino acid sequence changes in the NTR may have been important in functional differentiation among paralogs, specifically with regard to the types of carbohydrates that they recognize.  相似文献   

2.
3.
Zn finger proteins (ZFPs) of the C2/H2 type inXenopus laevis are encoded by a multigene family comprising several hundred members. Based upon conserved sequence features outside the Zn finger region, ZFPs can be subdivided into distinct subfamilies. Two of such subfamilies are characterized by conserved, N-terminal amino acid sequences termed the FAX and the FAR Domain. Here we present data suggesting that the zinc finger proteins of the FAR-ZFP subfamily are targets for CK II mediated phosphorylation. Expression of these proteins during oogenesis coincides with CK II activity in unfertilized eggs. Additionally, we have found that XIcOF 7.1, a member of the FAX-ZFP subfamily, is also phosphorylated by CK II. The target sites forin vitro phosphorylation are localized within the conserved N-terminal domains but not within the Zn finger regions. However, amino acid sequence comparison revealed that individual phosphoacceptor sites are not generally conserved among all members of the respective ZFP subfamilies. The relevance of a potential CK II phosphorylation for the regulation of ZFP activityin vivo is discussed.  相似文献   

4.
The rapid increase in the amount of protein sequence data has created a need for automated identification of sites that determine functional specificity among related subfamilies of proteins. A significant fraction of subfamily specific sites are only marginally conserved, which makes it extremely challenging to detect those amino acid changes that lead to functional diversification. To address this critical problem we developed a method named SPEER (specificity prediction using amino acids' properties, entropy and evolution rate) to distinguish specificity determining sites from others. SPEER encodes the conservation patterns of amino acid types using their physico-chemical properties and the heterogeneity of evolutionary changes between and within the subfamilies. To test the method, we compiled a test set containing 13 protein families with known specificity determining sites. Extensive benchmarking by comparing the performance of SPEER with other specificity site prediction algorithms has shown that it performs better in predicting several categories of subfamily specific sites.  相似文献   

5.
The classical human interferon-alpha (HuIFN-alpha) gene family is estimated to consist of 15 or more nonallelic members which encode proteins sharing greater than 77% amino acid sequence homology. Low-stringency hybridization with a HuIFN-alpha cDNA probe permitted the isolation of two distinct classes of bovine IFN-alpha genes. The first subfamily (class I) is more closely related to the known HuIFN-alpha genes than to the second subfamily (class II) of bovine IFN-alpha genes. Extensive analysis of the human genome has revealed a HuIFN-alpha gene subfamily corresponding to the class II bovine IFN-alpha genes. The class I human and bovine IFN-alpha genes encode mature IFN polypeptides of 165 to 166 amino acids, whereas the class II IFN-alpha genes encode 172 amino acid proteins. Expression in Escherichia coli of members of both gene subfamilies results in polypeptides having potent antiviral activity. In contrast to previous studies which found no evidence of class II IFN-alpha protein or mRNA expression, we demonstrate that the class I and class II IFN-alpha genes are coordinately induced in response to viral infection.  相似文献   

6.
7.
Acyl-coenzyme A synthetases (ACSs) catalyze the fundamental, initial reaction in fatty acid metabolism. "Activation" of fatty acids by thioesterification to CoA allows their participation in both anabolic and catabolic pathways. The availability of the sequenced human genome has facilitated the investigation of the number of ACS genes present. Using two conserved amino acid sequence motifs to probe human DNA databases, 26 ACS family genes/proteins were identified. ACS activity in either humans or rodents was demonstrated previously for 20 proteins, but 6 remain candidate ACSs. For two candidates, cDNA was cloned, protein was expressed in COS-1 cells, and ACS activity was detected. Amino acid sequence similarities were used to assign enzymes into subfamilies, and subfamily assignments were consistent with acyl chain length preference. Four of the 26 proteins did not fit into a subfamily, and bootstrap analysis of phylograms was consistent with evolutionary divergence. Three additional conserved amino acid sequence motifs were identified that likely have functional or structural roles. The existence of many ACSs suggests that each plays a unique role, directing the acyl-CoA product to a specific metabolic fate. Knowing the full complement of ACS genes in the human genome will facilitate future studies to characterize their specific biological functions.  相似文献   

8.
9.
Summary We have sequenced cDNA clones representing each of the three distinct groups of storage proteins of the cotton seed. Characteristics of their mRNAs and derived proteins are given. Dot matrix analysis of the nucleotide and amino acid sequences shows that 2 of these groups of proteins have a great deal of vestigial homology at low stringency and should be considered subfamilies of a single storage protein gene family. The remaining group is quite distinct and should be considered a separate multigene family. It also can be divived into 2 subfamilies based on the presence or absence of glycosyl residues and other sequence differences.These proteins are processed to smaller species during embryogenesis, and all of the mature storage proteins of cotton can be traced back to these 2 gene families.In view of these relationships we propose that these 2 families be called the and globulins of cotton storage proteins, each comprised of an A and B subfamily.  相似文献   

10.
Sequence analysis of chloroplast and mitochondrial large subunit rRNA genes from over 75 green algae disclosed 28 new group I intron-encoded proteins carrying a single LAGLIDADG motif. These putative homing endonucleases form four subfamilies of homologous enzymes, with the members of each subfamily being encoded by introns sharing the same insertion site. We showed that four divergent endonucleases from the I-CreI subfamily cleave the same DNA substrates. Mapping of the 66 amino acids that are conserved among the members of this subfamily on the 3-dimensional structure of I-CreI bound to its recognition sequence revealed that these residues participate in protein folding, homodimerization, DNA recognition and catalysis. Surprisingly, only seven of the 21 I-CreI amino acids interacting with DNA are conserved, suggesting that I-CreI and its homologs use different subsets of residues to recognize the same DNA sequence. Our sequence comparison of all 45 single-LAGLIDADG proteins identified so far suggests that these proteins share related structures and that there is a weak pressure in each subfamily to maintain identical protein–DNA contacts. The high sequence variability we observed in the DNA-binding site of homologous LAGLIDADG endonucleases provides insight into how these proteins evolve new DNA specificity.  相似文献   

11.
The hok killer gene family in gram-negative bacteria   总被引:23,自引:0,他引:23  
  相似文献   

12.
Abhiman S  Sonnhammer EL 《Proteins》2005,60(4):758-768
Protein function shift can be predicted from sequence comparisons, either using positive selection signals or evolutionary rate estimation. None of the methods have been validated on large datasets, however. Here we investigate existing and novel methods for protein function shift prediction, and benchmark the accuracy against a large dataset of proteins with known enzymatic functions. Function change was predicted between subfamilies by identifying two kinds of sites in a multiple sequence alignment: Conservation-Shifting Sites (CSS), which are conserved in two subfamilies using two different amino acid types, and Rate-Shifting Sites (RSS), which have different evolutionary rates in two subfamilies. CSS were predicted by a new entropy-based method, and RSS using the Rate-Shift program. In principle, the more CSS and RSS between two subfamilies, the more likely a function shift between them. A test dataset was built by extracting subfamilies from Pfam with different EC numbers that belong to the same domain family. Subfamilies were generated automatically using a phylogenetic tree-based program, BETE. The dataset comprised 997 subfamily pairs with four or more members per subfamily. We observed a significant increase in CSS and RSS for subfamily comparisons with different EC numbers compared to cases with same EC numbers. The discrimination was better using RSS than CSS, and was more pronounced for larger families. Combining RSS and CSS by discriminant analysis improved classification accuracy to 71%. The method was applied to the Pfam database and the results are available at http://FunShift.cgb.ki.se. A closer examination of some superfamily comparisons showed that single EC numbers sometimes embody distinct functional classes. Hence, the measured accuracy of function shift is underestimated.  相似文献   

13.
Serological analyses of soluble seed proteins of 12 representative taxa of the family Oleaceae by the techniques of Ouchterlony, presaturation, and immunoelectrophoresis (IEP) yielded complementary taxonomic information. Ouchterlony reactions differentiated among protein extracts of three species, and the combined serological techniques permitted the detection of protein differences of respective taxa of the two subfamilies Jasminoideae and Oleoideae. IEP enabled the separation of the 12 taxa investigated and the distribution into the two subfamilies, by the differential electrophoretic positions of precipitin arcs, which were consistent with members of each of the two subfamilies. Presaturation data, when analyzed by two cluster analysis computer programs, were taxonomically significant. One program, which calculated amalgamation distances from the presaturation data, clustered the 12 taxa into subgroups which corresponded to tribes, and within two groups, which corresponded to two subfamilies. A monothetic clustering program provided theoretical information on evolution of taxa within the family Oleaceae based on serological correspondences obtained from the presaturation data; the subfamily Jasminoideae was found phylogenetically primitive and the subfamily Oleoideae was advanced. Additionally, IEP data supported those theories suggesting that taxa of the Oleoideae evolved from taxa of the Jasminoideae. The groupings of taxa and different information obtained from the cluster analyses of data and all serological techniques reinforced each other, as well as contemporary taxonomic and phylogenetic treatments of taxa of the family Oleaceae. This research demonstrated the taxonomic value of protein-serological data, particularly as applied to the taxonomy of the Oleaceae.  相似文献   

14.

Background

The major birch pollen allergen, Bet v 1, is a member of the ubiquitous PR-10 family of plant pathogenesis-related proteins. In recent years, a number of diverse plant proteins with low sequence similarity to Bet v 1 was identified. In addition, determination of the Bet v 1 structure revealed the existence of a large superfamily of structurally related proteins. In this study, we aimed to identify and classify all Bet v 1-related structures from the Protein Data Bank and all Bet v 1-related sequences from the Uniprot database.

Results

Structural comparisons of representative members of already known protein families structurally related to Bet v 1 with all entries of the Protein Data Bank yielded 47 structures with non-identical sequences. They were classified into eleven families, five of which were newly identified and not included in the Structural Classification of Proteins database release 1.71. The taxonomic distribution of these families extracted from the Pfam protein family database showed that members of the polyketide cyclase family and the activator of Hsp90 ATPase homologue 1 family were distributed among all three superkingdoms, while members of some bacterial families were confined to a small number of species. Comparison of ligand binding activities of Bet v 1-like superfamily members revealed that their functions were related to binding and metabolism of large, hydrophobic compounds such as lipids, hormones, and antibiotics. Phylogenetic relationships within the Bet v 1 family, defined as the group of proteins with significant sequence similarity to Bet v 1, were determined by aligning 264 Bet v 1-related sequences. A distance-based phylogenetic tree yielded a classification into 11 subfamilies, nine exclusively containing plant sequences and two subfamilies of bacterial proteins. Plant sequences included the pathogenesis-related proteins 10, the major latex proteins/ripening-related proteins subfamily, and polyketide cyclase-like sequences.

Conclusion

The ubiquitous distribution of Bet v 1-related proteins among all superkingdoms suggests that a Bet v 1-like protein was already present in the last universal common ancestor. During evolution, this protein diversified into numerous families with low sequence similarity but with a common fold that succeeded as a versatile scaffold for binding of bulky ligands.  相似文献   

15.
Structures of homologous proteins are usually conserved during evolution, as are critical active site residues. This is the case for actin and tubulin, the two most important cytoskeleton proteins in eukaryotes. Actins and their related proteins (Arps) constitute a large superfamily whereas the tubulin family has fewer members. Unaligned sequences of these two protein families were analysed by searching for short groups of family-specific amino acid residues, that we call motifs, and by counting the number of residues from one motif to the next. For each sequence, the set of motif-to-motif residue counts forms a subfamily-specific pattern (landmark pattern) allowing actin and tubulin superfamily members to be identified and sorted into subfamilies. The differences between patterns of individual subfamilies are due to inserts and deletions (indels). Inserts appear to have arisen at an early stage in eukaryote evolution as suggested by the small but consistent kingdom-dependent differences found within many Arp subfamilies and in γ-tubulins. Inserts tend to be in surface loops where they can influence subfamily-specific function without disturbing the core structure of the protein. The relatively few indels found for tubulins have similar positions to established results, whereas we find many previously unreported indel positions and lengths for the metazoan Arps.  相似文献   

16.
Purine nucleotide-binding proteins build the large family of P-loop GTPases and related ATPases, which perform essential functions in all kingdoms of life. The Obg family comprises a group of ancient GTPases belonging to the TRAFAC (for translation factors) class and can be subdivided into several distinct protein subfamilies. The founding member of one of these subfamilies is the bacterial P-loop NTPase YchF, which had so far been assumed to act as GTPase. We have biochemically characterized the human homologue of YchF and found that it binds and hydrolyzes ATP more efficiently than GTP. For this reason, we have termed the protein hOLA1, for human Obg-like ATPase 1. Further biochemical characterization of YchF proteins from different species revealed that ATPase activity is a general but previously missed feature of the YchF subfamily of Obg-like GTPases. To explain ATP specificity of hOLA1, we have solved the x-ray structure of hOLA1 bound to the nonhydrolyzable ATP analogue AMPPCP. Our structural data help to explain the altered nucleotide specificity of YchF homologues and identify the Ola1/YchF subfamily of the Obg-related NTPases as an exceptional example of a single protein subfamily, which has evolved altered nucleotide specificity within a distinct protein family of GTPases.  相似文献   

17.
18.
Classification and evolution of EF-hand proteins   总被引:14,自引:0,他引:14  
Forty-five distinct subfamilies of EF-hand proteins have been identified. They contain from two to eight EF-hands that are recognizable by amino acid sequence as being statistically similar to other EF-hand domains. All proteins within one subfamily are congruent to one another, i.e. the dendrogram computed from one of the EF-hand domains is similar, within statistical error, to the dendrogram computed from another(s) domain. Thirteen subfamilies - including Calmodulin, Troponin C, Essential light chain, Regulatory light chain - referred to collectively as CTER, are congruent with one another. They appear to have evolved from a single ur-domain by two cycles of gene duplication and fusion. The subfamilies of CTER subsequently evolved by gene duplications and speciations. The remaining 32 subfamilies do not show such general patterns of congruence; however, some - such as S100, intestinal calcium binding protein (calbindin 9kd), and trichohylin - do not form congruent clusters of subfamilies. Nearly all of the domains 1, 3, 5, and 7 are most similar to other ODD domains. Correspondingly the EVEN numbered domains of all 45 subfamilies most closely resemble EVEN domains of other subfamilies. Many sequence and chem-ical characteristics do not show systemic trends by subfamily or species of host organisms; such homoplasy is widespread. Eighteen of the subfamilies are heterochimeric; in addition to multiple EF-hands they contain domains of other evolutionary origins.© Kluwer Academic Publishers  相似文献   

19.
The acyl-CoA dehydrogenases (ACADs) are enzymes that catalyze the α,β-dehydrogenation of acyl-CoA esters in fatty acid and amino acid catabolism. Eleven ACADs are now recognized in the sequenced human genome, and several homologs have been reported from bacteria, fungi, plants, and nematodes. We performed a systematic comparative genomic study, integrating homology searches with methods of phylogenetic reconstruction, to investigate the evolutionary history of this family. Sequence analyses indicate origin of the family in the common ancestor of Archaea, Bacteria, and Eukaryota, illustrating its essential role in the metabolism of early life. At least three ACADs were already present at that time: ancestral glutaryl-CoA dehydrogenase (GCD), isovaleryl-CoA dehydrogenase (IVD), and ACAD10/11. Two gene duplications were unique to the eukaryotic domain: one resulted in the VLCAD and ACAD9 paralogs and another in the ACAD10 and ACAD11 paralogs. The overall patchy distribution of specific ACADs across the tree of life is the result of dynamic evolution that includes numerous rounds of gene duplication and secondary losses, interdomain lateral gene transfer events, alteration of cellular localization, and evolution of novel proteins by domain acquisition. Our finding that eukaryotic ACAD species are more closely related to bacterial ACADs is consistent with endosymbiotic origin of ACADs in eukaryotes and further supported by the localization of all nine previously studied ACADs in mitochondria. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

20.
Function prediction by homology is widely used to provide preliminary functional annotations for genes for which experimental evidence of function is unavailable or limited. This approach has been shown to be prone to systematic error, including percolation of annotation errors through sequence databases. Phylogenomic analysis avoids these errors in function prediction but has been difficult to automate for high-throughput application. To address this limitation, we present a computationally efficient pipeline for phylogenomic classification of proteins. This pipeline uses the SCI-PHY (Subfamily Classification in Phylogenomics) algorithm for automatic subfamily identification, followed by subfamily hidden Markov model (HMM) construction. A simple and computationally efficient scoring scheme using family and subfamily HMMs enables classification of novel sequences to protein families and subfamilies. Sequences representing entirely novel subfamilies are differentiated from those that can be classified to subfamilies in the input training set using logistic regression. Subfamily HMM parameters are estimated using an information-sharing protocol, enabling subfamilies containing even a single sequence to benefit from conservation patterns defining the family as a whole or in related subfamilies. SCI-PHY subfamilies correspond closely to functional subtypes defined by experts and to conserved clades found by phylogenetic analysis. Extensive comparisons of subfamily and family HMM performances show that subfamily HMMs dramatically improve the separation between homologous and non-homologous proteins in sequence database searches. Subfamily HMMs also provide extremely high specificity of classification and can be used to predict entirely novel subtypes. The SCI-PHY Web server at http://phylogenomics.berkeley.edu/SCI-PHY/ allows users to upload a multiple sequence alignment for subfamily identification and subfamily HMM construction. Biologists wishing to provide their own subfamily definitions can do so. Source code is available on the Web page. The Berkeley Phylogenomics Group PhyloFacts resource contains pre-calculated subfamily predictions and subfamily HMMs for more than 40,000 protein families and domains at http://phylogenomics.berkeley.edu/phylofacts/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号