首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 531 毫秒
1.
In order to investigate the relationship between glycosyltransferase families and the motif for them, we classified 47 glycosyltransferase families in the CAZy database into four superfamilies, GTS-A, -B, -C, and -D, using a profile Hidden Markov Model method. On the basis of the classification and the similarity between GTS-A and nucleotidylyltransferase family catalyzing the synthesis of nucleotide-sugar, we proposed that ancient oligosaccharide might have been synthesized by the origin of GTS-B whereas the origin of GTS-A might be the gene encoding for synthesis of nucleotide-sugar as the donor and have evolved to glycosyltransferases to catalyze the synthesis of divergent carbohydrates. We also suggested that the divergent evolution of each superfamily in the corresponding subcellular component has increased the complexities of eukaryotic carbohydrate structure.  相似文献   

2.
Members of a superfamily of proteins could result from divergent evolution of homologues with insignificant similarity in the amino acid sequences. A superfamily relationship is detected commonly after the three-dimensional structures of the proteins are determined using X-ray analysis or NMR. The SUPFAM database described here relates two homologous protein families in a multiple sequence alignment database of either known or unknown structure. The present release (1.1), which is the first version of the SUPFAM database, has been derived by analysing Pfam, which is one of the commonly used databases of multiple sequence alignments of homologous proteins. The first step in establishing SUPFAM is to relate Pfam families with the families in PALI, which is an alignment database of homologous proteins of known structure that is derived largely from SCOP. The second step involves relating Pfam families which could not be associated reliably with a protein superfamily of known structure. The profile matching procedure, IMPALA, has been used in these steps. The first step resulted in identification of 1280 Pfam families (out of 2697, i.e. 47%) which are related, either by close homologous connection to a SCOP family or by distant relationship to a SCOP family, potentially forming new superfamily connections. Using the profiles of 1417 Pfam families with apparently no structural information, an all-against-all comparison involving a sequence-profile match using IMPALA resulted in clustering of 67 homologous protein families of Pfam into 28 potential new superfamilies. Expansion of groups of related proteins of yet unknown structural information, as proposed in SUPFAM, should help in identifying ‘priority proteins’ for structure determination in structural genomics initiatives to expand the coverage of structural information in the protein sequence space. For example, we could assign 858 distinct Pfam domains in 2203 of the gene products in the genome of Mycobacterium tubercolosis. Fifty-one of these Pfam families of unknown structure could be clustered into 17 potentially new superfamilies forming good targets for structural genomics. SUPFAM database can be accessed at http://pauling.mbu.iisc.ernet.in/~supfam.  相似文献   

3.
Bioinformatics is a very powerful tool in the field of glycoproteomics as well as genomics and proteomics. As a part of the Glycogene Project (GG project), we have developed a novel bioinformatics system for the comprehensive identification and in silico cloning of human glycogenes. Using our system, a total of 105 candidate human glycogenes were identified and then engineered for heterologous expression. Of these candidates, 38 recombinant proteins were successfully identified for their enzyme activity and substrate specificity. We also classified 47 out of 60 carbohydrate-active enzyme glycosyltransferase families into 4 superfamilies using the profile Hidden Markov Model method. On the basis of our classification and the relationship between glycosylation pathways and superfamilies, we propose the evolution of glycosyltransferases.  相似文献   

4.
Franco OL  Rigden DJ 《Glycobiology》2003,13(10):707-712
Glycosyltransferases (GTs) are diverse enzymes organized into 65 families. X-ray crystallography and in silico studies have shown many of these to belong to two structural superfamilies: GT-A and GT-B. Through application of fold recognition and iterated sequence searches, we demonstrate that families 60, 62, and 64 may also be grouped into the GT-A fold superfamily. Analysis of conserved acidic residues suggests that catalytic sites are better conserved in superfamily GT-B than in GT-A. Although 26% and 29% of GT families may now be confidently placed in superfamilies GT-A and GT-B, respectively, the remaining 45% of families bear no discernible resemblance to either superfamily, which, given the sensitivity of modern fold recognition methods, suggests the existence of novel structural scaffolds associated with GT activity. Furthermore, bioinformatics studies indicate the apparent ease with which mechanism-inverting or retaining-may change during evolution.  相似文献   

5.
6.
7.
The amino acid-polyamine-organocation (APC) superfamily has been shown to include five recognized families, four of which are specific for amino acids and their derivatives. Recent high-resolution X-ray crystallographic data have shown that four additional transporter families (BCCT, TC No. 2.A.15; SSS, 2.A.21; NSS, 2.A.22; and NCS1, 2.A.39), transporting a wide range of solutes, exhibit sufficiently similar folds to suggest a common evolutionary origin. We have used established statistical methods, based on sequence similarity, to show that these families are, in fact, members of the APC superfamily. We also identify two additional families (NCS2, 2.A.40; SulP, 2.A.53) as being members of this superfamily. Repeat sequences, each having five transmembrane α-helical segments and arising via ancient intragenic duplications, are demonstrated for all of these families, further strengthening the conclusion of homology. The APC superfamily appears to be the second largest superfamily of secondary carriers, the largest being the major facilitator superfamily (MFS). Although the topology of the members of the APC superfamily differs from that of the MFS, both families appear to have arisen from a common ancestral 2 TMS hairpin structure that underwent intragenic triplication followed by loss of a TMS in the APC family, to give the repeat units that are characteristic of these two superfamilies.  相似文献   

8.
Transport proteins function in the translocation of ions, solutes and macromolecules across cellular and organellar membranes. These integral membrane proteins fall into >600 families as tabulated in the Transporter Classification Database (www.tcdb.org). Recent studies, some of which are reported here, define distant phylogenetic relationships between families with the creation of superfamilies. Several of these are analyzed using a novel set of programs designed to allow reliable prediction of phylogenetic trees when sequence divergence is too great to allow the use of multiple alignments. These new programs, called SuperfamilyTree1 and 2 (SFT1 and 2), allow display of protein and family relationships, respectively, based on thousands of comparative BLAST scores rather than multiple alignments. Superfamilies analyzed include: (1) Aerolysins, (2) RTX Toxins, (3) Defensins, (4) Ion Transporters, (5) Bile/Arsenite/Riboflavin Transporters, (6) Cation:Proton Antiporters, and (7) the Glucose/Fructose/Lactose superfamily within the prokaryotic phosphoenol pyruvate-dependent Phosphotransferase System. In addition to defining the phylogenetic relationships of the proteins and families within these seven superfamilies, evidence is provided showing that the SFT programs outperform programs that are based on multiple alignments whenever sequence divergence of superfamily members is extensive. The SFT programs should be applicable to virtually any superfamily of proteins or nucleic acids.  相似文献   

9.
Shotgun: getting more from sequence similarity searches.   总被引:1,自引:0,他引:1  
MOTIVATION: As genomic sequencing reveals the range of structural classes generated through the evolution of proteins, analysis of the superfamilies to which they belong can contribute important insights for understanding their structure-function relationships. Current database search techniques fall short of identifying the majority of distant sequence relationships at statistically significant levels. We developed the Shotgun program in an effort to enhance the sensitivity and utility of current database search output. RESULTS: We have developed and used the Shotgun program to identify both new superfamily members and to reconstruct several known enzyme superfamilies using BLAST database searches. An analysis of the false-positive rates generated in the analysis and other control experiments provides evidence that high Shotgun scores indicate real evolutionary relationships. Shotgun is also a useful tool for identifying subgroup relationships within superfamilies and for testing hypotheses about related protein families. AVAILABILITY: By request from the Babbitt lab homepage: http://mako.cgl.ucsf. edu/babbittlab/ CONTACT: babbitt@cgl.ucsf.edu  相似文献   

10.
The Short-chain Dehydrogenases/Reductases Engineering Database (SDRED) covers one of the largest known protein families (168 150 proteins). Assignment to the superfamilies of Classical and Extended SDRs was achieved by global sequence similarity and by identification of family-specific sequence motifs. Two standard numbering schemes were established for Classical and Extended SDRs that allow for the determination of conserved amino acid residues, such as cofactor specificity determining positions or superfamily specific sequence motifs. The comprehensive sequence dataset of the SDRED facilitates the refinement of family-specific sequence motifs. The glycine-rich motifs for Classical and Extended SDRs were refined to improve the precision of superfamily classification. In each superfamily, the majority of sequences formed a tightly connected sequence network and belonged to a large homologous family. Despite their different sequence motifs and their different sequence length, the two sequence networks of Classical and Extended SDRs are not separate, but connected by edges at a threshold of 40% sequence similarity, indicating that all SDRs belong to a large, connected network. The SDRED is accessible at https://sdred.biocatnet.de/.  相似文献   

11.
During the past decade, ancient gene duplications were recognized as one of the main forces in the generation of diverse gene families and the creation of new functional capabilities. New tools developed to search data banks for homologous sequences, and an increased availability of reliable three-dimensional structural information led to the recognition that proteins with diverse functions can belong to the same superfamily. Analyses of the evolution of these superfamilies promises to provide insights into early evolution but are complicated by several important evolutionary processes. Horizontal transfer of genes can lead to a vertical spread of innovations among organisms, therefore finding a certain property in some descendants of an ancestor does not guarantee that it was present in that ancestor. Complete or partial gene conversion between duplicated genes can yield phylogenetic trees with several, apparently independent gene duplications, suggesting an often surprising parallelism in the evolution of independent lineages. Additionally, the breakup of domains within a protein and the fusion of domains into multifunctional proteins makes the delineation of superfamilies a task that remains difficult to automate.  相似文献   

12.

Background  

Inferences about protein function are often made based on sequence homology to other gene products of known activities. This approach is valuable for small families of conserved proteins but can be difficult to apply to large superfamilies of proteins with diverse function. In this study we looked at sequence homology between members of the DJ-1/ThiJ/PfpI superfamily, which includes a human protein of unclear function, DJ-1, associated with inherited Parkinson's disease.  相似文献   

13.
Dehalogenases are environmentally important enzymes that detoxify organohalogens by cleaving their carbon-halogen bonds. Many microbial genomes harbour enzyme families containing dehalogenases, but a sequence-based identification of genuine dehalogenases with high confidence is challenging because of the low sequence conservation among these enzymes. Furthermore, these protein families harbour a rich diversity of other enzymes including esterases and phosphatases. Reliable sequence determinants are necessary to harness genome sequencing-efforts for accelerating the discovery of novel dehalogenases with improved or modified activities. In an attempt to extract dehalogenase sequence fingerprints, 103 uncharacterized potential dehalogenase candidates belonging to the α/β hydrolase (ABH) and haloacid dehalogenase-like hydrolase (HAD) superfamilies were screened for dehalogenase, esterase and phosphatase activity. In this first biochemical screen, 1 haloalkane dehalogenase, 1 fluoroacetate dehalogenase and 5 l -2-haloacid dehalogenases were found (success rate 7%), as well as 19 esterases and 31 phosphatases. Using this functional data, we refined the sequence-based dehalogenase selection criteria and applied them to a second functional screen, which identified novel dehalogenase activity in 13 out of only 24 proteins (54%), increasing the success rate eightfold. Four new l -2-haloacid dehalogenases from the HAD superfamily were found to hydrolyse fluoroacetate, an activity never previously ascribed to enzymes in this superfamily.  相似文献   

14.

Background  

SUPFAM database is a compilation of superfamily relationships between protein domain families of either known or unknown 3-D structure. In SUPFAM, sequence families from Pfam and structural families from SCOP are associated, using profile matching, to result in sequence superfamilies of known structure. Subsequently all-against-all family profile matches are made to deduce a list of new potential superfamilies of yet unknown structure.  相似文献   

15.
Mammalian glycosyltransferases: genomic organization and protein structure.   总被引:5,自引:0,他引:5  
D H Joziasse 《Glycobiology》1992,2(4):271-277
In recent years, several glycosyltransferase genes and cDNAs have been cloned and characterized. Although the glycosyltransferases seem to share the same general architecture, there is only little sequence similarity between the various enzymes. Moreover, a comparison of the organization of the genes shows that there is no common pattern of intron-exon structure. In addition, there seems to be little or no correlation between glycosyltransferase exons and protein domains. Taken together, these observations suggest that many of the glycosyltransferase genes evolved independently. So far, only two glycosyltransferase gene families have been described. These families may have evolved by exon-shuffling, or by gene duplication and subsequent divergence. For specific glycosyltransferases, mechanisms such as alternative splicing and alternative promoter usage play a role in the production of multiple protein isoforms from a single gene. These isoenzymes may differ in their enzymatic properties or cellular localization.  相似文献   

16.
Superfamily and family analyses provide an effective tool for the functional classification of proteins, but must be automated for use on large datasets. We describe a 'gold standard' set of enzyme superfamilies, clustered according to specific sequence, structure, and functional criteria, for use in the validation of family and superfamily clustering methods. The gold standard set represents four fold classes and differing clustering difficulties, and includes five superfamilies, 91 families, 4,887 sequences and 282 structures.  相似文献   

17.
Meng EC  Polacco BJ  Babbitt PC 《Proteins》2004,55(4):962-976
We show that three-dimensional signatures consisting of only a few functionally important residues can be diagnostic of membership in superfamilies of enzymes. Using the enolase superfamily as a model system, we demonstrate that such a signature, or template, can identify superfamily members in structural databases with high sensitivity and specificity. This is remarkable because superfamilies can be highly diverse, with members catalyzing many different overall reactions; the unifying principle can be a conserved partial reaction or chemical capability. Our definition of a superfamily thus hinges on the disposition of residues involved in a conserved function, rather than on fold similarity alone. A clear advantage of basing structure searches on such active site templates rather than on fold similarity is the specificity with which superfamilies with distinct functional characteristics can be identified within a large set of proteins with the same fold, such as the (beta/alpha)8 barrels. Preliminary results are presented for an additional group of enzymes with a different fold, the haloacid dehalogenase superfamily, suggesting that this approach may be generally useful for assigning reading frames of unknown function to specific superfamilies and thereby allowing inference of some of their functional properties.  相似文献   

18.
Many protein classification systems capture homologous relationships by grouping domains into families and superfamilies on the basis of sequence similarity. Superfamilies with similar 3D structures are further grouped into folds. In the absence of discernable sequence similarity, these structural similarities were long thought to have originated independently, by convergent evolution. However, the growth of databases and advances in sequence comparison methods have led to the discovery of many distant evolutionary relationships that transcend the boundaries of superfamilies and folds. To investigate the contributions of convergent versus divergent evolution in the origin of protein folds, we clustered representative domains of known structure by their sequence similarity, treating them as point masses in a virtual 2D space which attract or repel each other depending on their pairwise sequence similarities. As expected, families in the same superfamily form tight clusters. But often, superfamilies of the same fold are linked with each other, suggesting that the entire fold evolved from an ancient prototype. Strikingly, some links connect superfamilies with different folds. They arise from modular peptide fragments of between 20 and 40 residues that co‐occur in the connected folds in disparate structural contexts. These may be descendants of an ancestral pool of peptide modules that evolved as cofactors in the RNA world and from which the first folded proteins arose by amplification and recombination. Our galaxy of folds summarizes, in a single image, most known and many yet undescribed homologous relationships between protein superfamilies, providing new insights into the evolution of protein domains.  相似文献   

19.
The Structural Motifs of Superfamilies (SMoS) database provides information about the structural motifs of aligned protein domain superfamilies. Such motifs among structurally aligned multiple members of protein superfamilies are recognized by the conservation of amino acid preference and solvent inaccessibility and are examined for the conservation of other features like secondary structural content, hydrogen bonding, non-polar interaction and residue packing. These motifs, along with their sequence and spatial orientation, represent the conserved core structure of each superfamily and also provide the minimal requirement of sequence and structural information to retain each superfamily fold.  相似文献   

20.
BACKGROUND: Signalling via the Notch receptor is a key regulator of many developmental processes. The differential responsiveness of Notch-expressing cells to the ligands Delta and Serrate is controlled by Fringe, itself essential for normal patterning in Drosophila and vertebrates. The mechanism of Fringe action, however, is not known. The protein has an amino-terminal hydrophobic stretch resembling a cleaved signal peptide, which has led to the widespread assumption that it is a secreted signalling molecule. It also has distant homology to bacterial glycosyltransferases, although it is not clear if this reflects a shared enzymatic activity, or merely a related structure. RESULTS: We report that a functional epitope-tagged form of Drosophila Fringe was localised in the Golgi apparatus. When the putative signal peptide was replaced by a confirmed one, Fringe no longer accumulated in the Golgi, but was instead efficiently secreted. This change in localisation dramatically reduced its biological activity, implying that the wild-type protein normally acts inside the cell. We show that Fringe specifically binds the nucleoside diphosphate UDP, a feature of many glycosyltransferases. Furthermore, specific mutation of a DxD motif (in the single-letter amino acid code where x is any amino acid), a hallmark of most glycosyltransferases that use nucleoside diphosphate sugars, did not affect the Golgi localisation of the protein but completely eliminated in vivo activity. CONCLUSIONS: These results indicate that Fringe does not exert its effects outside of the cell, but rather acts in the Golgi apparatus, apparently as a glycosyltransferase. They suggest that alteration in receptor glycosylation can regulate the relative efficiency of different ligands.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号