首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Protein structural class prediction is one of the challenging problems in bioinformatics. Previous methods directly based on the similarity of amino acid (AA) sequences have been shown to be insufficient for low-similarity protein data-sets. To improve the prediction accuracy for such low-similarity proteins, different methods have been recently proposed that explore the novel feature sets based on predicted secondary structure propensities. In this paper, we focus on protein structural class prediction using combinations of the novel features including secondary structure propensities as well as functional domain (FD) features extracted from the InterPro signature database. Our comprehensive experimental results based on several benchmark data-sets have shown that the integration of new FD features substantially improves the accuracy of structural class prediction for low-similarity proteins as they capture meaningful relationships among AA residues that are far away in protein sequence. The proposed prediction method has also been tested to predict structural classes for partially disordered proteins with the reasonable prediction accuracy, which is a more difficult problem comparing to structural class prediction for commonly used benchmark data-sets and has never been done before to the best of our knowledge. In addition, to avoid overfitting with a large number of features, feature selection is applied to select discriminating features that contribute to achieve high prediction accuracy. The selected features have been shown to achieve stable prediction performance across different benchmark data-sets.  相似文献   

2.
Protein structural class prediction is one of the challenging problems in bioinformatics. Previous methods directly based on the similarity of amino acid (AA) sequences have been shown to be insufficient for low-similarity protein data-sets. To improve the prediction accuracy for such low-similarity proteins, different methods have been recently proposed that explore the novel feature sets based on predicted secondary structure propensities. In this paper, we focus on protein structural class prediction using combinations of the novel features including secondary structure propensities as well as functional domain (FD) features extracted from the InterPro signature database. Our comprehensive experimental results based on several benchmark data-sets have shown that the integration of new FD features substantially improves the accuracy of structural class prediction for low-similarity proteins as they capture meaningful relationships among AA residues that are far away in protein sequence. The proposed prediction method has also been tested to predict structural classes for partially disordered proteins with the reasonable prediction accuracy, which is a more difficult problem comparing to structural class prediction for commonly used benchmark data-sets and has never been done before to the best of our knowledge. In addition, to avoid overfitting with a large number of features, feature selection is applied to select discriminating features that contribute to achieve high prediction accuracy. The selected features have been shown to achieve stable prediction performance across different benchmark data-sets.  相似文献   

3.
Genetic and structural analysis of the alpha chain polypeptides of heterotrimeric G proteins defines functional domains for GTP/GDP binding, GTPase activity, effector activation, receptor contact and beta gamma subunit complex regulation. The conservation in sequence comprising the GDP/GTP binding and GTPase domains among G protein alpha subunits readily allows common mutations to be made for the design of mutant polypeptides that function as constitutive active or dominant negative alpha chains when expressed in different cell types. Organization of the effector activation, receptor and beta gamma contact domains is similar in the primary sequence of the different alpha subunit polypeptides relative to the GTP/GDP binding domain sequences. Mutation within common motifs of the different G protein alpha chain polypeptides have similar functional consequences. Thus, what has been learned with the Gs and Gi proteins and the regulation of adenylyl cyclase can be directly applied to the analysis of newly identified G proteins and their coupling to receptors and regulation of putative effector enzymes.  相似文献   

4.
The primary sequence of human DNA polymerase alpha deduced from the full-length cDNA contains regions of striking similarity to sequences in replicative DNA polymerases from Escherichia coli phages PRD1 and T4, Bacillus phage phi 19, yeast DNA polymerase I, yeast linear plasmid pGKL1, maize S1 mitochondrial DNA, herpes family viruses, vaccinia virus, and adenovirus. The conservation of these homologous regions across this vast phylogenetic expanse indicates that these prokaryotic and eukaryotic DNA polymerases may all have evolved from a common primordial gene. Based on the sequence analysis and genetic results from yeast and herpes simplex virus studies, these consensus sequences are suggested to define potential sites that subserve essential roles in the DNA polymerase reaction. Two of these conserved regions appear to participate directly in the active site required for substrate deoxynucleotide interaction. One region toward the carboxyl-terminus has the potential to be the DNA interacting domain, whereas a potential DNA primase interaction domain is predicted toward the amino-terminus. The provisional assignment of these domains can be used to identify unique or dissimilar features of functionally homologous catalytic sites in viral DNA polymerases of pathogenetic significance and thereby serve to guide more rational antiviral drug design.  相似文献   

5.
GeMMA (Genome Modelling and Model Annotation) is a new approach to automatic functional subfamily classification within families and superfamilies of protein sequences. A major advantage of GeMMA is its ability to subclassify very large and diverse superfamilies with tens of thousands of members, without the need for an initial multiple sequence alignment. Its performance is shown to be comparable to the established high-performance method SCI-PHY. GeMMA follows an agglomerative clustering protocol that uses existing software for sensitive and accurate multiple sequence alignment and profile–profile comparison. The produced subfamilies are shown to be equivalent in quality whether whole protein sequences are used or just the sequences of component predicted structural domains. A faster, heuristic version of GeMMA that also uses distributed computing is shown to maintain the performance levels of the original implementation. The use of GeMMA to increase the functional annotation coverage of functionally diverse Pfam families is demonstrated. It is further shown how GeMMA clusters can help to predict the impact of experimentally determining a protein domain structure on comparative protein modelling coverage, in the context of structural genomics.  相似文献   

6.
The amphipathic alpha helix is an often-encountered secondary structural motif in biologically active peptides and proteins. An amphipathic helix is defined as an alpha helix with opposing polar and nonpolar faces oriented along the long axis of the helix. In a recent review article we grouped amphipathic helixes into seven distinct classes (A, H, L, G, K, C, and M) based upon a detailed analysis of their physical-chemical and structural properties (Segrest, J. P., et al. Amphipathic helix motif: classes and properties. Proteins. 1990. 8: 103-117). We have developed five computer programs that automate analysis and classification of potential amphipathic helical domains from primary amino acid sequence data. Here we describe these five programs and illustrate their usefulness by comparing two data sets of sequences representing different amphipathic alpha helical motifs from the exchangeable apolipoproteins. In a companion review article (Segrest, J. P., et al. The amphipathic helix in the exchangeable apolipoproteins: a review of secondary structure and function. J. Lipid Res. 1992. 33: 000-000) these five programs are used to localize and characterize the putative amphipathic helixes in the exchangeable apolipoproteins.  相似文献   

7.
At a nonpermissive temperature, the group D temperature-sensitive mutants of Newcastle disease virus strain Australia-Victoria (AV) are defective in plaque formation, in inducing infected cells to fuse, and in incorporating the cleaved fusion glycoprotein, F1 + F2, into virus particles. In this study, the F protein of AV, expressed in chicken embryo cells, was able to complement these mutants in a plaque assay, identifying the F gene as the gene containing the group D temperature-sensitive lesions. The F genes of mutants D1, D2, and D3 were found to contain single mutations relative to the AV sequence, clustered within a predicted amphipathic alpha helix (AAH) adjacent to the hydrophobic amino terminus of F1. These mutant F proteins were inefficiently processed at the permissive temperature, a problem that was exacerbated at the nonpermissive temperature. Surprisingly, the AV F protein was also found to be partially temperature sensitive in processing. Its AAH is predicted to contain a break in the helix close to the D mutation sites, which are themselves predicted to further weaken the helix at this point. Interestingly, six revertants of the group D mutants were found to have an additional lesion in the AAH, repairing both the AV and mutant helices, resulting in a predicted perfect helix. The F protein of these revertants had overcome both the processing defects of the mutants and the temperature sensitivity of AV, indicating that the AAH region is critical for F protein processing. The lesions of a second group of revertants were localized within F2, suggesting an interaction with the F1 AAH region containing the original lesion.  相似文献   

8.
The three-dimensional structure of a proteolytically modified protein C inhibitor, a member of the serine protease inhibitor superfamily, was constructed with computer graphics based on its amino acid sequence homology with that of the modified alpha 1-antitrypsin whose structure had been elucidated by X-ray crystallography. The intact form of protein C inhibitor was predicted with an alpha-carbon model based on its hydrophilicity and hydrogen bond pattern. Furthermore, a model of its interaction with activated protein C was constructed based on the structure of the complex between trypsin and its inhibitor, which had been determined by X-ray crystallography.  相似文献   

9.
Gs and Gi2 are G proteins whose alpha subunits are 65% homologous. Within the 355 amino acid alpha i2 polypeptide, substitution of residues Ile213-Lys319 with the corresponding alpha s region (Ile235-Arg356) generated a chimera that activated adenylyl cyclase, indicating that the alpha s activation domain resides within this 122 amino acid alpha s sequence. Mutation within alpha s residues Glu15-Pro144 resulted in an alpha s polypeptide having an enhanced rate of GDP dissociation. Mutation within two regions of the N-terminus influenced the ability of pertussis toxin to ADP-ribosylate the alpha subunit polypeptide, a reaction controlled by the beta gamma subunit complex. The findings define the G protein alpha subunit N-terminus as a regulatory region controlling beta gamma subunit interactions and GDP dissociation independent of the GTPase and effector activation domains.  相似文献   

10.
Advances in proteomics technology have enabled new proteins to be discovered at an unprecedented speed, and high throughput experimental methods have been developed to detect protein interactions and complexes en masse. Such bottom-up, data-driven approach has resulted in data that may be uninformative or potentially errorful, requiring further validation and annotation. The InterDom database focuses on providing supporting evidence for the detected protein interactions based on putative protein domain interactions. Using an integrative approach, InterDom derives potential domain interactions by combining data from multiple sources, ranging from domain fusions, protein interactions and complexes, to scientific literature. The InterDom database is available at http://InterDom.lit.org.sg.  相似文献   

11.
The molybdenum cofactor (Moco) consists of a unique and conserved pterin derivative, usually referred to as molybdopterin (MPT), which coordinates the essential transition metal molybdenum (Mo). Moco is required for the enzymatic activities of all Mo-enzymes, with the exception of nitrogenase and is synthesized by an evolutionary old multi-step pathway that is dependent on the activities of at least six gene products. In eukaryotes, the final step of Moco biosynthesis, i.e. transfer and insertion of Mo into MPT, is catalyzed by the two-domain proteins Cnx1 in plants and gephyrin in mammals. Gephyrin is ubiquitously expressed, and was initially found in the central nervous system, where it is essential for clustering of inhibitory neuroreceptors in the postsynaptic membrane. Gephyrin and Cnx1 contain at least two functional domains (E and G) that are homologous to the Escherichia coli proteins MoeA and MogA, the atomic structures of which have been solved recently. Here, we present the crystal structures of the N-terminal human gephyrin G domain (Geph-G) and the C-terminal Arabidopsis thaliana Cnx1 G domain (Cnx1-G) at 1.7 and 2.6 A resolution, respectively. These structures are highly similar and compared to MogA reveal four major differences in their three-dimensional structures: (1) In Geph-G and Cnx1-G an additional alpha-helix is present between the first beta-strand and alpha-helix of MogA. (2) The loop between alpha 2 and beta 2 undergoes conformational changes in all three structures. (3) A beta-hairpin loop found in MogA is absent from Geph-G and Cnx1-G. (4) The C terminus of Geph-G follows a different path from that in MogA. Based on the structures of the eukaryotic proteins and their comparisons with E. coli MogA, the predicted binding site for MPT has been further refined. In addition, the characterized alternative splice variants of gephyrin are analyzed in the context of the three-dimensional structure of Geph-G.  相似文献   

12.
13.

Background  

Proteins that are similar in sequence or structure may perform different functions in nature. In such cases, function cannot be inferred from sequence or structural similarity.  相似文献   

14.
The C-terminal G domain of the mouse laminin alpha2 chain consists of five lamin-type G domain (LG) modules (alpha2LG1 to alpha2LG5) and was obtained as several recombinant fragments, corresponding to either individual modules or the tandem arrays alpha2LG1-3 and alpha2LG4-5. These fragments were compared with similar modules from the laminin alpha1 chain and from the C-terminal region of perlecan (PGV) in several binding studies. Major heparin-binding sites were located on the two tandem fragments and the individual alpha2LG1, alpha2LG3 and alpha2LG5 modules. The binding epitope on alpha2LG5 could be localized to a cluster of lysines by site-directed mutagenesis. In the alpha1 chain, however, strong heparin binding was found on alpha1LG4 and not on alpha1LG5. Binding to sulfatides correlated to heparin binding in most but not all cases. Fragments alpha2LG1-3 and alpha2LG4-5 also bound to fibulin-1, fibulin-2 and nidogen-2 with Kd = 13-150 nM. Both tandem fragments, but not the individual modules, bound strongly to alpha-dystroglycan and this interaction was abolished by EDTA but not by high concentrations of heparin and NaCl. The binding of perlecan fragment PGV to alpha-dystroglycan was even stronger and was also not sensitive to heparin. This demonstrated similar binding repertoires for the LG modules of three basement membrane proteins involved in cell-matrix interactions and supramolecular assembly.  相似文献   

15.
We have determined the complete nucleotide sequence for TEF-1, one of three genes coding for elongation factor (EF)-1 alpha in Mucor racemosus. The deduced EF-1 alpha protein contains 458 amino acids encoded by two exons. The presence of an intervening sequence located near the 3' end of the gene was predicted by the nucleotide sequence data and confirmed by alkaline S1 nuclease mapping. The amino acid sequence of EF-1 alpha was compared to the published amino acid sequences of EF-1 alpha proteins from Saccharomyces cerevisiae and Artemia salina. These proteins shared nearly 85% homology. A similar comparison to the functionally analogous EF-Tu from Escherichia coli revealed several regions of amino acid homology suggesting that the functional domains are conserved in elongation factors from these diverse organisms. Secondary structure predictions indicated that alpha helix and beta sheet conformations associated with the functional domains in EF-Tu are present in the same relative location in EF-1 alpha from M. racemosus. Through this comparative structural analysis we have predicted the general location of functional domains in EF-1 alpha which interact with GTP and tRNA.  相似文献   

16.
LARK is an essential Drosophila RNA-binding protein of the RNA recognition motif (RRM) class that functions during embryonic development and for the circadian regulation of adult eclosion. LARK protein contains three consensus RNA-binding domains: two RRM domains and a retroviral-type zinc finger (RTZF). To show that these three structural domains are required for function, we performed a site-directed mutagenesis of the protein. The analysis of various mutations, in vivo, indicates that the RRM domains and the RTZF are required for wild-type LARK functions. RRM1 and RRM2 are essential for viability, although interestingly either domain can suffice for this function. Remarkably, mutation of either RRM2 or the RTZF results in the same spectrum of phenotypes: mutants exhibit reduced viability, abnormal wing and mechanosensory bristle morphology, female sterility, and flightlessness. The severity of these phenotypes is similar in single mutants and double RRM2; RTZF mutants, indicating a lack of additivity for the mutations and suggesting that RRM2 and the RTZF act together, in vivo, to determine LARK function. Finally, we show that mutations in RRM1, RRM2, or the RTZF do not affect the circadian regulation of eclosion, and we discuss possible interpretations of these results. This genetic analysis demonstrates that each of the LARK structural domains functions in vivo and indicates a pleiotropic requirement for both the LARK RRM2 and RTZF domains.  相似文献   

17.
18.
Stimulation of receptors coupled to G(q)/G(11) protein may induce phosphorylation on a tyrosine residue of the alpha subunit of this G protein, which is an essential event for G(q)/G(11) activation. Here we observed that in HEK293 cells stably expressing high levels of thyrotropin-releasing hormone (TRH) receptors and G(11)alpha protein the maximal tyrosine phosphorylation of G(q)/G(11)alpha was reached within 10 min of TRH stimulation and then it faded away at longer time periods of agonist exposure. The G(q)/G(11)alpha protein levels did not change during this treatment. Incubation of intact cells with beta-cyclodextrin (beta CD) for 40 min prior to hormone exposure significantly decreased the rapid transient tyrosine phosphorylation. Subsequent replenishment of cholesterol levels reversed the former negative effect of beta CD. Isolation of caveolin-enriched, detergent-resistant membrane domains indicated destruction of these structures in beta CD-treated cells. These data indicate that the preserved integrity of plasma membrane domains/caveolae is required for complete agonist-induced phosphorylation of G(q)/G(11)alpha.  相似文献   

19.
Identification of functional domains of Clostridium septicum alpha toxin   总被引:1,自引:0,他引:1  
Melton-Witt JA  Bentsen LM  Tweten RK 《Biochemistry》2006,45(48):14347-14354
Alpha toxin (AT) is the major virulence factor of Clostridium septicum that is a proteolytically activated pore-forming toxin that belongs to the aerolysin-like family of toxins. AT is predicted to be a three-domain molecule on the basis of its functional and sequence similarity with aerolysin, for which the crystal structure has been determined. In this study, we have substituted the entire primary structure of AT with alanine or cysteine to identify those amino acids that comprise functional domains involved in receptor binding, oligomerization, and pore formation. These studies revealed that receptor binding is restricted to domain 1 of the AT structure, whereas domains 1 and 3 are involved in oligomerization. These studies also revealed the presence of a putative functional region of AT proximal to the receptor-binding domain but distal from the pore-forming domain that is proposed to regulate the insertion of the transmembrane beta-hairpin of the prepore oligomer.  相似文献   

20.
Monoclonal antibodies to alpha 4, the major regulatory protein of herpes simplex virus 1, have been shown to differ in their effects on the binding of the protein to its DNA-binding site in the promoter-regulatory domain of an alpha gene. To map the epitopes, we expressed truncated genes in transient expression systems. All 10 monoclonal antibodies tested reacted with the N-terminal 288-amino-acid polypeptide. To map the epitopes more precisely, 29 15-mer oligopeptides, overlapping by five amino acids at each end, were synthesized and reacted with the monoclonal antibodies. The nine reactive monoclonal antibodies were mapped to seven sites. Of the two monoclonal antibodies which blocked the binding of alpha 4 to DNA, one (H950) reacted with oligopeptide no. 3 near the N terminal of the protein, whereas the second (H942) reacted with oligopeptide no. 23 near the C terminus of the 288-amino-acid polypeptide. In further tests, oligopeptide no. 19 was found to compete with two host proteins, designated as alpha H1 and alpha H2-alpha H3, for binding to DNA as well as to retard DNA in a band shift assay, whereas oligopeptides no. 26, 27, and 28 enhanced the binding of alpha 4 to DNA. Moreover, oligopeptide no. 27 was also found to retard DNA in a band shift assay. Polypeptide no. 19 competed with alpha 4 for binding to DNA, whereas no. 27 neither enhanced nor competed with the binding of the host polypeptide alpha H1 to its binding site in the promoter-regulatory domain of an alpha gene, but did enhance the binding of the alpha H2-alpha H3 protein to its binding site. In contrast to these results, the truncated alpha 4 polypeptide, 825 amino acids long, bound to the viral DNA, whereas a shorter, 519-amino-acid-long, truncated polypeptide did not. The 825-amino-acid polypeptide was previously shown to induce in transient expression of a late (gamma 2) viral gene.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号