首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Since our characterization of the slit cDNA sequence, encoding a protein secreted by glial cells and involved in the formation of axonal pathways in Drosophila, we have discovered that the protein contains two additional sequence motifs that are highly conserved in a variety of proteins. A search of the GenPept database with the 73 amino acids at the carboxy terminus of slit revealed that this region contains significant similarity to a carboxy-terminal domain found in six other exported proteins. This observation has allowed us to define a new carboxy-terminal protein motif. In addition, comparisons with a 202 amino acid domain residing between epidermal growth factor (EGF) repeats in slit shows this region to be conserved in laminin, agrin and perlecan and, strikingly, also to lie between EGF repeats in both agrin and perlecan. Our analysis suggests this motif is involved in mediating interactions among extracellular proteins. Consistent with our previous characterization of the slit protein, both new motifs are found only in extracellular proteins. The identification of these two conserved motifs in slit reveals that the entire 1469 amino acids of the protein are made up of modular regions similar to those conserved in other extracellular proteins.  相似文献   

2.
Computational methods such as sequence alignment and motif construction are useful in grouping related proteins into families, as well as helping to annotate new proteins of unknown function. These methods identify conserved amino acids in protein sequences, but cannot determine the specific functional or structural roles of conserved amino acids without additional study. In this work, we present 3MATRIX (http://3matrix.stanford.edu) and 3MOTIF (http://3motif.stanford.edu), a web-based sequence motif visualization system that displays sequence motif information in its appropriate three-dimensional (3D) context. This system is flexible in that users can enter sequences, keywords, structures or sequence motifs to generate visualizations. In 3MOTIF, users can search using discrete sequence motifs such as PROSITE patterns, eMOTIFs, or any other regular expression-like motif. Similarly, 3MATRIX accepts an eMATRIX position-specific scoring matrix, or will convert a multiple sequence alignment block into an eMATRIX for visualization. Each query motif is used to search the protein structure database for matches, in which the motif is then visually highlighted in three dimensions. Important properties of motifs such as sequence conservation and solvent accessible surface area are also displayed in the visualizations, using carefully chosen color shading schemes.  相似文献   

3.
Correlations of amino acids in proteins   总被引:2,自引:0,他引:2  
Du Q  Wei D  Chou KC 《Peptides》2003,24(12):1863-1869
A correlation analysis among 20 amino acids is performed for four protein structural classes (, β, /β, and +β) in a total of 204 proteins. The correlation relationships among amino acids can be classified into the following four types: (1) strong positive correlation, (2) strong negative correlation, (3) weak correlation, and (4) no correlation. The correlation relationships are different for different proteins and are correlated with the features of their structural classes. The amino acids with the weak correlation relationship can be treated as the independent basis functions for the space where proteins are defined. The amino acids with large correlation coefficients are linear correlative with each other and they are not independent. The strong correlation among amino acids reflects their mutual constrained relationship, as exhibited by their relevant structural features. The information obtained through the correlation analysis is used for predicting protein structural classes and a better prediction quality is obtained than that by the simple geometry distance methods without taking into account the correlation effects.  相似文献   

4.
Helicases are motor proteins of biological system, which catalyze the opening of energetically stable duplex nucleic acids in an ATP-dependent manner and thereby are involved in almost all aspects of nucleic acid metabolism including cell cycle progression. They contain several conserved domains including the DEAD-box and also several unique domains associated with these. The Pfam database (http://pfam.janelia.org/) is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). A diverse range of proteins are found in nature, and the functional specificity to each protein, to a greater extent, is imparted by its domain architecture. To this extent, a DEAD-box ATP-dependent RNA helicase (LOC_Os01g36890; Genomic sequence length: 6284 nucleotides; CDS length: 1299 nucleotides; Protein length: 432 amino acids) was studied. The protein sequence was imported for domain search on Pfam. This particular Pfam entry after covering a large proportion of the sequences in the underlying database has generated a more comprehensive coverage across a wide range of phyla of the known domains that are associated with the typical DEAD-box helicase motif. A total of 362 domain architectures were recollected from the Pfam database for the Family: DEAD (PF00270). We have therefore systematically analyzed the domains closely associated with DEAD-motif, which occur in a variety of proteins and can provide insights into their function.  相似文献   

5.
Pectate lyases are plant virulence factors that degrade the pectate component of the plant cell wall. The enzymes share considerable sequence homology with plant pollen and style proteins, suggesting a shared structural topology and possibly functional relationships as well. The three-dimensional structures of two Erwinia chrysanthemi pectate lyases, C and E, have been superimposed and the structurally conserved amino acids have been identified. There are 232 amino acids that superimpose with a root-mean-square deviation of 3 A or less. These amino acids have been used to correct the primary sequence alignment derived from evolution-based techniques. Subsequently, multiple alignment techniques have allowed the realignment of other extracellular pectate lyases as well as all sequence homologs, including pectin lyases and the plant pollen and style proteins. The new multiple sequence alignment reveals amino acids likely to participate in the parallel beta helix motif, those involved in binding Ca2+, and those invariant amino acids with potential catalytic properties. The latter amino acids cluster in two well-separated regions on the pectate lyase structures, suggesting two distinct enzymatic functions for extracellular pectate lyases and their sequence homologs.  相似文献   

6.
The precursors of most surface proteins on Gram-positive bacteria have a C-terminal hydrophobic domain and charged tail, preceded by a conserved LPXTG motif that signals the anchoring process. This motif is the substrate for an enzyme, termed sortase, which has transpeptidation activity resulting in the cleavage of the LPXTG sequence and ultimate attachment of the protein to the peptidoglycan. While screening a group A streptococcal membrane extract for cleavage activity of the LPXTG motif, we identified an enzyme (which we term "LPXTGase") that differs significantly from sortase but also cleaves this motif. The enzyme is heavily glycosylated, which is required for its activity. Amino acid composition and sequence analysis revealed that LPXTGase differs from other enzymes, in that the molecule, which is about 14 kDa in size, has no aromatic amino acids, is rich in alanine, and is 30% composed of uncommon amino acids, suggesting a nonribosomal construction. A similar enzyme found in the membrane extract of Staphylococcus aureus, indicates that this unusual molecule may be common among Gram-positive bacteria. Whereas peptide antibiotics have been reported from bacillus species that also contain unusual amino acids and are synthesized non-ribosomally on amino acid-activating polyenzyme templates, this would be the first reported enzyme that may be similarly synthesized.  相似文献   

7.
B J Druker  L Sibert    T M Roberts 《Journal of virology》1992,66(10):5770-5776
A polyomavirus middle T-antigen (MTAg) mutant containing a substitution of Leu for Pro at amino acid 248 has previously been described as completely transformation defective (B. J. Druker, L. Ling, B. Cohen, T. M. Roberts, and B. S. Schaffhausen, J. Virol. 64:4454-4461, 1990). This mutant had no alterations in associated proteins or associated kinase activities compared with wild-type MTAg. Pro-248 lies in a tetrameric sequence, NPTY, which is reminiscent of the so-called NPXY sequence in the low-density-lipoprotein receptor. In the low-density-lipoprotein receptor, mutations in the NPXY motif but not in the surrounding amino acids abolish receptor function, apparently by decreasing receptor internalization (W. Chen, J. L. Goldstein, and M. S. Brown, J. Biol. Chem. 265:3116-3123, 1990). To determine whether this sequence represents a functional motif in MTAg as well, a series of single amino acid substitutions was constructed in this region of MTAg. All of the mutations of N, P, T, or Y, including the relatively conservative substitution of Ser for Thr at amino acid 249, resulted in a transformation-defective MTAg, whereas mutations outside of this sequence allowed mutants to retain near-wild-type transformation capabilities. Transformation-defective mutants with mutations in the NPTY region behaved similarly to the mutant with the original Pro-248-to-Leu-248 mutation when assayed for associated proteins and activities in vitro; that is, they retained a full complement of wild-type activities and associated proteins. Further, insertion of the tetrameric sequence NPTY downstream of the mutated motif restored transforming abilities to these mutants. Thus, the tetrameric sequence NPTY in MTAg appears to represent a well-defined functional motif of MTAg.  相似文献   

8.
Sweadner KJ  Rael E 《Genomics》2000,68(1):41-56
A gene family of small membrane proteins, represented by phospholemman and the gamma subunit of Na,K-ATPase, was defined and characterized by the analysis of more than 1000 related ESTs (expressed sequence tags). In addition to new and more complete cDNA sequence for known family members (including MAT-8, CHIF, and RIC), the findings included two new family members and new splicing variants. A large number of EST replicates made it possible to derive curated DNA sequence with higher confidence and accuracy than from the sequencing of individual clones. The family has a core motif of 35 invariant and conserved amino acids centered on a single transmembrane span. Features of each predicted protein product were compared, and tissue distributions were determined. The gene family was named FXYD (pronounced fix-id) in recognition of invariant amino acids in its signature motif. The abundant proteins are involved in the control of ion transport.  相似文献   

9.
Amino acids fulfil a diverse range of roles in proteins, each utilising its chemical properties in different ways in different contexts to create required functions. For example, cysteines form disulphide or hydrogen bonds in different circumstances and charged amino acids do not always make use of their charge. The repertoire of amino acid functions and the frequency at which they occur in proteins remains understudied. Measuring large numbers of mutational consequences, which can elucidate the role an amino acid plays, was prohibitively time‐consuming until recent developments in deep mutational scanning. In this study, we gathered data from 28 deep mutational scanning studies, covering 6,291 positions in 30 proteins, and used the consequences of mutation at each position to define a mutational landscape. We demonstrated rich relationships between this landscape and biophysical or evolutionary properties. Finally, we identified 100 functional amino acid subtypes with a data‐driven clustering analysis and studied their features, including their frequencies and chemical properties such as tolerating polarity, hydrophobicity or being intolerant of charge or specific amino acids. The mutational landscape and amino acid subtypes provide a foundational catalogue of amino acid functional diversity, which will be refined as the number of studied protein positions increases.  相似文献   

10.
Systematic analysis of soluble proteins in developing rat cerebellum by an automated two-dimensional liquid-chromatography system detected a number of proteins which increased transiently during the initial stage of postnatal development. One of the proteins, V-1, was isolated using a liquid-chromatography system, and its amino acid sequence was determined by analysis of the purified protein. The sequence showed that the V-1 protein consists of 117 amino acids with an acetylated N-terminus, and has 2.5 internal sequence repeats of 33 amino acids. Computer retrieval of the sequence indicated that the repeated sequences have a structural characteristics of the cdc10/SWI6 motif, which is found in a series of proteins, including those involved in cell-cycle control and cell-fate determination in yeast, Drosophila melanogaster and Caenorhabditis elegans. The structure of V-1, coupled with its controlled expression in early postnatal development, implies a potential role for V-1 in cerebellar morphogenesis.  相似文献   

11.
THR1, the gene from Saccharomyces cerevisiae, encoding homoserine kinase, one of the threonine biosynthetic enzymes, has been cloned by complementation. The nucleotide sequence of a 3.1-kb region carrying this gene reveals an open reading frame of 356 codons, corresponding to about 40 kDa for the encoded protein. The presence of three canonical GCN4 regulatory sequences in the upstream flanking region suggests that the expression of THR1 is under the general amino acid control. In parallel, the enzyme was purified by four consecutive column chromatographies, monitoring homoserine kinase activity. In SDS gel electrophoresis, homoserine kinase migrates like a 40-kDa protein; the native enzyme appears to be a homodimer. The sequence of the first 15 NH2-terminal amino acids, as determined by automated Edman degradation, is in accordance with the amino acid sequence deduced from the nucleotide sequence. Computer-assisted comparison of the yeast enzyme with the corresponding activities from bacterial sources showed that several segments among these proteins are highly conserved. Furthermore, the observed homology patterns suggest that the ancestral sequences might have been composed from separate (functional) domains. A block of very similar amino acids is found in the homoserine kinases towards the carboxy terminus that is also present in many other proteins involved in threonine (or serine) metabolism; this motif, therefore, may represent the binding site for the hydroxyamino acids. Limited similarity was detected between a motif conserved among the homoserine kinases and consensus sequences found in other mono- or dinucleotide-binding proteins.  相似文献   

12.
The basic-helix-loop-helix-zipper (bHLH-Zip) motif is a conserved region of approximately 70 amino acids that mediates both sequence-specific DNA binding and protein dimerization. This motif is found in protein sequences from many eukaryotic organisms and is contained in the protein sequence of the oncogene myc and its partner max, and a shortened version of the motif (bHLH) is found in the muscle determination factor myoD and its partner E12. An evaluation of the conserved amino acids that define the motif coupled with the published mutagenic studies of this region has led to our formulation of a molecular model for the binding of this motif as a dimer to specific sequences of DNA. This model has the dimeric protein interacting with an abutted, dyad-symmetric DNA sequence. Helix 2 of each monomer is modeled as a coiled-coil extension of the C-terminal "leucine zipper." Helix 1 does not interact with helix 1 from its partner in the dimer but with the hydrophobic surface created when the helix 2 regions of the dimer interact with each other as a coiled-coil. Sequence-specific interactions are proposed between the basic region and the invariant cis elements that all bHLH-Zip proteins bind.  相似文献   

13.
Influenza virus polymerase complex is a heterotrimer consisting of polymerase basic protein 1 (PB1), polymerase basic protein 2 (PB2), and polymerase acidic protein (PA). Of these, only PB1, which has been implicated in RNA chain elongation, possesses the four conserved motifs (motifs I, II, III, and IV) and the four invariant amino acids (one in each motif) found among all viral RNA-dependent RNA or RNA-dependent DNA polymerases. We have modified an assay system developed by Huang et al. (T.-J. Huang, P. Palese, and M. Krystal, J. Virol. 64:5669-5673, 1990) to reconstitute the functional polymerase activity in vivo. Using this assay, we have examined the requirement of each of these motifs of PB1 in polymerase activity. We find that each of these invariant amino acids is critical for PB1 activity and that mutation in any one of these residues renders the protein nonfunctional. We also find that in motif III, which contains the SSDD sequence, the signature sequence of influenza virus RNA polymerase, SDD is essentially invariant and cannot accommodate sequences found in other RNA viral polymerases. However, conserved changes in the flanking sequences of SDD can be partially tolerated. These results provide the experimental evidence that influenza virus PB1 possesses a similar polymerase module as has been proposed for other RNA viruses and that the core SDD sequence of influenza virus PB1 represents a sequence variant of the GDN in negative-stranded nonsegmented RNA viruses, GDD in positive-stranded RNA virus and double-stranded RNA viruses, or MDD in retroviruses.  相似文献   

14.
SpsA, a pneumococcal surface protein belonging to the family of choline-binding proteins, interacts specifically with secretory immunglobulin A (SIgA) via the secretory component (SC). SIgA and free SC from mouse, rat, rabbit and guinea-pig failed to interact with SpsA indicating species-specific binding to human SIgA and SC. SpsA is the only pneumococcal receptor molecule for SIgA and SC as confirmed by complete loss of SIgA and SC binding to a spsA mutant. Analysis of recombinant SpsA fusion proteins showed that the binding domain is located in the N-terminal region of SpsA. By the use of different truncated N-terminal SpsA fusion proteins, the minimum binding domain was shown to be composed of 112 amino acids (residues 172-283). The sequence of this 112-amino-acids domain was used to spot synthesize 34 overlapping peptides, consisting of 15 amino acids each, with an offset of three amino acids on a cellulose membrane. One of the peptides reacted specifically with both SIgA and SC. By using a second membrane with immobilized synthetic peptides of decreasing length containing parts of the identified 15-amino-acid motif a hexapeptide, YRNYPT was identified as the binding motif for SC and SIgA. SpsA proteins with a size smaller than the assay-positive domain of 112 amino acids were able to inhibit the interaction of SIgA and pneumococci provided they contained the binding motif. The results indicated that the hexapeptide YRNYPT located in SpsA of pneumococcal strain type 1 (ATCC 33400) between amino acids 198 and 203 is involved in SIgA and SC binding. Because synthetic peptides containing only parts of the hexapeptide also assayed positive, these results further suggest that at least the amino acids YPT of the identified hexapeptide are critical for binding to SC and SIgA. Amino acid substitutions in the identified putative binding motif abolished SC-/SIgA-binding activity of the mutated SpsA protein, confirming the functional activity of this hexapeptide and the critical role of the amino acids YPT in SC and SIgA binding. Identification of this motif, which is highly conserved in SpsA protein among different serotypes, might contribute towards a new peptide based vaccine strategy.  相似文献   

15.
16.
17.
Discovering structural correlations in alpha-helices.   总被引:5,自引:2,他引:3       下载免费PDF全文
We have developed a new representation for structural and functional motifs in protein sequences based on correlations between pairs of amino acids and applied it to alpha-helical and beta-sheet sequences. Existing probabilistic methods for representing and analyzing protein sequences have traditionally assumed conditional independence of evidence. In other words, amino acids are assumed to have no effect on each other. However, analyses of protein structures have repeatedly demonstrated the importance of interactions between amino acids in conferring both structure and function. Using Bayesian networks, we are able to model the relationships between amino acids at distinct positions in a protein sequence in addition to the amino acid distributions at each position. We have also developed an automated program for discovering sequence correlations using standard statistical tests and validation techniques. In this paper, we test this program on sequences from secondary structure motifs, namely alpha-helices and beta-sheets. In each case, the correlations our program discovers correspond well with known physical and chemical interactions between amino acids in structures. Furthermore, we show that, using different chemical alphabets for the amino acids, we discover structural relationships based on the same chemical principle used in constructing the alphabet. This new representation of 3-dimensional features in protein motifs, such as those arising from structural or functional constraints on the sequence, can be used to improve sequence analysis tools including pattern analysis and database search.  相似文献   

18.
C Xiao  H Xin  A Dong  C Sun  K Cao 《DNA research》1999,6(3):179-181
A rice cDNA encoding a novel calmodulin-like protein was identified. It has 38 additional amino acids at the C-terminus of a complete, typical calmodulin (CaM) sequence of 149 amino acids. The four C-terminal amino acid residues form a CAAL motif which could be a site for protein prenylation and may subsequently cause the protein to become membrane associated. RT-PCR analysis confirmed that such a combined protein gene truly exists in rice. Sequence analysis of its genomic counterpart showed that there is an intron located at junction of the normal CaM sequence and the 38 C-terminal amino acids. This introduces a potential stop codon for normal CaM if an alternative splicing mechanism is involved. Southern blot analysis of rice genomic DNA revealed that there is only one locus for this gene. The northern blot analysis showed that this gene is highly expressed in rice roots, shoots and flowers. The distribution of this protein demonstrates the functional importance of this novel CaM-like protein in rice.  相似文献   

19.
Sequence analysis of chloroplast and mitochondrial large subunit rRNA genes from over 75 green algae disclosed 28 new group I intron-encoded proteins carrying a single LAGLIDADG motif. These putative homing endonucleases form four subfamilies of homologous enzymes, with the members of each subfamily being encoded by introns sharing the same insertion site. We showed that four divergent endonucleases from the I-CreI subfamily cleave the same DNA substrates. Mapping of the 66 amino acids that are conserved among the members of this subfamily on the 3-dimensional structure of I-CreI bound to its recognition sequence revealed that these residues participate in protein folding, homodimerization, DNA recognition and catalysis. Surprisingly, only seven of the 21 I-CreI amino acids interacting with DNA are conserved, suggesting that I-CreI and its homologs use different subsets of residues to recognize the same DNA sequence. Our sequence comparison of all 45 single-LAGLIDADG proteins identified so far suggests that these proteins share related structures and that there is a weak pressure in each subfamily to maintain identical protein–DNA contacts. The high sequence variability we observed in the DNA-binding site of homologous LAGLIDADG endonucleases provides insight into how these proteins evolve new DNA specificity.  相似文献   

20.
The short cytoplasmic tail of mouse CD1d (mCD1d) is required for its endosomal localization, for the presentation of some glycolipid Ags, and for the development of Valpha14i NKT cells. This tail has a four-amino acid Tyr-containing motif, Tyr-Gln-Asp-Ile (YQDI), similar to those sequences known to be important for the interaction with adaptor protein complexes (AP) that mediate the endosomal localization of many different proteins. In fact, mCD1d has been shown previously to interact with the AP-3 adaptor complex. In the present study, we mutated each amino acid in the YQDI motif to determine the importance of the entire motif sequence in influencing mCD1d trafficking, its interaction with adaptors, and its intracellular localization. The results indicate that the Y, D, and I amino acids are significant functionally because mutations at each of these positions altered the intracellular distribution of mCD1d and reduced its ability to present glycosphingolipids to NKT cells. However, the three amino acids are not all acting in the same way because they differ with regard to how they influence the intracellular distribution of CD1d, its rate of internalization, and its ability to interact with the mu subunit of AP-3. Our results emphasize that multiple steps, including interactions with the adaptors AP-2 and AP-3, are required for normal trafficking of mCD1d and that these different steps are mediated by only a few cytoplasmic amino acids.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号