首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Discovering structural correlations in alpha-helices.   总被引:5,自引:2,他引:3       下载免费PDF全文
We have developed a new representation for structural and functional motifs in protein sequences based on correlations between pairs of amino acids and applied it to alpha-helical and beta-sheet sequences. Existing probabilistic methods for representing and analyzing protein sequences have traditionally assumed conditional independence of evidence. In other words, amino acids are assumed to have no effect on each other. However, analyses of protein structures have repeatedly demonstrated the importance of interactions between amino acids in conferring both structure and function. Using Bayesian networks, we are able to model the relationships between amino acids at distinct positions in a protein sequence in addition to the amino acid distributions at each position. We have also developed an automated program for discovering sequence correlations using standard statistical tests and validation techniques. In this paper, we test this program on sequences from secondary structure motifs, namely alpha-helices and beta-sheets. In each case, the correlations our program discovers correspond well with known physical and chemical interactions between amino acids in structures. Furthermore, we show that, using different chemical alphabets for the amino acids, we discover structural relationships based on the same chemical principle used in constructing the alphabet. This new representation of 3-dimensional features in protein motifs, such as those arising from structural or functional constraints on the sequence, can be used to improve sequence analysis tools including pattern analysis and database search.  相似文献   

2.
The amino acid sequence that forms the alpha-helical coiled coil structure has a representative heptad repeat denoted by defgabc, according to their positions. Although the a and d positions are usually occupied by hydrophobic residues, hydrophilic residues at these positions sometimes play important roles in natural proteins. We have manipulated a few amino acids at the a and d positions of a de novo designed trimeric coiled coil to confer new functions to the peptides. The IZ peptide, which has four heptad repeats and forms a parallel triple-stranded coiled coil, has Ile at all of the a and d positions. We show three examples: (1) the substitution of one Ile at either the a or d position with Glu caused the peptide to become pH sensitive; (2) the metal ion induced alpha-helical bundles were formed by substitutions with two His residues at the d and a positions for a medium metal ion, and with one Cys residue at the a position for a soft metal ion; and (3) the AAB-type heterotrimeric alpha-helical bundle formation was accomplished by a combination of Ala and Trp residues at the a positions of different peptide chains. Furthermore, we applied these procedures to prepare an ABC-type heterotrimeric alpha-helical bundle and a metal ion-induced heterotrimeric alpha-helical bundle.  相似文献   

3.
Maquettes are de novo designed mimicries of nature used to test the construction and engineering criteria of oxidoreductases. One type of scaffold used in maquette construction is a four-alpha-helical bundle. The sequence of the four-alpha-helix bundle maquettes follows a heptad repeat pattern typical of left-handed coiled-coils. Initial designs were molten globular due partly to the minimalist approach taken by the designers. Subsequent iterative redesign generated several structured scaffolds with similar heme binding properties. Variant [I(6)F(13)](2), a structured scaffold, was partially resolved with NMR spectroscopy and found to have a set of mobile inter-helical packing interfaces. Here, the X-ray structure of a similar peptide ([I(6)F(13)M(31)](2) i.e. ([CGGG EIWKL HEEFLKK FEELLKL HEERLKKM](2))(2) which we call L31M), has been solved using MAD phasing and refined to 2.8A resolution. The structure shows that the maquette scaffold is an anti-parallel four-helix bundle with "up-up-down-down" topology. No pre-formed heme-binding pocket exists in the protein scaffold. We report unexpected inter-helical crossing angles, residue positions and translations between the helices. The crossing angles between the parallel helices are -5 degrees rather than the expected +20 degrees for typical left-handed coiled-coils. Deviation of the scaffold from the design is likely due to the distribution and size of hydrophobic residues. The structure of L31M points out that four identical helices may interact differently in a bundle and heptad repeats with an alternating [HPPHHPP]/[HPPHHPH] (H: hydrophobic, P: polar) pattern are not a sufficient design criterion to generate left-hand coiled-coils.  相似文献   

4.
Structural genomics projects are producing many three-dimensional structures of proteins that have been identified only from their gene sequences. It is therefore important to develop computational methods that will predict sites involved in productive intermolecular interactions that might give clues about functions. Techniques based on evolutionary conservation of amino acids have the advantage over physiochemical methods in that they are more general. However, the majority of techniques neither use all available structural and sequence information, nor are able to distinguish between evolutionary restraints that arise from the need to maintain structure and those that arise from function. Three methods to identify evolutionary restraints on protein sequence and structure are described here. The first identifies those residues that have a higher degree of conservation than expected: this is achieved by comparing for each amino acid position the sequence conservation observed in the homologous family of proteins with the degree of conservation predicted on the basis of amino acid type and local environment. The second uses information theory to identify those positions where environment-specific substitution tables make poor predictions of the overall amino acid substitution pattern. The third method identifies those residues that have highly conserved positions when three-dimensional structures of proteins in a homologous family are superposed. The scores derived from these methods are mapped onto the protein three-dimensional structures and contoured, allowing identification clusters of residues with strong evolutionary restraints that are sites of interaction in proteins involved in a variety of functions. Our method differs from other published techniques by making use of structural information to identify restraints that arise from the structure of the protein and differentiating these restraints from others that derive from intermolecular interactions that mediate functions in the whole organism.  相似文献   

5.
It is well established that the vast majority of proteins of all taxonomical groups and species are initiated by an AUG codon, translated into the amino acid methionine (Met). Many attempts were made to evaluate the importance of the sequences surrounding the initiation codon, mostly focusing on the RNA sequence. However, the role and importance of the amino acids following the initiating Met residue were rarely investigated, mostly in bacteria and fungi. Herein, we computationally examined the protein sequences of all major taxonomical groups represented in the Swiss-Prot database, and evaluated the preference of each group to specific amino acids at the positions directly following the initial Met. The results indicate that there is a species-specific preference for the second amino acid of the majority of protein sequences. Interestingly, the preference for a certain amino acid at the second position changes throughout evolution from lysine in prokaryotes, through serine in lower eukaryotes, to alanine in higher plants and animals.  相似文献   

6.
The 44 amino acid E5 transmembrane protein is the primary oncogene product of bovine papillomavirus. Homodimers of the E5 protein activate the cellular PDGF beta receptor tyrosine kinase by binding to its transmembrane domain and inducing receptor dimerization, resulting in cellular transformation. To investigate the role of transmembrane hydrophilic amino acids in receptor activation, we constructed a library of dimeric small transmembrane proteins in which 16 transmembrane amino acids of the E5 protein were replaced with random, predominantly hydrophobic amino acids. A low level of hydrophilic amino acids was encoded at each of the randomized positions, including position 17, which is an essential glutamine in the wild-type E5 protein. Library proteins that induced transformation in mouse C127 cells stably bound and activated the PDGF beta receptor. Strikingly, 35% of the transforming clones had a hydrophilic amino acid at position 17, highlighting the importance of this position in activation of the PDGF beta receptor. Hydrophilic amino acids in other transforming proteins were found adjacent to position 17 or at position 14 or 21, which are in the E5 homodimer interface. Approximately 22% of the transforming proteins lacked hydrophilic amino acids. The hydrophilic amino acids in the transforming clones appear to be important for driving homodimerization, binding to the PDGF beta receptor, or both. Interestingly, several of the library proteins bound and activated PDGF beta receptor transmembrane mutants that were not activated by the wild-type E5 protein. These experiments identified transmembrane proteins that activate the PDGF beta receptor and revealed the importance of hydrophilic amino acids at specific positions in the transmembrane sequence. Our identification of transformation-competent transmembrane proteins with altered specificity suggests that this approach may allow the creation and identification of transmembrane proteins that modulate the activity of a variety of receptor tyrosine kinases.  相似文献   

7.
Coordinated amino acid changes in homologous protein families   总被引:4,自引:0,他引:4  
In the tobamovirus coat protein family, amino acid residues at some spatially close positions are found to be substituted in a coordinated manner [Altschuh et al. (1987) J. Mol. Biol., 193, 693]. Therefore, these positions show an identical pattern of amino acid substitutions when amino acid sequences of these homologous proteins are aligned. Based on this principle, coordinated substitutions have been searched for in three additional protein families: serine proteases, cysteine proteases and the haemoglobins. Coordinated changes have been found in all three protein families mostly within structurally constrained regions. This method works with a varying degree of success depending on the function of the proteins, the range of sequence similarities and the number of sequences considered. By relaxing the criteria for residue selection, the method was adapted to cover a broader range of protein families and to study regions of the proteins having weaker structural constraints. The information derived by these methods provides a general guide for engineering of a large variety of proteins to analyse structure-function relationships.  相似文献   

8.
A set of combinatorial amphipathic helical peptides referred to as the KIA series has been screened to identify native-like helical bundles. The series contains the following consensus sequence: AKAxAAxxKAxAAxxKAGGY, where "x" positions are occupied by either Ala or Ile. The peptide sequences in the series comprise all possible combinations of four Ile residues occupying the six x positions. In each case, Ala occupied the two x positions not occupied by Ile. There are a total of 15 peptides in the KIA series; all of the peptides differ in the number of ridges and grooves formed by the Ile side chains, and two of the KIA peptides possess a canonical knobs-into-holes heptad repeat. The structure and stability of these 15 peptides and their pairwise mixtures were evaluated. One peptide in the series formed a stable four-helix bundle that folded with cooperativity similar to native proteins. Ten peptides assembled into molten globular helical assemblies, two peptides were unstructured, and two peptides assembled into helical filaments that were several micrometers long. One of the helical filament forming peptides could be diverted from forming filaments by the addition of another KIA peptide, and resulted in the formation of a heteromeric six-helix bundle. This study demonstrates that combinatorial peptides composed of only three types of amino acids can form a diverse array of structures, some of which are native-like.  相似文献   

9.
艾亮  冯杰 《生物信息学》2023,21(3):179-186
本文提出了一种新的快速非比对的蛋白质序列相似性与进化分析方法。在刻画蛋白质序列特征时,首先将氨基酸的10种理化性质通过主成分分析浓缩为6个主成分,并且将每条蛋白质序列里的氨基酸数目作为权重对主成分得分值进行加权平均,然后再融合氨基酸的位置信息构成一个26维的蛋白质序列特征向量,最后利用欧式距离度量蛋白质序列间的相似性及进化关系。通过对3个蛋白质序列数据集的测试表明,本文提出的方法能将每条蛋白质序列准确聚类,并且简便快捷,说明了该方法的有效性。  相似文献   

10.
White- and Yolk-riboflavin binding proteins were isolated from hen eggs, and characterized as to their chemical properties. White- and Yolk-RBPs had almost same amino acid compositions except for glutamic acid, but their carbohydrate compositions were different from each other. The complete amino acid sequence of White-RBP was determined by conventional methods. White-RBP comprised 219 amino acid residues, and the amino-terminus was pyroglutamic acid (pyrrolidonecarboxylic acid). Two amino acids, lysine and asparagine, were found at the fourteenth residue from the amino-terminus. Carbohydrate chains were linked to asparagine residues at positions 36 and 147. Both White- and Yolk-RBPs were phosphorylated. In White-RBP either six or seven of nine serine residues between Ser(185) and Ser(197) were phosphorylated. The amino acid sequences around phosphoserines showed that phosphorylation might occur at a serine residue in one of the following sequences; Ser-X-Glu or Ser-X-Ser(P).  相似文献   

11.
Amino acid sequence data have been collected for the coiled-coil rod domains of three-stranded alpha-fibrous proteins--fibrinogen, laminin, tenascin, macrophage scavenger receptor protein and the leg fibre protein from bacteriophage. Such domains are characterized by a heptad substructure in which apolar residues occur alternately three and four residues apart. The distribution of residues in each position of the heptad has been analysed, and the results compared with those obtained for the two-stranded alpha-fibrous proteins, which include the intermediate filament and myosin families. Distinctions can be drawn between the sequences in two- and three-stranded coiled-coil structures and these provide criteria that will prove useful in predicting secondary and tertiary structure purely from sequence data.  相似文献   

12.
The envelope glycoprotein of human immunodeficiency virus type 1 (HIV-1) interacts with receptors on the target cell and mediates virus entry by fusing the viral and cell membranes. To maintain the viral infectivity, amino acids that interact with receptors are expected to be more conserved than the other sites on the protein surface. In contrast to the functional constraint of amino acids for the receptor binding, some amino acid changes in this protein may produce antigenic variations that enable the virus to escape from recognition of the host immune system. Therefore, both positive selection (higher fitness) and negative selection (lower fitness) against amino acid changes are taking place during evolution of surface proteins of parasites To elucidate the evolutionary mechanisms of the whole HIV-1 gp120 envelope glycoprotein at the single site level, we collected and analyzed all available sequence data for the protein. By analyzing 186 sequences of the HIV-1 gp120 (subtype B), we reevaluated amino acid variability at the single site level, and estimated the numbers of synonymous and nonsynonymous substitutions at each codon position to detect positive and negative selection. We identified 33 amino acid positions which may be under positive selection. Some of these positions may form discontinuous epitopes. We also analyzed amino acid sequences to find amino acid positions responsible for usage of the second receptor. We found that, in addition to the V3 loop, amino acid variation at residue 440 in C4 region is clearly linked with the usage of CXCR 4.  相似文献   

13.
The group A rotaviruses are composed of at least seven serotypes. Serotype specificity is defined mainly by an outer capsid protein, VP7. In contrast, the other surface protein, VP3 (775 amino acids), appears to be associated with both serotype-specific and heterotypic immunity. To identify the cross-reactive and serotype-specific neutralization epitopes on VP3 of human rotavirus, we sequenced the VP3 gene of antigenic mutants resistant to each of seven anti-VP3 neutralizing monoclonal antibodies (N-MAbs) which exhibited heterotypic or serotype 2-specific reactivity, and we defined three distinct neutralization epitopes on VP3. The mutants sustained single amino acid substitutions at position 305, 392, 433, or 439. Amino acid position 305 was critical to epitope I, whereas amino acid position 433 was critical to epitope III. In contrast, epitope II appeared to be more dependent upon conformation and protein folding because both amino acid positions 392 and 439 appeared to be critical. These four positions clustered in a relatively limited area of VP5, the larger of the two cleavage products of VP3. At the positions where amino acid substitutions occurred, there was a correlation between amino acid sequence homology among different serotypes and the reactivity patterns of various viruses with the N-MAbs used for selection of mutants. A synthetic peptide (amino acids 296 to 313) which included the sequence of epitope I reacted with its corresponding N-MAb, suggesting that the region contains a sequential antigenic determinant. These data may prove useful in current efforts to develop vaccines against human rotavirus infection.  相似文献   

14.
We have determined the complete nucleotide sequence of the VP4 gene of porcine rotavirus YM. It is 2,362 nucleotides long, with a single open reading frame coding for a protein of 776 amino acids. A phylogenetic tree was derived from the deduced YM VP4 amino acid sequence and 18 other available VP4 sequences of rotavirus strains belonging to different serotypes and isolated from different animal species. In this tree, VP4 proteins were grouped by the hosts that the corresponding viruses infect rather than by the serotypes they belong to, suggesting that this protein is involved in the host specificity of the viruses. In an attempt to predict the secondary structure of the VP4 protein, we selected the more divergent VP4 sequences and made a secondary structure analysis of each protein. In spite of variations within the individual structures predicted, there was a general structural pattern which suggested the existence of at least two different domains. One, comprising the amino-terminal 63% of the protein, is predicted to be a possible globular domain rich in beta-strands alternated with turns and coils. The second domain, represented by the remaining, carboxy-terminal part of VP4, is rich in long stretches of alpha-helix, one of which, 63 amino acids long, has heptad repeats resembling those found in proteins known to form alpha-helical coiled-coils. The predicted secondary structure correlates well with the available data on the protein accessibility delineated by immunological and biochemical findings and with the spike structure of the protein, which has been determined by cryoelectron microscopy.  相似文献   

15.
Specific interactions of transmembrane helices play a pivotal role in the folding and oligomerization of integral membrane proteins. The helix-helix interfaces frequently depend on specific amino acid patterns. In this study, a heptad repeat pattern was randomized with all naturally occurring amino acids to uncover novel sequence motifs promoting transmembrane domain interactions. Self-interacting transmembrane domains were selected from the resulting combinatorial library by means of the ToxR/POSSYCCAT system. A comparison of the amino acid composition of high-and low-affinity sequences revealed that high-affinity transmembrane domains exhibit position-specific enrichment of histidine. Further, sequences containing His preferentially display Gly, Ser, and/or Thr residues at flanking positions and frequently contain a C-terminal GxxxG motif. Mutational analysis of selected sequences confirmed the importance of these residues in homotypic interaction. Probing heterotypic interaction indicated that His interacts in trans with hydroxylated residues. Reconstruction of minimal interaction motifs within the context of an oligo-Leu sequence confirmed that His is part of a hydrogen bonded cluster that is brought into register by the GxxxG motif. Notably, a similar motif contributes to self-interaction of the BNIP3 transmembrane domain.  相似文献   

16.
The amino-terminal amino acid sequences of the pili proteins from four antigenically dissimilar strains of Neisseria gonorrhoeae, from Neisseria meningiditis, and from Escherichia coli were determined. Although antibodies raised to the pili protein from a given strain of gonococcus cross-reacted poorly or not at all with each of the other strains tested, the amino-terminal sequences were all identical. The meningococcal protein sequence was also identical with the gonococcal sequence through 29 residues, and this sequence was highly homologous to the sequence of the pili protein of Moraxella nonliquifaciens determined by other workers. However, the sequence of the pili protein from E. coli showed no similarity to the other sequences. The gonococcal and meningococcal proteins have an unusual amino acid at the amino termini, N-methylphenylalanine. In addition, the first 24 residues of these proteins have only two hydrophilic residues (at positions 2 and 5) with the rest being predominantly aliphatic hydrophobic amino acids. The preservation of this highly unusual sequence among five antigenically dissimilar Neisseria pili proteins implies a role for the amino-terminal structure in pilus function. The amino terminus may be directly or indirectly (through preservation of tertiary structure) important for the pilus function of facilitating attachment of bacteria to human cells.  相似文献   

17.
Considerable sequence data have been collected from the intermediate filament proteins and other alpha-fibrous proteins including myosin, tropomyosin, paramyosin, desmoplakin and M-protein. The data show that there is a clear preference for some amino acids to occur in specific positions within the heptad substructure that characterizes the sequences which form the coiled-coil rod domain in this class of proteins. The results also indicate that although there are major similarities between the various proteins there are also key differences. In all cases, however, significant regularities in the linear disposition of the acidic and the basic residues in the coiled-coil segments can be related to modes of chain and molecular aggregation. In particular a clear trend has been observed which relates the mode of molecular aggregation to the number of interchain ionic interactions per heptad pair.  相似文献   

18.
乙型肝炎病毒X蛋白(hepatitis B virus X protein,HBx)全长154个氨基酸,与肝癌发生密切相关.为确定HBx的优势氨基酸序列和热点突变位点,在GenBank中下载所有HBx的氨基酸序列13950条,剔除插入突变、缺失突变和起始密码子非甲硫氨酸的序列,最后保留7126条.通过分析这7126条序列,计算出HBx每个位点的氨基酸分布情况,出现频率最高的氨基酸为该位点的优势氨基酸,其他氨基酸为突变氨基酸.154个位点的优势氨基酸组成HBx优势氨基酸序列.突变率>10.0%的热点突变位点有32个.其中第36、42、44、87、88和127位氨基酸有4种(突变率>1.0%)以上突变形式,具有较高的多态性.与肝癌密切相关的K130M/V131I双突变率为34.7%.通过7126条HBx序列与优势序列的同源性比较,随机选出其中50条序列(2条与优势序列同源性<75%,48条同源性为76%~99%),与23条参考序列及优势序列共同构建系统发生树.结果显示,HBx优势氨基酸序列属于基因型C,这与基因型C为全球主要流行型一致.本研究首次系统性分析了GenBank中HBx的优势序列,确定了32个HBx热点突变位点和6个多态性较高的位点,为基于HBx突变的基础和应用研究奠定了基础.  相似文献   

19.
A number of operator-binding proteins contain similar sequence features to Cro and cI repressors of bacteriophage and CAP protein of Escherichia coli, such as conserved amino acids at constant positions. However, these sequence patterns also occur in proteins that are not operator-binding. We use sequence analogy information in conjunction with a pattern recognition algorithm. The functional and structural properties, e.g., distributions of hydrophobicity, hydrophilicity, charged amino acids, electrostatic free energy, and helical structures of protein are also considered. Within the framework of discriminant analysis, we calculate the above variables and search for a better combination of variables. To assess the discriminatory power of these variables, we allocated additional sequences and predict DNA-binding regions of regulatory proteins not included in the training set.  相似文献   

20.
Amino acids fulfil a diverse range of roles in proteins, each utilising its chemical properties in different ways in different contexts to create required functions. For example, cysteines form disulphide or hydrogen bonds in different circumstances and charged amino acids do not always make use of their charge. The repertoire of amino acid functions and the frequency at which they occur in proteins remains understudied. Measuring large numbers of mutational consequences, which can elucidate the role an amino acid plays, was prohibitively time‐consuming until recent developments in deep mutational scanning. In this study, we gathered data from 28 deep mutational scanning studies, covering 6,291 positions in 30 proteins, and used the consequences of mutation at each position to define a mutational landscape. We demonstrated rich relationships between this landscape and biophysical or evolutionary properties. Finally, we identified 100 functional amino acid subtypes with a data‐driven clustering analysis and studied their features, including their frequencies and chemical properties such as tolerating polarity, hydrophobicity or being intolerant of charge or specific amino acids. The mutational landscape and amino acid subtypes provide a foundational catalogue of amino acid functional diversity, which will be refined as the number of studied protein positions increases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号