首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Identifying common local segments, also called motifs, in multiple protein sequences plays an important role for establishing homology between proteins. Homology is easy to establish when sequences are similar (sharing an identity > 25%). However, for distant proteins, it is much more difficult to align motifs that are not similar in sequences but still share common structures or functions. This paper is a first attempt to align multiple protein sequences using both primary and secondary structure information. A new sequence model is proposed so that the model assigns high probabilities not only to motifs that contain conserved amino acids but also to motifs that present common secondary structures. The proposed method is tested in a structural alignment database BAliBASE. We show that information brought by the predicted secondary structures greatly improves motif identification. A website of this program is available at www.stat.purdue.edu/~junxie/2ndmodel/sov.html.  相似文献   

2.
We have recently reported the first complete amino acid sequence of an iron-containing superoxide dismutase. The iron enzyme is thought to be closely homologous to the manganese-containing superoxide dismutases. The availability of complete amino acid sequence information for four manganese superoxide dismutases and the crystal structures for two iron and two manganese superoxide dismutases prompted us to investigate the degree of homology between the two proteins at various levels. We report that it is not possible to clearly distinguish the two proteins on the basis of their secondary or tertiary structures. It would appear that a small number of single site substitutions are responsible for conferring distinguishing properties between the two proteins. Substitution of glycine 77 and glutamine 154 by a glutamine and an alanine respectively in Photobacterium leiognathi iron superoxide dismutase may distinguish the kinetic and other particular properties of this protein from the manganese protein (and other iron superoxide dismutases). Furthermore the primary structure of both the iron and manganese proteins does not appear to have any homology with any other known amino acid sequence.  相似文献   

3.
According to the hypothesis explored in this paper, native aggregation is genetically controlled (programmed) reversible aggregation that occurs when interacting proteins form new temporary structures through highly specific interactions. It is assumed that Anfinsen's dogma may be extended to protein aggregation: composition and amino acid sequence determine not only the secondary and tertiary structure of single protein, but also the structure of protein aggregates (associates). Cell function is considered as a transition between two states (two states model), the resting state and state of activity (this applies to the cell as a whole and to its individual structures). In the resting state, the key proteins are found in the following inactive forms: natively unfolded and globular. When the cell is activated, secondary structures appear in natively unfolded proteins (including unfolded regions in other proteins), and globular proteins begin to melt and their secondary structures become available for interaction with the secondary structures of other proteins. These temporary secondary structures provide a means for highly specific interactions between proteins. As a result, native aggregation creates temporary structures necessary for cell activity.  相似文献   

4.
A method for comparison of protein sequences based on their primary and secondary structure is described. Protein sequences are annotated with predicted secondary structures (using a modified Chou and Fasman method). Two lettered code sequences are generated (Xx, where X is the amino acid and x is its annotated secondary structure). Sequences are compared with a dynamic programming method (STRALIGN) that includes a similarity matrix for both the amino acids and secondary structures. The similarity value for each paired two-lettered code is a linear combination of similarity values for the paired amino acids and their annotated secondary structures. The method has been applied to eight globin proteins (28 pairs) for which the X-ray structure is known. For protein pairs with high primary sequence similarity (greater than 45%), STRALIGN alignment is identical to that obtained by a dynamic programming method using only primary sequence information. However, alignment of protein pairs with lower primary sequence similarity improves significantly with the addition of secondary structure annotation. Alignment of the pair with the least primary sequence similarity of 16% was improved from 0 to 37% 'correct' alignment using this method. In addition, STRALIGN was successfully applied to seven pairs of distantly related cytochrome c proteins, and three pairs of distantly related picornavirus proteins.  相似文献   

5.
Ribosomes are the only cell organelles occurring in all organisms. E. coli ribosomes, which are the best characterized particles, consist of three RNAs and 53 proteins. All components have been isolated and characterized by chemical, physical and immunological methods. The primary structures of the RNAs and of all the proteins are known. Information about the secondary structure of the proteins derives from circular dichroism measurements and from secondary structure prediction methods. The tertiary structure is being studied by limited proteolysis, proton magnetic resonance and crystallization followed by X-ray analysis. Various methods are being used to elucidate the architecture of the ribosomal particle: three-dimensional image reconstruction of crystals of bacterial ribosomes and/or their subunits; immune electron microscopy; neutron scattering; protein-protein, protein-RNA and RNA-RNA crosslinking; total reconstitution of ribosomal subunits. The results from these studies yield valuable information on the architecture of the ribosomal particle. Many mutants have been isolated in which one or a few ribosomal proteins are altered or even deleted. The genetic and biochemical characterization of these mutants allows conclusions about the importance of these proteins for the function of the ribosome. Ribosomal proteins from various prokaryotic and eukaryotic species have been compared by two-dimensional gel electrophoresis, immunological methods, reconstitution and amino acid sequence analysis. These studies show a strong homology among prokaryotic ribosomal proteins but only a weak homology between proteins from prokaryotic and eukaryotic ribosomes. Comparison of the primary and secondary structures of the ribosomal RNAs from various organisms shows that the secondary structure of the RNA molecules has been strongly conserved throughout evolution.  相似文献   

6.
蛋白质的二级结构预测研究进展   总被引:1,自引:0,他引:1  
唐媛  李春花  张瑗  尚进  邹凌云  李立奇 《生物磁学》2013,(26):5180-5182
认识蛋白质的二级结构是了解蛋白质的折叠模式和三级结构的基础,并为研究蛋白质的功能以及它们之间的相互作用模式提供结构基础,同时还可以为新药研发提供帮助。故研究蛋白质的二级结构具有重要的意义。随着后基因组时代的到来,越来越多的蛋白质序列不断被发现,给蛋白质的二级结构研究带来巨大的挑战和研究空间。而依靠传统的实验方法很难获取大规模蛋白质的二级结构信息。目前,采用生物信息学手段仍然是获得大部分蛋白质二级结构的途径。近年来,许多研究者通过构建用于二级结构预测的蛋白质数据集,计算、提取蛋白质的各种特征信息,并采用不同的预测算法预测蛋白质的二级结构得到了快速的发展。本文拟从蛋白质的特征信息的提取与筛选、预测算法以及预测效果的检验方法等方面进行综述,介绍蛋白质二级结构预测领域的研究进展。相信随着基因组学、蛋白质组学和生物信息学的不断发展,蛋白质二级结构预测会不断取得新突破。  相似文献   

7.
While studies of secondary structure interactions have focused on local interacting features, there is a need for a more global characterization of packing-induced aligned packing of secondary structures. This study presents an analysis of the distribution of globally sampled secondary structures within selected subunits of a selected set of multimeric proteins. Comparisons are made between the distribution of the cosines of angles between triplets of linear segments associated to secondary structures and a theoretically obtained distribution for triplets of random uniformly distributed unit vectors. We show that, among all pairs of helix or strand segments, planar configurations appear more frequently than expected for uniformly distributed vectors, and alignment is strongly preferred compared to that expected for uniformly distributed vector triplets. Among all secondary structure triplets, pairs of angle cosines between helix strand segments deviate from uniformity corresponding to alignment and anti-alignment. Furthermore, among all helix or strand segments, including non-interacting secondary structures, the distribution of a single angle cosine indicates a strong preference for alignment and anti-alignment. Selection for interactive triplets shows results consistent with prior studies. Lastly, angle pairs are not statistically independent, indicating that alignment between two helix or strand segments is more likely if another helix or strand is aligned with either of the first two helices or strands. Selection for interactive segment triplets shows results consistent with prior studies.  相似文献   

8.
The complete primary structures of two variant specific glycoproteins (VSGs) of the nannomonad Trypanosoma (N.) congolense are presented. These coat proteins subserve the function of antigenic variation. The secondary structure potentials of both VSGs have been calculated. The amino acid sequences and secondary structure potentials of these VSGs have been compared with the primary structures and secondary structure potentials of several Trypanosoma brucei complex VSGs. In homologous regions, the T. brucei complex VSGs show a pattern of sharply contrasting secondary structure potentials. It has been suggested previously that this pattern gives rise to different folding structures in different members of this polygene protein family. Thus, different short regions of the polypeptide sequence are exposed as antigenic "caps" on the solvent-exposed surface of intact trypanosomes. A sharply contrasting secondary structure potential pattern is also found in regions of the two T. congolense VSGs. However, there is little homology of primary structure between each of the two T. congolense VSGs and any member of the T. brucei complex VSG polygene family whose primary structure has been determined.  相似文献   

9.
Fan H  Mark AE 《Proteins》2003,53(1):111-120
The relative stability of protein structures determined by either X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy has been investigated by using molecular dynamics simulation techniques. Published structures of 34 proteins containing between 50 and 100 residues have been evaluated. The proteins selected represent a mixture of secondary structure types including all alpha, all beta, and alpha/beta. The proteins selected do not contain cysteine-cysteine bridges. In addition, any crystallographic waters, metal ions, cofactors, or bound ligands were removed before the systems were simulated. The stability of the structures was evaluated by simulating, under identical conditions, each of the proteins for at least 5 ns in explicit solvent. It is found that not only do NMR-derived structures have, on average, higher internal strain than structures determined by X-ray crystallography but that a significant proportion of the structures are unstable and rapidly diverge in simulations.  相似文献   

10.
Intrinsically disordered proteins (IDPs)/protein regions (IDPRs) lack unique three-dimensional structure at the level of secondary and/or tertiary structure and are represented as an ensemble of interchanging conformations. To investigate the role of presence/absence of secondary structures in promoting intrinsic disorder in proteins, a comparative sequence analysis of IDPs, IDPRs and proteins with minimal secondary structures (less than 5%) is required. A sequence analysis reveals proteins with minimal secondary structure content have high mean net positive charge, low mean net hydrophobicity and low sequence complexity. Interestingly, analysis of the relative local electrostatic interactions reveal that an increase in the relative repulsive interactions between amino acids separated by three or four residues lead to either loss of secondary structure or intrinsic disorder. IDPRs show increase in both local negative-negative and positive-positive repulsive interactions. While IDPs show a marked increase in the local negative-negative interactions, proteins with minimal secondary structure depict an increase in the local positive-positive interactions. IDPs and IDPRs are enriched in D, E and Q residues, while proteins with minimal secondary structure are depleted of these residues. Proteins with minimal secondary structures have higher content of G and C, while IDPs and IDPRs are depleted of these residues. These results confirm that proteins with minimal secondary structure have a distinctly different propensity for charge, hydrophobicity, specific amino acids and local electrostatic interactions as compared to IDPs/IDPRs. Thus we conclude that lack of secondary structure may be a necessary but not a sufficient condition for intrinsic disorder in proteins.  相似文献   

11.
Molecular modeling of proteins is confronted with the problem of finding homologous proteins, especially when few identities remain after the process of molecular evolution. Using even the most recent methods based on sequence identity detection, structural relationships are still difficult to establish with high reliability. As protein structures are more conserved than sequences, we investigated the possibility of using protein secondary structure comparison (observed or predicted structures) to discriminate between related and unrelated proteins sequences in the range of 10%-30% sequence identity. Pairwise comparison of secondary structures have been measured using the structural overlap (Sov) parameter. In this article, we show that if the secondary structures likeness is >50%, most of the pairs are structurally related. Taking into account the secondary structures of proteins that have been detected by BLAST, FASTA, or SSEARCH in the noisy region (with high E: value), we show that distantly related protein sequences (even with <20% identity) can be still identified. This strategy can be used to identify three-dimensional templates in homology modeling by finding unexpected related proteins and to select proteins for experimental investigation in a structural genomic approach, as well as for genome annotation.  相似文献   

12.
The use of proton-proton nuclear Overhauser enhancement (NOE) distance information for identification of polypeptide secondary structures in non-crystalline proteins was investigated by stereochemical studies of standard secondary structures and by statistical analyses of the secondary structures in the crystal conformations of a group of globular proteins. Both regular helix and beta-sheet secondary structures were found to contain a dense network of short 1H-1H distances. The results obtained imply that the combined information on all these distances obtained from visual inspection of the two-dimensional NOE (NOESY) spectra is sufficient for determination of the helical and beta-sheet secondary structures in small globular proteins. Furthermore, cis peptide bonds can be identified from unique, short sequential proton-proton distances. Limitations of this empirical approach are that the exact start or end of a helix may be difficult to define when the adjoining residues form a tight turn, and that unambiguous identification of tight turns can usually be obtained only in the hairpins of antiparallel beta-structures. The short distances between protons in pentapeptide segments of the different secondary structures have been tabulated to provide a generally applicable guide for the analysis of NOESY spectra of proteins.  相似文献   

13.
Circular dichroism (CD) spectroscopy is a valuable method for defining canonical secondary structure contents of proteins based on empirically‐defined spectroscopic signatures derived from proteins with known three‐dimensional structures. Many proteins identified as being “Intrinsically Disordered Proteins” have a significant amount of their structure that is neither sheet, helix, nor turn; this type of structure is often classified by CD as “other”, “random coil”, “unordered”, or “disordered”. However the “other” category can also include polyproline II (PPII)‐type structures, whose spectral properties have not been well‐distinguished from those of unordered structures. In this study, synchrotron radiation circular dichroism spectroscopy was used to investigate the spectral properties of collagen and polyproline, which both contain PPII‐type structures. Their native spectra were compared as representatives of PPII structures. In addition, their spectra before and after treatment with various conditions to produce unfolded or denatured structures were also compared, with the aim of defining the differences between CD spectra of PPII and disordered structures. We conclude that the spectral features of collagen are more appropriate than those of polyproline for use as the representative spectrum for PPII structures present in typical amino acid‐containing proteins, and that the single most characteristic spectroscopic feature distinguishing a PPII structure from a disordered structure is the presence of a positive peak around 220nm in the former but not in the latter. These spectra are now available for inclusion in new reference data sets used for CD analyses of the secondary structures of soluble proteins.  相似文献   

14.
ANTHEPROT is a fully interactive program devoted to the analysis of protein structures using a graphics workstation. It presents four options: The first option can predict secondary structures using five methods, and hydrophobicity, solvent accessibility, flexibility and antigenicity profiles using eighteen scales. The user may introduce his own scales. The results displayed on the screen can be easily analyzed. The second option is for representing results concerning up to eight proteins by one method. To compare these proteins, it is possible to align the profiles or the predicted secondary structure according to various motifs. The secondary structure deduced from crystallographic data may also be introduced. The third option is designed to compare the primary structure of two proteins and to visualize on the screen regions that exhibit similarity. Six different comparison matrices may be used, but the user can also introduce his own matrices. The last option is for studying the proteolytic peptides resulting from a chemical or enzymatic digestion of a given protein. It is possible to analyze the protein cleavage using eleven chemical reagents or enzymes. The results are displayed on the screen as RP-HPLC chromatogram.  相似文献   

15.
The amino acid sequences of ribosomal proteins L1, L14, L15, L23, L24 and L29 from Bacillus stearothermophilus have been completely determined. This has been achieved by sequence analyses of peptides derived from enzymatic digestions of the proteins with trypsin, chymotrypsin, pepsin, Staphylococcus aureus protease, and Armillaria mellea protease as well as by chemical cleavage with hydroxylamine and cyanogen bromide. Based on the primary structures of the six proteins, their secondary structures were predicted using four different computer prediction programs. A comparison of the amino acid sequences of the studied proteins from B. stearothermophilus with the homologous proteins from Escherichia coli revealed that in four proteins (L1, L15, L24 and L29) between 40-50% of the residue in the sequences are identical, whereas this value is significantly higher (69%) for L14 and lower (28%) for L23. The distribution of those amino acid residues which are identical in the corresponding proteins from the two bacteria is not random along the protein chain: some regions are highly conserved whereas others are not. This finding indicates that the regions which are conserved during evolution are important for the spatial structure and/or function of the protein.  相似文献   

16.
The nucleotide frequencies in the second codon positions of genes are remarkably different for the coding regions that correspond to different secondary structures in the encoded proteins, namely, helix, beta-strand and aperiodic structures. Indeed, hydrophobic and hydrophilic amino acids are encoded by codons having U or A, respectively, in their second position. Moreover, the beta-strand structure is strongly hydrophobic, while aperiodic structures contain more hydrophilic amino acids. The relationship between nucleotide frequencies and protein secondary structures is associated not only with the physico-chemical properties of these structures but also with the organisation of the genetic code. In fact, this organisation seems to have evolved so as to preserve the secondary structures of proteins by preventing deleterious amino acid substitutions that could modify the physico-chemical properties required for an optimal structure.  相似文献   

17.
Supersecondary structures of proteins have been systematically searched and classified, but not enough attention has been devoted to such large edifices beyond the basic identification of secondary structures. The objective of the present study is to show that the association of secondary structures that share some of their backbone residues is a commonplace in globular proteins, and that such deeper fusion of secondary structures, namely extended secondary structures (ESSs), helps stabilize the original secondary structures and the resulting tertiary structures. For statistical purposes, a set of 163 proteins from the protein databank was randomly selected and a few specific cases are structurally analyzed and characterized in more detail. The results point that about 30% of the residues from each protein, on average, participate in ESS. Alternatively, for the specific cases considered, our results were based on the secondary structures produced after extensive Molecular Dynamics simulation of a protein–aqueous solvent system. Based on the very small width of the time distribution of the root mean squared deviations, between the ESS taken along the simulation and the ESS from the mean structure of the protein, for each ESS, we conclude that the ESSs significantly increase the conformational stability by forming very stable aggregates. The ubiquity and specificity of the ESS suggest that the role they play in the structure of proteins, including the domains formation, deserves to be thoroughly investigated.  相似文献   

18.
A procedure for classifying proteins of known sequence into structurally similar groups was developed on the basis of the Argos parametric approach. It is shown that stefins and cystatins constitute two structurally well resolved, but homologous groups of proteins. Furthermore, it is very probable that segments of secondary structures within each family are conserved, although significant differences between stefins and cystatins are indicated at the level of secondary structure. Next, secondary structures of all sequenced stefins and cystatins were predicted and used in the construction of secondary structures of the "typical stefin" and the "typical cystatin". Results were interpreted in the light of evolution and inhibition mechanism: Alignment of the "typical stefin" versus the "typical cystatin" secondary structure segments suggests that the divergence of stefin and cystatin families did not occur by a gene fusion event, but only by a mechanism of substitution, insertion and/or deletion. The central region of low-molecular mass cystatins, which is assumed to interact with cysteine proteinases, is predicted to be in a beta-sheet conformation. This resembles the beta-sheet in the active site of "standard mechanism" serine proteinases inhibitors.  相似文献   

19.
The most popular algorithms employed in the pairwise alignment of protein primary structures (Smith-Watermann (SW) algorithm, FASTA, BLAST, etc.) only analyze the amino acid sequence. The SW algorithm is the most accurate, yielding alignments that agree best with superimpositions of the corresponding spatial structures of proteins. However, even the SW algorithm fails to reproduce the spatial structure alignment when the sequence identity is lower than 30%. The objective of this work was to develop a new and more accurate algorithm taking the secondary structure of proteins into account. The alignments generated by this algorithm and having the maximal weight with the secondary structure considered proved to be more accurate than SW alignments. With sequences having less than 30% identity, the accuracy (i.e., the portion of reproduced positions of a reference alignment obtained by superimposing the protein spatial structures) of the new algorithm is 58 vs. 35% of the SW algorithm. The accuracy of the new algorithm is much the same with secondary structures established experimentally or predicted theoretically. Hence, the algorithm is applicable to proteins with unknown spatial structures. The program is available at ftp://194.149.64.196/STRUSWER/.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号