首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
    
Does the amino acid use at the terminal positions of an α‐helix become altered depending on the context—more specifically, when there is an adjoining 310‐helix, and can a single helical cylinder encompass the resultant composite helix? An analysis of 138 and 107 cases of 310–α and α–310 composite helices, respectively, found in known protein structures indicate that the secondary structural element occurring first imposes its characteristics on the sequence of the structural element coming next. Thus, when preceded by a 310‐helix, the preference of proline to occur at the N1 position of an α‐helix is shifted to the N2 position, a typical characteristic of the C‐terminal capping of the 310‐helix. When an α‐ or a 310‐helix leads into a helix of the other type, there is a bend at the junction, especially for the 310–α composite, with the two junction residues facing inward and buried within the structure. Thus a single helical cylinder may not properly represent a composite helix, the bend providing a means for the tertiary structure to assume a globular shape, very much akin to what a proline‐induced kink does to an α‐helix. The tertiary structural context in which β–310 and 310–β composites occurs can be different, causing the angle between the secondary structural elements in the two cases to be different. Composites of 310‐helices and β‐strands are much more conserved among members in families of homologous structures than those between two types of helices; in many of the former instances, the 310‐helix constitutes the loops in β‐hairpin or β–β‐corner motifs. The overall fold of the chain may be more conserved than the actual identify of the secondary structure elements in a composite. © 2005 Wiley Periodicals, Inc. Biopolymers 78: 147–162, 2005 This article was originally published online as an accepted preprint. The “Published Online” date corresponds to the preprint version. You can request a copy of the preprint by emailing the Biopolymers editorial office at biopolymers@wiley.com  相似文献   

3.
    
Chameleon sequences (ChSeqs) refer to sequence strings of identical amino acids that can adopt different conformations in protein structures. Researchers have detected and studied ChSeqs to understand the interplay between local and global interactions in protein structure formation. The different secondary structures adopted by one ChSeq challenge sequence‐based secondary structure predictors. With increasing numbers of available Protein Data Bank structures, we here identify a large set of ChSeqs ranging from 6 to 10 residues in length. The homologous ChSeqs discovered highlight the structural plasticity involved in biological function. When compared with previous studies, the set of unrelated ChSeqs found represents an about 20‐fold increase in the number of detected sequences, as well as an increase in the longest ChSeq length from 8 to 10 residues. We applied secondary structure predictors on our ChSeqs and found that methods based on a sequence profile outperformed methods based on a single sequence. For the unrelated ChSeqs, the evolutionary information provided by the sequence profile typically allows successful prediction of the prevailing secondary structure adopted in each protein family. Our dataset will facilitate future studies of ChSeqs, as well as interpretations of the interplay between local and nonlocal interactions. A user‐friendly web interface for this ChSeq database is available at prodata.swmed.edu/chseq .  相似文献   

4.
By sequencing the entire ribosomal RNA (rRNA) gene region of Nosema heliothidis isolated from cotton bollworm (Helicoverpa armigera), we showed that its gene organization is similar to the type species, Nosema bombycis: the 5'-large subunit rRNA (2,490 bp)-internal transcribed spacer (192 bp)-small subunit rRNA (1,232 bp)-intergenic spacer (274 bp)-5S rRNA (115 bp)-3'. We constructed two phylogenetic trees, analyzed phylogenetic relationships, examined rRNA organization of microsporidia, and compared the secondary structure of small subunit rRNA with closely related microsporidia. The latter two features may provide important information for the classification and phylogenetic analysis of microsporidia.  相似文献   

5.
C Sander  R Schneider 《Proteins》1991,9(1):56-68
The database of known protein three-dimensional structures can be significantly increased by the use of sequence homology, based on the following observations. (1) The database of known sequences, currently at more than 12,000 proteins, is two orders of magnitude larger than the database of known structures. (2) The currently most powerful method of predicting protein structures is model building by homology. (3) Structural homology can be inferred from the level of sequence similarity. (4) The threshold of sequence similarity sufficient for structural homology depends strongly on the length of the alignment. Here, we first quantify the relation between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homology threshold curve as a function of alignment length. We then produce a database of homology-derived secondary structure of proteins (HSSP) by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the number of known protein structures by a factor of five to more than 1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homology.  相似文献   

6.
The most popular algorithms employed in the pairwise alignment of protein primary structures (Smith-Watermann (SW) algorithm, FASTA, BLAST, etc.) only analyze the amino acid sequence. The SW algorithm is the most accurate, yielding alignments that agree best with superimpositions of the corresponding spatial structures of proteins. However, even the SW algorithm fails to reproduce the spatial structure alignment when the sequence identity is lower than 30%. The objective of this work was to develop a new and more accurate algorithm taking the secondary structure of proteins into account. The alignments generated by this algorithm and having the maximal weight with the secondary structure considered proved to be more accurate than SW alignments. With sequences having less than 30% identity, the accuracy (i.e., the portion of reproduced positions of a reference alignment obtained by superimposing the protein spatial structures) of the new algorithm is 58 vs. 35% of the SW algorithm. The accuracy of the new algorithm is much the same with secondary structures established experimentally or predicted theoretically. Hence, the algorithm is applicable to proteins with unknown spatial structures. The program is available at ftp://194.149.64.196/STRUSWER/.  相似文献   

7.
The profile method, for detecting distantly related proteins by sequence comparison, has been extended to incorporate secondary structure information from known X-ray structures. The sequence of a known structure is aligned to sequences of other members of a given folding class. From the known structure, the secondary structure (alpha-helix, beta-strand or "other") is assigned to each position of the aligned sequences. As in the standard profile method, a position-dependent scoring table, termed a profile, is calculated from the aligned sequences. However, rather than using the standard Dayhoff mutation table in calculating the profile, we use distinct amino acid mutation tables for residues in alpha-helices, beta-strands or other secondary structures to calculate the profile. In addition, we also distinguish between internal and external residues. With this new secondary structure-based profile method, we created a profile for eight-stranded, antiparallel beta barrels of the insecticyanin folding class. It is based on the sequences of retinol-binding protein, insecticyanin and beta-lactoglobulin. Scanning the sequence database with this profile, it was possible to detect the sequence of avidin. The structure of streptavidin is known, and it appears to be distantly related to the antiparallel beta barrels. Also detected is the sequence of complement component C8, which we therefore predict to be a member of this folding class.  相似文献   

8.
Analysis of sequence requirements for protein tyrosine sulfation.   总被引:5,自引:0,他引:5       下载免费PDF全文
We analyzed sequences surrounding known tyrosine sulfation sites to determine the characteristics that distinguish these sites from those that do not undergo sulfation. Tests evaluated the number and position of acidic, basic, hydrophobic, and small amino acids, as well as disulfide and N-glycosylation (sugar) sites. We determined that composition-based tests that select close to 100% of known tyrosine sulfation sites reject 97% of the non-sulfated tyrosines. The acidic test, by far the most selective, eliminated 95% of the non-sulfated tyrosine residues and none of the sulfated tyrosines. Including the basic, hydrophobic, and disulfide tests increased the elimination rate to 97%. Whereas no position flanking the tyrosine residues had the same amino acid always present, imperfectly conserved amino acids found in some positions will improve the specificity of the tests.  相似文献   

9.
Following the original idea of Maynard Smith on evolution of the protein sequence space, a novel tool is developed that allows the "space walk", from one sequence to its likely evolutionary relative and further on. At a given threshold of identity between consecutive steps, the walks of many steps are possible. The sequences at the ends of the walks may substantially differ from one another. In a sequence space of randomized (shuffled) sequences the walks are very short. The approach opens new perspectives for protein evolutionary studies and sequence annotation.  相似文献   

10.
Although it is known that three-dimensional structure is well conserved during the evolutionary development of proteins, there have been few studies that consider other parameters apart from divergence of the main-chain coordinates. In this study, we align the structures of 90 pairs of homologous proteins having sequence identities ranging from 5 to 100%. Their structures are compared as a function of sequence identity, including not only consideration of C alpha coordinates but also accessibility, Ooi numbers, secondary structure, and side-chain angles. We discuss how these properties change as the sequences become less similar. This will be of practical use in homology modeling, especially for modeling very distantly related or analogous proteins. We also consider how the average size and number of insertions and deletions vary as sequences diverge. This study presents further quantitative evidence that structure is remarkably well conserved in detail, as well as at the topological level, even when the sequences do not show similarity that is significant statistically.  相似文献   

11.
同义密码子的反常蛋白质二级结构偏好性   总被引:1,自引:0,他引:1  
统计分析了 119种人蛋白质和 92种大肠杆菌蛋白质的mRNA序列和蛋白质二级结构的关系 .从二肽频数出发 ,研究了同义密码子使用对蛋白质二级结构的影响 ,证明其影响在 10 %到 2 0 %的量级 .对于人和大肠杆菌 ,在 90 %置信水平上 ,4 0 0对二肽中分别有 79对和 6 0对 ,在 95 %置信水平上 ,分别有 4 5对和 36对二肽的相应密码子二联体具有不同于氨基酸的反常二级结构偏好性 ,并且这种反常不能归因于随机涨落  相似文献   

12.
13.
Clustal W—蛋白质与核酸序列分析软件   总被引:2,自引:1,他引:2  
蛋白质与核酸的序列分析在现代生物学和生物信息学中发挥着重要作用,新的算法和软件层出不穷,本文介绍一个可运行在PC机上的完全免费的多序列比较软件-ClustalW,它不但可以进行蛋白质与核酸的多序列比较,分析不同序列之间的相似性关系,还可以绘制进化树。由于其灵活的输入输出格式、方便的参数设定和选择、详尽的在线帮助以及良好的可移植性,使得ClustalW在蛋白质与核酸的序列分析中得到了广泛应用。  相似文献   

14.
M protein is considered a virulence determinant on the streptococcal cell wall by virtue of its ability to allow the organism to resist attack by human neutrophils. The complete DNA sequence of the M6 gene from streptococcal strain D471 has allowed, for the first time, the study of the structural characteristics of the amino acid sequence of an entire M protein molecule. Predictive secondary structural analysis revealed that the majority of this fibrillar molecule exhibits strong alpha-helical potential and that, except for the ends, nonpolar residues in the central region of the molecule exhibit the 7-residue periodicity typical for coiled-coil proteins. Differences in this heptad pattern of nonpolar residues allow this central rod region to be divided into three subdomains which correlate essentially with the repeat regions A, B, and C/D in the M6 protein sequence. Alignment of the N-terminal half of the M6 sequence with PepM5, the N-terminal half of the M5 protein, revealed that 42% of the amino acids were identical. The majority of the identities were "core" nonpolar residues of the heptad periodicity which are necessary for the maintenance of the coiled coil. Thus, conservation of structure in a sequence-variable region of these molecules may be biologically significant. Results suggest that serologically different M proteins may be built according to a basic scheme: an extended central coiled-coil rod domain (which may vary in size among strains) flanked by functional end domains.  相似文献   

15.
蛋白质序列的编码是亚细胞定位预测问题中的关键技术之一。该文较为详细地介绍了目前已有的蛋白质序列编码算法;并指出了序列编码中存在的一些问题及可能的发展方向。  相似文献   

16.
蛋白质结构型的定义和识别   总被引:4,自引:1,他引:4       下载免费PDF全文
提出紧结构域的概念,由二级结构序列中一段或几段连续的α螺旋和β折叠构成的空间紧密堆集的最大折叠体称为紧结构域.利用3种紧结构域(α域,β域和α/β域)定义球蛋白的5种结构型:α型蛋白,β型蛋白,α/β型蛋白,多域蛋白和ζ型蛋白.将1 261个代表性的蛋白质(1 022家族)进行分类,并和SCOP库的分类做了比较.进行了删去序列冗余的分析.在此基础上提出结构型的预测方案,成功率在82%~85%.  相似文献   

17.
    
Wang J  Feng JA 《Proteins》2005,58(3):628-637
Sequence alignment has become one of the essential bioinformatics tools in biomedical research. Existing sequence alignment methods can produce reliable alignments for homologous proteins sharing a high percentage of sequence identity. The performance of these methods deteriorates sharply for the sequence pairs sharing less than 25% sequence identity. We report here a new method, NdPASA, for pairwise sequence alignment. This method employs neighbor-dependent propensities of amino acids as a unique parameter for alignment. The values of neighbor-dependent propensity measure the preference of an amino acid pair adopting a particular secondary structure conformation. NdPASA optimizes alignment by evaluating the likelihood of a residue pair in the query sequence matching against a corresponding residue pair adopting a particular secondary structure in the template sequence. Using superpositions of homologous proteins derived from the PSI-BLAST analysis and the Structural Classification of Proteins (SCOP) classification of a nonredundant Protein Data Bank (PDB) database as a gold standard, we show that NdPASA has improved pairwise alignment. Statistical analyses of the performance of NdPASA indicate that the introduction of sequence patterns of secondary structure derived from neighbor-dependent sequence analysis clearly improves alignment performance for sequence pairs sharing less than 20% sequence identity. For sequence pairs sharing 13-21% sequence identity, NdPASA improves the accuracy of alignment over the conventional global alignment (GA) algorithm using the BLOSUM62 by an average of 8.6%. NdPASA is most effective for aligning query sequences with template sequences whose structure is known. NdPASA can be accessed online at http://astro.temple.edu/feng/Servers/BioinformaticServers.htm.  相似文献   

18.
19.
Complementary (c)DNA coding for an insect yolk protein, the egg-specific protein of the silkworm Bombyx mori was cloned and the nucleotide sequence determined. The sequence covers the entire coding region of 1,677 base pairs with 5′ and 3′ noncoding regions (21 and 115 base pairs, respectively). The deduced amino acid sequence of the egg-specific protein consists of 559 amino acid residues. The NH2-terminal 18 amino acid sequence is enriched in hydrophobic amino acids and assumed to be a signal peptide. A sequence, Asn-X-Thr, a potential N-linked glycosylation site, is found at positions 191 to 193. A serine-rich domain is localized in the region from 63 to 90, in which phosphorylation takes place. Cys His motif in 405 to 415 is analogous to a proposed metal binding sequence. Lys132-Asn133 and Arg228-Asp229 are probably the sites cleaved by the egg-specific protein protease that appears during embryogenesis. The derived amino acid sequence has no appreciable homology to other sequenced proteins.  相似文献   

20.
Ion pairs contribute to several functions including the activity of catalytic triads, fusion of viral membranes, stability in thermophilic proteins and solvent-protein interactions. Furthermore, they have the ability to affect the stability of protein structures and are also a part of the forces that act to hold monomers together. This paper deals with the possible ion pair combinations and networks in 25% and 90% non-redundant protein chains. Different types of ion pairs present in various secondary structural elements are analysed. The ion pairs existing between different subunits of multisubunit protein structures are also computed and the results of various analyses are presented in detail. The protein structures used in the analysis are solved using X-ray crystallography, whose resolution is better than or equal to 1.5 A and R-factor better than or equal to 20%. This study can, therefore, be useful for analyses of many protein functions. It also provides insights into the better understanding of the architecture of protein structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号