首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Secondary structure prediction for aligned RNA sequences   总被引:19,自引:0,他引:19  
Most functional RNA molecules have characteristic secondary structures that are highly conserved in evolution. Here we present a method for computing the consensus structure of a set aligned RNA sequences taking into account both thermodynamic stability and sequence covariation. Comparison with phylogenetic structures of rRNAs shows that a reliability of prediction of more than 80% is achieved for only five related sequences. As an application we show that the Early Noduline mRNA contains significant secondary structure that is supported by sequence covariation.  相似文献   

We present heuristic-based predictions of the secondary and tertiary structures of the cyclins A, B, and D, representatives of the cyclin superfamily. The list of suggested constraints for tertiary structure assembly was left unrefined in order to submit this report before an announced crystal structure for cyclin A becomes available. To predict these constraints, a master sequence alignment over 270 positions of cyclin types A, B, and D was adjusted based on individual secondary structure predictions for each type. We used new heuristics for predicting aromatic residues at protein-protein interfaces and to identify sequentially distinct regions in the protein chain that cluster in the folded structure. The boundaries of two conjectured domains in the cyclin fold were predicted based on experimental data in the literature. The domain that is important for interaction of the cyclins with cyclin-dependent kinases (CDKs) is predicted to contain six helices; the second domain in the consensus model contains both helices and a β-sheet that is formed by sequentially distant regions in the protein chain. A plausible phosphorylation site is identified. This work represents a blinded test of the method for prediction of secondary and, to a lesser extent, tertiary structure from a set of homologous protein sequences. Evaluation of our predictions will become possible with the publication of the announced crystal structure.  相似文献   

Computational tools for prediction of the secondary structure of two or more interacting nucleic acid molecules are useful for understanding mechanisms for ribozyme function, determining the affinity of an oligonucleotide primer to its target, and designing good antisense oligonucleotides, novel ribozymes, DNA code words, or nanostructures. Here, we introduce new algorithms for prediction of the minimum free energy pseudoknot-free secondary structure of two or more nucleic acid molecules, and for prediction of alternative low-energy (sub-optimal) secondary structures for two nucleic acid molecules. We provide a comprehensive analysis of our predictions against secondary structures of interacting RNA molecules drawn from the literature. Analysis of our tools on 17 sequences of up to 200 nucleotides that do not form pseudoknots shows that they have 79% accuracy, on average, for the minimum free energy predictions. When the best of 100 sub-optimal foldings is taken, the average accuracy increases to 91%. The accuracy decreases as the sequences increase in length and as the number of pseudoknots and tertiary interactions increases. Our algorithms extend the free energy minimization algorithm of Zuker and Stiegler for secondary structure prediction, and the sub-optimal folding algorithm by Wuchty et al. Implementations of our algorithms are freely available in the package MultiRNAFold.  相似文献   

The binding of the intermediate proteins φ1 and φ3 from the mussel Mytilus edulis to DNA was studied in comparison with the typical protamine from the squid Loligo vuigaris using precipitation curves, thermal denaturation and X-ray diffraction techniques. The properties of protein φ1 appear to be very close to those of typical protamines while the properties of protein φ3 are notably different. The method of reconstitution influences the structural properties of the complexes. This effect is most pronounced in the case of protein φ3. The structural heterogeneity of the protein component in the complexes is discussed in the light of these observations.  相似文献   

Chromatin structure and dynamics: functional implications   总被引:4,自引:0,他引:4  

The physicochemical mechanism of protein folding has been elucidated by the island model, describing a growth type of folding. The folding pathway is closely related with nucleation on the polypeptide chain and thus the formation of small local structures or secondary structures at the earliest stage of folding is essential to all following steps. The island model is applicable to any protein, but a high precision of secondary structure prediction is indispensable to folding simulation. The secondary structures formed at the earliest stage of folding are supposed to be of standard form, but they are usually deformed during the folding process, especially at the last stage, although the degree of deformation is different for each protein. Ferredoxin is an example of a protein having this property. According to X-ray investigation (1FDX), ferredoxin is not supposed to have secondary structures. However, if we assumed that in ferredoxin all the residues are in a coil state, we could not attain the correct structure similar to the native one. Further, we found that some parts of the chain are not flexible, suggesting the presence of secondary structures, in agreement with the recent PDB data (1DUR). Assuming standard secondary structures (-helices and -strands) at the nonflexible parts at the early stage of folding, and deforming these at the final stage, a structure similar to the native one was obtained. Another peculiarity of ferredoxin is the absence of disulfide bonds, in spite of its having eight cysteines. The reason cysteines do not form disulfide bonds became clear by applying the lampshade criterion, but more importantly, the two groups of cysteines are ready to make iron complexes, respectively, at a rather later stage of folding. The reason for poor prediction accuracy of secondary structure with conventional methods is discussed.  相似文献   

The Chou-Fasman predictive algorithm for determining the secondary structure of proteins from the primary sequence is reviewed. Many examples of its use are presented which illustrate its wide applicability, such as predicting (a) regions with the potential for conformational change, (b) sequences which are capable of assuming several conformations in different environments, (c) effects of single amino acid mutations, (d) amino acid replacements in synthesis of peptides to bring about a change in conformation, (e) guide to the synthesis of polypeptides with definitive secondary structure,e.g. signal sequences, (f) conformational homologues from varying sequences and (g) the amino acid requirements for amphiphilicα-helical peptides.  相似文献   

Tom Defay  Fred E. Cohen 《Proteins》1995,23(3):431-445
The results of a protein structure prediction contest are reviewed. Twelve different groups entered predictions on 14 proteins of known sequence whose structures had been determined but not yet disseminated to the scientific community. Thus, these represent true tests of the current state of structure prediction methodologies. From this work, it is clear that accurate tertiary structure prediction is not yet possible. However, protein fold and motif prediction are possible when the motif is recognizably similar to another known structure. Internal symmetry and the information inherent in an aligned family of homologous sequences facilitate predictive efforts. Novel folds remain a major challenge for prediction efforts. © 1995 Wiley-Liss, Inc.  相似文献   

Fang X  Luo Z  Yuan B  Wang J 《Bioinformation》2007,2(5):222-229
The prediction of RNA secondary structure can be facilitated by incorporating with comparative analysis of homologous sequences. However, most of existing comparative methods are vulnerable to alignment errors and thus are of low accuracy in practical application. Here we improve the prediction of RNA secondary structure by detecting and assessing conserved stems shared by all sequences in the alignment. Our method can be summarized by: 1) we detect possible stems in single RNA sequence using the so-called position matrix with which some possibly paired positions can be uncovered; 2) we detect conserved stems across multiple RNA sequences by multiplying the position matrices; 3) we assess the conserved stems using the Signal-to-Noise; 4) we compute the optimized secondary structure by incorporating the so-called reliable conserved stems with predictions by RNAalifold program. We tested our method on data sets of RNA alignments with known secondary structures. The accuracy, measured as sensitivity and specificity, of our method is greater than predictions by RNAalifold.  相似文献   

Using a test set of 13 small, compact proteins, we demonstrate that a remarkably simple protocol can capture native topology from secondary structure information alone, in the absence of long-range interactions. It has been a long-standing open question whether such information is sufficient to determine a protein's fold. Indeed, even the far simpler problem of reconstructing the three-dimensional structure of a protein from its exact backbone torsion angles has remained a difficult challenge owing to the small, but cumulative, deviations from ideality in backbone planarity, which, if ignored, cause large errors in structure. As a familiar example, a small change in an elbow angle causes a large displacement at the end of your arm; the longer the arm, the larger the displacement. Here, correct secondary structure assignments (alpha-helix, beta-strand, beta-turn, polyproline II, coil) were used to constrain polypeptide backbone chains devoid of side chains, and the most stable folded conformations were determined, using Monte Carlo simulation. Just three terms were used to assess stability: molecular compaction, steric exclusion, and hydrogen bonding. For nine of the 13 proteins, this protocol restricts the main chain to a surprisingly small number of energetically favorable topologies, with the native one prominent among them.  相似文献   

神经网络在蛋白质二级结构预测中的应用   总被引:3,自引:0,他引:3  
介绍了蛋白质二级结构预测的研究意义,讨论了用在蛋白质二级结构预测方面的神经网络设计问题,并且较详尽地评述了近些年来用神经网络方法在蛋白质二级结构预测中的主要工作进展情况,展望了蛋白质结构预测的前景。  相似文献   

Consideration has been given to possible sequences of nucleosomes which can produce a ‘thick fibre’-like structure. Only a few basic requirements were imposed: (i) the thick fibre is a regular single helix with about 7 nucleosomes per turn; (ii) the nucleosomes are equidistant along the polynuclesome chain; (iii) the helix is flexible having variable pitch. It was found that in addition to the straightforward sequential arrangement there is only one other nonsequential arrangement which satisfies these requirements. This is a helix with around 8 nucleosomes per turn in which all nucleosomes are identically placed. It is possible in the region of 200 to 218 ± 10 base pairs (b.p.) DNA repeats lengths. The linker DNA is straight or almost straight and crosses the internal ‘hollow’ cylinder which is not occupied by nucleosomes. This structure satisfies the experimental data for the distance distribution function, and the observed mass per unit length and changes noted in the mass per unit length. Further, if it is assumed that the core particle axis of symmetry is in the plane of the two linkers and bisects them then this makes the core particles oblique to the thick fibre radii with alternate angles of ± 20 to 30°. This orientation of the nucleosomes can explain the DNA digestion patterns obtained with DNase II and with DNase I.  相似文献   

Protein S (PS) and growth arrest specific factor 6 (GAS6) are vitamin K-dependent proteins with similar structures. They are mosaic proteins possessing a carboxyl-terminal region presenting sequence similarity with plasma sex hormone binding globulin (plasma SHBG), although apparently not involved in steroid binding. The SHBG-like modules have sequence similarity with the G repeats of the chain A of laminin. Laminin G repeats have been reported to contain mainly β-strands (about 40–50%) but no or little α structure by circular dichroism (CD) spectroscopy. Secondary structure predictions carried out in the present work unexpectedly showed a 20 to 27% helices content in the SHBG region of PS/GAS6 (about 100 residues), while plasma SHBG and laminin G repeats had around 10% helices. CD measurements for human PS indicated also that its SHBG region had about 100 residues in α-helical structure. These data suggest that the SHBG region of PS/GAS6 on the one hand, and the laminin G repeats and possibly plasma SHBG on the other hand, could present important structural differences. Previously reported polymorphisms and point mutations leading to PS deficiency and thrombophilia have been analyzed with our structural predictions. We found a good agreement between these structural predictions, CD measurements, experimental and clinical data. This information allows us to gain insights into the three-dimensional structure of PS that will be helpful for the design of new experiments and future clinical investigations. Proteins 29:478–491, 1997. © 1997 Wiley-Liss, Inc.  相似文献   

目前蛋白质二级结构的预测准确率徘徊在75%左右,难以作进一步提高。本文通过统计学的方法,对蛋白质的冗余数据库进行了分析。并由此证明,目前影响预测准确率继续的真正原因是蛋白质数据库本身的系统误差,系统误差大约为25%。而该误差是由于实验条件的客观原因带来的。  相似文献   

A novel method for predicting the secondary structures of proteins from amino acid sequence has been presented. The protein secondary structure seqlets that are analogous to the words in natural language have been extracted. These seqlets will capture the relationship between amino acid sequence and the secondary structures of proteins and further form the protein secondary structure dictionary. To be elaborate, the dictionary is organism-specific. Protein secondary structure prediction is formulated as an integrated word segmentation and part of speech tagging problem. The word-lattice is used to represent the results of the word segmentation and the maximum entropy model is used to calculate the probability of a seqlet tagged as a certain secondary structure type. The method is markovian in the seqlets, permitting efficient exact calculation of the posterior probability distribution over all possible word segmentations and their tags by viterbi algorithm. The optimal segmentations and their tags are computed as the results of protein secondary structure prediction. The method is applied to predict the secondary structures of proteins of four organisms respectively and compared with the PHD method. The results show that the performance of this method is higher than that of PHD by about 3.9% Q3 accuracy and 4.6% SOV accuracy. Combining with the local similarity protein sequences that are obtained by BLAST can give better prediction. The method is also tested on the 50 CASP5 target proteins with Q3 accuracy 78.9% and SOV accuracy 77.1%. A web server for protein secondary structure prediction has been constructed which is available at http://www.insun.hit.edu.cn:81/demos/biology/index.html.  相似文献   

1 Introduction The prediction of protein structure and function from amino acid sequences is one of the most impor-tant problems in molecular biology. This problem is becoming more pressing as the number of known pro-tein sequences is explored as a result of genome and other sequencing projects, and the protein sequence- structure gap is widening rapidly[1]. Therefore, com-putational tools to predict protein structures are needed to narrow the widening gap. Although the prediction of three dim…  相似文献   

蛋白质二级结构预测是蛋白质结构研究的一个重要环节,大量的新预测方法被提出的同时,也不断有新的蛋白质二级结构预测服务器出现。试验选取7种目前常用的蛋白质二级结构预测服务器:PSRSM、SPOT-1D、MUFOLD、Spider3、RaptorX,Psipred和Jpred4,对它们进行了使用方法的介绍和预测效果的评估。随机选取了PDB在2018年8月至11月份发布的180条蛋白质作为测试集,评估角度为:Q3、Sov、边界识别率、内部识别率、转角C识别率,折叠E识别率和螺旋H识别率七种角度。上述服务器180条测试数据的Q3结果分别为:89.96%、88.18%、86.74%、85.77%、83.61%,79.72%和78.29%。结果表明PSRSM的预测结果最好。180条测试集中,以同源性30%,40%,70%分类的实验结果中,PSRSM的Q3结果分别为:89.49%、90.53%、89.87%,均优于其他服务器。实验结果表明,蛋白质二级结构预测可从结合多种深度学习方法以及使用大数据训练模型方向做进一步的研究。  相似文献   

We describe a new method for polyproline II-type (PPII) secondary structure prediction based on tetrapeptide conformation properties using data obtained from all globular proteins in the Protein Data Bank (PDB). This is the first method for PPII prediction with a relatively high level of accuracy (approximately 60%). Our method uses only frequencies of different conformations among oligopeptides without any additional parameters. We also attempted to predict alpha-helices and beta-strands using the same approach. We find that the application of our method reveals interrelation between sequence and structure even for very short oligopeptides (tetrapeptides).  相似文献   

Electrophoretic analyses of acid extracts from mature sperm of newt, Cynops pyrrhogaster, on acid/urea/Triton X-100 polyacrylamide gel showed the exclusive occurrence of sperm-specific nuclear basic proteins (SBPs), which moved faster than somatic histones on the gel. These SBPs were eluted separately by reversed phase-high-performance liquid chromatography as two large peaks and a few small peaks. Of these, only the small peaks disappeared with treatment of the acid extracts with alkaline phosphatase before they were injected into the column, so that there were only two distinct components: NP1 and NP2. Determination of amino acid sequences by the Edman method as well as by sequencing of cDNA for both components indicated that each protein consisted of 43 (NP1) or 48 (NP2) amino acid residues, rich in arginine residues (53.5% in NP1; 47.9% in NP2), forming the clusters. They had molecular masses of 5,386 Da (NP1) and 5,748 Da (NP2), respectively. Northern blot analysis using cDNAs as probes indicated that mRNAs for both NP1 and NP2 occurred not in primary spermatocytes but in round spermatids. In situ hybridization analyses using antisense RNA for NP1 as a probe clearly showed the first appearance of NP1 mRNA at the late stage of round spermatid. Mol. Reprod. Dev. 46:243–251, 1997. © 1997 Wiley-Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号