The prediction of RNA structure is useful for understanding evolution for both in silico and in vitro studies. Physical methods like NMR studies to predict RNA secondary structure are expensive and difficult. Computational RNA secondary structure prediction is easier. Comparative sequence analysis provides the best solution. But secondary structure prediction of a single RNA sequence is challenging. RNA-SSPT is a tool that computationally predicts secondary structure of a single RNA sequence. Most of the RNA secondary structure prediction tools do not allow pseudoknots in the structure or are unable to locate them. Nussinov dynamic programming algorithm has been implemented in RNA-SSPT. The current studies shows only energetically most favorable secondary structure is required and the algorithm modification is also available that produces base pairs to lower the total free energy of the secondary structure. For visualization of RNA secondary structure, NAVIEW in C language is used and modified in C# for tool requirement. RNA-SSPT is built in C# using Dot Net 2.0 in Microsoft Visual Studio 2005 Professional edition. The accuracy of RNA-SSPT is tested in terms of Sensitivity and Positive Predicted Value. It is a tool which serves both secondary structure prediction and secondary structure visualization purposes.  相似文献   

随机文法模型在RNA二级结构预测中的应用   总被引:1,自引:0,他引:1  
RNA二级结构的研究是当今计算分子生物学的一个重要课题,基于比较序列分析方法的随机文法模型预测RNA二级结构具有准确率高,能对假结建模,但不易实施等特点,本文通过分析随机文法对RNA二级结构建模的过程,提出了一种综合利用比较序列方法,随机文法方法,词条方法预测RNA二级结构的方案.  相似文献   

RNA二级结构预测系统构建   总被引:9,自引:0,他引:9  
运用下列RNA二级结构预测算法:碱基最大配对方法、Zuker极小化自由能方法、螺旋区最优堆积、螺旋区随机堆积和所有可能组合方法与基于一级螺旋区的RNA二级结构绘图技术, 构建了RNA二级结构预测系统Rnafold. 另外, 通过随机选取20个tRNA序列, 从自由能和三叶草结构两个方面比较了前4种二级结构预测算法, 并运用t检验方法分析了自由能的统计学差别. 从三叶草结构来看, 以随机堆积方法最好, 其次是螺旋区最优堆积方法和Zuker算法, 以碱基最大配对方法最差. 最后, 分析了两种极小化自由能方法之间的差别.  相似文献   

Chemical and enzymatic footprinting experiments, such as shape (selective 2′-hydroxyl acylation analyzed by primer extension), yield important information about RNA secondary structure. Indeed, since the -hydroxyl is reactive at flexible (loop) regions, but unreactive at base-paired regions, shape yields quantitative data about which RNA nucleotides are base-paired. Recently, low error rates in secondary structure prediction have been reported for three RNAs of moderate size, by including base stacking pseudo-energy terms derived from shape data into the computation of minimum free energy secondary structure. Here, we describe a novel method, RNAsc (RNA soft constraints), which includes pseudo-energy terms for each nucleotide position, rather than only for base stacking positions. We prove that RNAsc is self-consistent, in the sense that the nucleotide-specific probabilities of being unpaired in the low energy Boltzmann ensemble always become more closely correlated with the input shape data after application of RNAsc. From this mathematical perspective, the secondary structure predicted by RNAsc should be ‘correct’, in as much as the shape data is ‘correct’. We benchmark RNAsc against the previously mentioned method for eight RNAs, for which both shape data and native structures are known, to find the same accuracy in 7 out of 8 cases, and an improvement of 25% in one case. Furthermore, we present what appears to be the first direct comparison of shape data and in-line probing data, by comparing yeast asp-tRNA shape data from the literature with data from in-line probing experiments we have recently performed. With respect to several criteria, we find that shape data appear to be more robust than in-line probing data, at least in the case of asp-tRNA.  相似文献   

RNA分子众多、结构复杂、功能重要,已经成为当前重要的研究热点之一。RNA的功能与结构密切相关,伴随RNA分子及功能的发现,建立了有关RNA二级结构的数据库,一方面有助于理解RNA功能的结构基础,一方面有助于开发各种有关RNA结构的预测模型。本文对近年常见的RNA二级结构数据库作一概述,希望有助于相关工作者更好地了解与应用相关数据。  相似文献   

RNA的二级结构预测是生物信息学中一个已经有30多年历史的经典问题,基于最小自由能模型(MFE)的优化算法是使用最为广泛的方法.但RNA结构中假结的存在使MFE问题理论上成为一个NP-hard问题,即使采用动态规划等优化算法也会面临时间复杂度高的困难,同时研究还发现,由于受RNA折叠动力学机制以及环境因素的影响,真实的RNA二级结构往往并不处于自由能最小状态.根据RNA折叠的特点,提出了一种启发式搜索算法来预测带假结的RNA二级结构.该算法以RNA的茎为基本单元,采用启发式搜索策略在茎的组合空间中搜索自由能最小并且出现频率最高的RNA二级结构,该算法不仅能显著降低搜索RNA二级结构的时间复杂度,还有助于弥补单纯依赖能量预测RNA二级结构的不足.在多种类型的RNA标准数据集上进行了检验,结果表明,该算法在预测的精度上优于目前国际上几个著名的RNA二级结构预测算法并且具有较高的运行效率.  相似文献   

比较序列分析作为RNA二级结构预测的最可靠途径, 已经发展出许多算法。将基于此方法的结构预测视为一个二值分类问题: 根据序列比对给出的可用信息, 判断比对中任意两列能否构成碱基对。分类器采用支持向量机方法, 特征向量包括共变信息、热力学信息和碱基互补比例。考虑到共变信息对序列相似性的要求, 通过引入一个序列相似度影响因子, 来调整不同序列相似度情况下共变信息和热力学信息对预测过程的影响, 提高了预测精度。通过49组Rfam-seed比对的验证, 显示了该方法的有效性, 算法的预测精度优于多数同类算法, 并且可以预测简单的假节。  相似文献   

RNA二级结构的最小自由能算法   总被引:1,自引:0,他引:1  
RNA(即tRNA,rRNA,mRNA和SnRNA)有两大主要功能:一是某些病毒的遗传物质;二是参与蛋白质的合成,这些与细胞分化、代谢、记忆的储存等有重要关系,这些功能与RNA二级结构的稳定性。自由能密切相关.常用的计算自由能的方法有热力学微扰法及热力学微积分法等.本文以寻找最小自由能二级结构为目的,给出了RNA二级结构的最小自由能算法,该算法的时间复杂性不超过O(n^4)。  相似文献   

Previously proposed methods for protein secondary structure prediction from multiple sequence alignments do not efficiently extract the evolutionary information that these alignments contain. The predictions of these methods are less accurate than they could be, because of their failure to consider explicitly the phylogenetic tree that relates aligned protein sequences. As an alternative, we present a hidden Markov model approach to secondary structure prediction that more fully uses the evolutionary information contained in protein sequence alignments. A representative example is presented, and three experiments are performed that illustrate how the appropriate representation of evolutionary relatedness can improve inferences. We explain why similar improvement can be expected in other secondary structure prediction methods and indeed any comparative sequence analysis method.  相似文献   

Sequence conservation and co-variation of base pairs are hallmarks of structured RNAs. For certain RNAs (e.g. riboswitches), a single sequence must adopt at least two alternative secondary structures to effectively regulate the message. If alternative secondary structures are important to the function of an RNA, we expect to observe evolutionary co-variation supporting multiple conformations. We set out to characterize the evolutionary co-variation supporting alternative conformations in riboswitches to determine the extent to which alternative secondary structures are conserved. We found strong co-variation support for the terminator, P1, and anti-terminator stems in the purine riboswitch by extending alignments to include terminator sequences. When we performed Boltzmann suboptimal sampling on purine riboswitch sequences with terminators we found that these sequences appear to have evolved to favor specific alternative conformations. We extended our analysis of co-variation to classic alignments of group I/II introns, tRNA, and other classes of riboswitches. In a majority of these RNAs, we found evolutionary evidence for alternative conformations that are compatible with the Boltzmann suboptimal ensemble. Our analyses suggest that alternative conformations are selected for and thus likely play functional roles in even the most structured of RNAs.  相似文献   

Genetically diverse pathogens (such as Human Immunodeficiency virus type 1, HIV-1) are frequently stratified into phylogenetically or immunologically defined subtypes for classification purposes. Computational identification of such subtypes is helpful in surveillance, epidemiological analysis and detection of novel variants, e.g., circulating recombinant forms in HIV-1. A number of conceptually and technically different techniques have been proposed for determining the subtype of a query sequence, but there is not a universally optimal approach. We present a model-based phylogenetic method for automatically subtyping an HIV-1 (or other viral or bacterial) sequence, mapping the location of breakpoints and assigning parental sequences in recombinant strains as well as computing confidence levels for the inferred quantities. Our Subtype Classification Using Evolutionary ALgorithms (SCUEAL) procedure is shown to perform very well in a variety of simulation scenarios, runs in parallel when multiple sequences are being screened, and matches or exceeds the performance of existing approaches on typical empirical cases. We applied SCUEAL to all available polymerase (pol) sequences from two large databases, the Stanford Drug Resistance database and the UK HIV Drug Resistance Database. Comparing with subtypes which had previously been assigned revealed that a minor but substantial (≈5%) fraction of pure subtype sequences may in fact be within- or inter-subtype recombinants. A free implementation of SCUEAL is provided as a module for the HyPhy package and the Datamonkey web server. Our method is especially useful when an accurate automatic classification of an unknown strain is desired, and is positioned to complement and extend faster but less accurate methods. Given the increasingly frequent use of HIV subtype information in studies focusing on the effect of subtype on treatment, clinical outcome, pathogenicity and vaccine design, the importance of accurate, robust and extensible subtyping procedures is clear.  相似文献   

石鸥燕  杨晶  杨惠云  田心 《现代生物医学进展》2007,7(11):1723-1724,1706
蛋白质二级结构预测对于我们了解蛋白质空间结构是至关重要的一步。文章提出了一种简单的二级结构预测方法,该方法采用多数投票法将现有的3种较好的二级结构预测方法的预测结果汇集形成一致性预测结果。从PDB数据库中随机选取近两年新测定结构的57条相似性小于30%的蛋白质,对该方法的预测结果进行测试,其Q3准确率比3种独立的方法提高了1.12—2.29%,相关系数及SOV准确率也有相应的提高。并且各项准确率均比同样采用一致性方法的Jpred二级结构预测程序准确率要高。这种预测方法虽然原理简单,但无须使用额外的参数,计算量小,易于实现,最重要的前提就是必须选用目前准确性比较出色的蛋白质二级结构预测方法。  相似文献   


A new approach to the prediction of secondary RNA structures based on the analysis of the kinetics of molecular self-organisation is proposed herein. The Markov process is used to describe structural reconstructions during secondary structure formation. This process is modelled by a Monte-Carlo method. Examples of the calculation by this method of the secondary structures kinetic ensemble are given. Distribution of time-dependent probabilities within the ensembles is obtained.

An effective method for search for the equilibrium ensemble is also suggested. This method is based on the construction of a tree of all possible secondary structures of RNA. By ascribing a probability for each structure (according to its free energy) the Boltzmann equilibrium ensemble can be obtained.  相似文献   

基于量子进化算法的RNA序列-结构比对   总被引:1,自引:0,他引:1  
多序列比对是计算分子生物学的经典问题,也是许多生物学研究的重要基础步骤.RNA作为生物大分子的一种,不同于蛋白质和DNA,其二级结构在进化过程中比初级序列更保守,因此要求在RNA序列比对中不仅要考虑序列信息,更要着重考虑二级结构信息.提出了一种基于量子进化算法的RNA多序列-结构比对程序,对RNA序列进行了量子编码,设计了考虑进结构信息的全交叉算子,提出了适合于进行RNA序列-结构比对的适应度函数,克服了传统进化算法收敛速度慢和早熟问题.在标准数据库上的测试,证实了方法的有效性.  相似文献   

基因组功能预测的进化印记方法   总被引:6,自引:1,他引:6  
改善基因组功能预测方案是目前功能基因组学的迫切问题,生物进化历程会在分子序列上留下相应进化印记-直系同源簇的特异模体,在这一生物学事实的基础上,提出了一个新的基因缚功能预测方法,首先利用进化分析方法构建直系同源簇,再找到各直系同源簇的功能模体,这样可以形成特异的功能模体库,未知基因的功能预测可望通过搜索该功能模体库而得以高效,准确地完成,对5个家族的检验初步证实该方案是可行的。  相似文献   

<正> A new method for simulating the folding pathway of RNA secondary structure using the modified ant colony algorithmis proposed.For a given RNA sequence,the set of all possible stems is obtained and the energy of each stem iscalculated and stored at the initial stage.Furthermore,a more realistic formula is used to compute the energy ofmulti-branch loop in the following iteration.Then a folding pathway is simulated,including such processes as constructionof the heuristic information,the rule of initializing the pheromone,the mechanism of choosing the initial andnext stem and the strategy of updating the pheromone between two different stems.Finally by testing RNA sequences withknown secondary structures from the public databases,we analyze the experimental data to select appropriate values forparameters.The measure indexes show that our procedure is more consistent with phylogenetically proven structures thansoftware RNAstructure sometimes and more effective than the standard Genetic Algorithm.  相似文献   

