首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Given an RNA sequence and two designated secondary structures A, B, we describe a new algorithm that computes a nearly optimal folding pathway from A to B. The algorithm, RNAtabupath, employs a tabu semi-greedy heuristic, known to be an effective search strategy in combinatorial optimization. Folding pathways, sometimes called routes or trajectories, are computed by RNAtabupath in a fraction of the time required by the barriers program of Vienna RNA Package. We benchmark RNAtabupath with other algorithms to compute low energy folding pathways between experimentally known structures of several conformational switches. The RNApathfinder web server, source code for algorithms to compute and analyze pathways and supplementary data are available at http://bioinformatics.bc.edu/clotelab/RNApathfinder.  相似文献   

3.
4.
Algorithms for prediction of RNA secondary structure-the set of base pairs that form when an RNA molecule folds-are valuable to biologists who aim to understand RNA structure and function. Improving the accuracy and efficiency of prediction methods is an ongoing challenge, particularly for pseudoknotted secondary structures, in which base pairs overlap. This challenge is biologically important, since pseudoknotted structures play essential roles in functions of many RNA molecules, such as splicing and ribosomal frameshifting. State-of-the-art methods, which are based on free energy minimization, have high run-time complexity (typically Theta(n(5)) or worse), and can handle (minimize over) only limited types of pseudoknotted structures. We propose a new approach for prediction of pseudoknotted structures, motivated by the hypothesis that RNA structures fold hierarchically, with pseudoknot-free (non-overlapping) base pairs forming first, and pseudoknots forming later so as to minimize energy relative to the folded pseudoknot-free structure. Our HFold algorithm uses two-phase energy minimization to predict hierarchically formed secondary structures in O(n(3)) time, matching the complexity of the best algorithms for pseudoknot-free secondary structure prediction via energy minimization. Our algorithm can handle a wide range of biological structures, including kissing hairpins and nested kissing hairpins, which have previously required Theta(n(6)) time.  相似文献   

5.
Existing computational methods for RNA secondary-structure prediction tacitly assume RNA to only encode functional RNA structures. However, experimental studies have revealed that some RNA sequences, e.g. compact viral genomes, can simultaneously encode functional RNA structures as well as proteins, and evidence is accumulating that this phenomenon may also be found in Eukaryotes. We here present the first comparative method, called RNA-DECODER, which explicitly takes the known protein-coding context of an RNA-sequence alignment into account in order to predict evolutionarily conserved secondary-structure elements, which may span both coding and non-coding regions. RNA-DECODER employs a stochastic context-free grammar together with a set of carefully devised phylogenetic substitution-models, which can disentangle and evaluate the different kinds of overlapping evolutionary constraints which arise. We show that RNA-DECODER's parameters can be automatically trained to successfully fold known secondary structures within the HCV genome. We scan the genomes of HCV and polio virus for conserved secondary-structure elements, and analyze performance as a function of available evolutionary information. On known secondary structures, RNA-DECODER shows a sensitivity similar to the programs MFOLD, PFOLD and RNAALIFOLD. When scanning the entire genomes of HCV and polio virus for structure elements, RNA-DECODER's results indicate a markedly higher specificity than MFOLD, PFOLD and RNAALIFOLD.  相似文献   

6.
An algorithm is presented for generating rigorously all suboptimal secondary structures between the minimum free energy and an arbitrary upper limit. The algorithm is particularly fast in the vicinity of the minimum free energy. This enables the efficient approximation of statistical quantities, such as the partition function or measures for structural diversity. The density of states at low energies and its associated structures are crucial in assessing from a thermodynamic point of view how well-defined the ground state is. We demonstrate this by exploring the role of base modification in tRNA secondary structures, both at the level of individual sequences from Escherichia coli and by comparing artificially generated ensembles of modified and unmodified sequences with the same tRNA structure. The two major conclusions are that (1) base modification considerably sharpens the definition of the ground state structure by constraining energetically adjacent structures to be similar to the ground state, and (2) sequences whose ground state structure is thermodynamically well defined show a significant tendency to buffer single point mutations. This can have evolutionary implications, since selection pressure to improve the definition of ground states with biological function may result in increased neutrality.  相似文献   

7.

Background  

RNA exhibits a variety of structural configurations. Here we consider a structure to be tantamount to the noncrossing Watson-Crick and G-U-base pairings (secondary structure) and additional cross-serial base pairs. These interactions are called pseudoknots and are observed across the whole spectrum of RNA functionalities. In the context of studying natural RNA structures, searching for new ribozymes and designing artificial RNA, it is of interest to find RNA sequences folding into a specific structure and to analyze their induced neutral networks. Since the established inverse folding algorithms, RNAinverse, RNA-SSD as well as INFO-RNA are limited to RNA secondary structures, we present in this paper the inverse folding algorithm Inv which can deal with 3-noncrossing, canonical pseudoknot structures.  相似文献   

8.
9.
Functionally homologous RNA sequences can substantially diverge in their primary sequences but it can be reasonably assumed that they are related in their higher-degree structures. The problem to find such structures and simultaneously satisfy as far as possible the free-energy-minimization criterion, is considered here in two aspects. Firstly a quantitative measure of the folding consensus among secondary structures is defined, translating each structure into a linear representation and using the correlation theorem to compare them. Secondly an algorithm for the parallel search for secondary structures according to the free-energy-minimization criterion, but with a filtering action on the basis of the folding consensus measure is presented. The method is tested on groups of RNA sequences different in origin and in functions, for which proposals of homologous secondary structures based on experimental data exist. A comparison of the results with a blank consisting of a search on the basis of the free energy minimization alone is always performed. In these tests the method shows its ability in obtaining, from different sequences, secondary structures characterized by a high-folding consensus measure also when lower free energy but not homologous structures are possible. Two applications are also shown. The first demonstrates the transfer of experimental data available for one sequence, to a functionally related and therefore homologous one. The second application is the possibility of using a topological probe in the search for precise structural motifs.  相似文献   

10.
A mathematical model for analyzing the secondary structures of RNA is developed that is based on the connection matrix associated with the planar p-h graph. The classification of the elementary structures allows the introduction of the basis of structural space from which to build the global secondary structure. All admissible solutions belong to the configuration space and can be obtained directly from its basis.  相似文献   

11.
We suggest a new algorithm to search a given set of the RNA sequences for conserved secondary structures. The algorithm is based on alignment of the sequences for potential helical strands. This procedure can be used to search for new structured RNAs and new regulatory elements. It is efficient for the genome-scale analysis. The results of various tests run with this algorithm are shown.  相似文献   

12.
A statistical reference for RNA secondary structures with minimum free energies is computed by folding large ensembles of random RNA sequences. Four nucleotide alphabets are used: two binary alphabets, AU and GC, the biophysical AUGC and the synthetic GCXK alphabet. RNA secondary structures are made of structural elements, such as stacks, loops, joints, and free ends. Statistical properties of these elements are computed for small RNA molecules of chain lengths up to 100. The results of RNA structure statistics depend strongly on the particular alphabet chosen. The statistical reference is compared with the data derived from natural RNA molecules with similar base frequencies. Secondary structures are represented as trees. Tree editing provides a quantitative measure for the distance dt, between two structures. We compute a structure density surface as the conditional probability of two structures having distance t given that their sequences have distance h. This surface indicates that the vast majority of possible minimum free energy secondary structures occur within a fairly small neighborhood of any typical (random) sequence. Correlation lengths for secondary structures in their tree representations are computed from probability densities. They are appropriate measures for the complexity of the sequence-structure relation. The correlation length also provides a quantitative estimate for the mean sensitivity of structures to point mutations. © 1993 John Wiley & Sons, Inc.  相似文献   

13.
An algorithm for comparing multiple RNA secondary structures   总被引:1,自引:0,他引:1  
A new distributed computational procedure is presented for rapidlydetermining the similarity of multiple conformations of RNAsecondary structures. A data abstraction scheme is utilizedto reduce the quantity of data that must be handled to determinethe degree of similarity among multiple structures. The methodhas been used to compare 200 structures with easy visualizationof both those structures and substructures that are similarand those that are vastly different. It has the capability ofprocessing many more conformations as a function of researchrequirements. The algorithm is described as well as some suggestionsfor future uses and extensions. Received on October 29, 1987; accepted on May 4, 1988  相似文献   

14.
A program for predicting significant RNA secondary structures   总被引:1,自引:0,他引:1  
We describe a program for the analysis of RNA secondary structure.There are two new features in this program. (i) To get vectorspeeds on a vector pipeline machine (such as Cray X-MP/24) wehave vectorized the secondary structure dynamic algorithm. (ii)The statistical significance of a locally ‘optimal’secondary structure is assessed by a Monte Carlo method. Theresults can be depicted graphically including profiles of thestability of local secondary structures and the distributionof the potentially significant secondary structures in the RNAmolecules. Interesting regions where both the potentially significantsecondary structures and ‘open’ structures (single-strandedcoils) occur can be identified by the plots mentioned above.Furthermore, the speed of the vectorized code allows repeatedMonte Carlo simulations with different overlapping window sizes.Thus, the optimal size of the significant secondary structureoccurring in the interesting region can be assessed by repeatingthe Monte Carlo simulation. The power of the program is demonstratedin the analysis of local secondary structures of human T-celllymphotrophic virus type III (HIV). Received on August 17, 1987; accepted on January 5, 1988  相似文献   

15.
I investigate the competition between two quasispecies residing on two disparate neutral networks. Under the assumption that the two neutral networks have different topologies and fitness levels, it is the mutation rate that determines which quasispecies will eventually be driven to extinction. For small mutation rates, I find that the quasispecies residing on the neutral network with the lower replication rate will disappear. For higher mutation rates, however, the faster replicating sequences may be outcompeted by the slower replicating ones if the connection density on the second neutral network is sufficiently high. The analytical results are in excellent agreement with flow-reactor simulations of replicating RNA sequences.  相似文献   

16.
W C Johnson 《Proteins》1999,35(3):307-312
We have developed an algorithm to analyze the circular dichroism of proteins for secondary structure. Its hallmark is tremendous flexibility in creating the basis set, and it also combines the ideas of many previous workers. We also present a new basis set containing the CD spectra of 22 proteins with secondary structures from high quality X-ray diffraction data. High flexibility is obtained by doing the analysis with a variable selection basis set of only eight proteins. Many variable selection basis sets fail to give a good analysis, but good analyses can be selected without any a priori knowledge by using the following criteria: (1) the sum of secondary structures should be close to 1.0, (2) no fraction of secondary structure should be less than -0.03, (3) the reconstructed CD spectrum should fit the original CD spectrum with only a small error, and (4) the fraction of alpha-helix should be similar to that obtained using all the proteins in the basis set. This algorithm gives a root mean square error for the predicted secondary structure for the proteins in the basis set of 3.3% for alpha-helix, 2.6% for 3(10)-helix, 4.2% for beta-strand, 4.2% for beta-turn, 2.7% for poly(L-proline) II type 3(1)-helix, and 5.1% for other structures when compared with the X-ray structure.  相似文献   

17.
18.
Ribonucleic acid (RNA) secondary structure prediction continues to be a significant challenge, in particular when attempting to model sequences with less rigidly defined structures, such as messenger and non-coding RNAs. Crucial to interpreting RNA structures as they pertain to individual phenotypes is the ability to detect RNAs with large structural disparities caused by a single nucleotide variant (SNV) or riboSNitches. A recently published human genome-wide parallel analysis of RNA structure (PARS) study identified a large number of riboSNitches as well as non-riboSNitches, providing an unprecedented set of RNA sequences against which to benchmark structure prediction algorithms. Here we evaluate 11 different RNA folding algorithms’ riboSNitch prediction performance on these data. We find that recent algorithms designed specifically to predict the effects of SNVs on RNA structure, in particular remuRNA, RNAsnp and SNPfold, perform best on the most rigorously validated subsets of the benchmark data. In addition, our benchmark indicates that general structure prediction algorithms (e.g. RNAfold and RNAstructure) have overall better performance if base pairing probabilities are considered rather than minimum free energy calculations. Although overall aggregate algorithmic performance on the full set of riboSNitches is relatively low, significant improvement is possible if the highest confidence predictions are evaluated independently.  相似文献   

19.
Automatic display of RNA secondary structures   总被引:1,自引:1,他引:0  
  相似文献   

20.
RNA secondary structures and their prediction   总被引:1,自引:0,他引:1  
This is a review of past and present attempts to predict the secondary structure of ribonucleic acids (RNAs) through mathematical and computer methods. Related areas covering classification, enumeration and graphical representations of structures are also covered. Various general prediction techniques are discussed, especially the use of thermodynamic criteria to construct an optimal structure. The emphasis in this approach is on the use of dynamic programming algorithms to minimize free energy. One such algorithm is introduced which comprises existing ones as special cases. Issued as NRCC No. 23684.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号