首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This paper studies local connectivity of neutral networks of RNA secondary and pseudoknot structures. A neutral network denotes the set of RNA sequences that fold into a particular structure. It is called locally connected, if in the limit of long sequences, the distance of any two of its sequences scales with their distance in the n-cube. One main result of this paper is that is the threshold probability for local connectivity for neutral networks, considered as random subgraphs of n-cubes. Furthermore, we analyze local connectivity for finite sequence length and different alphabets. We show that it is closely related to the existence of specific paths within the neutral network. We put our theoretical results into context with folding algorithms into minimum-free energy RNA secondary and pseudoknot structures. Finally, we relate our structural findings with dynamics by discussing the role of local connectivity in the context of neutral evolution.  相似文献   

2.
Random graph theory is used to model and analyse the relationship between sequences and secondary structures of RNA molecules, which are understood as mappings from sequence space into shape space. These maps are non-invertible since there are always many orders of magnitude more sequences than structures. Sequences folding into identical structures formneutral networks. A neutral network is embedded in the set of sequences that arecompatible with the given structure. Networks are modeled as graphs and constructed by random choice of vertices from the space of compatible sequences. The theory characterizes neutral networks by the mean fraction of neutral neighbors (λ). The networks are connected and percolate sequence space if the fraction of neutral nearest neighbors exceeds a threshold value (λ>λ*). Below threshold (λ<λ*), the networks are partitioned into a largest “giant” component and several smaller components. Structure are classified as “common” or “rare” according to the sizes of their pre-images, i.e. according to the fractions of sequences folding into them. The neutral networks of any pair of two different common structures almost touch each other, and, as expressed by the conjecture ofshape space covering sequences folding into almost all common structures, can be found in a small ball of an arbitrary location in sequence space. The results from random graph theory are compared to data obtained by folding large samples of RNA sequences. Differences are explained in terms of specific features of RNA molecular structures. Deicated to professor Manfred Eigen  相似文献   

3.
We study the secondary structure of RNA determined by Watson–Crick pairing without pseudo-knots using Milnor invariants of links. We focus on the first non-trivial invariant, which we call the Heisenberg invariant. The Heisenberg invariant, which is an integer, can be interpreted in terms of the Heisenberg group as well as in terms of lattice paths. We show that the Heisenberg invariant gives a lower bound on the number of unpaired bases in an RNA secondary structure. We also show that the Heisenberg invariant can predict allosteric structures for RNA. Namely, if the Heisenberg invariant is large, then there are widely separated local maxima (i.e., allosteric structures) for the number of Watson–Crick pairs found. Partially supported by DST (under grant DSTO773) and UGC (under SAP-DSA Phase IV).  相似文献   

4.
Robustness and evolvability are highly intertwined properties of biological systems. The relationship between these properties determines how biological systems are able to withstand mutations and show variation in response to them. Computational studies have explored the relationship between these two properties using neutral networks of RNA sequences (genotype) and their secondary structures (phenotype) as a model system. However, these studies have assumed every mutation to a sequence to be equally likely; the differences in the likelihood of the occurrence of various mutations, and the consequence of probabilistic nature of the mutations in such a system have previously been ignored. Associating probabilities to mutations essentially results in the weighting of genotype space. We here perform a comparative analysis of weighted and unweighted neutral networks of RNA sequences, and subsequently explore the relationship between robustness and evolvability. We show that assuming an equal likelihood for all mutations (as in an unweighted network), underestimates robustness and overestimates evolvability of a system. In spite of discarding this assumption, we observe that a negative correlation between sequence (genotype) robustness and sequence evolvability persists, and also that structure (phenotype) robustness promotes structure evolvability, as observed in earlier studies using unweighted networks. We also study the effects of base composition bias on robustness and evolvability. Particularly, we explore the association between robustness and evolvability in a sequence space that is AU-rich – sequences with an AU content of 80% or higher, compared to a normal (unbiased) sequence space. We find that evolvability of both sequences and structures in an AU-rich space is lesser compared to the normal space, and robustness higher. We also observe that AU-rich populations evolving on neutral networks of phenotypes, can access less phenotypic variation compared to normal populations evolving on neutral networks.  相似文献   

5.
The evolution and adaptation of molecular populations is constrained by the diversity accessible through mutational processes. RNA is a paradigmatic example of biopolymer where genotype (sequence) and phenotype (approximated by the secondary structure fold) are identified in a single molecule. The extreme redundancy of the genotype-phenotype map leads to large ensembles of RNA sequences that fold into the same secondary structure and can be connected through single-point mutations. These ensembles define neutral networks of phenotypes in sequence space. Here we analyze the topological properties of neutral networks formed by 12-nucleotides RNA sequences, obtained through the exhaustive folding of sequence space. A total of 4(12) sequences fragments into 645 subnetworks that correspond to 57 different secondary structures. The topological analysis reveals that each subnetwork is far from being random: it has a degree distribution with a well-defined average and a small dispersion, a high clustering coefficient, and an average shortest path between nodes close to its minimum possible value, i.e. the Hamming distance between sequences. RNA neutral networks are assortative due to the correlation in the composition of neighboring sequences, a feature that together with the symmetries inherent to the folding process explains the existence of communities. Several topological relationships can be analytically derived attending to structural restrictions and generic properties of the folding process. The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations. This property prevents fragmentation of neutral networks and thus enhances the navigability of sequence space. In summary, RNA neutral networks show unique topological properties, unknown to other networks previously described.  相似文献   

6.
A detailed knowledge of the mapping between sequence and structure spaces in populations of RNA molecules is essential to better understand their present-day functional properties, to envisage a plausible early evolution of RNA in a prebiotic chemical environment and to improve the design of in vitro evolution experiments, among others. Analysis of natural RNAs, as well as in vitro and computational studies, show that certain RNA structural motifs are much more abundant than others, pointing out a complex relation between sequence and structure. Within this framework, we have investigated computationally the structural properties of a large pool (108 molecules) of single-stranded, 35 nt-long, random RNA sequences. The secondary structures obtained are ranked and classified into structure families. The number of structures in main families is analytically calculated and compared with the numerical results. This permits a quantification of the fraction of structure space covered by a large pool of sequences. We further show that the number of structural motifs and their frequency is highly unbalanced with respect to the nucleotide composition: simple structures such as stem-loops and hairpins arise from sequences depleted in G, while more complex structures require an enrichment of G. In general, we observe a strong correlation between subfamilies—characterized by a fixed number of paired nucleotides—and nucleotide composition. Our results are compared to the structural repertoire obtained in a second pool where isolated base pairs are prohibited.  相似文献   

7.
RNA secondary-structure folding algorithms predict the existence of connected networks of RNA sequences with identical secondary structures. Fitness landscapes that are based on the mapping between RNA sequence and RNA secondary structure hence have many neutral paths. A neutral walk on these fitness landscapes gives access to a virtually unlimited number of secondary structures that are a single point mutation from the neutral path. This shows that neutral evolution explores phenotype space and can play a role in adaptation. Received: 23 December 1995 / Accepted: 17 March 1996  相似文献   

8.
Combinatorics of RNA Structures with Pseudoknots   总被引:1,自引:0,他引:1  
In this paper, we derive the generating function of RNA structures with pseudoknots. We enumerate all k-noncrossing RNA pseudoknot structures categorized by their maximal sets of mutually intersecting arcs. In addition, we enumerate pseudoknot structures over circular RNA. For 3-noncrossing RNA structures and RNA secondary structures we present a novel 4-term recursion formula and a 2-term recursion, respectively. Furthermore, we enumerate for arbitrary k all k-noncrossing, restricted RNA structures i.e. k-noncrossing RNA structures without 2-arcs i.e. arcs of the form (i,i+2), for 1≤in−2.  相似文献   

9.
In this paper, we study irreducibility in RNA structures. By RNA structure, we mean RNA secondary as well as RNA pseudoknot structures as abstract contact structures. We give an analysis contrasting random and minimum free energy (mfe) configurations and secondary versus pseudoknots structures. In the process, we compute various distributions: the numbers of irreducible substructures and their locations and sizes, parameterized in terms of the maximal number of mutually crossing arcs, k−1, and the minimal size of stacks σ. In particular, we analyze the size of the largest irreducible substructure for random and mfe structures, which is the key factor for the folding time of mfe configurations. We show that the largest irreducible substructure is typically unique and contains almost all nucleotides.  相似文献   

10.
A statistical reference for RNA secondary structures with minimum free energies is computed by folding large ensembles of random RNA sequences. Four nucleotide alphabets are used: two binary alphabets, AU and GC, the biophysical AUGC and the synthetic GCXK alphabet. RNA secondary structures are made of structural elements, such as stacks, loops, joints, and free ends. Statistical properties of these elements are computed for small RNA molecules of chain lengths up to 100. The results of RNA structure statistics depend strongly on the particular alphabet chosen. The statistical reference is compared with the data derived from natural RNA molecules with similar base frequencies. Secondary structures are represented as trees. Tree editing provides a quantitative measure for the distance dt, between two structures. We compute a structure density surface as the conditional probability of two structures having distance t given that their sequences have distance h. This surface indicates that the vast majority of possible minimum free energy secondary structures occur within a fairly small neighborhood of any typical (random) sequence. Correlation lengths for secondary structures in their tree representations are computed from probability densities. They are appropriate measures for the complexity of the sequence-structure relation. The correlation length also provides a quantitative estimate for the mean sensitivity of structures to point mutations. © 1993 John Wiley & Sons, Inc.  相似文献   

11.

Background  

RNA exhibits a variety of structural configurations. Here we consider a structure to be tantamount to the noncrossing Watson-Crick and G-U-base pairings (secondary structure) and additional cross-serial base pairs. These interactions are called pseudoknots and are observed across the whole spectrum of RNA functionalities. In the context of studying natural RNA structures, searching for new ribozymes and designing artificial RNA, it is of interest to find RNA sequences folding into a specific structure and to analyze their induced neutral networks. Since the established inverse folding algorithms, RNAinverse, RNA-SSD as well as INFO-RNA are limited to RNA secondary structures, we present in this paper the inverse folding algorithm Inv which can deal with 3-noncrossing, canonical pseudoknot structures.  相似文献   

12.

Background

The neutral theory of Motoo Kimura stipulates that evolution is mostly driven by neutral mutations. However adaptive pressure eventually leads to changes in phenotype that involve non-neutral mutations. The relation between neutrality and adaptation has been studied in the context of RNA before and here we further study transitional mutations in the context of degenerate (plastic) RNA sequences and genetic assimilation. We propose quasineutral mutations, i.e. mutations which preserve an element of the phenotype set, as minimal mutations and study their properties. We also propose a general probabilistic interpretation of genetic assimilation and specialize it to the Boltzmann ensemble of RNA sequences.

Results

We show that degenerate sequences i.e. sequences with more than one structure at the MFE level have the highest evolvability among all sequences and are central to evolutionary innovation. Degenerate sequences also tend to cluster together in the sequence space. The selective pressure in an evolutionary simulation causes the population to move towards regions with more degenerate sequences, i.e. regions at the intersection of different neutral networks, and this causes the number of such sequences to increase well beyond the average percentage of degenerate sequences in the sequence space. We also observe that evolution by quasineutral mutations tends to conserve the number of base pairs in structures and thereby maintains structural integrity even in the presence of pressure to the contrary.

Conclusions

We conclude that degenerate RNA sequences play a major role in evolutionary adaptation.
  相似文献   

13.
Algorithms predicting RNA secondary structures based on different folding criteria – minimum free energies (mfe), kinetic folding (kin), maximum matching (mm) – and different parameter sets are studied systematically. Two base pairing alphabets were used: the binary GC and the natural four-letter AUGC alphabet. Computed structures and free energies depend strongly on both the algorithm and the parameter set. Statistical properties, such as mean number of base pairs, mean numbers of stacks, mean loop sizes, etc., are much less sensitive to the choice of parameter set and even of algorithm. Some features of RNA secondary structures, such as structure correlation functions, shape space covering and neutral networks, seem to depend only on the base pairing logic (GC or AUGC alphabet). Received: 16 May 1996 / Accepted: 10 July 1996  相似文献   

14.
Evolution is a highly complex multilevel process and mathematical modeling of evolutionary phenomenon requires proper abstraction and radical reduction to essential features. Examples are natural selection, Mendel’s laws of inheritance, optimization by mutation and selection, and neutral evolution. An attempt is made to describe the roots of evolutionary theory in mathematical terms. Evolution can be studied in vitro outside cells with polynucleotide molecules. Replication and mutation are visualized as chemical reactions that can be resolved, analyzed, and modeled at the molecular level, and straightforward extension eventually results in a theory of evolution based upon biochemical kinetics. Error propagation in replication commonly results in an error threshold that provides an upper bound for mutation rates. Appearance and sharpness of the error threshold depend on the fitness landscape, being the distribution of fitness values in genotype or sequence space. In molecular terms, fitness landscapes are the results of two consecutive mappings from sequences into structures and from structures into the (nonnegative) real numbers. Some properties of genotype–phenotype maps are illustrated well by means of sequence–structure relations of RNA molecules. Neutrality in the sense that many RNA sequences form the same (coarse grained) structure is one of these properties, and characteristic for such mappings. Evolution cannot be fully understood without considering fluctuations—each mutant originates form a single copy, after all. The existence of neutral sets of genotypes called neutral networks, in particular, necessitates stochastic modeling, which is introduced here by simulation of molecular evolution in a kind of flowreactor.  相似文献   

15.
Over the last few decades, much effort has been taken to develop approaches for identifying good predictions of RNA secondary structure. This is due to the fact that most computational prediction methods based on free energy minimization compute a number of suboptimal foldings and we have to identify the native folding among all these possible secondary structures. Using the abstract shapes approach as introduced by Giegerich et al. (Nucleic Acids Res 32(16):4843–4851, 2004), each class of similar secondary structures is represented by one shape and the native structures can be found among the top shape representatives. In this article, we derive some interesting results answering enumeration problems for abstract shapes and secondary structures of RNA. We compute precise asymptotics for the number of different shape representations of size n and for the number of different shapes showing up when abstracting from secondary structures of size n under a combinatorial point of view. A more realistic model taking primary structures into account remains an open challenge. We give some arguments why the present techniques cannot be applied in this case.  相似文献   

16.
In the absence of chaperone molecules, RNA folding is believed to depend on the distribution of kinetic traps in the energy landscape of all secondary structures. Kinetic traps in the Nussinov energy model are precisely those secondary structures that are saturated, meaning that no base pair can be added without introducing either a pseudoknot or base triple. In this paper, we compute the asymptotic expected number of hairpins in saturated structures. For instance, if every hairpin is required to contain at least θ=3 unpaired bases and the probability that any two positions can base-pair is p=3/8, then the asymptotic number of saturated structures is 1.34685?n ?3/2?1.62178 n , and the asymptotic expected number of hairpins follows a normal distribution with mean $0.06695640 \cdot n + 0.01909350 \cdot\sqrt{n} \cdot\mathcal{N}$ . Similar results are given for values θ=1,3, and p=1,1/2,3/8; for instance, when θ=1 and p=1, the asymptotic expected number of hairpins in saturated secondary structures is 0.123194?n, a value greater than the asymptotic expected number 0.105573?n of hairpins over all secondary structures. Since RNA binding targets are often found in hairpin regions, it follows that saturated structures present potentially more binding targets than nonsaturated structures, on average. Next, we describe a novel algorithm to compute the hairpin profile of a given RNA sequence: given RNA sequence a 1,…,a n , for each integer k, we compute that secondary structure S k having minimum energy in the Nussinov energy model, taken over all secondary structures having k hairpins. We expect that an extension of our algorithm to the Turner energy model may provide more accurate structure prediction for particular RNAs, such as tRNAs and purine riboswitches, known to have a particular number of hairpins. Mathematica? computations, C and Python source code, and additional supplementary information are available at the website http://bioinformatics.bc.edu/clotelab/RNAhairpinProfile/.  相似文献   

17.
This paper introduces a new type of Cayley graphs for building large-scale interconnection networks, namely WGn2m\mathit{WG}_{n}^{2m}, whose vertex degree is m+3 when n≥3 and is m+2 when n=2. A routing algorithm for the proposed graph is also presented, and the upper bound of the diameter is deduced as ⌊5n/2⌋. Moreover, the embedding properties and maximal fault tolerance are also analyzed. Finally, we compare the proposed networks with some other similar network topologies. It is found that WGn2m\mathit{WG}_{n}^{2m} is superior to other interconnection networks because it helps to construct large-scale networks with lower cost.  相似文献   

18.
Sumedha  Martin OC  Wagner A 《Bio Systems》2007,90(2):475-485
RNA secondary structure is an important computational model to understand how genetic variation maps into phenotypic (structural) variation. Evolutionary innovation in RNA structures is facilitated by neutral networks, large connected sets of RNA sequences that fold into the same structure. Our work extends and deepens previous studies on neutral networks. First, we show that even the 1-mutant neighborhood of a given sequence (genotype) G0 with structure (phenotype) P contains many structural variants that are not close to P. This holds for biological and generic RNA sequences alike. Second, we analyze the relation between new structures in the 1-neighborhoods of genotypes Gk that are only a moderate Hamming distance k away from G0, and the structure of G0 itself, both for biological and for generic RNA structures. Third, we analyze the relation between mutational robustness of a sequence and the distances of structural variants near this sequence. Our findings underscore the role of neutral networks in evolutionary innovation, and the role that high robustness can play in diminishing the potential for such innovation.  相似文献   

19.
The trnTtrnF region is located in the large single-copy region of the chloroplast genome. It consists of the trnL intron, a group I intron, and the trnTtrnL and trnLtrnF intergenic spacers. We analyzed the evolution of the region in the three genera of the gymnosperm lineage Gnetales (Gnetum, Welwitschia, and Ephedra), with especially dense sampling in Gnetum for which we sequenced 41 accessions, representing most of the 25–35 species. The trnL intron has a conserved secondary structure and contains elements that are homologous across land plants, while the spacers are so variable in length and composition that homology cannot be found even among the three genera. Palindromic sequences that form hairpin structures were detected in the trnLtrnF spacer, but neither spacer contained promoter elements for the tRNA genes. The absence of promoters, presence of hairpin structures in the trnLtrnF spacer, and high sequence variation in both spacers together suggest that trnT and trnF are independently transcribed. Our model for the expression and processing of the genes tRNAThr(UGU), tRNALeu(UAA), and tRNAPhe (GAA) therefore attributes the seemingly neutral evolution of the two spacers to their escape from functional constraints. [Reviewing Editor: Debashish Bhattacharya]  相似文献   

20.

Background  

The general problem of RNA secondary structure prediction under the widely used thermodynamic model is known to be NP-complete when the structures considered include arbitrary pseudoknots. For restricted classes of pseudoknots, several polynomial time algorithms have been designed, where the O(n 6)time and O(n 4) space algorithm by Rivas and Eddy is currently the best available program.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号