首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we present the asymptotic enumeration of RNA structures with pseudoknots. We develop a general framework for the computation of exponential growth rate and the asymptotic expansion for the numbers of k-noncrossing RNA structures. Our results are based on the generating function for the number of k-noncrossing RNA pseudoknot structures, , derived in Bull. Math. Biol. (2008), where k−1 denotes the maximal size of sets of mutually intersecting bonds. We prove a functional equation for the generating function and obtain for k=2 and k=3, the analytic continuation and singular expansions, respectively. It is implicit in our results that for arbitrary k singular expansions exist and via transfer theorems of analytic combinatorics, we obtain asymptotic expression for the coefficients. We explicitly derive the asymptotic expressions for 2- and 3-noncrossing RNA structures. Our main result is the derivation of the formula .  相似文献   

2.
To an RNA pseudoknot structure is naturally associated a topological surface, which has its associated genus, and structures can thus be classified by the genus. Based on earlier work of Harer–Zagier, we compute the generating function $\mathbf{D}_{g,\sigma }(z)=\sum _{n}\mathbf{d}_{g,\sigma }(n)z^n$ for the number $\mathbf{d}_{g,\sigma }(n)$ of those structures of fixed genus $g$ and minimum stack size $\sigma $ with $n$ nucleotides so that no two consecutive nucleotides are basepaired and show that $\mathbf{D}_{g,\sigma }(z)$ is algebraic. In particular, we prove that $\mathbf{d}_{g,2}(n)\sim k_g\,n^{3(g-\frac{1}{2})} \gamma _2^n$ , where $\gamma _2\approx 1.9685$ . Thus, for stack size at least two, the genus only enters through the sub-exponential factor, and the slow growth rate compared to the number of RNA molecules implies the existence of neutral networks of distinct molecules with the same structure of any genus. Certain RNA structures called shapes are shown to be in natural one-to-one correspondence with the cells in the Penner–Strebel decomposition of Riemann’s moduli space of a surface of genus $g$ with one boundary component, thus providing a link between RNA enumerative problems and the geometry of Riemann’s moduli space.  相似文献   

3.
We adapt here a surprising technique, the boustrophedon method, to speed up the sampling of RNA secondary structures from the Boltzmann low-energy ensemble. This technique is simple and its implementation straight-forward, as it only requires a permutation in the order of some operations already performed in the stochastic traceback stage of these algorithms. It nevertheless greatly improves their worst-case complexity from to , for n the size of the original sequence. Moreover the average-case complexity of the generation is shown to be improved from to in an Boltzmann-weighted homopolymer model based on the Nussinov–Jacobson free-energy model. These results are extended to the more realistic Turner free-energy model through experiments performed on both structured (Drosophilia melanogaster mRNA 5S) and hybrid (Staphylococcus aureus RNAIII) RNA sequences, using a boustrophedon modified version of the popular software UnaFold. This improvement allows for the sampling of greater and more significant sets of structures in a given time.   相似文献   

4.
In this paper we study canonical RNA pseudoknot structures. We prove central limit theorems for the distributions of the arc-numbers of k-noncrossing RNA structures with given minimum stack-size τ over n nucleotides. Furthermore we compare the space of all canonical structures with canonical minimum free energy pseudoknot structures. Our results generalize the analysis of Schuster et al. obtained for RNA secondary structures [Hofacker, I.L., Schuster, P., Stadler, P.F., 1998. Combinatorics of RNA secondary structures. Discrete Appl. Math. 88, 207–237; Jin, E.Y., Reidys, C.M., 2007b. Central and local limit theorems for RNA structures. J. Theor. Biol. 250 (2008), 547–559; 2007a. Asymptotic enumeration of RNA structures with pseudoknots. Bull. Math. Biol., 70 (4), 951–970] to k-noncrossing RNA structures. Here k2 and τ are arbitrary natural numbers. We compare canonical pseudoknot structures to arbitrary structures and show that canonical pseudoknot structures exhibit significantly smaller exponential growth rates. We then compute the asymptotic distribution of their arc-numbers. Finally, we analyze how the minimum stack-size and crossing number factor into the distributions.  相似文献   

5.
Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r ij 2 ] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r ij is greater or less than a cutoff value r cutoff. We have performed spectral decomposition of the distance matrices $ {\mathbf{D}} = \sum {\lambda_{k} {\mathbf{v}}_{k} {\mathbf{v}}_{k}^{T} } Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r ij2] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r ij is greater or less than a cutoff value r cutoff. We have performed spectral decomposition of the distance matrices , in terms of eigenvalues and the corresponding eigenvectors and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r 2—the square distance of points from the center of mass, with the next three being the principal components of the system of points. By predicting r 2 from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 ?, and by combining it with the prediction of the first principal component we can improve this approximation to 4.0 ?. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the dynamics. After structure matching, we apply principal component analysis (PCA) to obtain the important apparent motions for both bound and unbound structures. There are significant similarities between the first few key motions and the first few low-frequency normal modes calculated from a static representative structure with an elastic network model (ENM) that is based on the contact matrix C (related to D), strongly suggesting that the variations among the observed structures and the corresponding conformational changes are facilitated by the low-frequency, global motions intrinsic to the structure. Similarities are also found when the approach is applied to an NMR ensemble, as well as to atomic molecular dynamics (MD) trajectories. Thus, a sufficiently large number of experimental structures can directly provide important information about protein dynamics, but ENM can also provide a similar sampling of conformations. Finally, we use distance constraints from databases of known protein structures for structure refinement. We use the distributions of distances of various types in known protein structures to obtain the most probable ranges or the mean-force potentials for the distances. We then impose these constraints on structures to be refined or include the mean-force potentials directly in the energy minimization so that more plausible structural models can be built. This approach has been successfully used by us in 2006 in the CASPR structure refinement ().  相似文献   

6.

Background  

RNA exhibits a variety of structural configurations. Here we consider a structure to be tantamount to the noncrossing Watson-Crick and G-U-base pairings (secondary structure) and additional cross-serial base pairs. These interactions are called pseudoknots and are observed across the whole spectrum of RNA functionalities. In the context of studying natural RNA structures, searching for new ribozymes and designing artificial RNA, it is of interest to find RNA sequences folding into a specific structure and to analyze their induced neutral networks. Since the established inverse folding algorithms, RNAinverse, RNA-SSD as well as INFO-RNA are limited to RNA secondary structures, we present in this paper the inverse folding algorithm Inv which can deal with 3-noncrossing, canonical pseudoknot structures.  相似文献   

7.
A k-noncrossing RNA pseudoknot structure is a graph over {1,…,n} without 1-arcs, i.e. arcs of the form (i,i+1) and in which there exists no k-set of mutually intersecting arcs. In particular, RNA secondary structures are 2-noncrossing RNA structures. In this paper we prove a central and a local limit theorem for the distribution of the number of 3-noncrossing RNA structures over n nucleotides with exactly h bonds. Our analysis employs the generating function of k-noncrossing RNA pseudoknot structures and the asymptotics for the coefficients. The results of this paper explain the findings on the number of arcs of RNA secondary structures obtained by molecular folding algorithms and are of relevance for prediction algorithms of k-noncrossing RNA structures.  相似文献   

8.
Let ${\mathcal {S}}$ denote the set of (possibly noncanonical) base pairs {i, j} of an RNA tertiary structure; i.e. ${\{i, j\} \in \mathcal {S}}$ if there is a hydrogen bond between the ith and jth nucleotide. The page number of ${\mathcal {S}}$ , denoted ${\pi(\mathcal {S})}$ , is the minimum number k such that ${\mathcal {S}}$ can be decomposed into a disjoint union of k secondary structures. Here, we show that computing the page number is NP-complete; we describe an exact computation of page number, using constraint programming, and determine the page number of a collection of RNA tertiary structures, for which the topological genus is known. We describe an approximation algorithm from which it follows that ${\omega(\mathcal {S}) \leq \pi(\mathcal {S}) \leq \omega(\mathcal {S}) \cdot \log n}$ , where the clique number of ${\mathcal {S}, \omega(\mathcal {S})}$ , denotes the maximum number of base pairs that pairwise cross each other.  相似文献   

9.
In the absence of chaperone molecules, RNA folding is believed to depend on the distribution of kinetic traps in the energy landscape of all secondary structures. Kinetic traps in the Nussinov energy model are precisely those secondary structures that are saturated, meaning that no base pair can be added without introducing either a pseudoknot or base triple. In this paper, we compute the asymptotic expected number of hairpins in saturated structures. For instance, if every hairpin is required to contain at least θ=3 unpaired bases and the probability that any two positions can base-pair is p=3/8, then the asymptotic number of saturated structures is 1.34685?n ?3/2?1.62178 n , and the asymptotic expected number of hairpins follows a normal distribution with mean $0.06695640 \cdot n + 0.01909350 \cdot\sqrt{n} \cdot\mathcal{N}$ . Similar results are given for values θ=1,3, and p=1,1/2,3/8; for instance, when θ=1 and p=1, the asymptotic expected number of hairpins in saturated secondary structures is 0.123194?n, a value greater than the asymptotic expected number 0.105573?n of hairpins over all secondary structures. Since RNA binding targets are often found in hairpin regions, it follows that saturated structures present potentially more binding targets than nonsaturated structures, on average. Next, we describe a novel algorithm to compute the hairpin profile of a given RNA sequence: given RNA sequence a 1,…,a n , for each integer k, we compute that secondary structure S k having minimum energy in the Nussinov energy model, taken over all secondary structures having k hairpins. We expect that an extension of our algorithm to the Turner energy model may provide more accurate structure prediction for particular RNAs, such as tRNAs and purine riboswitches, known to have a particular number of hairpins. Mathematica? computations, C and Python source code, and additional supplementary information are available at the website http://bioinformatics.bc.edu/clotelab/RNAhairpinProfile/.  相似文献   

10.
11.
12.
13.
Within this paper we investigate the Bernoulli model for random secondary structures of ribonucleic acid (RNA) molecules. Assuming that two random bases can form a hydrogen bond with probability p we prove asymptotic equivalents for the averaged number of hairpins and bulges, the averaged loop length, the expected order, the expected number of secondary structures of size n and order k and further parameters all depending on p. In this way we get an insight into the change of shape of a random structure during the process . Afterwards we compare the computed parameters for random structures in the Bernoulli model to the corresponding quantities for real existing secondary structures of large subunit rRNA molecules found in the database of Wuyts et al. That is how it becomes possible to identify those parameters which behave (almost) randomly and those which do not and thus should be considered as interesting, e.g., with respect to the biological functions or the algorithmic prediction of RNA secondary structures.  相似文献   

14.
In response to decreasing atmospheric emissions of sulfur (S) since the 1970s there has been a concomitant decrease in S deposition to watersheds in the Northeastern U.S. Previous study at the Hubbard Brook Experimental Forest, NH (USA) using chemical and isotopic analyzes ( $ \delta^{34} {\text{S}}_{{{\text{SO}}_{4} }} $ ) combined with modeling has suggested that there is an internal source of S within these watersheds that results in a net loss of S via sulfate in drainage waters. The current study expands these previous investigations by the utilization of δ18O analyzes of precipitation sulfate and streamwater sulfate. Archived stream and bulk precipitation samples at the Hubbard Brook Experimental Forest from 1968–2004 were analyzed for stable oxygen isotope ratios of sulfate ( $ \delta^{18} {\text{O}}_{{{\text{SO}}_{4} }} $ ). Overall decreasing temporal trends and seasonally low winter values of $ \delta^{18} {\text{O}}_{{{\text{SO}}_{4} }} $ in bulk precipitation are most likely attributed to similar trends in precipitation $ \delta^{18} {\text{O}}_{{{\text{H}}_{2} {\text{O}}}} $ values. Regional climate trends and changes in temperature control precipitation $ \delta^{18} {\text{O}}_{{{\text{H}}_{2} {\text{O}}}} $ values that are reflected in the $ \delta^{18} {\text{O}}_{{{\text{SO}}_{4} }} $ values of precipitation. The significant relationship between ambient temperature and the $ \delta^{18} {\text{O}}_{{{\text{H}}_{2} {\text{O}}}} $ values of precipitation is shown from a nearby site in Ottawa, Ontario (Canada). Although streamwater $ \delta^{18} {\text{O}}_{{{\text{SO}}_{4} }} $ values did not reveal temporal trends, a large difference between precipitation and streamwater $ \delta^{18} {\text{O}}_{{{\text{SO}}_{4} }} $ values suggest the importance of internal cycling of S especially through the large organic S pool and the concomitant effect on the $ \delta^{18} {\text{O}}_{{{\text{SO}}_{4} }} $ values in drainage waters.  相似文献   

15.
The secondary structures of nucleic acids form a particularly important class of contact structures. Many important RNA molecules, however, contain pseudo-knots, a structural feature that is excluded explicitly from the conventional definition of secondary structures. We propose here a generalization of secondary structures incorporating ‘non-nested’ pseudo-knots, which we call bi-secondary structures, and discuss measures for the complexity of more general contact structures based on their graph-theoretical properties. Bi-secondary structures are planar trivalent graphs that are characterized by special embedding properties. We derive exact upper bounds on their number (as a function of the chain length n) implying that there are fewer different structures than sequences. Computational results show that the number of bi-secondary structures grows approximately like 2.35n. Numerical studies based on kinetic folding and a simple extension of the standard energy model show that the global features of the sequence-structure map of RNA do not change when pseudo-knots are introduced into the secondary structure picture. We find a large fraction of neutral mutations and, in particular, networks of sequences that fold into the same shape. These neutral networks percolate through the entire sequence space.  相似文献   

16.
Summary The energy requirements of Adélie penguin (Pygoscelis adeliae) chicks were analysed with respect to body mass (W, 0.145–3.35 kg, n=36) and various forms of activity (lying, standing, minor activity, locomotion, walking on a treadmill). Direct respirometry was used to measure O2 consumption ( ) and CO2 production. Heart rate (HR, bpm) was recorded from the ECG obtained by both externally attached electrodes and implantable HR-transmitters. The parameters measured were not affected by hand-rearing of the chicks or by implanting transmitters. HR measured in the laboratory and in the field were comparable. Oxygen uptake ranged from in lying chicks to at maximal activity, RQ=0.76. Metabolic rate in small wild chicks (0.14–0.38 kg) was not affected by time of day, nor was their feeding frequency in the colony (Dec 20–21). Regressions of HR on were highly significant (p< 0.0001) in transmitter implanted chicks (n=4), and two relationships are proposed for the pooled data, one for minor activities ( ), and one for walking ( ). Oxygen consumption, mass of the chick (2–3 kg), and duration of walking (T, s) were related as , whereas mass-specific O2 consumption was related to walking speed (S, m·s-1) as .Abbreviations bpm beats per minute - D distance walked (m) - ECG electrocardiogram - HR heart rate (bpm) - ns number of steps - RQ respiratory quotient - S walking speed (m·s-1) - T time walked (s) - W body mass (kg)  相似文献   

17.
Random graph theory is used to model and analyse the relationship between sequences and secondary structures of RNA molecules, which are understood as mappings from sequence space into shape space. These maps are non-invertible since there are always many orders of magnitude more sequences than structures. Sequences folding into identical structures formneutral networks. A neutral network is embedded in the set of sequences that arecompatible with the given structure. Networks are modeled as graphs and constructed by random choice of vertices from the space of compatible sequences. The theory characterizes neutral networks by the mean fraction of neutral neighbors (λ). The networks are connected and percolate sequence space if the fraction of neutral nearest neighbors exceeds a threshold value (λ>λ*). Below threshold (λ<λ*), the networks are partitioned into a largest “giant” component and several smaller components. Structure are classified as “common” or “rare” according to the sizes of their pre-images, i.e. according to the fractions of sequences folding into them. The neutral networks of any pair of two different common structures almost touch each other, and, as expressed by the conjecture ofshape space covering sequences folding into almost all common structures, can be found in a small ball of an arbitrary location in sequence space. The results from random graph theory are compared to data obtained by folding large samples of RNA sequences. Differences are explained in terms of specific features of RNA molecular structures. Deicated to professor Manfred Eigen  相似文献   

18.
L Yuan  S S Stivala 《Biopolymers》1972,11(10):2079-2089
The effect of dielectric constant (D) of the solvent on the viscosity of heparin was examined using the relation \documentclass{article}\pagestyle{empty}\begin{document}$ \eta _{{\rm sp}} /c = [\eta ]_\infty (1 + k/\sqrt c) $\end{document}, where [η] is the shielded intrinsic viscosity obtained by extrapolating \documentclass{article}\pagestyle{empty}\begin{document}$ \eta _{{\rm sp}} /c\,{\rm vs}{\rm . }\,1/\sqrt c ) $\end{document} to infinite concentration, and k is an interaction parameter independent of the dielectric constant of the solvent. This equation was previously reported by the authors9 for describing the reduced viscosities of strong polyelectrolytes in salt-free polar solvents. It was found that the [η] of heparin increases linearly with increasing dielectric constant of the solvent whereas the k values were, within experimental error, independent of D in the range 54.7 < D < 93.2 examined. Graded hydrolysis of heparin from its acid form (heparinic acid) at 57°C resulted in samples of varying degree of desulfation with corresponding decrease in biological activity. It was found that both [η] and k decrease with increasing desulfation.  相似文献   

19.
The effect of exercise training on heart rate variability (HRV) and improvements in peak oxygen consumption ( peak) was examined in sedentary middle-aged men. The HRV and absolute and relative peak of training (n = 19) and control (n = 15) subjects were assessed before and after a 24-session moderate intensity exercise training programme. Results indicated that with exercise training there was a significantly increased absolute and relative peak (P < 0.005) for the training group (12% and 11% respectively) with no increase for the control group. The training group also displayed a significant reduction in resting heart rate; however, HRV remained unchanged. The trained subjects were further categorized into high (n = 5) and low (n = 5) HRV groups and changes in peak were compared. Improvements in both absolute and relative peak were significantly greater (P > 0.005) in the high HRV group (17% and 20% respectively) compared to the low HRV group (6% and 1% respectively). The groups did not differ in mean age, pretraining oxygen consumption, or resting heart rate. These results would seem to suggest that a short aerobic training programme does not alter HRV in middle-aged men. Individual differences in HRV, however, may be associated with peak response to aerobic training.  相似文献   

20.
Genetic parameters for growth, stem straightness, pilodyn penetration, relative bark thickness and survival were estimated in a base-population of five open-pollinated provenance/progeny trials of Eucalyptus viminalis. The trials, located in northern, central and southern Buenos Aires Province, Argentina, comprised 148 open-pollinated families from 13 Australian native provenances and eight local Argentinean seedlots. The Australian native provenances come from a limited range of the natural distribution. Overall survival, based on the latest assessment of each trial, was 62.4%. Single-site analyses showed that statistically significant provenances differences (p?<?0.05) for at least one of the studied traits in three out of the five trials analyzed. The local land race performed inconsistently in this study. The average narrow-sense individual-tree heritability estimate $ \left( {{{\hat{h}}^2}} \right) $ was 0.27 for diameter and 0.17 for total height. Values of $ {\hat{h}^2} $ also increased with age. Pilodyn penetration, assessed at only one site, was more heritable $ \left( {{{\hat{h}}^2} = 0.32} \right) $ than the average of growth traits. Estimated individual-tree heritabilities were moderate to low for stem straightness (average of 0.20) and relative bark thickness (0.16). The estimated additive genetic correlations $ \left( {{{{r}}_{{A}}}} \right) $ between diameter and height were consistently high and positive ( $ {{r}_{^A}} $ average of 0.90). High additive genetic correlations were observed between growth variables and pilodyn penetration ( $ {{r}_{^A}} $ average of 0.58). Relative bark thickness showed a negative correlation with diameter $ \left( {{{{r}}_{^A}} = - 0.39} \right) $ and height $ \left( {{{{r}}_{^A}} = - 0.51} \right) $ . The average estimated additive genetic correlation between sites was high for diameter (0.67). The implications of all these parameter estimates for genetic improvement of E. viminalis in Argentina are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号